I have been working on a piece of code to create a disparity map.
I don't want to use OpenCV for more than loading / saving the images converting them to grayscale.
So far, I've managed to implement the algorithm explained in this website. I'm using the version of the algorithm that uses the Sum of Absolute Differences (SAD). To test my implementation, I'm using the stereo images from this dataset.
Here's my code:
import cv2
import numpy as np
# Load the stereo images
img = cv2.imread('bow-view1.png')
img2 = cv2.imread('bow-view5.png')
# convert stereo images to grayscale
gray = cv2.cvtColor(img,cv2.COLOR_BGR2GRAY)
gray2 = cv2.cvtColor(img2,cv2.COLOR_BGR2GRAY)
# get the size of the images
# l -> lines
# c -> columns
# v -> channel (RGB)
l,c,v = img.shape
# initialize arrays
minSAD = np.ones((l,c)) * 1000
sad = np.ones((l,c))
winsad = np.ones((l,c))
disp = np.zeros((l,c))
max_shift = 30
# set size of the SAD window
w_l = 2
w_c = 2
for shift in range(max_shift):
print("New Shift: %d"%(shift))
for u in range(0,l):
for v in range(0,c):
# calculate SAD
if(u+shift < l):
sad[u,v] = np.abs((int(gray[u,v]) - int(gray2[u+shift,v])))
sum_sad = 0
for d in range(w_l):
for e in range(w_c):
if(u+d < l and v+e < c):
sum_sad += sad[u+d,v+e]
winsad[u,v] = sum_sad
# Save disparity
if(sad[u,v] < minSAD[u,v]):
minSAD[u,v] = winsad[u,v]
disp[u,v] = shift
print("Process Complete")
# write disparity map to image
print("Disparity Map Generated")
This is the output generated by that code:
I should get an output similar (or very close to) this:
I've tried several window sizes (in the SAD step), but I keep getting results like this one or images that are all black.
Any answer that helps me figure out the problem or that at least points me in the right direction will be very appreciated!
One thing you are missing here is that all the values in the disp array will be between 0 and 30 which correspond to black pixel, so in order to map these values between 0 and 255 you have to multiply the shift by 8.
I have tried inverting a negative film images color with the bitwise_not() function in python but it has this blue tint. I would like to know how I could develop a negative film image that looks somewhat good. Here's the outcome of what I did. (I just cropped the negative image for a new test I was doing so don't mind that)
If you don't use exact maximum and minimum, but 1st and 99th percentile, or something nearby (0.1%?), you'll get some nicer contrast. It'll cut away outliers due to noise, compression, etc.
Additionally, you should want to mess with gamma, or scale the values linearly, to achieve white balance.
I'll apply a "gray world assumption" and scale each plane so the mean is gray. I'll also mess with gamma, but that's just messing around.
And... all of that completely ignores gamma mapping, both of the "negative" and of the outputs.
import numpy as np
import cv2 as cv
import skimage
im = cv.imread("negative.png")
(bneg,gneg,rneg) = cv.split(im)
def stretch(plane):
# take 1st and 99th percentile
imin = np.percentile(plane, 1)
imax = np.percentile(plane, 99)
# stretch the image
plane = (plane - imin) / (imax - imin)
return plane
b = 1 - stretch(bneg)
g = 1 - stretch(gneg)
r = 1 - stretch(rneg)
bgr = cv.merge([b,g,r])
cv.imwrite("positive.png", bgr * 255)
b = 1 - stretch(bneg)
g = 1 - stretch(gneg)
r = 1 - stretch(rneg)
# gray world
b *= 0.5 / b.mean()
g *= 0.5 / g.mean()
r *= 0.5 / r.mean()
bgr = cv.merge([b,g,r])
cv.imwrite("positive_grayworld.png", bgr * 255)
b = 1 - np.clip(stretch(bneg), 0, 1)
g = 1 - np.clip(stretch(gneg), 0, 1)
r = 1 - np.clip(stretch(rneg), 0, 1)
# goes in the right direction
b = skimage.exposure.adjust_gamma(b, gamma=b.mean()/0.5)
g = skimage.exposure.adjust_gamma(g, gamma=g.mean()/0.5)
r = skimage.exposure.adjust_gamma(r, gamma=r.mean()/0.5)
bgr = cv.merge([b,g,r])
cv.imwrite("positive_gamma.png", bgr * 255)
Here's what happens when gamma is applied to the inverted picture... a reasonably tolerable transfer function results from applying the same factor twice, instead of applying its inverse.
Trying to "undo" the gamma while ignoring that the values were inverted... causes serious distortions:
And the min/max values for contrast stretching also affect the whole thing.
A simple photo of a negative simply won't do. It'll include stray light that offsets the black point, at the very least. You need a proper scan of the negative.
Here is one simple way to do that in Python/OpenCV. Basically one stretches each channel of the image to full dynamic range separately. Then recombines. Then inverts.
import cv2
import numpy as np
import skimage.exposure
# read image
img = cv2.imread('boys_negative.png')
# separate channels
r,g,b = cv2.split(img)
# stretch each channel
r_stretch = skimage.exposure.rescale_intensity(r, in_range='image', out_range=(0,255)).astype(np.uint8)
g_stretch = skimage.exposure.rescale_intensity(g, in_range='image', out_range=(0,255)).astype(np.uint8)
b_stretch = skimage.exposure.rescale_intensity(b, in_range='image', out_range=(0,255)).astype(np.uint8)
# combine channels
img_stretch = cv2.merge([r_stretch, g_stretch, b_stretch])
# invert
result = 255 - img_stretch
cv2.imshow('input', img)
cv2.imshow('result', result)
# save results
cv2.imwrite('boys_negative_inverted.jpg', result)
Caveat: This works for this image, but may not be a universal solution for all images.
In the above, I did not clip when stretching as I wanted to preserver all information. But if one wants to clip and use skimage.exposure.rescale_intensity for stretching, then it is easy enough by the following:
import cv2
import numpy as np
import skimage.exposure
# read image
img = cv2.imread('boys_negative.png')
# separate channels
r,g,b = cv2.split(img)
# compute clip points -- clip 1% only on high side
clip_rmax = np.percentile(r, 99)
clip_gmax = np.percentile(g, 99)
clip_bmax = np.percentile(b, 99)
clip_rmin = np.percentile(r, 0)
clip_gmin = np.percentile(g, 0)
clip_bmin = np.percentile(b, 0)
# stretch each channel
r_stretch = skimage.exposure.rescale_intensity(r, in_range=(clip_rmin,clip_rmax), out_range=(0,255)).astype(np.uint8)
g_stretch = skimage.exposure.rescale_intensity(g, in_range=(clip_gmin,clip_gmax), out_range=(0,255)).astype(np.uint8)
b_stretch = skimage.exposure.rescale_intensity(b, in_range=(clip_bmin,clip_bmax), out_range=(0,255)).astype(np.uint8)
# combine channels
img_stretch = cv2.merge([r_stretch, g_stretch, b_stretch])
# invert
result = 255 - img_stretch
cv2.imshow('input', img)
cv2.imshow('result', result)
# save results
cv2.imwrite('boys_negative_inverted2.jpg', result)
I have a code that split large images into 1024X1024 small images with 10% overlap. After this process I am processing each 1024X1024 small image. Finally, I want to combine these small images I have processed in the original image size. How can I do the combine process? Can you share some sample code? Thanks...
import cv2
path_to_img = "demo.png"
img = cv2.imread(path_to_img)
img_h, img_w, _ = img.shape
split_width = 1024
split_height = 1024
def start_points(size, split_size, overlap=0):
points = [0]
stride = int(split_size * (1-overlap))
counter = 1
while True:
pt = stride * counter
if pt + split_size >= size:
points.append(size - split_size)
counter += 1
return points
X_points = start_points(img_w, split_width, 0.1)
Y_points = start_points(img_h, split_height, 0.1)
splitted_images = []
for i in Y_points:
for j in X_points:
split = img[i:i+split_height, j:j+split_width]
To reconstruct the original image is almost the same principle. Simply use the horizontal and vertical coordinates you created and reverse the operation. You will of course need to use an external counter that will help you iterate through your list of patches. The only other intricacy you need is to declare a container that will house the final image. You can do that by declaring an array of the same type as the input image and setting it to all zeroes:
import numpy as np
final_image = np.zeros_like(img)
index = 0
for i in Y_points:
for j in X_points:
final_image[i:i+split_height, j:j+split_width] = splitted_images[index]
index += 1
final_image will now contain the reconstructed image using the patches. Take note that I have simply overwritten any values that are overlapping with the most recent patch that overlaps any area of interest that would have overlapping values last.
I am using python and openCv for a brain segmentation project. I have segmented the brain MRI image using K means segmentation. I want to get each segment resulted through k means segmentation in seperate images. please help me in this.
#k_means segmentation
epsilon = 0.01
number_of_iterations = 50
number_of_clusters = 4
print(criteria, 'Criteria K_means parameters')
#k means segmentation
_, labels, centers =cv2.kmeans(kmeans_input, number_of_clusters, None,
print(labels.shape, 'k-means segmentation')
#Adopting the labels
labels = labels.flatten('F')
for x in range (number_of_clusters): labels[labels == x] = centers [x]
print(labels.shape, 'adopting the tables value')
I would do it using sklearn kmeans segmentation as follows. I show how to create the segmented image and then select one color to present. I create a mask from thresholding the one color and then apply the mask to blacken out the other colors in the segmented image. You can write a loop over each color to get them all. It is also possible to use the mask to make the non-color be transparent rather than black. But I do not show that here. Or you can just save the binary masks.
from skimage import io
from sklearn import cluster
import sys
import cv2
# read input and convert to range 0-1
image = io.imread('barn.jpg')/255.0
h, w, c = image.shape
# reshape to 1D array
image_2d = image.reshape(h*w, c)
# set number of colors
numcolors = 6
# do kmeans processing
kmeans_cluster = cluster.KMeans(n_clusters=int(numcolors))
cluster_centers = kmeans_cluster.cluster_centers_
cluster_labels = kmeans_cluster.labels_
# need to scale result back to range 0-255
newimage = cluster_centers[cluster_labels].reshape(h, w, c)*255.0
newimage = newimage.astype('uint8')
# select cluster 3 (in range 1 to numcolors) and create mask
lower = cluster_centers[3]*255
upper = cluster_centers[3]*255
lower = lower.astype('uint8')
upper = upper.astype('uint8')
mask = cv2.inRange(newimage, lower, upper)
# apply mask to get layer 3
layer3 = newimage.copy()
layer3[mask == 0] = [0,0,0]
# save kmeans clustered image and layer 3
io.imsave('barn_kmeans.gif', newimage)
io.imsave('barn_kmeans_layer3.gif', layer3)
Clustered Image:
Result for color 3:
For a grayscale image, the following works for me.
from skimage import io
from sklearn import cluster
import sys
import cv2
# read input and convert to range 0-1
image = io.imread('barn_gray.jpg',as_gray=True)/255.0
h, w = image.shape
# reshape to 1D array
image_2d = image.reshape(h*w,1)
# set number of colors
numcolors = 6
# do kmeans processing
kmeans_cluster = cluster.KMeans(n_clusters=int(numcolors))
cluster_centers = kmeans_cluster.cluster_centers_
cluster_labels = kmeans_cluster.labels_
# need to scale result back to range 0-255
newimage = cluster_centers[cluster_labels].reshape(h, w)*255.0
newimage = newimage.astype('uint8')
# select cluster 3 (in range 1 to numcolors) and create mask
# note the cluster numbers and corresponding colors are not constant from run to run
lower = cluster_centers[3]*255
upper = cluster_centers[3]*255
lower = lower.astype('uint8')
upper = upper.astype('uint8')
mask = cv2.inRange(newimage, lower, upper)
# apply mask to get layer 3
layer3 = newimage.copy()
layer3[mask == 0] = [0]
# save kmeans clustered image and layer 3
io.imsave('barn_gray_kmeans.gif', newimage)
io.imsave('barn_gray_kmeans_layer3.gif', layer3)
Disclaimer: This was part of a homework assignment, however, it has been passed in already. I'm simply looking for the correct solution for future know-how.
The goal with this program was to use the Python OpenCV library to implement image -> image steganography (Embedding/Extracting images inside other images). This is done with two images of equal size using the least significant bit(LSB) method.
The program allows the user to choose the number of bits used for embedding, so with 1 bit used the embedded image is nearly undetectable to the human eye, and with 7 you can clearly make out the hidden image.
I've correctly implemented the embedding just fine, by taking the most significant bits(MSB) of each RGB byte from the secret image, and setting them in the LSB places of the cover image.
My problem is extracting the secret image after it has been embedded. After the code runs, the image that I'm left with seems to be only the blue representation of it. I'm not sure where I went wrong, but I have a feeling it has something to do with my bit manipulation techniques, or use of the OpenCV library. Any help is greatly appreciated, thanks in advance!
Code for extracting:
import cv2
import numpy
def extract(img1, bitsUsed):
print "Extracting..."
# Import image & get dimensions
img = cv2.imread(img1)
h = img.shape[0]
w = img.shape[1]
# Create new image to extract secret image
# Same dimensions, and rgb channel
secretImg = numpy.zeros((h,w,3), numpy.uint8)
x, y = 0, 0
# Loop thru each pixel
while x < w:
while y < h:
# Grab the LSB (based on bitsUsed from embedding)
lsb_B = img.item(y,x,0) & bitsUsed
lsb_G = img.item(y,x,1) & bitsUsed
lsb_R = img.item(y,x,2) & bitsUsed
# Place those bits into MSB positions on new img
secretImg.itemset((y,x,0), lsb_B << (8 - bitsUsed))
secretImg.itemset((y,x,0), lsb_G << (8 - bitsUsed))
secretImg.itemset((y,x,0), lsb_R << (8 - bitsUsed))
y += 1
y = 0
x += 1
cv2.imwrite("extractedImg.png", secretImg)
njuffa is correct. In the extraction, when you have embedded only 1 bit, you want to AND with 0b00000001 (1), for 2 bits with 0b00000011 (3), for 3 bits with 0b00000111 (7), etc. Generally, for k embedded bits, you want the mask 2**k - 1.
Moreover, cv2.imread() will generate a numpy array of the pixels. Instead of looping through each pixel, you can vectorise your computations. All in all, this is what your code could look like.
import cv2
def embed(cover_file, secret_file, k):
cover = cv2.imread(cover_file)
secret = cv2.imread(secret_file)
mask = 256 - 2**k
stego = (cover & mask) | (secret >> (8 - k))
cv2.imwrite('stego.png', stego)
def extract(stego_file, k):
stego = cv2.imread(stego_file)
mask = 2**k - 1
output = (stego & mask) << (8 - k)
cv2.imwrite('extracted.png', output)
I am trying to determine the centroid of one specific object using OpenCV and Python.
I am using the following code, but it is taking too much time to calculate the centroid.
I need a faster approach for this -- should I change the resolution of the cameras in order to increase the computing speed?
This is my code:
#taking infinite frames continuously to make a video
ret, frame = capture.read()
rgb_image = cv2.cvtColor(frame , 0)
content_red = rgb_image[:,:,2] #red channel of image
content_green = rgb_image[:,:,1] #green channel of image
content_blue = rgb_image[:,:,0] #blue channel of image
r = rgb_image.shape[0] #gives the rows of the image matrix
c = rgb_image.shape[1] # gives the columns of the image matrix
d = rgb_image.shape[2] #gives the depth order of the image matrux
binary_image = np.zeros((r,c),np.float32)
for i in range (1,r): #thresholding the object as per requirements
for j in range (1,c):
if((content_red[i][j]>186) and (content_red[i][j]<230) and \
(content_green[i][j]>155) and (content_green[i][j]<165) and \
(content_blue[i][j]> 175) and (content_blue[i][j]< 195)):
binary_image[i][j] = 1
cox = np.mean(meanI) #x-coordinate of centroid
coy = np.mean(meanJ) #y-coordinate of centroid
As you have discovered, nested loops in Python are very slow. It is best to avoid iterating over every pixel using nested loops. Fortunately, OpenCV has some built-in functions that do exactly what you are trying to achieve: inRange(), which creates a binary image of pixels which fall in between the specified bounds, and moments(), which you can use to calculate the centroid of a binary image. I strongly suggest reading over OpenCV's documentation to get a feel for what the library offers.
Combining these two functions gives the following code:
import numpy as np
import cv2
lower = np.array([175, 155, 186], np.uint8) # Note these ranges are BGR ordered
upper = np.array([195, 165, 230], np.uint8)
binary = cv2.inRange(im, lower, upper) # im is your BGR image
moments = cv2.moments(binary, True)
cx = moments['m10'] / moments['m00']
cy = moments['m01'] / moments['m00']
cx and cy are the x- and y-coordinates of the image centroid. This version is a whopping 3000 times faster than using nested loops.