I have a disparity map.
Based on the disparity map, hovering on the 'left image' displays:
X and y of the image, So if i hover on the top-left most, it will display x:0, y:0
The next step is to display distance of the specific pixel,to make my life easy, I will try to do it with reprojectImageTo3D(disp, Q)
I got Q from stereoRectify
now, reprojectImageTo3D in python, returns an n by 3 matrix.
So I can see, it is a row of x y z coordinates. Wondering, how can I know which pixel are these coordinates correspond to?
This is a sample of the 3D points that I saved using numpy.savetxt
http://pastebin.com/wwDCYwjA
BTW: I'm doing everything in python, but GUI in Java, I don't have time to study GUI in python.
If you correctly calculate your disparity map, you should get (n1,n2,1) dimensional array, where n1,n2 - number of image's pixels by axes, 1 - number of chanels (single channel, which contain distance in pixels between correspondent pixels from left and right images). You should check that by typing disp.shape. After that you should pass your disparity map's ndarray to reprojectImageTo3D function and get ndarray, which has (n1,n2,3) shape (third dimension contains X,Y,Z coords of 3D point). You can check that by typing:
threeDImage = reprojectImageTo3D(disp, Q)
print threeDImage.shape
And finally, since you made your disparity map based on left image, each pixel, which has coords x,y on left image (or disparity map), corresponds to threeDImage[x][y] 3D point. Keep in mind, that row:0, column:0 is the top-left element of the matrix, based on opencv handling images:
0/0---column--->
|
|
row
|
|
v
Related
I am trying to associate rgb values to pixel coordinates after having done a perspective projection. The equation for the perspective projection is:
where x, y, are the pixel locations of the point, X, Y, and Z are locations of points in the camera frame, and the other parameters denote the intrinsic camera parameters. Given a point cloud containing the point locations and rgb values, I would like to associate rgb values to pixel locations according to the perspective projection.
The following code should create the correct image:
import matplotlib.pyplot as plt
import open3d as o3d
import numpy as np
cx = 325.5;
cy = 253.5;
fx = 518.0;
fy = 519.0;
K = np.array([[fx, 0, cx], [0, fy, cy], [0, 0, 1]])
pcd = o3d.io.read_point_cloud('freiburg.pcd', remove_nan_points=True)
points = np.array(pcd.points)
colors = np.array(pcd.colors)
projection = (K # points.T).T
normalization = projection / projection[:, [2]] #last elemet must be 1
pixel_coordinates = normalization.astype(int)
img = np.zeros((480, 640, 3))
#how can I fill the img appropriately? The matrix pixel coordinates should
# inform about where to place the color intensities.
for position, intensity in zip(pixel_coordinates, colors):
row, column = position[0], position[1]
#img[row, column, :] = intensity # returns with error
img[column, row, :] = intensity # gives a strange picture.
The point cloud can be read here. I expect to be able to associate the rgb values in the last loop:
for position, intensity in zip(pixel_coordinates, colors):
row, column = position[0], position[1]
#img[row, column, :] = intensity # returns with error
img[column, row, :] = intensity # gives a strange picture.
Strangely, if the second-to-last line is not commented, the program returns and IndexError while attempting to write a rgb values outside the range of available columns. The last line in the loop runs however without problems. The generated picture and the correct picture can be seen below:
How can I modify the code above to obtain the correct image?
A couple of issues:
You are ignoring the nonlinear distortion in the projection. Are the images you are comparing to undistorted? If they are, are you sure your projection matrix K is the one associated to the undistorted image?
Projecting the 3D points will inevitably produce a point cloud on the image plane, not a continuous image. To produce an image somewhat natural you likely need to interpolate nearby samples in the 2D point cloud. Your choice of interpolation filter determines the quality of the result. For example, you could first make an image of rgb buckets, a similar image of weights, project the 3d points, place their rgb values in the closest bucket (the one obtained by rounding the projection x,y coords), with a weight equal to the reciprocal of the distance of the projection from the bucket's center (i.e. the reciprocal of the euclidean norm of the rounding residuals). You then first compute the output pixel values as weighted averages at each bucket and then, if there are any unfilled bucket, you fill them by (say) bilinear interpolation of the filled neighbors. The last step will fill 1-pixel holes surrounded by already filled values. For larger holes you will need to choose some kind of infill procedure.
I have a binary image with dots, which I obtained using OpenCV's goodFeaturesToTrack, as shown on Image1.
Image1 : Cloud of points
I would like to fit a grid of 4*25 dots on it, such as the on shown on Image2 (Not all points are visible on the image, but it is a regular 4*25 points rectangle).
Image2 : Model grid of points
My model grid of 4*25 dots is parametrized by :
1 - The position of the top left corner
2 - The inclination of the rectangle with the horizon
The code below shows a function that builds such a model.
This problem seems to be close to a chessboard corner problem.
I would like to know how to fit my model cloud of points to the input image and get the position and angle of the cloud.
I can easily measure a distance in between the two images (the input one and the on with the model grid) but I would like to avoid having to check every pixel and angle on the image for finding the minimum of this distance.
def ModelGrid(pos, angle, shape):
# Initialization of output image of size shape
table = np.zeros(shape)
# Parameters
size_pan = [32, 20]# Pixels
nb_corners= [4, 25]
index = np.ndarray([nb_corners[0], nb_corners[1], 2],dtype=np.dtype('int16'))
angle = angle*np.pi/180
# Creation of the table
for i in range(nb_corners[0]):
for j in range(nb_corners[1]):
index[i,j,0] = pos[0] + j*int(size_pan[1]*np.sin(angle)) + i*int(size_pan[0]*np.cos(angle))
index[i,j,1] = pos[1] + j*int(size_pan[1]*np.cos(angle)) - i*int(size_pan[0]*np.sin(angle))
if 0 < index[i,j,0] < table.shape[0]:
if 0 < index[i,j,1] < table.shape[1]:
table[index[i,j,0], index[i,j,1]] = 1
return table
A solution I found, which works relatively well is the following :
First, I create an index of positions of all positive pixels, just going through the image. I will call these pixels corners.
I then use this index to compute an average angle of inclination :
For each of the corners, I look for others which would be close enough in certain areas, as to define a cross. I manage, for each pixel to find the ones that are directly on the left, right, top and bottom of it.
I use this cross to calculate an inclination angle, and then use the median of all obtained inclination angles as the angle for my model grid of points.
Once I have this angle, I simply build a table using this angle and the positions of each corner.
The optimization function measures the number of coincident pixels on both images, and returns the best position.
This way works fine for most examples, but the returned 'best position' has to be one of the corners, which does not imply that it corresponds to the best position... Mainly if the top left corner of the grid within the cloud of corners is missing.
I am trying to warp an 640x360 image via the OpenCV remap function (in python 2.7). The steps executed are the following
Generate a curve and store its x and y coordinates in two seperate arrays, curve_x and curve_y.I am attaching the generated curve as an image(using pyplot):
Load image via the opencv imread function
original = cv2.imread('C:\\Users\\User\\Desktop\\alaskan-landscaps3.jpg')
Execute a nested for loop so that each pixel is shifted upwards in proportion to the height of the curve at that point.For each pixel I calculate a warping factor by dividing the distance between the curve's y coordinate and the "ceiling" (360) by the height of the image. The factor is then multiplied with the distance between the pixel's y-coordinate and the "ceiling" in order to find the new distance that the pixel must have from the "ceiling" (it will be shorter since we have an upward shift). Finally I subtract this new distance from the "ceiling" to obtain the new y-coordinate for the pixel. I thought of this formula in order to ensure that all entries in the map_y array used in the remap function will be within the area of the original image.
for i in range(0, y_size):
for j in range(0,x_size):
map_y[i][j]= y_size-((y_size - i) * ((y_size - curve_y[j]) / y_size))
map_x[i][j]=j`
Then using the remap function
warped=cv2.remap(original,map_x,map_y,cv2.INTER_LINEAR)
The resulting image appears to be warped somewhat along the curve's path but it is cropped - I am attaching both the original and resulting image
I know I must be missing something but I can't figure out where the mistake is in my code - I don't understand why since all y-coordinates in map_y are between 0-360 the top third of the image has disappeared following the remapping
Any pointers or help will be appreciated. Thanks
[EDIT:] I have edited my function as follows:
#array to store previous y-coordinate, used as a counter during mapping process
floor_y=np.zeros((x_size),np.float32)
#for each row and column of picture
for i in range(0, y_size):
for j in range(0,x_size):
#calculate distance between top of the curve at given x coordinate and top
height_above_curve = (y_size-1) - curve_y_points[j]
#calculated a mapping factor, using total height of picture and distance above curve
mapping_factor = (y_size-1)/height_above_curve
# if there was no curve at given x-coordinate then do not change the pixel coordinate
if(curve_y_points[j]==0):
map_y[i][j]=j
#if this is the first time the column is traversed, save the curve y-coordinate
elif (floor_y[j]==0):
#the pixel is translated upwards according to the height of the curve at that point
floor_y[j]=i+curve_y_points[j]
map_y[i][j]=i+curve_y_points[j] # new coordinate saved
# use a modulo operation to only translate each nth pixel where n is the mapping factor.
# the idea is that in order to fit all pixels from the original picture into a new smaller space
#(because the curve squashes the picture upwards) a number of pixels must be removed
elif ((math.floor(i % mapping_factor))==0):
#increment the "floor" counter so that the next group of pixels from the original image
#are mapped 1 pixel higher up than the previous group in the new picture
floor_y[j]=floor_y[j]+1
map_y[i][j]=floor_y[j]
else:
#for pixels that must be skipped map them all to the last pixel actually translated to the new image
map_y[i][j]=floor_y[j]
#all x-coordinates remain unchanges as we only translate pixels upwards
map_x[i][j] = j
#printout function to test mappings at x=383
for j in range(0, 360):
print('At x=383,y='+str(j)+'for curve_y_points[383]='+str(curve_y_points[383])+' and floor_y[383]='+str(floor_y[383])+' mapping is:'+str(map_y[j][383]))
The bottom line is that now the higher part of the image should not receive mappings from the lowest part so overwriting of pixels should not take place. Yet i am still getting a hugely exaggerated upwards warping effect in the picture which I cannot explain. (see new image below).The top of the curved part is at around y=140 in the original picture yet now is very close to the top i.e y around 300. There is also the question of why I am not getting a blank space at the bottom for the pixels below the curve.
I'm thinking that maybe there is also something going on with the order of rows and columns in the map_y array?
I don't think the image is being cropped. Rather, the values are "crowded" in the top-middle pixels, so that they get overwritten. Consider the following example with a simple function on a checkerboard.
import numpy as np
import cv2
import pickle
y_size=200
x_size=200
x=np.linspace(0,x_size,x_size+1)
y=(-(x-x_size/2)*(x-x_size/2))/x_size+x_size
plt.plot(x,y)
The function looks like this:
Then let's produce an image with a regular pattern.
test=np.zeros((x_size,y_size),dtype=np.float32)
for i in range(0, y_size):
for j in range(0,x_size):
if i%2 and j%2:
test[i][j]=255
cv2.imwrite('checker.png',test)
Now let's apply your shift function to that pattern:
map_y=np.zeros((x_size,y_size),dtype=np.float32)
map_x=np.zeros((x_size,y_size),dtype=np.float32)
for i in range(0, y_size):
for j in range(0,x_size):
map_y[i][j]= y_size-((y_size - i) * ((y_size - y[j]) / y_size))
map_x[i][j]=j
warped=cv2.remap(test,map_x,map_y,cv2.INTER_LINEAR)
cv2.imwrite('warped.png',warped)
If you notice, because of the shift, more than one value corresponds to the top-middle areas, which makes it look like it is cropped. But if you check to the top left and right corners of the image, notice that the values are sparser, thus the "cropping" effect does not occur much. I hope the simple example helps better to understand what is going on.
I am using the OpenCV HoughCircles method in Python as follows:
circles = cv2.HoughCircles(img,cv.CV_HOUGH_GRADIENT,1,20,
param1=50,param2=30,minRadius=0,maxRadius=0)
This seems to work quite well. However, one thing I noticed is that it detects circles which can extend outside of the image boundaries. Does anyone know how I can filter these results out?
Think of each circle as being bounded inside a square of dimensions 2r x 2r where r is the radius of the circle. Also, the centre of this box is located at (x,y) which also corresponds to where the centre of the circle is located in the image. To see if the circle is within the image boundaries, you simply need to make sure that the box that contains the circle does not go outside of the image. Mathematically speaking, you would need to ensure that:
r <= x <= cols-1-r
r <= y <= rows-1-r # Assuming 0-indexing
rows and cols are the rows and columns of your image. All you really have to do now is cycle through every circle in the detected result and filter out those circles that go outside of the image boundaries by checking if the centre of each circle is within the two inequalities specified above. If the circle is within the two inequalities, you would save this circle. Any circles that don't satisfy the inequalities, you don't include this in the final result.
To put this logic to code, do something like this:
import cv # Load in relevant packages
import cv2
import numpy as np
img = cv2.imread(...,0) # Load in image here - Ensure 8-bit grayscale
final_circles = [] # Stores the final circles that don't go out of bounds
circles = cv2.HoughCircles(img,cv.CV_HOUGH_GRADIENT,1,20,param1=50,param2=30,minRadius=0,maxRadius=0) # Your code
rows = img.shape[0] # Obtain rows and columns
cols = img.shape[1]
circles = np.round(circles[0, :]).astype("int") # Convert to integer
for (x, y, r) in circles: # For each circle we have detected...
if (r <= x <= cols-1-r) and (r <= y <= rows-1-r): # Check if circle is within boundary
final_circles.append([x, y, r]) # If it is, add this to our final list
final_circles = np.asarray(final_circles).astype("int") # Convert to numpy array for compatability
The peculiar thing about cv2.HoughCircles is that it returns a 3D matrix where the first dimension is a singleton dimension. To eliminate this singleton dimension, I did circles[0, :] which will result in a 2D matrix. Each row of this new 2D matrix contains a tuple of (x, y, r) and characterizes where a circle is located in your image as well as its radius. I also converted the centres and radii to integers so that if you decide to draw them later on, you will be able to do it with cv2.circle.
you could, add a function which will take the center and the radius of the circle add them up/and subtract and check if this will result outside the boundaries of your image.
Can anyone please explain if it is possible, and if so how, to work with cv2.getPerspectiveTransform().
I have 3d information about my image: I know the length of a,b and also the Different heights of c,d,e,f and g. I made the height different to get more 3d information but if it isn't needed that will be preferable.
Ultimately I need to know where the pink dot really is in the rectangle after implementing the transform on my [x,y] position I get from the camera feed.
If you denote by C,D,E,F the positions of the four corners of the black polygon in the original image (each of them is a 2D point), and C',D',E',F' the positions of the corresponding points in your target image (probably (0,0), (a, 0), (a, b), (0, b)), M = cv2.getPerspectiveTransform({C,D,E,F}, {C',D',E',F'}) is the perspective transformation from one polygon to the other.
Given the position G of the vertical projection of g onto the black polygon in the original image, you can compute its position in the target image as cv2.transform(G, M). This will return a point (x,y,z), where the last coordinate z is a normalizing term. This z is zero when your point would be "at infinity" in the target image. If z is not zero, the point you are looking for is (x/z, y/z).
If z is zero, your point is at infinity, in the direction of the support of vector (x, y) (think of the case where G would be at the intersection of the supporting lines of two opposite sides of the black polygon in the source image).
If you know that the heights of c,d,e,f,g are equal, these points are also coplanar, and the exact same method applies to c,d,e,f,g instead of C,D,E,F,G.