I am trying to warp an 640x360 image via the OpenCV remap function (in python 2.7). The steps executed are the following
Generate a curve and store its x and y coordinates in two seperate arrays, curve_x and curve_y.I am attaching the generated curve as an image(using pyplot):
Load image via the opencv imread function
original = cv2.imread('C:\\Users\\User\\Desktop\\alaskan-landscaps3.jpg')
Execute a nested for loop so that each pixel is shifted upwards in proportion to the height of the curve at that point.For each pixel I calculate a warping factor by dividing the distance between the curve's y coordinate and the "ceiling" (360) by the height of the image. The factor is then multiplied with the distance between the pixel's y-coordinate and the "ceiling" in order to find the new distance that the pixel must have from the "ceiling" (it will be shorter since we have an upward shift). Finally I subtract this new distance from the "ceiling" to obtain the new y-coordinate for the pixel. I thought of this formula in order to ensure that all entries in the map_y array used in the remap function will be within the area of the original image.
for i in range(0, y_size):
for j in range(0,x_size):
map_y[i][j]= y_size-((y_size - i) * ((y_size - curve_y[j]) / y_size))
map_x[i][j]=j`
Then using the remap function
warped=cv2.remap(original,map_x,map_y,cv2.INTER_LINEAR)
The resulting image appears to be warped somewhat along the curve's path but it is cropped - I am attaching both the original and resulting image
I know I must be missing something but I can't figure out where the mistake is in my code - I don't understand why since all y-coordinates in map_y are between 0-360 the top third of the image has disappeared following the remapping
Any pointers or help will be appreciated. Thanks
[EDIT:] I have edited my function as follows:
#array to store previous y-coordinate, used as a counter during mapping process
floor_y=np.zeros((x_size),np.float32)
#for each row and column of picture
for i in range(0, y_size):
for j in range(0,x_size):
#calculate distance between top of the curve at given x coordinate and top
height_above_curve = (y_size-1) - curve_y_points[j]
#calculated a mapping factor, using total height of picture and distance above curve
mapping_factor = (y_size-1)/height_above_curve
# if there was no curve at given x-coordinate then do not change the pixel coordinate
if(curve_y_points[j]==0):
map_y[i][j]=j
#if this is the first time the column is traversed, save the curve y-coordinate
elif (floor_y[j]==0):
#the pixel is translated upwards according to the height of the curve at that point
floor_y[j]=i+curve_y_points[j]
map_y[i][j]=i+curve_y_points[j] # new coordinate saved
# use a modulo operation to only translate each nth pixel where n is the mapping factor.
# the idea is that in order to fit all pixels from the original picture into a new smaller space
#(because the curve squashes the picture upwards) a number of pixels must be removed
elif ((math.floor(i % mapping_factor))==0):
#increment the "floor" counter so that the next group of pixels from the original image
#are mapped 1 pixel higher up than the previous group in the new picture
floor_y[j]=floor_y[j]+1
map_y[i][j]=floor_y[j]
else:
#for pixels that must be skipped map them all to the last pixel actually translated to the new image
map_y[i][j]=floor_y[j]
#all x-coordinates remain unchanges as we only translate pixels upwards
map_x[i][j] = j
#printout function to test mappings at x=383
for j in range(0, 360):
print('At x=383,y='+str(j)+'for curve_y_points[383]='+str(curve_y_points[383])+' and floor_y[383]='+str(floor_y[383])+' mapping is:'+str(map_y[j][383]))
The bottom line is that now the higher part of the image should not receive mappings from the lowest part so overwriting of pixels should not take place. Yet i am still getting a hugely exaggerated upwards warping effect in the picture which I cannot explain. (see new image below).The top of the curved part is at around y=140 in the original picture yet now is very close to the top i.e y around 300. There is also the question of why I am not getting a blank space at the bottom for the pixels below the curve.
I'm thinking that maybe there is also something going on with the order of rows and columns in the map_y array?
I don't think the image is being cropped. Rather, the values are "crowded" in the top-middle pixels, so that they get overwritten. Consider the following example with a simple function on a checkerboard.
import numpy as np
import cv2
import pickle
y_size=200
x_size=200
x=np.linspace(0,x_size,x_size+1)
y=(-(x-x_size/2)*(x-x_size/2))/x_size+x_size
plt.plot(x,y)
The function looks like this:
Then let's produce an image with a regular pattern.
test=np.zeros((x_size,y_size),dtype=np.float32)
for i in range(0, y_size):
for j in range(0,x_size):
if i%2 and j%2:
test[i][j]=255
cv2.imwrite('checker.png',test)
Now let's apply your shift function to that pattern:
map_y=np.zeros((x_size,y_size),dtype=np.float32)
map_x=np.zeros((x_size,y_size),dtype=np.float32)
for i in range(0, y_size):
for j in range(0,x_size):
map_y[i][j]= y_size-((y_size - i) * ((y_size - y[j]) / y_size))
map_x[i][j]=j
warped=cv2.remap(test,map_x,map_y,cv2.INTER_LINEAR)
cv2.imwrite('warped.png',warped)
If you notice, because of the shift, more than one value corresponds to the top-middle areas, which makes it look like it is cropped. But if you check to the top left and right corners of the image, notice that the values are sparser, thus the "cropping" effect does not occur much. I hope the simple example helps better to understand what is going on.
Related
There are several packages available to digitize the line graphs e.g. GetData Graph Digitizer. However, for digitzation of heat maps I could not find any packages or programs.
I want to digitize the heat map (images from png or jpg format) using Python. How to do it?
Do I need to write the entire code from scratch?
Or there are any packages available?
There are multiple ways to do it, many Machine Learning libraries offering custom visualization functions...easier or harder.
You need to split the problem in half.
First, using OpenCV for python or scikit-image you first have to load the images as matrices. You can set some offsets to start right at the beginning of the cells.
import cv2
# 1 - read color image (3 color channels)
image = cv2.imread('test.jpg',1)
Then, you will iterate thru the cells and read the color inside. You can normalise the result if you want. The reason we're introducing some offsets is because the heatmap doesn't start in the top left corner of the original image at (0,0). The offset_x and offset_y will be lists with 2 values each.
offset_x[0]: the offset from the left part of the image up to the beginning of the heatmap(i.e. start_of_heatmap_x)
offset_x[1]: the offset from the right part of the image up to the ending of the heatmap(i.e. image_width - end_of_heatmap_x)
offset_y[0]: the offset from the top part of the image up to the beggining of the heatmap(i.e. start_of_heatmap_y)
offset_y[1]: the offset from the bottom part of the image up to the ending of the heatmap (i.e. image_height - end_of_heatmap_y)
Also, we don't iterate up to the last column. That's because we start from the "0-th" column and we add cell_size/2 on each base local coordinates to obtain the center value of the cell.
def read_as_digital(image, cell_size, offset_x, offset_y):
# grab the image dimensions
h = image.shape[0]
w = image.shape[1]
results = []
# loop over the image, cell by cell
for y in range(offset_y[0], h-offset_y[1]-cell_size, cell_size):
row = []
for x in range(offset_x[0], w-offset_x[0]-cell_size, cell_size):
# append heatmap cell color to row
row.append(image[x+int(cell_size/2),y+int(cell_size/2)])
results.append(row)
# return the thresholded image
return results
Extracting the legend information is not hard because we can derive the values by having the limits (although this applies for linear scales).
So for example, we can derive the step on the legends (from x and y).
def generate_legend(length, offset, cell_size, legend_start, legend_end):
nr_of_cells = (length- offset[0] - offset[1])/cell_size
step_size = (legend_end - legend_start)/nr_of_cells
i=legend_start+step_size/2 # a little offset to center on the cell
values = []
while(i<legend_end):
values.append(i)
i = i+step_size
return values
Then you want to visualize them to see if everything was done right. For example, with seaborn it's very easy [1]. If you want more control, over...anything, you can use scikit learn and matplotlib [2].
I have a binary image with dots, which I obtained using OpenCV's goodFeaturesToTrack, as shown on Image1.
Image1 : Cloud of points
I would like to fit a grid of 4*25 dots on it, such as the on shown on Image2 (Not all points are visible on the image, but it is a regular 4*25 points rectangle).
Image2 : Model grid of points
My model grid of 4*25 dots is parametrized by :
1 - The position of the top left corner
2 - The inclination of the rectangle with the horizon
The code below shows a function that builds such a model.
This problem seems to be close to a chessboard corner problem.
I would like to know how to fit my model cloud of points to the input image and get the position and angle of the cloud.
I can easily measure a distance in between the two images (the input one and the on with the model grid) but I would like to avoid having to check every pixel and angle on the image for finding the minimum of this distance.
def ModelGrid(pos, angle, shape):
# Initialization of output image of size shape
table = np.zeros(shape)
# Parameters
size_pan = [32, 20]# Pixels
nb_corners= [4, 25]
index = np.ndarray([nb_corners[0], nb_corners[1], 2],dtype=np.dtype('int16'))
angle = angle*np.pi/180
# Creation of the table
for i in range(nb_corners[0]):
for j in range(nb_corners[1]):
index[i,j,0] = pos[0] + j*int(size_pan[1]*np.sin(angle)) + i*int(size_pan[0]*np.cos(angle))
index[i,j,1] = pos[1] + j*int(size_pan[1]*np.cos(angle)) - i*int(size_pan[0]*np.sin(angle))
if 0 < index[i,j,0] < table.shape[0]:
if 0 < index[i,j,1] < table.shape[1]:
table[index[i,j,0], index[i,j,1]] = 1
return table
A solution I found, which works relatively well is the following :
First, I create an index of positions of all positive pixels, just going through the image. I will call these pixels corners.
I then use this index to compute an average angle of inclination :
For each of the corners, I look for others which would be close enough in certain areas, as to define a cross. I manage, for each pixel to find the ones that are directly on the left, right, top and bottom of it.
I use this cross to calculate an inclination angle, and then use the median of all obtained inclination angles as the angle for my model grid of points.
Once I have this angle, I simply build a table using this angle and the positions of each corner.
The optimization function measures the number of coincident pixels on both images, and returns the best position.
This way works fine for most examples, but the returned 'best position' has to be one of the corners, which does not imply that it corresponds to the best position... Mainly if the top left corner of the grid within the cloud of corners is missing.
I have a disparity map.
Based on the disparity map, hovering on the 'left image' displays:
X and y of the image, So if i hover on the top-left most, it will display x:0, y:0
The next step is to display distance of the specific pixel,to make my life easy, I will try to do it with reprojectImageTo3D(disp, Q)
I got Q from stereoRectify
now, reprojectImageTo3D in python, returns an n by 3 matrix.
So I can see, it is a row of x y z coordinates. Wondering, how can I know which pixel are these coordinates correspond to?
This is a sample of the 3D points that I saved using numpy.savetxt
http://pastebin.com/wwDCYwjA
BTW: I'm doing everything in python, but GUI in Java, I don't have time to study GUI in python.
If you correctly calculate your disparity map, you should get (n1,n2,1) dimensional array, where n1,n2 - number of image's pixels by axes, 1 - number of chanels (single channel, which contain distance in pixels between correspondent pixels from left and right images). You should check that by typing disp.shape. After that you should pass your disparity map's ndarray to reprojectImageTo3D function and get ndarray, which has (n1,n2,3) shape (third dimension contains X,Y,Z coords of 3D point). You can check that by typing:
threeDImage = reprojectImageTo3D(disp, Q)
print threeDImage.shape
And finally, since you made your disparity map based on left image, each pixel, which has coords x,y on left image (or disparity map), corresponds to threeDImage[x][y] 3D point. Keep in mind, that row:0, column:0 is the top-left element of the matrix, based on opencv handling images:
0/0---column--->
|
|
row
|
|
v
I need to search outliers in more or less homogeneous images representing some physical array. The images have a resolution which is much higher than the screen resolution. Thus every pixel on screen originates from a block of image pixels. Is there the possibility to customize the algorithm which calculates the displayed value for such a block? Especially the possibility to either use the lowest or the highest value would be helpful.
Thanks in advance
Scipy provides several such filters. To get a new image (new) whose pixels are the maximum/minimum over a w*w block of an original image (img), you can use:
new = scipy.ndimage.filters.maximum_filter(img, w)
new = scipy.ndimage.filters.minimum_filter(img, w)
scipy.ndimage.filters has several other filters available.
If the standard filters don't fit your requirements, you can roll your own. To get you started here is an example that shows how to get the minimum in each block in the image. This function reduces the size of the full image (img) by a factor of w in each direction. It returns a smaller image (new) in which each pixel is the minimum pixel in a w*w block of pixels from the original image. The function assumes the image is in a numpy array:
import numpy as np
def condense(img, w):
new = np.zeros((img.shape[0]/w, img.shape[1]/w))
for i in range(0, img.shape[1]//w):
col1 = i * w
new[:, i] = img[:, col1:col1+w].reshape(-1, w*w).min(1)
return new
If you wanted the maximum, replace min with max.
For the condense function to work well, the size of the full image must be a multiple of w in each direction. The handling of non-square blocks or images that don't divide exactly is left as an exercise for the reader.
I'm trying to determine if an image is squared(pixelated).
I've heard of 2D fourrier transform with numpy or scipy but it is a bit complicated.
The goal is to determine an amount of squared zone due to bad compression like this (img a):
I have no idea if this would work - but, something you could try is to get the nearest neighbors around a pixel. The pixellated squares will be a visible jump in RGB values around a region.
You can find the nearest neighbors for every pixel in an image with something like
def get_neighbors(x,y, img):
ops = [-1, 0, +1]
pixels = []
for opy in ops:
for opx in ops:
try:
pixels.append(img[x+opx][y+opy])
except:
pass
return pixels
This will give you the nearest pixels in a region of your source image.
To use it, you'd do something like
def detect_pixellated(fp):
img = misc.imread(fp)
width, height = np.shape(img)[0:2]
# Pixel change to detect edge
threshold = 20
for x in range(width):
for y in range(height):
neighbors = get_neighbors(x, y, img)
# Neighbors come in this order:
# 6 7 8
# 3 4 5
# 0 1 2
center = neighbor[4]
del neighbor[4]
for neighbor in neighbors:
diffs = map(operator.abs, map(operator.sub, neighbor, center))
possibleEdge = all(diff > threshold for diff in diffs)
After further thought though, use OpenCV and do edge detection and get contour sizes. That would be significantly easier and more robust.
If you scan through lines of it it's abit easier because then you deal with linear graphs instead of 2d image graphs, which is always simpler.
Solution:
scan a line across the pixels, put the line in an array if it is faster to access for computations, and then run algorithms on the line(s) to determine the blockiness:
1/ run through every pixel in your line and compare it to the previous pixel by substracting the value between the two pixels. make an array of previous pixel values. if large jumps in pixel values are at regular invervals, it's blocky. if there are large jumps in values combined with small jumps in values, it's blocky... you can assume that if there are many equal pixel differences, it's blocky, especially if you repeat the analysis twice at 2 and 4 neighbour pixel intervals, and on multiple lines.
you can also make graphs of pixel differences between pixels 3-5-10 pixels apart, to have additional information on gradient changes of sampled lines of pics. if the ratio of pixel differences of neighbour pixels and 5th neighbour pixels is similar, it also indicates unsmooth colors.
there can be many algorythms, including fast fourrier on a linear graph, same as audio, that you would use on line(s) from the pic, that is simpler than a 2d image algorythm.