python split image in overlapping and rotating tiles - python

I am doing a image classification. I have very imbalanced data. I am trying couple of approaches to overcome the imbalanced data issue. one of them is oversampling the minority class.
The images that i have are already in high resolution(1392x1038), so i am splitting them into 348x256 size 16 tiles. As in oversampling, you simply replicate the minority classes. I was thinking of splitting the image into overlapping tiles with stride 1 or 2, so i would have slighlty different images and it would also help me in oversampling. Following code splits the images into specified number of defined size overlapping tiles
for i in range(0, count):
start_row_idx = random.randint(0, img_height-target_height-1)
start_col_idx = random.randint(0, img_width-target_width-1)
if mode == 'rgb':
patch = img_array[start_row_idx:(start_row_idx+target_height), start_col_idx:(start_col_idx+target_width), :]
else:
patch = img_array[start_row_idx:(start_row_idx+target_height), start_col_idx:(start_col_idx+target_width)]
patches.append(patch)
idxs.append((start_row_idx, start_col_idx))
how can I make it work for rotating overlapping tiles with defined number of tiles and size.
Edited Question:
In following image, the black squares shows the horizontal stride and tile which is I am able to get. I want to get the red color squares in that shape. I think, with red color type cropping i would be able to get more images for oversampling.

As we discussed above, you have tiles that have the potential of being overlapped so this is already being addressed. What is missing are rotating the tiles too. We will need to specify a random angle of rotation so that we can generate a random angle first.
After, this is simply a matter of applying an affine transform that is purely a rotation to the tiles then appending to the list. The problem with rotating images in OpenCV is that when you do rotate the image, it is subject to cropping so you don't get the entire tile contained in the image once you rotate.
I used the following post as inspiration to address this issue so that when you do rotate, the image is fully contained. Take note that the image will expand in dimensions in order to accommodate for the rotation and to keep the entire image contained in the rotated result.
import cv2
import numpy as np
def rotate_about_center(src, angle):
h, w = src.shape[:2]
rangle = np.deg2rad(angle) # angle in radians
# now calculate new image width and height
nw = (abs(np.sin(rangle)*h) + abs(np.cos(rangle)*w))
nh = (abs(np.cos(rangle)*h) + abs(np.sin(rangle)*w))
# ask OpenCV for the rotation matrix
rot_mat = cv2.getRotationMatrix2D((nw*0.5, nh*0.5), angle, 1)
# calculate the move from the old centre to the new centre combined
# with the rotation
rot_move = np.dot(rot_mat, np.array([(nw-w)*0.5, (nh-h)*0.5,0]))
# the move only affects the translation, so update the translation
# part of the transform
rot_mat[0,2] += rot_move[0]
rot_mat[1,2] += rot_move[1]
return cv2.warpAffine(src, rot_mat, (int(math.ceil(nw)), int(math.ceil(nh))), flags=cv2.INTER_LANCZOS4)
You use this function and call this with a random angle then save the patch when you're done. You'll also need to specify a maximum angle of rotation of course.
import random
max_angle = 20 # +/- 20 degrees maximum rotation
patches = []
idxs = []
for i in range(0, count):
start_row_idx = random.randint(0, img_height-target_height-1)
start_col_idx = random.randint(0, img_width-target_width-1)
# Generate an angle between +/- max_angle
angle = (2*max_angle)*random.random() - max_angle
if mode == 'rgb':
patch = img_array[start_row_idx:(start_row_idx+target_height), start_col_idx:(start_col_idx+target_width), :]
else:
patch = img_array[start_row_idx:(start_row_idx+target_height), start_col_idx:(start_col_idx+target_width)]
# Randomly rotate the image
patch_r = rotate_about_center(patch, angle)
# Save it now
patches.append(patch_r)
idxs.append((start_row_idx, start_col_idx))

Related

Find minimal number of rectangles in the image

I have binary images where rectangles are placed randomly and I want to get the positions and sizes of those rectangles.
If possible I want the minimal number of rectangles necessary to exactly recreate the image.
On the left is my original image and on the right the image I get after applying scipys.find_objects()
(like suggested for this question).
import scipy
# image = scipy.ndimage.zoom(image, 9, order=0)
labels, n = scipy.ndimage.measurements.label(image, np.ones((3, 3)))
bboxes = scipy.ndimage.measurements.find_objects(labels)
img_new = np.zeros_like(image)
for bb in bboxes:
img_new[bb[0], bb[1]] = 1
This works fine if the rectangles are far apart, but if they overlap and build more complex structures this algorithm just gives me the largest bounding box (upsampling the image made no difference). I have the feeling that there should already exist a scipy or opencv method which does this.
I would be glad to know if somebody has an idea on how to tackle this problem or even better knows of an existing solution.
As result I want a list of rectangles (ie. lower-left-corner : upper-righ-corner) in the image. The condition is that when I redraw those filled rectangles I want to get exactly the same image as before. If possible the number of rectangles should be minimal.
Here is the code for generating sample images (and a more complex example original vs scipy)
import numpy as np
def random_rectangle_image(grid_size, n_obstacles, rectangle_limits):
n_dim = 2
rect_pos = np.random.randint(low=0, high=grid_size-rectangle_limits[0]+1,
size=(n_obstacles, n_dim))
rect_size = np.random.randint(low=rectangle_limits[0],
high=rectangle_limits[1]+1,
size=(n_obstacles, n_dim))
# Crop rectangle size if it goes over the boundaries of the world
diff = rect_pos + rect_size
ex = np.where(diff > grid_size, True, False)
rect_size[ex] -= (diff - grid_size)[ex].astype(int)
img = np.zeros((grid_size,)*n_dim, dtype=bool)
for i in range(n_obstacles):
p_i = np.array(rect_pos[i])
ps_i = p_i + np.array(rect_size[i])
img[tuple(map(slice, p_i, ps_i))] = True
return img
img = random_rectangle_image(grid_size=64, n_obstacles=30,
rectangle_limits=[4, 10])
Here is something to get you started: a naïve algorithm that walks your image and creates rectangles as large as possible. As it is now, it only marks the rectangles but does not report back coordinates or counts. This is to visualize the algorithm alone.
It does not need any external libraries except for PIL, to load and access the left side image when saved as a PNG. I'm assuming a border of 15 pixels all around can be ignored.
from PIL import Image
def fill_rect (pixels,xp,yp,w,h):
for y in range(h):
for x in range(w):
pixels[xp+x,yp+y] = (255,0,0,255)
for y in range(h):
pixels[xp,yp+y] = (255,192,0,255)
pixels[xp+w-1,yp+y] = (255,192,0,255)
for x in range(w):
pixels[xp+x,yp] = (255,192,0,255)
pixels[xp+x,yp+h-1] = (255,192,0,255)
def find_rect (pixels,x,y,maxx,maxy):
# assume we're at the top left
# get max horizontal span
width = 0
height = 1
while x+width < maxx and pixels[x+width,y] == (0,0,0,255):
width += 1
# now walk down, adjusting max width
while y+height < maxy:
for w in range(x,x+width,1):
if pixels[x,y+height] != (0,0,0,255):
break
if pixels[x,y+height] != (0,0,0,255):
break
height += 1
# fill rectangle
fill_rect (pixels,x,y,width,height)
image = Image.open('A.png')
pixels = image.load()
width, height = image.size
print (width,height)
for y in range(16,height-15,1):
for x in range(16,width-15,1):
if pixels[x,y] == (0,0,0,255):
find_rect (pixels,x,y,width,height)
image.show()
From the output
you can observe the detection algorithm can be improved, as, for example, the "obvious" two top left rectangles are split up into 3. Similar, the larger structure in the center also contains one rectangle more than absolutely needed.
Possible improvements are either to adjust the find_rect routine to locate a best fit¹, or store the coordinates and use math (beyond my ken) to find which rectangles may be joined.
¹ A further idea on this. Currently all found rectangles are immediately filled with the "found" color. You could try to detect obviously multiple rectangles, and then, after marking the first, the other rectangle(s) to check may then either be black or red. Off the cuff I'd say you'd need to try different scan orders (top-to-bottom or reverse, left-to-right or reverse) to actually find the minimally needed number of rectangles in any combination.

how do I fit a grid of points on a random point cloud

I have a binary image with dots, which I obtained using OpenCV's goodFeaturesToTrack, as shown on Image1.
Image1 : Cloud of points
I would like to fit a grid of 4*25 dots on it, such as the on shown on Image2 (Not all points are visible on the image, but it is a regular 4*25 points rectangle).
Image2 : Model grid of points
My model grid of 4*25 dots is parametrized by :
1 - The position of the top left corner
2 - The inclination of the rectangle with the horizon
The code below shows a function that builds such a model.
This problem seems to be close to a chessboard corner problem.
I would like to know how to fit my model cloud of points to the input image and get the position and angle of the cloud.
I can easily measure a distance in between the two images (the input one and the on with the model grid) but I would like to avoid having to check every pixel and angle on the image for finding the minimum of this distance.
def ModelGrid(pos, angle, shape):
# Initialization of output image of size shape
table = np.zeros(shape)
# Parameters
size_pan = [32, 20]# Pixels
nb_corners= [4, 25]
index = np.ndarray([nb_corners[0], nb_corners[1], 2],dtype=np.dtype('int16'))
angle = angle*np.pi/180
# Creation of the table
for i in range(nb_corners[0]):
for j in range(nb_corners[1]):
index[i,j,0] = pos[0] + j*int(size_pan[1]*np.sin(angle)) + i*int(size_pan[0]*np.cos(angle))
index[i,j,1] = pos[1] + j*int(size_pan[1]*np.cos(angle)) - i*int(size_pan[0]*np.sin(angle))
if 0 < index[i,j,0] < table.shape[0]:
if 0 < index[i,j,1] < table.shape[1]:
table[index[i,j,0], index[i,j,1]] = 1
return table
A solution I found, which works relatively well is the following :
First, I create an index of positions of all positive pixels, just going through the image. I will call these pixels corners.
I then use this index to compute an average angle of inclination :
For each of the corners, I look for others which would be close enough in certain areas, as to define a cross. I manage, for each pixel to find the ones that are directly on the left, right, top and bottom of it.
I use this cross to calculate an inclination angle, and then use the median of all obtained inclination angles as the angle for my model grid of points.
Once I have this angle, I simply build a table using this angle and the positions of each corner.
The optimization function measures the number of coincident pixels on both images, and returns the best position.
This way works fine for most examples, but the returned 'best position' has to be one of the corners, which does not imply that it corresponds to the best position... Mainly if the top left corner of the grid within the cloud of corners is missing.

Distortion effect using OpenCv-python

I want to create distortion effect like Spiral, stretch, fisheye, Wedge and other effect like underwater and snow like this website using cv2 library in python.
I figured out fisheye distortion.
In OpenCV version 3.0 and above it is possible to perform it using cv2.fisheye.undistortImage(). I have the code in python if you need.
This is what I got for the following input image:
Input Image:
Distorted image:
The function accepts a matrix, which upon modification yield different distortions of the image.
UPDATE
In order to add a snowfall effect you can add some noise like Poisson noise.
Here is a replacement block to map out a fisheye in the middle of the image. Please look elsewhere for details on the math. Use this in place of the 2 for loops in the previous code.
As stated in the first half of my answer (see previous answer), the purpose of this block is to create 2 maps that work together to remap the source image into the destination image.
To create the two maps, this block sweeps through 2 for loops with the dimensions of the image. Values are calculated for the X and y maps (flex_x and flex_y). It starts with assigning each to simply x and y for a 1-to-1 replacement map. Then, if the radius (r) is between 0 and 1, the map for the tangential slide for the fisheye is applied and new flex_x and flex_y values are mapped.
Please see my other answer for more details.
# create simple maps with a modified assignment
# outside the bulge is normal, inside is modified
# this is where the magic is assembled
for y in range(h):
ny = ((2*y-250)/(h-250))-1 #play with the 250's to move the y
ny2 = ny*ny
for x in range(w):
nx = ((2*x-50)/(w-50))-1 #play with the 50's to move the x
nx2 = nx*nx
r = math.sqrt(nx2+ny2)
flex_x[y,x] = x
flex_y[y,x] = y
if r>0 and r<1:
nr1 = 1 - r**2
nr2 = math.sqrt(nr1)
nr = (r + (1.0-nr2)) / 2.0
theta = math.atan2(ny,nx)
nxn = nr*math.cos(theta)
nyn = nr*math.sin(theta)
flex_x[y,x] = (((nxn+1)*w)/2.0)
flex_y[y,x] = (((nyn+1)*h)/2.0)
Here is half of an answer. The cv2.remap function uses maps to choose a pixel from the source for each pixel in the destination. alkasm's answer to this: How do I use OpenCV's remap function?
does a great job of defining the process, but glosses over the usefulness of those maps. If you can get creative in the maps, you can make any effect you want. Here is what I came up with.
The program starts by loading the image and resizing it. This is a convenience for a smaller screen. Then the empty maps are created.
The maps need to be the same dimensions as the image that is being processed, but with a depth of 1. If the resized original is 633 x 400 x 3, the maps both need to be 633 x 400.
When the remapping is done, cv2.remap will used the value at each coordinate in the maps to determine which pixel in the original to use in the destination. For each x,y in the destination, dest[x,y] = src[map1[x,y],map2[x,y]].
The simplest mapping would be if for every (x,y), map1(x,y)=x and map2(x,y)=y. This creates a 1-to-1 map, and the destination would match the source. In this example, a small offset is added to each value. The cosine function in the offset creates both positive and negative shifts, creating waves in the final image.
Note that creating the maps is slow, but the cv2.remap is fast. Once you have created the map, the cv2.remap is fast enough to be applied to frames of video.
import numpy as np #create waves
import cv2
import math
# read in image and resize down to width of 400
# load your image file here
image = cv2.imread("20191114_154534.jpg")
r = 400.0 / image.shape[1]
dim = (400, int(image.shape[0] * r))
# Perform the resizing of the image
resized = cv2.resize(image, dim, interpolation = cv2.INTER_AREA)
# Grab the dimensions of the image and calculate the center
# of the image (center not needed at this time)
(h, w, c) = resized.shape
center = (w // 2, h // 2)
# set up the x and y maps as float32
flex_x = np.zeros((h,w),np.float32)
flex_y = np.zeros((h,w),np.float32)
# create simple maps with a modified assignment
# the math modifier creates ripples. increase the divisor for less waves,
# increase the multiplier for greater movement
# this is where the magic is assembled
for y in range(h):
for x in range(w):
flex_x[y,x] = x + math.cos(x/15) * 15
flex_y[y,x] = y + math.cos(y/30) * 25
# do the remap this is where the magic happens
dst = cv2.remap(resized,flex_x,flex_y,cv2.INTER_LINEAR)
#show the results and wait for a key
cv2.imshow("Resized",resized)
cv2.imshow("Flexed",dst)
cv2.waitKey(0)
cv2.destroyAllWindows()

Python - Perspective transform for OpenCV from a rotation angle

I'm working on depth map with OpenCV. I can obtain it but it is reconstructed from the left camera origin and there is a little tilt of this latter and as you can see on the figure, the depth is "shifted" (the depth should be close and no horizontal gradient):
I would like to express it as with a zero angle, i try with the warp perspective function as you can see below but i obtain a null field...
P = np.dot(cam,np.dot(Transl,np.dot(Rot,A1)))
dst = cv2.warpPerspective(depth, P, (2048, 2048))
with :
#Projection 2D -> 3D matrix
A1 = np.zeros((4,3))
A1[0,0] = 1
A1[0,2] = -1024
A1[1,1] = 1
A1[1,2] = -1024
A1[3,2] = 1
#Rotation matrice around the Y axis
theta = np.deg2rad(5)
Rot = np.zeros((4,4))
Rot[0,0] = np.cos(theta)
Rot[0,2] = -np.sin(theta)
Rot[1,1] = 1
Rot[2,0] = np.sin(theta)
Rot[2,2] = np.cos(theta)
Rot[3,3] = 1
#Translation matrix on the X axis
dist = 0
Transl = np.zeros((4,4))
Transl[0,0] = 1
Transl[0,2] = dist
Transl[1,1] = 1
Transl[2,2] = 1
Transl[3,3] = 1
#Camera Intrisecs matrix 3D -> 2D
cam = np.concatenate((C1,np.zeros((3,1))),axis=1)
cam[2,2] = 1
P = np.dot(cam,np.dot(Transl,np.dot(Rot,A1)))
dst = cv2.warpPerspective(Z0_0, P, (2048*3, 2048*3))
EDIT LATER :
You can download the 32MB field dataset here: https://filex.ec-lille.fr/get?k=cCBoyoV4tbmkzSV5bi6. Then, load and view the image with:
from matplotlib import pyplot as plt
import numpy as np
img = np.load('testZ0.npy')
plt.imshow(img)
plt.show()
I have got a rough solution in place. You can modify it later.
I used the mouse handling operations available in OpenCV to crop the region of interest in the given heatmap.
(Did I just say I used a mouse to crop the region?) Yes, I did. To learn more about mouse functions in OpenCV SEE THIS. Besides, there are many other SO questions that can help you in this regard.:)
Using those functions I was able to obtain the following:
Now to your question of removing the tilt. I used the homography principal by taking the corner points of the image above and using it on a 'white' image of a definite size. I used the cv2.findHomography() function for this.
Now using the cv2.warpPerspective() function in OpenCV, I was able to obtain the following:
Now you can the required scale to this image as you wanted.
CODE:
I have also attached some snippets of code for your perusal:
#First I created an image of white color of a definite size
back = np.ones((435, 379, 3)) # size
back[:] = (255, 255, 255) # white color
Next I obtained the corner points pts_src on the tilted image below :
pts_src = np.array([[25.0, 2.0],[403.0,22.0],[375.0,436.0],[6.0,433.0]])
I wanted the points above to be mapped to the points 'pts_dst' given below :
pts_dst = np.array([[2.0, 2.0], [379.0, 2.0], [379.0, 435.0],[2.0, 435.0]])
Now I used the principal of homography:
h, status = cv2.findHomography(pts_src, pts_dst)
Finally I mapped the original image to the white image using perspective transform.
fin = cv2.warpPerspective(img, h, (back.shape[1],back.shape[0]))
# img -> original tilted image.
# back -> image of white color.
Hope this helps! I also got to learn a great deal from this question.
Note: The points fed to the 'cv2.findHomography()' must be in float.
For more info on Homography , visit THIS PAGE

how to locate the center of a bright spot in an image?

Here is an example of the kinds of images I'll be dealing with:
(source: csverma at pages.cs.wisc.edu)
There is one bright spot on each ball. I want to locate the coordinates of the centre of the bright spot. How can I do it in Python or Matlab? The problem I'm having right now is that more than one points on the spot has the same (or roughly the same) white colour, but what I need is to find the centre of this 'cluster' of white points.
Also, for the leftmost and rightmost images, how can I find the centre of the whole circular object?
You can simply threshold the image and find the average coordinates of what is remaining. This handles the case when there are multiple values that have the same intensity. When you threshold the image, there will obviously be more than one bright white pixel, so if you want to bring it all together, find the centroid or the average coordinates to determine the centre of all of these white bright pixels. There isn't a need to filter in this particular case. Here's something to go with in MATLAB.
I've read in that image directly, converted to grayscale and cleared off the white border that surrounds each of the images. Next, I split up the image into 5 chunks, threshold the image, find the average coordinates that remain and place a dot on where each centre would be:
im = imread('http://pages.cs.wisc.edu/~csverma/CS766_09/Stereo/callight.jpg');
im = rgb2gray(im);
im = imclearborder(im);
%// Split up images and place into individual cells
split_point = floor(size(im,2) / 5);
images = mat2cell(im, size(im,1), split_point*ones(5,1));
%// Show image to place dots
imshow(im);
hold on;
%// For each image...
for idx = 1 : 5
%// Get image
img = images{idx};
%// Threshold
thresh = img > 200;
%// Find coordinates of thresholded image
[y,x] = find(thresh);
%// Find average
xmean = mean(x);
ymean = mean(y);
%// Place dot at centre
%// Make sure you offset by the right number of columns
plot(xmean + (idx-1)*split_point, ymean, 'r.', 'MarkerSize', 18);
end
I get this:
If you want a Python solution, I recommend using scikit-image combined with numpy and matplotlib for plotting. Here's the above code transcribed in Python. Note that I saved the image referenced by the link manually on disk and named it balls.jpg:
import skimage.io
import skimage.segmentation
import numpy as np
import matplotlib.pyplot as plt
# Read in the image
# Note - intensities are floating point from [0,1]
im = skimage.io.imread('balls.jpg', True)
# Threshold the image first then clear the border
im_clear = skimage.segmentation.clear_border(im > (200.0/255.0))
# Determine where to split up the image
split_point = int(im.shape[1]/5)
# Show image in figure and hold to place dots in
plt.figure()
plt.imshow(np.dstack([im,im,im]))
# For each image...
for idx in range(5):
# Extract sub image
img = im_clear[:,idx*split_point:(idx+1)*split_point]
# Find coordinates of thresholded image
y,x = np.nonzero(img)
# Find average
xmean = x.mean()
ymean = y.mean()
# Plot on figure
plt.plot(xmean + idx*split_point, ymean, 'r.', markersize=14)
# Show image and make sure axis is removed
plt.axis('off')
plt.show()
We get this figure:
Small sidenote
I could have totally skipped the above code and used regionprops (MATLAB link, scikit-image link). You could simply threshold the image, then apply regionprops to find the centroids of each cluster of white pixels, but I figured I'd show you a more manual way so you can appreciate the algorithm and understand it for yourself.
Hope this helps!
Use a 2D convolution and then find the point with the highest intensity. You can apply a concave non-linear function (such as exp) on intensity values before applying the 2d convolution, to intensify the bright spots relative to the dimmer parts of the image. Something like conv2(exp(img),ker)

Categories