Related
I have image with with many cars, every car has coordinates of polygon and keypoints. I use this code to crop object by polygon and get new keypoints.
x,y,w,h = cv2.boundingRect(points_poly_int)
cropped_img = img[y:y+h,x:x+w]
head_coords_after_crop = np.asarray([head_coords_old[0] - x, head_coords_old[1] -y])
center_coords_after_crop = np.asarray([center_coords_old[0] - x, center_coords_old[1] -y])
Here example of cropped image and keypoints:
What I need is rotate the whole image by any angle and remap coordinates of polygons and keypoints for every object
Here method which return rotated image and matrix of transformation:
def rotate_image(mat, angle):
"""
Rotates an image (angle in degrees) and expands image to avoid cropping
"""
height, width = mat.shape[:2] # image shape has 3 dimensions
image_center = (width/2, height/2) # getRotationMatrix2D needs coordinates in reverse order (width, height) compared to shape
rotation_mat = cv2.getRotationMatrix2D(image_center, angle, 1.)
# rotation calculates the cos and sin, taking absolutes of those.
abs_cos = abs(rotation_mat[0,0])
abs_sin = abs(rotation_mat[0,1])
# find the new width and height bounds
bound_w = int(height * abs_sin + width * abs_cos)
bound_h = int(height * abs_cos + width * abs_sin)
# subtract old image center (bringing image back to origo) and adding the new image center coordinates
rotation_mat[0, 2] += bound_w/2 - image_center[0]
rotation_mat[1, 2] += bound_h/2 - image_center[1]
# rotate image with the new bounds and translated rotation matrix
rotated_mat = cv2.warpAffine(mat, rotation_mat, (bound_w, bound_h))
return rotated_mat, rotation_mat
What I do next is multiplying old coordinates with matrix of transformation. Here code:
img_roated, C = rotate_image(img, 180)
#Remap polygons coordinates
ones = np.ones((points_poly.shape[0], 1))
new_poly = np.hstack((points_poly,ones))
new_poly = (C # new_poly.T).T
new_poly = new_poly.astype(np.int32)
#Crop by new polygons
x,y,w,h = cv2.boundingRect(new_poly)
cropped_img = img_roated[y:y+h,x:x+w]
#Reamp keypoints coordinates
head_coords_new = np.asarray([756.600, 1687.900, 1])
center_coords_new = np.asarray([762.300, 1708.400, 1])
head_coords_new = (C # head_coords_new.T).T
center_coords_new = (C # center_coords_new.T).T
head_coords_new = np.asarray([head_coords_old[0] - x, head_coords_old[1] - y])
center_coords_new = np.asarray([center_coords_old[0] - x, center_coords_old[1] - y])
head_coords_new = head_coords_new.astype(np.int32)
center_coords_new = center_coords_new.astype(np.int32)
But result is differnt from first picture, Here new picture:
Somehow keypoints shift, and it happens with every angle. And I don't know how to fix it.
Here the source image: https://drive.google.com/file/d/14K_MQHMwtWlw-QCQbaB5ecrREbWwyKhO/view?usp=sharing
And polygons with keypoints:
{'keypoints': [{'id': 'head', 'pos': '756.600;1687.900'},
{'id': 'roof_center', 'pos': '762.300;1708.400'}],
'polygon': '{(759.700;1717.300);(770.000;1714.200);(762.000;1687.400);(756.600;1687.900);(751.200;1690.700);(759.700;1717.300)}'}
If you wish to reproduce the issue.
Thanks in advnced
Here the differnce. Right pic is first image rotated in pic viewer. Left is transformed pic
I am cropping an image using python PIL. Say my image is as such:
This is the simple code snippet I use for cropping:
from PIL import Image
im = Image.open(image)
cropped_image = im.crop((topLeft_x, topLeft_y, bottomRight_x, bottomRight_y))
cropped_image.save("Out.jpg")
The result of this is:
I want to scale out this cropped image keeping the aspect ratio (proportionate width and height) same by say 20% to look something like this without exceeding the image dimensions.
How should I scale out the crop such that the aspect ratio is maintained while not exceeding the image boundary/ dimensions?
You should calculate the center of your crop and use it there on.
As an example:
crop_width = right - left
crop_height = bottom - top
crop_center_x = int(left + crop_width/2)
crop_center_y = (top + crop_height/2)
In this way you will obtain the (x,y) point which corresponds to the center of your crop w.r.t your original image.
In that case, you will know that the maximum width for your crop would be the minimum between the center value and the outer bounds of the original image minus the center itself:
im = Image.open("brad.jpg")
l = 200
t = 200
r = 300
b = 300
cropped = im.crop((l, t, r, b))
Which gives you:
If you want to "enlarge" it to the maximum starting from the same center, then you will have:
max_width = min(crop_center_x, im.size[0]-crop_center_x)
max_height = min(crop_center_y, im.size[1]-crop_center_y)
new_l = crop_center_x - max_width
new_t = crop_center_x - max_height
new_r = crop_center_x + max_width
new_b = crop_center_x + max_height
new_crop = im.crop((new_l, new_t, new_r, new_b))
which gives as a result, having the same center:
Edit
If you want to keep the aspect ratio you should retrieve it (the ratio) before and apply the crop only if the resulting size would still fit the original image. As an example, if you want to enlarge it by 20%:
ratio = crop_height/crop_width
scale = 20/100
new_width = int(crop_width + (crop_width*scale))
# Here we are using the previously calculated value for max_width to
# determine if the new one would be too large.
# Note that the width that we calculated here (new_width) refers to both
# sides of the crop, while the max_width calculated previously refers to
# one side only; same for height. Sorry for the confusion.
if max_width < new_width/2:
new_width = int(2*max_width)
new_height = int(new_width*ratio)
# Do the same for the height, update width if necessary
if max_height < new_height/2:
new_height = int(2*max_height)
new_width = int(new_height/ratio)
adjusted_scale = (new_width - crop_width)/crop_width
if adjusted_scale != scale:
print("Scale adjusted to: {:.2f}".format(adjusted_scale))
new_l = int(crop_center_x - new_width/2)
new_r = int(crop_center_x + new_width/2)
new_t = int(crop_center_y - new_height/2)
new_b = int(crop_center_y + new_height/2)
Once you have the width and height values the process to get the crop is the same as above.
I want to divide a picture in equally big squares and measure the average gray scale level and replace it with a blob, aka halftoning. This code gives me a picture but it doesn't look right. Any ideas what could be wrong?
im = scipy.misc.imread("uggla.tif")
def halftoning(im):
im = im.astype('float64')
width,height = im.shape
halftone_pic = np.zeros((width, height))
for x in range(width):
for y in range(height):
floating_matrix = im[x:x + 1, y:y + 1]
sum = np.sum(floating_matrix)
mean = np.mean(sum)
round = (mean > 128) * 255
halftone_pic[x,y] = round
fig, ax = plt.subplots(1,2)
ax[0].imshow(im, cmap="gray")
ax[1].imshow(halftone_pic, cmap="gray")
plt.show()
Here's something that does what you want. It's essentially a simplification of the code in the accepted answer to the related question How to create CMYK halftone Images from a color image?:
from PIL import Image, ImageDraw, ImageStat
# Adaption of answer https://stackoverflow.com/a/10575940/355230
def halftone(img, sample, scale, angle=45):
''' Returns a halftone image created from the given input image `img`.
`sample` (in pixels), determines the sample box size from the original
image. The maximum output dot diameter is given by `sample` * `scale`
(which is also the number of possible dot sizes). So `sample` == 1 will
preserve the original image resolution, but `scale` must be > 1 to allow
variations in dot size.
'''
img_grey = img.convert('L') # Convert to greyscale.
channel = img_grey.split()[0] # Get grey pixels.
channel = channel.rotate(angle, expand=1)
size = channel.size[0]*scale, channel.size[1]*scale
bitmap = Image.new('1', size)
draw = ImageDraw.Draw(bitmap)
for x in range(0, channel.size[0], sample):
for y in range(0, channel.size[1], sample):
box = channel.crop((x, y, x+sample, y+sample))
mean = ImageStat.Stat(box).mean[0]
diameter = (mean/255) ** 0.5
edge = 0.5 * (1-diameter)
x_pos, y_pos = (x+edge) * scale, (y+edge) * scale
box_edge = sample * diameter * scale
draw.ellipse((x_pos, y_pos, x_pos+box_edge, y_pos+box_edge),
fill=255)
bitmap = bitmap.rotate(-angle, expand=1)
width_half, height_half = bitmap.size
xx = (width_half - img.size[0]*scale) / 2
yy = (height_half - img.size[1]*scale) / 2
bitmap = bitmap.crop((xx, yy, xx + img.size[0]*scale,
yy + img.size[1]*scale))
return Image.merge('1', [bitmap])
# Sample usage
img = Image.open('uggla.tif')
img_ht = halftone(img, 8, 1)
img_ht.show()
Here's the results from using this as the input image:
Halftoned result produced:
I'm pasting a randomly generated barcode on a background image.
This barcode has been randomly rotated, skewed, and scaled.
Then, this barcode is randomly placed onto the background image.
I'm trying to find out the coordinates of the actual barcode, ignoring the expanded black mask.
I'm a beginner in matrices and image manipulation so any help, especially in the math, would be appreciated.
This is where I generate the barcode, using pdf417gen library, along with the coordinates of the barcode.
import numpy as np
import os
import random
import sys
from pdf417gen import encode, render_image
from PIL import Image
def generate_barcode(self):
barcode = encode("random text data", columns=5, security_level=5)
scale = 5
ratio = 3
padding = 5
barcode_image = render_image(barcode, scale=scale, ratio=ratio, padding=padding)
barcode_coords = np.array([
[(barcode_image.width - padding) / float(barcode_image.width), (barcode_image.height - padding) / float(barcode_image.height)],
[padding / float(barcode_image.width), (barcode_image.height - padding) / float(barcode_image.height)],
[padding / float(barcode_image.width), padding / float(barcode_image.height)],
[(barcode_image.width - padding) / float(barcode_image.width), padding / float(barcode_image.height)]
])
return (barcode_coords, barcode_image)
Once I have the barcode's image and coordinate, I do the following.
transform the barcode's image
attempt to match the coordinates with the image's transformation
paste the image onto a background image
then draw a red outline using the coordinates
The red outline should outline the barcode's image.
Here's where I transform the barcode image and paste it to the background image.
def composite_images(self, background_image, barcode_coords, barcode_image):
coords = barcode_coords
barcode = barcode_image
# instantiating the transformation variables
scale = random.randrange(4, 50) / 100.0
size = int( min(background_image.size) * scale) # background_image.size returns (width, height)
barcode = barcode.resize((int(size * 2.625), size)) # width:height ratio is 2.625:1
rotation = random.randrange(0, 360)
xstretch = random.randrange(0, 100) / 100.0
ystretch = random.randrange(0, 100) / 100.0
xshear = random.randrange(0, 100) / 100.0
yshear = random.randrange(0, 100) / 100.0
# set affine transform on the barcode coordinates
affine_transform = get_affine_transform(rotation, xstretch, ystretch, xshear, yshear)
coords = transform_coords(coords, affine_transform, True)
expand_mask = transform_coords(np.array([ # shifts expand mask based on transformation
[0.0, 0.0],
[float(size * 2.625), 0.0],
[float(size * 2.625), float(size)],
[0.0, float(size)]
]), mat, False)
minx = min(expand_mask[:,0])
maxx = max(expand_mask[:,0])
miny = min(expand_mask[:,1])
maxy = max(expand_mask[:,1])
mat_inv = np.linalg.inv(np.array([ # the inverse matrix
[mat[0,0], mat[0,1], -minx],
[mat[1,0], mat[1,1], -miny],
[0,0,1.0]
]))
image_matrix = (mat_inv[0,0], mat_inv[0,1], mat_inv[0,2],
mat_inv[1,0], mat_inv[1,1], mat_inv[1,2])
new_size = (int(maxx-minx), int(maxy-miny))
# set affine transform on the barcode image using data from coordinates affine transformation
barcode = barcode.transform(new_size, method=Image.AFFINE, data=image_matrix)
# paste the barcode image onto a random position on background image
region_x = random.randrange(0, background_image.width - size)
region_y = random.randrange(0, background_image.height - size)
background_image.paste(barcode, (region_x, region_y))
coords *= scale
coords += [region_x / float(background_image.width), region_y / float(background_image.height)]
return(coords, background_image)
def get_affine_transform(self, rotation, xstretch, ystretch, xshear, yshear):
theta = -(rotation / 180.0) * np.pi
return np.array([
[np.cos(theta) * xstretch, -np.sin(theta) * xshear],
[np.sin(theta) * ystretch, np.cos(theta) * yshear]
])
def transform_coords(self, coords, affine_transform, center):
if center:
coords -= (.5, .5) # center on origin
coords = np.dot(coords, affine_transform.T)
if center:
coords += (.5, .5) # reset centering
return coords
Now I draw the red outline using the coords and image (with pasted barcode) returned from composite_images().
def draw_red_outline(self, box_coords, image):
outline = box_coords * [image.width, image.height]
outline = outline.astype(int)
outline = tuple(map(tuple, outline))
draw = ImageDraw.Draw(image)
draw.poly(outline, outline=(255,0,0,0))
del draw
image.show()
I'm unsure as to where my math is going wrong.
To get coordinates of transformed points you can do the following:
After getting transformation matrix:
transformed_img = cv2.warpPerspective(source_img, m, image_shape)
You apply it to image:
transformed_img = cv2.warpPerspective(source_img, m, image_shape)
and transformed image contains result with coordinates which you want to calculate and some black region.
So, the solution for each of 4 points' coordinates (if there are no 0 coordinates) is the following:
point = np.array([w, h]) #width and hight of the source point (before transform)
homg_point = [point[0], point[1], 1] # homogeneous coords
transf_homg_point = m.dot(homg_point) # transform
transf_homg_point /= transf_homg_point1[2] # scale
transf_point = transf_homg_point[:2] # remove Cartesian coords
print(transf_point) #check the result
For my neural network I want to augment my training data by adding small random rotations and zooms to my images. The issue I am having is that scipy is changing the size of my images when it applies the rotations and zooms. I need to to just clip the edges if part of the image goes out of bounds. All of my images must be the same size.
def loadImageData(img, distort = False):
c, fn = img
img = scipy.ndimage.imread(fn, True)
if distort:
img = scipy.ndimage.zoom(img, 1 + 0.05 * rnd(), mode = 'constant')
img = scipy.ndimage.rotate(img, 10 * rnd(), mode = 'constant')
print(img.shape)
img = img - np.min(img)
img = img / np.max(img)
img = np.reshape(img, (1, *img.shape))
y = np.zeros(ncats)
y[c] = 1
return (img, y)
scipy.ndimage.rotate accepts a reshape= parameter:
reshape : bool, optional
If reshape is true, the output shape is adapted so that the input
array is contained completely in the output. Default is True.
So to "clip" the edges you can simply call scipy.ndimage.rotate(img, ..., reshape=False).
from scipy.ndimage import rotate
from scipy.misc import face
from matplotlib import pyplot as plt
img = face()
rot = rotate(img, 30, reshape=False)
fig, ax = plt.subplots(1, 2)
ax[0].imshow(img)
ax[1].imshow(rot)
Things are more complicated for scipy.ndimage.zoom.
A naive method would be to zoom the entire input array, then use slice indexing and/or zero-padding to make the output the same size as your input. However, in cases where you're increasing the size of the image it's wasteful to interpolate pixels that are only going to get clipped off at the edges anyway.
Instead you could index only the part of the input that will fall within the bounds of the output array before you apply zoom:
import numpy as np
from scipy.ndimage import zoom
def clipped_zoom(img, zoom_factor, **kwargs):
h, w = img.shape[:2]
# For multichannel images we don't want to apply the zoom factor to the RGB
# dimension, so instead we create a tuple of zoom factors, one per array
# dimension, with 1's for any trailing dimensions after the width and height.
zoom_tuple = (zoom_factor,) * 2 + (1,) * (img.ndim - 2)
# Zooming out
if zoom_factor < 1:
# Bounding box of the zoomed-out image within the output array
zh = int(np.round(h * zoom_factor))
zw = int(np.round(w * zoom_factor))
top = (h - zh) // 2
left = (w - zw) // 2
# Zero-padding
out = np.zeros_like(img)
out[top:top+zh, left:left+zw] = zoom(img, zoom_tuple, **kwargs)
# Zooming in
elif zoom_factor > 1:
# Bounding box of the zoomed-in region within the input array
zh = int(np.round(h / zoom_factor))
zw = int(np.round(w / zoom_factor))
top = (h - zh) // 2
left = (w - zw) // 2
out = zoom(img[top:top+zh, left:left+zw], zoom_tuple, **kwargs)
# `out` might still be slightly larger than `img` due to rounding, so
# trim off any extra pixels at the edges
trim_top = ((out.shape[0] - h) // 2)
trim_left = ((out.shape[1] - w) // 2)
out = out[trim_top:trim_top+h, trim_left:trim_left+w]
# If zoom_factor == 1, just return the input array
else:
out = img
return out
For example:
zm1 = clipped_zoom(img, 0.5)
zm2 = clipped_zoom(img, 1.5)
fig, ax = plt.subplots(1, 3)
ax[0].imshow(img)
ax[1].imshow(zm1)
ax[2].imshow(zm2)
I recommend using cv2.resize because it is way faster than scipy.ndimage.zoom, probably due to support for simpler interpolation methods.
For a 480x640 image :
cv2.resize takes ~2 ms
scipy.ndimage.zoom takes ~500 ms
scipy.ndimage.zoom(...,order=0) takes ~175ms
If you are doing the data augmentation on the fly, this amount of speedup is invaluable because it means more experiments in less time.
Here is a version of clipped_zoom using cv2.resize
def cv2_clipped_zoom(img, zoom_factor=0):
"""
Center zoom in/out of the given image and returning an enlarged/shrinked view of
the image without changing dimensions
------
Args:
img : ndarray
Image array
zoom_factor : float
amount of zoom as a ratio [0 to Inf). Default 0.
------
Returns:
result: ndarray
numpy ndarray of the same shape of the input img zoomed by the specified factor.
"""
if zoom_factor == 0:
return img
height, width = img.shape[:2] # It's also the final desired shape
new_height, new_width = int(height * zoom_factor), int(width * zoom_factor)
### Crop only the part that will remain in the result (more efficient)
# Centered bbox of the final desired size in resized (larger/smaller) image coordinates
y1, x1 = max(0, new_height - height) // 2, max(0, new_width - width) // 2
y2, x2 = y1 + height, x1 + width
bbox = np.array([y1,x1,y2,x2])
# Map back to original image coordinates
bbox = (bbox / zoom_factor).astype(np.int)
y1, x1, y2, x2 = bbox
cropped_img = img[y1:y2, x1:x2]
# Handle padding when downscaling
resize_height, resize_width = min(new_height, height), min(new_width, width)
pad_height1, pad_width1 = (height - resize_height) // 2, (width - resize_width) //2
pad_height2, pad_width2 = (height - resize_height) - pad_height1, (width - resize_width) - pad_width1
pad_spec = [(pad_height1, pad_height2), (pad_width1, pad_width2)] + [(0,0)] * (img.ndim - 2)
result = cv2.resize(cropped_img, (resize_width, resize_height))
result = np.pad(result, pad_spec, mode='constant')
assert result.shape[0] == height and result.shape[1] == width
return result