I am trying to implement an algorithm in python to scale images by a factor or rotate them by a given angle (or both at the same time). I am using opencv to handle the images and I know opencv has these functions built in, however I want to do this myself to better understand image transformations. I believe I calculate the rotation matrix correctly. However, when I try to implement the affine transformation, it does not come out correctly.
import numpy as np
import cv2
import math as m
import sys
img = cv2.imread(sys.argv[1])
angle = sys.argv[2]
#get rotation matrix
def getRMat((cx, cy), angle, scale):
a = scale*m.cos(angle*np.pi/180)
b = scale*(m.sin(angle*np.pi/180))
u = (1-a)*cx-b*cy
v = b*cx+(1-a)*cy
return np.array([[a,b,u], [-b,a,v]])
#determine shape of img
h, w = img.shape[:2]
#print h, w
#determine center of image
cx, cy = (w / 2, h / 2)
#calculate rotation matrix
#then grab sine and cosine of the matrix
mat = getRMat((cx,cy), -int(angle), 1)
print mat
cos = np.abs(mat[0,0])
sin = np.abs(mat[0,1])
#calculate new height and width to account for rotation
newWidth = int((h * sin) + (w * cos))
newHeight = int((h * cos) + (w * sin))
#print newWidth, newHeight
mat[0,2] += (newWidth / 2) - cx
mat[1,2] += (newHeight / 2) - cy
#this is how the image SHOULD look
dst = cv2.warpAffine(img, mat, (newWidth, newHeight))
cv2.imshow('dst', dst)
cv2.waitKey(0)
cv2.destroyAllWindows()
#apply transform
#attempt at my own warp affine function...still buggy tho
def warpAff(image, matrix, (width, height)):
dst = np.zeros((width, height, 3), dtype=np.uint8)
oldh, oldw = image.shape[:2]
#print oldh, oldw
#loop through old img and transform its coords
for x in range(oldh):
for y in range(oldw):
#print y, x
#transform the coordinates
u = int(x*matrix[0,0]+y*matrix[0,1]+matrix[0,2])
v = int(x*matrix[1,0]+y*matrix[1,1]+matrix[1,2])
#print u, v
#v -= width / 1.5
if (u >= 0 and u < height) and (v >= 0 and v < width):
dst[u,v] = image[x,y]
return dst
dst = warpAff(img, mat, (newWidth, newHeight))
cv2.imshow('dst', dst)
cv2.waitKey(0)
cv2.destroyAllWindows()
Image I am using for testing
You're applying the rotation backward.
This means that for an angle of 20, instead of rotating 20 degrees clockwise, you rotate 20 degrees counterclockwise. That on its own would be easy to fix—just negate the angle.
But it also means that, for each destination pixel, if no source pixel exactly rotates to it, you end up with an all-black pixel. You could solve that by using any interpolation algorithm, but it's making things more complicated.
If we instead just reverse the process, and instead of calculating the destination (u, v) for each (x, y), we calculate the source (x, y) for every destination (u, v), that solves both problems:
def warpAff(image, matrix, width, height):
dst = np.zeros((width, height, 3), dtype=np.uint8)
oldh, oldw = image.shape[:2]
# Loop over the destination, not the source, to ensure that you cover
# every destination pixel exactly 1 time, rather than 0-4 times.
for u in range(width):
for v in range(height):
x = u*matrix[0,0]+v*matrix[0,1]+matrix[0,2]
y = u*matrix[1,0]+v*matrix[1,1]+matrix[1,2]
intx, inty = int(x), int(y)
# We could interpolate here by using something like this linear
# interpolation matrix, but let's keep it simple and not do that.
# fracx, fracy = x%1, y%1
# interp = np.array([[fracx*fracy, (1-fracx)*fracy],
# [fracx*(1-fracy), (1-fracx)*(1-fracy)]])
if 0 < x < oldw and 0 < y < oldh:
dst[u, v] = image[intx, inty]
return dst
Now the only remaining problem is that you didn't apply the shift backward, so we end up shifting the image in the wrong direction when we turn everything else around. That's trivial to fix:
mat[0,2] += cx - (newWidth / 2)
mat[1,2] += cy - (newHeight / 2)
You do have one more problem: your code (and this updated code) only works for square images. You're getting height and width backward multiple times, and they almost all cancel out, but apparently one of them doesn't. In general, you're treating your arrays as (width, height) rather than (height, width), but you end up comparing to (original version) or looping over (new version) (height, width). So, if height and width are different, you end up trying to write past the end of the array.
Trying to find all of these and fix them is probably as much work as just starting over and doing it consistently everywhere from the start:
mat = getRMat(cx, cy, int(angle), 1)
cos = np.abs(mat[0,0])
sin = np.abs(mat[0,1])
newWidth = int((h * sin) + (w * cos))
newHeight = int((h * cos) + (w * sin))
mat[0,2] += cx - (newWidth / 2)
mat[1,2] += cy - (newHeight / 2)
def warpAff2(image, matrix, width, height):
dst = np.zeros((height, width, 3), dtype=np.uint8)
oldh, oldw = image.shape[:2]
for u in range(width):
for v in range(height):
x = u*matrix[0,0]+v*matrix[0,1]+matrix[0,2]
y = u*matrix[1,0]+v*matrix[1,1]+matrix[1,2]
intx, inty = int(x), int(y)
if 0 < intx < oldw and 0 < inty < oldh:
pix = image[inty, intx]
dst[v, u] = pix
return dst
dst = warpAff2(img, mat, newWidth, newHeight)
It's worth noting that there are much simpler (and more efficient) ways to implement this. If you build a 3x3 square matrix, you can vectorize the multiplication. Also, you can create the matrix more simply by just multiplying a shift matrix # a rotation matrix # an unshift matrix instead of manually fixing things up after the fact. But hopefully this version, since it's as close as possible to your original, should be easiest to understand.
Related
Let say, I have a grayscale image that has some black pixels as shown below:
In this image, I am trying to find out patches having no zero values. For simplicity, let's assume that overlapping patches are allowed. The challenge is that these patches aren't rectangular but circular in shape. Please see an example below:
Please note that there are many such patches possible in the image. However, for illustration purposes, I have just manually drawn a few.
It is possible to find such patches using for nested for loop but this doesn't look the optimal way.
# find one circular patch
for y in range(-radius, radius):
for x in range(-radius, radius):
if x**2 + y**2 < radius**2:
# this pixel in inside the circular patch
patch_x, patch_y = img_x + x, img_y + y
I am trying to use convolution operation but no luck so far
import cv2
import numpy as np
radius = 20
img = cv2.imread('img.png', cv2.CV_8UC1)
candidates = img != 0
patch_shape = (radius, radius)
out = np.lib.stride_tricks.as_strided(
candidates,
shape=(candidates.shape[0] - patch_shape[0] + 1, \
candidates.shape[1] - patch_shape[1] + 1, \
*patch_shape),
strides=2*img.strides,
writeable=False,
)
patches = np.argwhere(out.all(axis=(-2, -1)))
My goal is to find all (if not at least a few say 10) patches of given size in circular shape from Numpy array having no zero values.
I would go with convolution.
There is a nice trick to generate a circular kernel (mask)
def create_circular_kernel(radius):
center = (radius, radius)
h = 2*radius
w = 2*radius
Y, X = np.ogrid[:h, :w]
dist_from_center = np.sqrt((X - center[0])**2 + (Y-center[1])**2)
mask = dist_from_center <= radius
return mask
And then circle centers can be found using convolution:
In your case, it should be equal to zero.
from scipy.signal import correlate2d
radius = 20
kernel = create_circular_kernel(radius)
convolved_image = correlate2d(candidates, kernel, 'same')
patch_centers = np.where(convolved_image==0, 1,0)
This gives image with values 1, where you can draw a circle containing no zero values.
For an assignment I want to resize a .jpg image with a python code, but without using the pil.image.resize() function or another similar function. I want to write the code myself but I can't figure out how. The image is RGB. I have found this can be solved by nearest neighbor interpolation (as well as other methods but this one is fine for my specific assignment). The height and the width should both be able to be made bigger or smaller. So far I only have this:
import numpy as np
import scipy as sc
import matplotlib as plt
import math
import PIL
from PIL import Image
img = np.array(Image.open("foto1.jpg"))
height = img.shape[0]
width = img.shape[1]
dim = img.shape[2]
new_h = int(input("New height: "))
new_w = int(input("New width: "))
imgR = img[:,:,0] #red pixels
imgG = img[:,:,1] #green pixels
imgB = img[:,:,2] #blue pixels
newR = np.empty([new_h, new_w])
newG = np.empty([new_h, new_w])
newB = np.empty([new_h, new_w])
So now all three colours have a new array of the right dimensions. Unfortunately on the web I can only find people who use resize() functions... Does anyone know?
Thank in advance!
The key to doing any image transformation like resizing is to have a mapping from output coordinates to input coordinates. Then you can simply iterate over the entire output and grab a pixel from the input. Nearest neighbor makes this particularly easy, because there's never a need to interpolate a pixel that doesn't lie exactly on integer coordinates - you simply round the coordinates to the nearest integer.
for new_y in range(new_h):
old_y = int(round(new_y * (new_h - 1) / (height - 1)))
if old_y < 0: old_y = 0
if old_y >= height: old_y = height - 1
for new_x in range(new_w):
old_x = int(round(new_x * (new_w - 1) / (width - 1)))
if old_x < 0: old_x = 0
if old_x >= width: old_x = width - 1
newR[new_y,new_x] = imgR[old_y,old_x]
newG[new_y,new_x] = imgG[old_y,old_x]
newB[new_y,new_x] = imgB[old_y,old_x]
The following code could do the trick.
def resize_img(image, resize_width, resize_height):
"""
:params
image: shape -> (width, height, channels)
resize_width: The resize width dimension.
resize_height: The resize height dimension.
:returns
array of shape -> (resized_width, resized_height, channels)
"""
original_width, original_height, channel = image.shape
red_channel = image[:, :, 0]
green_channel = image[:, :, 1]
blue_channel = image[:, :, 2]
resized_image = np.zeros((resize_width, resize_height, channel), dtype=np.uint8)
x_scale = original_width/resize_width
y_scale = original_height/resize_height
resize_idx = np.zeros((resize_width, resize_height))
resize_index_x = np.ceil(np.arange(0, original_width, x_scale)).astype(int)
resize_index_y = np.ceil(np.arange(0, original_height, y_scale)).astype(int)
resize_index_x[np.where(resize_index_x == original_width)] -= 1
resize_index_y[np.where(resize_index_y == original_height)] -= 1
resized_image[:, :, 0] = red_channel[resize_index_x, :][:, resize_index_y]
resized_image[:, :, 1] = green_channel[resize_index_x, :][:, resize_index_y]
resized_image[:, :, 2] = blue_channel[resize_index_x, :][:, resize_index_y]
return resized_image
I want to apply a pinch/bulge filter on an image using Python OpenCV. The result should be some kind of this example:
https://pixijs.io/pixi-filters/tools/screenshots/dist/bulge-pinch.gif
I've read the following stackoverflow post that should be the correct formula for the filter: Formulas for Barrel/Pincushion distortion
But I'm struggling to implement this in Python OpenCV.
I've read about maps to apply filter on an image: Distortion effect using OpenCv-python
As for my understanding, the code could look something like this:
import numpy as np
import cv2 as cv
f_img = 'example.jpg'
im_cv = cv.imread(f_img)
# grab the dimensions of the image
(h, w, _) = im_cv.shape
# set up the x and y maps as float32
flex_x = np.zeros((h, w), np.float32)
flex_y = np.zeros((h, w), np.float32)
# create map with the barrel pincushion distortion formula
for y in range(h):
for x in range(w):
flex_x[y, x] = APPLY FORMULA TO X
flex_y[y, x] = APPLY FORMULA TO Y
# do the remap this is where the magic happens
dst = cv.remap(im_cv, flex_x, flex_y, cv.INTER_LINEAR)
cv.imshow('src', im_cv)
cv.imshow('dst', dst)
cv.waitKey(0)
cv.destroyAllWindows()
Is this the correct way to achieve the distortion presented in the example image? Any help regarding useful ressources or preferably examples are much appreciated.
After familiarizing myself with the ImageMagick source code, I've found a way to apply the formula for distortion. With the help of the OpenCV remap function, this is a way to distort an image:
import numpy as np
import cv2 as cv
f_img = 'example.jpg'
im_cv = cv.imread(f_img)
# grab the dimensions of the image
(h, w, _) = im_cv.shape
# set up the x and y maps as float32
flex_x = np.zeros((h, w), np.float32)
flex_y = np.zeros((h, w), np.float32)
# create map with the barrel pincushion distortion formula
for y in range(h):
delta_y = scale_y * (y - center_y)
for x in range(w):
# determine if pixel is within an ellipse
delta_x = scale_x * (x - center_x)
distance = delta_x * delta_x + delta_y * delta_y
if distance >= (radius * radius):
flex_x[y, x] = x
flex_y[y, x] = y
else:
factor = 1.0
if distance > 0.0:
factor = math.pow(math.sin(math.pi * math.sqrt(distance) / radius / 2), -amount)
flex_x[y, x] = factor * delta_x / scale_x + center_x
flex_y[y, x] = factor * delta_y / scale_y + center_y
# do the remap this is where the magic happens
dst = cv.remap(im_cv, flex_x, flex_y, cv.INTER_LINEAR)
cv.imshow('src', im_cv)
cv.imshow('dst', dst)
cv.waitKey(0)
cv.destroyAllWindows()
This has the same effect as using the convert -implode function from ImageMagick.
You can do that using implode and explode options in Python Wand, which uses ImageMagick.
Input:
from wand.image import Image
import numpy as np
import cv2
with Image(filename='zelda1.jpg') as img:
img.virtual_pixel = 'black'
img.implode(0.5)
img.save(filename='zelda1_implode.jpg')
# convert to opencv/numpy array format
img_implode_opencv = np.array(img)
img_implode_opencv = cv2.cvtColor(img_implode_opencv, cv2.COLOR_RGB2BGR)
with Image(filename='zelda1.jpg') as img:
img.virtual_pixel = 'black'
img.implode(-0.5 )
img.save(filename='zelda1_explode.jpg')
# convert to opencv/numpy array format
img_explode_opencv = np.array(img)
img_explode_opencv = cv2.cvtColor(img_explode_opencv, cv2.COLOR_RGB2BGR)
# display result with opencv
cv2.imshow("IMPLODE", img_implode_opencv)
cv2.imshow("EXPLODE", img_explode_opencv)
cv2.waitKey(0)
Implode:
Explode:
In order to better understand how image manipulation works, I've decided to create my own image rotation algorithm rather than using cv2.rotate() However, I'm encountering a weird picture cropping and pixel misplacement issue.
I think it may have something to do with my padding, but there may be other errors
import cv2
import math
import numpy as np
# Load & Show original image
img = cv2.imread('Lena.png', 0)
cv2.imshow('Original', img)
# Variable declarations
h = img.shape[0] # Also known as rows
w = img.shape[1] # Also known as columns
cX = h / 2 #Image Center X
cY = w / 2 #Image Center Y
theta = math.radians(100) #Change to adjust rotation angle
imgArray = np.array((img))
imgArray = np.pad(imgArray,pad_width=((100,100),(100,100)),mode='constant',constant_values=0)
#Add padding in an attempt to prevent image cropping
# loop pixel by pixel in image
for x in range(h + 1):
for y in range(w + 1):
try:
TX = int((x-cX)*math.cos(theta)+(y-cY)*math.sin(theta)+cX) #Rotation formula
TY = int(-(x-cX)*math.sin(theta)+(y-cY)*math.cos(theta)+cY) #Rotation formula
imgArray[x,y] = img[TX,TY]
except IndexError as error:
print(error)
cv2.imshow('Rotated', imgArray)
cv2.waitKey(0)
Edit:
I think the misplaced image position may have something to do with lack of proper origin point, however I cannot seem to find a functioning solution to that problem.
Though I didn't dive in the math part of the domain, but based on the given information I think the matrix rotating formula should work like this:
UPDATE:
As I promised I dived a bit into the domain and got to the solution you can see as follows. The main trick that I've swapped the source and destination indices in the looping too, so the rounding doesn't mean any problem ever:
import cv2
import math
import numpy as np
# Load & Show original image
img = cv2.imread('/home/george/Downloads/lena.png', 0)
cv2.imshow('Original', img)
# Variable declarations
h = img.shape[0] # Also known as rows
w = img.shape[1] # Also known as columns
p = 120
h += 2 * p
w += 2 * p
cX = h / 2 #Image Center X
cY = h / 2 #Image Center Y
theta = math.radians(45) #Change to adjust rotation angle
imgArray = np.zeros_like((img))
#Add padding in an attempt to prevent image cropping
imgArray = np.pad(imgArray, pad_width=p, mode='constant', constant_values=0)
img = np.pad(img, pad_width=p, mode='constant', constant_values=0)
# loop pixel by pixel in image
for TX in range(h + 1):
for TY in range(w + 1):
try:
x = int( +(TX - cX) * math.cos(theta) + (TY - cY) * math.sin(theta) + cX) #Rotation formula
y = int( -(TX - cX) * math.sin(theta) + (TY - cY) * math.cos(theta) + cY) #Rotation formula
imgArray[TX, TY] = img[x, y]
except IndexError as error:
pass
# print(error)
cv2.imshow('Rotated', imgArray)
cv2.waitKey(0)
exit()
Note: See usr2564301 comment too, if you want to dive deeper in the domain.
For my neural network I want to augment my training data by adding small random rotations and zooms to my images. The issue I am having is that scipy is changing the size of my images when it applies the rotations and zooms. I need to to just clip the edges if part of the image goes out of bounds. All of my images must be the same size.
def loadImageData(img, distort = False):
c, fn = img
img = scipy.ndimage.imread(fn, True)
if distort:
img = scipy.ndimage.zoom(img, 1 + 0.05 * rnd(), mode = 'constant')
img = scipy.ndimage.rotate(img, 10 * rnd(), mode = 'constant')
print(img.shape)
img = img - np.min(img)
img = img / np.max(img)
img = np.reshape(img, (1, *img.shape))
y = np.zeros(ncats)
y[c] = 1
return (img, y)
scipy.ndimage.rotate accepts a reshape= parameter:
reshape : bool, optional
If reshape is true, the output shape is adapted so that the input
array is contained completely in the output. Default is True.
So to "clip" the edges you can simply call scipy.ndimage.rotate(img, ..., reshape=False).
from scipy.ndimage import rotate
from scipy.misc import face
from matplotlib import pyplot as plt
img = face()
rot = rotate(img, 30, reshape=False)
fig, ax = plt.subplots(1, 2)
ax[0].imshow(img)
ax[1].imshow(rot)
Things are more complicated for scipy.ndimage.zoom.
A naive method would be to zoom the entire input array, then use slice indexing and/or zero-padding to make the output the same size as your input. However, in cases where you're increasing the size of the image it's wasteful to interpolate pixels that are only going to get clipped off at the edges anyway.
Instead you could index only the part of the input that will fall within the bounds of the output array before you apply zoom:
import numpy as np
from scipy.ndimage import zoom
def clipped_zoom(img, zoom_factor, **kwargs):
h, w = img.shape[:2]
# For multichannel images we don't want to apply the zoom factor to the RGB
# dimension, so instead we create a tuple of zoom factors, one per array
# dimension, with 1's for any trailing dimensions after the width and height.
zoom_tuple = (zoom_factor,) * 2 + (1,) * (img.ndim - 2)
# Zooming out
if zoom_factor < 1:
# Bounding box of the zoomed-out image within the output array
zh = int(np.round(h * zoom_factor))
zw = int(np.round(w * zoom_factor))
top = (h - zh) // 2
left = (w - zw) // 2
# Zero-padding
out = np.zeros_like(img)
out[top:top+zh, left:left+zw] = zoom(img, zoom_tuple, **kwargs)
# Zooming in
elif zoom_factor > 1:
# Bounding box of the zoomed-in region within the input array
zh = int(np.round(h / zoom_factor))
zw = int(np.round(w / zoom_factor))
top = (h - zh) // 2
left = (w - zw) // 2
out = zoom(img[top:top+zh, left:left+zw], zoom_tuple, **kwargs)
# `out` might still be slightly larger than `img` due to rounding, so
# trim off any extra pixels at the edges
trim_top = ((out.shape[0] - h) // 2)
trim_left = ((out.shape[1] - w) // 2)
out = out[trim_top:trim_top+h, trim_left:trim_left+w]
# If zoom_factor == 1, just return the input array
else:
out = img
return out
For example:
zm1 = clipped_zoom(img, 0.5)
zm2 = clipped_zoom(img, 1.5)
fig, ax = plt.subplots(1, 3)
ax[0].imshow(img)
ax[1].imshow(zm1)
ax[2].imshow(zm2)
I recommend using cv2.resize because it is way faster than scipy.ndimage.zoom, probably due to support for simpler interpolation methods.
For a 480x640 image :
cv2.resize takes ~2 ms
scipy.ndimage.zoom takes ~500 ms
scipy.ndimage.zoom(...,order=0) takes ~175ms
If you are doing the data augmentation on the fly, this amount of speedup is invaluable because it means more experiments in less time.
Here is a version of clipped_zoom using cv2.resize
def cv2_clipped_zoom(img, zoom_factor=0):
"""
Center zoom in/out of the given image and returning an enlarged/shrinked view of
the image without changing dimensions
------
Args:
img : ndarray
Image array
zoom_factor : float
amount of zoom as a ratio [0 to Inf). Default 0.
------
Returns:
result: ndarray
numpy ndarray of the same shape of the input img zoomed by the specified factor.
"""
if zoom_factor == 0:
return img
height, width = img.shape[:2] # It's also the final desired shape
new_height, new_width = int(height * zoom_factor), int(width * zoom_factor)
### Crop only the part that will remain in the result (more efficient)
# Centered bbox of the final desired size in resized (larger/smaller) image coordinates
y1, x1 = max(0, new_height - height) // 2, max(0, new_width - width) // 2
y2, x2 = y1 + height, x1 + width
bbox = np.array([y1,x1,y2,x2])
# Map back to original image coordinates
bbox = (bbox / zoom_factor).astype(np.int)
y1, x1, y2, x2 = bbox
cropped_img = img[y1:y2, x1:x2]
# Handle padding when downscaling
resize_height, resize_width = min(new_height, height), min(new_width, width)
pad_height1, pad_width1 = (height - resize_height) // 2, (width - resize_width) //2
pad_height2, pad_width2 = (height - resize_height) - pad_height1, (width - resize_width) - pad_width1
pad_spec = [(pad_height1, pad_height2), (pad_width1, pad_width2)] + [(0,0)] * (img.ndim - 2)
result = cv2.resize(cropped_img, (resize_width, resize_height))
result = np.pad(result, pad_spec, mode='constant')
assert result.shape[0] == height and result.shape[1] == width
return result