First off, i am relatively new to Python and its libraries.
The purpose of the following code is to convert a HDR image to RGBM as detailed in WebGL Insights Chapter 16.
import argparse
import numpy
import imageio
import math
# Parse arguments
parser = argparse.ArgumentParser(description = 'Convert a HDR image to a 32bit RGBM image.')
parser.add_argument('file', metavar = 'FILE', type = str, help ='Image file to convert')
args = parser.parse_args()
# Load image
image = imageio.imread(args.file)
height = image.shape[0]
width = image.shape[1]
output = numpy.zeros((height, width, 4))
# Convert image
for i in numpy.ndindex(image.shape[:2]):
rgb = image[i]
rgba = numpy.zeros(4)
rgba[0:3] = (1.0 / 7.0) * numpy.sqrt(rgb)
rgba[3] = max(max(rgba[0], rgba[1]), rgba[2])
rgba[3] = numpy.clip(rgba[3], 1.0 / 255.0, 1.0)
rgba[3] = math.ceil(rgba[3] * 255.0) / 255.0
output[i] = rgba
# Save image to png
imageio.imsave(args.file.split('.')[0] + '_rgbm.png', output)
The code works and produces correct results, but it does so very slowly. This is of course caused by iterating over each pixels separately within python, which for larger images can take a long time (about 4:30 minutes for an image with a size of 3200x1600).
My question is: Is there a more efficient way to achieve what I'm after? I briefly looked into vectorization and broadcasting in numpy but haven't found a way to apply those to my problem yet.
Edit:
Thanks to Mateen Ulhaq, i found a solution:
# Convert image
rgb = (1.0 / 7.0) * numpy.sqrt(image)
alpha = numpy.amax(rgb, axis=2)
alpha = numpy.clip(alpha, 1.0 / 255.0, 1.0)
alpha = numpy.ceil(alpha * 255.0) / 255.0
alpha = numpy.reshape(alpha, (height, width, 1))
output = numpy.concatenate((rgb, alpha), axis=2)
This completes in only a few seconds.
The line
for i in numpy.ndindex(image.shape[:2]):
is just iterating over every pixel. It's probably faster to get rid of the loop and process every pixel in each line of code ("vectorized").
rgb = (1.0 / 7.0) * np.sqrt(image)
alpha = np.amax(rgb, axis=2)
alpha = np.clip(alpha, 1.0 / 255.0, 1.0)
alpha = np.ceil(alpha * 255.0) / 255.0
alpha = numpy.reshape(alpha, (height, width, 1))
output = np.concatenate((rgb, alpha), axis=2)
I think it's also a bit clearer.
Related
I am using two different ways to re-size an image, but all three look exactly the same...
What am I doing wrong that no scaling occurs?
import cv2 as cv
import numpy as np
path = "resources/Shapes.png"
img = cv.imread(path)
cv.imshow("img", img)
res1 = cv.resize(img, None, fx = 2, fy = 2, interpolation = cv.INTER_CUBIC)
cv.imshow("res1", res1)
height, width = img.shape[:2]
res2 = cv.resize(img, (2 * width, 2 * height), interpolation = cv.INTER_CUBIC)
cv.imshow("res2", res2)
k = cv.waitKey(0)
Just putting this here for future reference:
The code above works, the issue was that imshow does not always show the true size of the image, by saving the different images, or simply examining them with res1.shape vs img.shape, you can see the true size of the image.
Given an input image test.png, how to resize it (downsizing) such that it fits in a 400x500 pixels box, keeping aspect ratio?
Is it possible directly with cv2.resize, or do we have to compute the sizing factor manually?
It seems there is no helper function for this already built-in in cv2. Here is a working solution:
maxwidth, maxheight = 400, 500
import cv2
img = cv2.imread('test2.png')
f1 = maxwidth / img.shape[1]
f2 = maxheight / img.shape[0]
f = min(f1, f2) # resizing factor
dim = (int(img.shape[1] * f), int(img.shape[0] * f))
resized = cv2.resize(img, dim)
cv2.imwrite('out.png', resized)
I want to apply a pinch/bulge filter on an image using Python OpenCV. The result should be some kind of this example:
https://pixijs.io/pixi-filters/tools/screenshots/dist/bulge-pinch.gif
I've read the following stackoverflow post that should be the correct formula for the filter: Formulas for Barrel/Pincushion distortion
But I'm struggling to implement this in Python OpenCV.
I've read about maps to apply filter on an image: Distortion effect using OpenCv-python
As for my understanding, the code could look something like this:
import numpy as np
import cv2 as cv
f_img = 'example.jpg'
im_cv = cv.imread(f_img)
# grab the dimensions of the image
(h, w, _) = im_cv.shape
# set up the x and y maps as float32
flex_x = np.zeros((h, w), np.float32)
flex_y = np.zeros((h, w), np.float32)
# create map with the barrel pincushion distortion formula
for y in range(h):
for x in range(w):
flex_x[y, x] = APPLY FORMULA TO X
flex_y[y, x] = APPLY FORMULA TO Y
# do the remap this is where the magic happens
dst = cv.remap(im_cv, flex_x, flex_y, cv.INTER_LINEAR)
cv.imshow('src', im_cv)
cv.imshow('dst', dst)
cv.waitKey(0)
cv.destroyAllWindows()
Is this the correct way to achieve the distortion presented in the example image? Any help regarding useful ressources or preferably examples are much appreciated.
After familiarizing myself with the ImageMagick source code, I've found a way to apply the formula for distortion. With the help of the OpenCV remap function, this is a way to distort an image:
import numpy as np
import cv2 as cv
f_img = 'example.jpg'
im_cv = cv.imread(f_img)
# grab the dimensions of the image
(h, w, _) = im_cv.shape
# set up the x and y maps as float32
flex_x = np.zeros((h, w), np.float32)
flex_y = np.zeros((h, w), np.float32)
# create map with the barrel pincushion distortion formula
for y in range(h):
delta_y = scale_y * (y - center_y)
for x in range(w):
# determine if pixel is within an ellipse
delta_x = scale_x * (x - center_x)
distance = delta_x * delta_x + delta_y * delta_y
if distance >= (radius * radius):
flex_x[y, x] = x
flex_y[y, x] = y
else:
factor = 1.0
if distance > 0.0:
factor = math.pow(math.sin(math.pi * math.sqrt(distance) / radius / 2), -amount)
flex_x[y, x] = factor * delta_x / scale_x + center_x
flex_y[y, x] = factor * delta_y / scale_y + center_y
# do the remap this is where the magic happens
dst = cv.remap(im_cv, flex_x, flex_y, cv.INTER_LINEAR)
cv.imshow('src', im_cv)
cv.imshow('dst', dst)
cv.waitKey(0)
cv.destroyAllWindows()
This has the same effect as using the convert -implode function from ImageMagick.
You can do that using implode and explode options in Python Wand, which uses ImageMagick.
Input:
from wand.image import Image
import numpy as np
import cv2
with Image(filename='zelda1.jpg') as img:
img.virtual_pixel = 'black'
img.implode(0.5)
img.save(filename='zelda1_implode.jpg')
# convert to opencv/numpy array format
img_implode_opencv = np.array(img)
img_implode_opencv = cv2.cvtColor(img_implode_opencv, cv2.COLOR_RGB2BGR)
with Image(filename='zelda1.jpg') as img:
img.virtual_pixel = 'black'
img.implode(-0.5 )
img.save(filename='zelda1_explode.jpg')
# convert to opencv/numpy array format
img_explode_opencv = np.array(img)
img_explode_opencv = cv2.cvtColor(img_explode_opencv, cv2.COLOR_RGB2BGR)
# display result with opencv
cv2.imshow("IMPLODE", img_implode_opencv)
cv2.imshow("EXPLODE", img_explode_opencv)
cv2.waitKey(0)
Implode:
Explode:
I have two images, one with and other without alpha channel. Thus, image A and B has a shape of (x,y,4) and (x,y,3) respectively.
I want to merge both images in a single tensor using python, where B is the background and A is the upper image. The final image must have a shape of (x, y, 3). I tried if scikit-image or cv2 is capable of doing this, but I couldn't found any solution.
here is alpha blending in python
import numpy as np
import cv2
alpha = 0.4
img1 = cv2.imread('Desert.jpg')
img2 = cv2.imread('Penguins.jpg')
#r,c,z = img1.shape
out_img = np.zeros(img1.shape,dtype=img1.dtype)
out_img[:,:,:] = (alpha * img1[:,:,:]) + ((1-alpha) * img2[:,:,:])
'''
# if want to loop over the whole image
for y in range(r):
for x in range(c):
out_img[y,x,0] = (alpha * img1[y,x,0]) + ((1-alpha) * img2[y,x,0])
out_img[y,x,1] = (alpha * img1[y,x,1]) + ((1-alpha) * img2[y,x,1])
out_img[y,x,2] = (alpha * img1[y,x,2]) + ((1-alpha) * img2[y,x,2])
'''
cv2.imshow('Output',out_img)
cv2.waitKey(0)
The above solution works, however I have a more efficient one:
alpha = A[:,:,3]
A1 = A[:,:,:3]
C = np.multiply(A1, alpha.reshape(x,y,1)) + np.multiply(B, 1-alpha.reshape(x,y,1))
For my neural network I want to augment my training data by adding small random rotations and zooms to my images. The issue I am having is that scipy is changing the size of my images when it applies the rotations and zooms. I need to to just clip the edges if part of the image goes out of bounds. All of my images must be the same size.
def loadImageData(img, distort = False):
c, fn = img
img = scipy.ndimage.imread(fn, True)
if distort:
img = scipy.ndimage.zoom(img, 1 + 0.05 * rnd(), mode = 'constant')
img = scipy.ndimage.rotate(img, 10 * rnd(), mode = 'constant')
print(img.shape)
img = img - np.min(img)
img = img / np.max(img)
img = np.reshape(img, (1, *img.shape))
y = np.zeros(ncats)
y[c] = 1
return (img, y)
scipy.ndimage.rotate accepts a reshape= parameter:
reshape : bool, optional
If reshape is true, the output shape is adapted so that the input
array is contained completely in the output. Default is True.
So to "clip" the edges you can simply call scipy.ndimage.rotate(img, ..., reshape=False).
from scipy.ndimage import rotate
from scipy.misc import face
from matplotlib import pyplot as plt
img = face()
rot = rotate(img, 30, reshape=False)
fig, ax = plt.subplots(1, 2)
ax[0].imshow(img)
ax[1].imshow(rot)
Things are more complicated for scipy.ndimage.zoom.
A naive method would be to zoom the entire input array, then use slice indexing and/or zero-padding to make the output the same size as your input. However, in cases where you're increasing the size of the image it's wasteful to interpolate pixels that are only going to get clipped off at the edges anyway.
Instead you could index only the part of the input that will fall within the bounds of the output array before you apply zoom:
import numpy as np
from scipy.ndimage import zoom
def clipped_zoom(img, zoom_factor, **kwargs):
h, w = img.shape[:2]
# For multichannel images we don't want to apply the zoom factor to the RGB
# dimension, so instead we create a tuple of zoom factors, one per array
# dimension, with 1's for any trailing dimensions after the width and height.
zoom_tuple = (zoom_factor,) * 2 + (1,) * (img.ndim - 2)
# Zooming out
if zoom_factor < 1:
# Bounding box of the zoomed-out image within the output array
zh = int(np.round(h * zoom_factor))
zw = int(np.round(w * zoom_factor))
top = (h - zh) // 2
left = (w - zw) // 2
# Zero-padding
out = np.zeros_like(img)
out[top:top+zh, left:left+zw] = zoom(img, zoom_tuple, **kwargs)
# Zooming in
elif zoom_factor > 1:
# Bounding box of the zoomed-in region within the input array
zh = int(np.round(h / zoom_factor))
zw = int(np.round(w / zoom_factor))
top = (h - zh) // 2
left = (w - zw) // 2
out = zoom(img[top:top+zh, left:left+zw], zoom_tuple, **kwargs)
# `out` might still be slightly larger than `img` due to rounding, so
# trim off any extra pixels at the edges
trim_top = ((out.shape[0] - h) // 2)
trim_left = ((out.shape[1] - w) // 2)
out = out[trim_top:trim_top+h, trim_left:trim_left+w]
# If zoom_factor == 1, just return the input array
else:
out = img
return out
For example:
zm1 = clipped_zoom(img, 0.5)
zm2 = clipped_zoom(img, 1.5)
fig, ax = plt.subplots(1, 3)
ax[0].imshow(img)
ax[1].imshow(zm1)
ax[2].imshow(zm2)
I recommend using cv2.resize because it is way faster than scipy.ndimage.zoom, probably due to support for simpler interpolation methods.
For a 480x640 image :
cv2.resize takes ~2 ms
scipy.ndimage.zoom takes ~500 ms
scipy.ndimage.zoom(...,order=0) takes ~175ms
If you are doing the data augmentation on the fly, this amount of speedup is invaluable because it means more experiments in less time.
Here is a version of clipped_zoom using cv2.resize
def cv2_clipped_zoom(img, zoom_factor=0):
"""
Center zoom in/out of the given image and returning an enlarged/shrinked view of
the image without changing dimensions
------
Args:
img : ndarray
Image array
zoom_factor : float
amount of zoom as a ratio [0 to Inf). Default 0.
------
Returns:
result: ndarray
numpy ndarray of the same shape of the input img zoomed by the specified factor.
"""
if zoom_factor == 0:
return img
height, width = img.shape[:2] # It's also the final desired shape
new_height, new_width = int(height * zoom_factor), int(width * zoom_factor)
### Crop only the part that will remain in the result (more efficient)
# Centered bbox of the final desired size in resized (larger/smaller) image coordinates
y1, x1 = max(0, new_height - height) // 2, max(0, new_width - width) // 2
y2, x2 = y1 + height, x1 + width
bbox = np.array([y1,x1,y2,x2])
# Map back to original image coordinates
bbox = (bbox / zoom_factor).astype(np.int)
y1, x1, y2, x2 = bbox
cropped_img = img[y1:y2, x1:x2]
# Handle padding when downscaling
resize_height, resize_width = min(new_height, height), min(new_width, width)
pad_height1, pad_width1 = (height - resize_height) // 2, (width - resize_width) //2
pad_height2, pad_width2 = (height - resize_height) - pad_height1, (width - resize_width) - pad_width1
pad_spec = [(pad_height1, pad_height2), (pad_width1, pad_width2)] + [(0,0)] * (img.ndim - 2)
result = cv2.resize(cropped_img, (resize_width, resize_height))
result = np.pad(result, pad_spec, mode='constant')
assert result.shape[0] == height and result.shape[1] == width
return result