The goal is to extract a random 2x5 patch from a 5x10 image, and do so randomly for all images in a batch. Looking to write a faster implementation that avoids for loops. Haven't been able to figure out how to use the torch .gather operation with two index arrays (idx_h and idx_w in code example).
Naive for loop:
import torch
b = 3 # batch size
h = 5 # height
w = 10 # width
crop_border = (3, 5) # number of pixels (height, width) to crop
x = torch.arange(b * h * w).reshape(b, h, w)
print(x)
dh_ = torch.randint(0, crop_border[0], size=(b,))
dw_ = torch.randint(0, crop_border[1], size=(b,))
_dh = h - (crop_border[0] - dh_)
_dw = w - (crop_border[1] - dw_)
idx_h = torch.stack([torch.arange(d_, _d) for d_, _d in zip(dh_, _dh)])
idx_w = torch.stack([torch.arange(d_, _d) for d_, _d in zip(dw_, _dw)])
print(idx_h, idx_w)
new_shape = (b, idx_h.shape[1], idx_w.shape[1])
cropped_x = torch.empty(new_shape)
for batch in range(b):
for height in range(idx_h.shape[1]):
for width in range(idx_w.shape[1]):
cropped_x[batch, height, width] = x[
batch, idx_h[batch, height], idx_w[batch, width]
]
print(cropped_x)
Index arrays needed to be repeated and reshaped to work with gather operation. Fast_crop code based pytorch discussion: https://discuss.pytorch.org/t/similar-to-torch-gather-over-two-dimensions/118827
def fast_crop(x, idx1, idx2):
"""
Compute
x: N x B x V
idx1: N x K matrix where idx1[i, j] is between [0, B)
idx2: N x K matrix where idx2[i, j] is between [0, V)
Return:
cropped: N x K matrix where y[i, j] = x[i, idx1[i,j], idx2[i,j]]
"""
x = x.contiguous()
assert idx1.shape == idx2.shape
lin_idx = idx2 + x.size(-1) * idx1
x = x.view(-1, x.size(1) * x.size(2))
lin_idx = lin_idx.view(-1, lin_idx.shape[1] * lin_idx.shape[2])
cropped = x.gather(-1, lin_idx)
return cropped.reshape(idx1.shape)
idx1 = torch.repeat_interleave(idx_h, idx_w.shape[1]).reshape(new_shape)
idx2 = torch.repeat_interleave(idx_w, idx_h.shape[1], dim=0).reshape(new_shape)
cropped = fast_crop(x, idx1, idx2)
(cropped == cropped_x).all()
Using realistic numbers for b = 100, h = 100, w = 130 and crop_border = (40, 95), a 10 trial run takes the for loop 32s while fast_crop only 0.043s.
I have an image INPUT, a painted CANVAS, and a homography matrix H, this code below will "crop" the CANVAS (bigger than INPUT) which was warped using the homography matrix H to the size of the INPUT. But so far my code is so inefficient that the loop starts to slow down the whole process. The example code below will effectively produce 1280*720 of loops for 3 channels image (I will be dealing with larger channels image/tensor). Is there a way to optimize/vectorize this process? Thanks in advance!
INPUT = np.zeros(720, 1280, 3)
CANVAS = np.zeros(1548, 1104, 3)
H = np.random.uniform(size=(3, 3))
inputIm_shape = INPUT.shape
h, w, c = inputIm_shape
xs, ys, a = [], [], np.zeros((w, h))
# this is where the bottleneck happens
for index, _ in np.ndenumerate(a):
xs.append(index[0]), ys.append(index[1])
input_coords = np.vstack(
(np.array(xs), np.array(ys), np.ones(len(xs))))
transformed = np.matmul(H, input_coords)
transformed[0, :] = np.divide(transformed[0, :], transformed[2, :])
transformed[1, :] = np.divide(transformed[1, :], transformed[2, :])
map_image = np.zeros(inputIm_shape)
hc, wc, cc = CANVAS.shape
badcount = 0
# and this too
for k in range(0, input_coords.shape[1]):
if int(transformed[0, k]) < 0 or int(transformed[0, k]) > wc - 1 or int(transformed[1, k]) < 0 or int(transformed[1, k]) > hc - 1:
badcount += 1
x_input = max(min(int(input_coords[0, k]), w - 1), 0)
y_input = max(min(int(input_coords[1, k]), h - 1), 0)
x_canvas = max(min(int(transformed[0, k]), wc - 1), 0)
y_canvas = max(min(int(transformed[1, k]), hc - 1), 0)
map_image[y_input, x_input] = CANVAS[y_canvas, x_canvas]
A disclaimer first, I'm not very skilled in Python, you guys have my admiration.
My problem:
I need to generate 10k+ images from templates (128px by 128px) with various hues and luminances.
I load the images and turn them into arrays
image = Image.open(dir + "/" + file).convert('RGBA')
arr=np.array(np.asarray(image).astype('float'))
From what I can understand, handling numpy arrays in this fashion is much faster than looping over every pixels and using colorsys.
Now, I've stumbled upon a couple functions to convert rgb to hsv.
This helped me generate my images with different hues, but I also need to play with the brightness so that some can be black, and others white.
def rgb_to_hsv(rgb):
# Translated from source of colorsys.rgb_to_hsv
hsv=np.empty_like(rgb)
hsv[...,3:]=rgb[...,3:]
r,g,b=rgb[...,0],rgb[...,1],rgb[...,2]
maxc = np.max(rgb[...,:2],axis=-1)
minc = np.min(rgb[...,:2],axis=-1)
hsv[...,2] = maxc
hsv[...,1] = (maxc-minc) / maxc
rc = (maxc-r) / (maxc-minc)
gc = (maxc-g) / (maxc-minc)
bc = (maxc-b) / (maxc-minc)
hsv[...,0] = np.select([r==maxc,g==maxc],[bc-gc,2.0+rc-bc],default=4.0+gc-rc)
hsv[...,0] = (hsv[...,0]/6.0) % 1.0
idx=(minc == maxc)
hsv[...,0][idx]=0.0
hsv[...,1][idx]=0.0
return hsv
def hsv_to_rgb(hsv):
# Translated from source of colorsys.hsv_to_rgb
rgb=np.empty_like(hsv)
rgb[...,3:]=hsv[...,3:]
h,s,v=hsv[...,0],hsv[...,1],hsv[...,2]
i = (h*6.0).astype('uint8')
f = (h*6.0) - i
p = v*(1.0 - s)
q = v*(1.0 - s*f)
t = v*(1.0 - s*(1.0-f))
i = i%6
conditions=[s==0.0,i==1,i==2,i==3,i==4,i==5]
rgb[...,0]=np.select(conditions,[v,q,p,p,t,v],default=v)
rgb[...,1]=np.select(conditions,[v,v,v,q,p,p],default=t)
rgb[...,2]=np.select(conditions,[v,p,t,v,v,q],default=p)
return rgb
How easy is it to modify these functions to convert to and from HSL?
Any trick to convert HSV to HSL?
Any info you can give me is greatly appreciated, thanks!
Yes, numpy, namely the vectorised code, can speed-up color conversions.
The more, for massive production of 10k+ bitmaps, you may want to re-use a ready made professional conversion, or sub-class it, if it is not exactly matching your preferred Luminance model.
a Computer Vision library OpenCV, currently available for python as a cv2 module, can take care of the colorsystem conversion without any additional coding just with:
a ready-made conversion one-liner
out = cv2.cvtColor( anInputFRAME, cv2.COLOR_YUV2BGR ) # a bitmap conversion
A list of some color-systems available in cv2 ( you may notice RGB to be referred to as BRG due to OpenCV convention of a different ordering of an image's Blue-Red-Green color-planes ),
( symmetry applies COLOR_YCR_CB2BGR <-|-> COLOR_BGR2YCR_CB not all pairs shown )
>>> import cv2
>>> for key in dir( cv2 ): # show all ready conversions
... if key[:7] == 'COLOR_Y':
... print key
COLOR_YCR_CB2BGR
COLOR_YCR_CB2RGB
COLOR_YUV2BGR
COLOR_YUV2BGRA_I420
COLOR_YUV2BGRA_IYUV
COLOR_YUV2BGRA_NV12
COLOR_YUV2BGRA_NV21
COLOR_YUV2BGRA_UYNV
COLOR_YUV2BGRA_UYVY
COLOR_YUV2BGRA_Y422
COLOR_YUV2BGRA_YUNV
COLOR_YUV2BGRA_YUY2
COLOR_YUV2BGRA_YUYV
COLOR_YUV2BGRA_YV12
COLOR_YUV2BGRA_YVYU
COLOR_YUV2BGR_I420
COLOR_YUV2BGR_IYUV
COLOR_YUV2BGR_NV12
COLOR_YUV2BGR_NV21
COLOR_YUV2BGR_UYNV
COLOR_YUV2BGR_UYVY
COLOR_YUV2BGR_Y422
COLOR_YUV2BGR_YUNV
COLOR_YUV2BGR_YUY2
COLOR_YUV2BGR_YUYV
COLOR_YUV2BGR_YV12
COLOR_YUV2BGR_YVYU
COLOR_YUV2GRAY_420
COLOR_YUV2GRAY_I420
COLOR_YUV2GRAY_IYUV
COLOR_YUV2GRAY_NV12
COLOR_YUV2GRAY_NV21
COLOR_YUV2GRAY_UYNV
COLOR_YUV2GRAY_UYVY
COLOR_YUV2GRAY_Y422
COLOR_YUV2GRAY_YUNV
COLOR_YUV2GRAY_YUY2
COLOR_YUV2GRAY_YUYV
COLOR_YUV2GRAY_YV12
COLOR_YUV2GRAY_YVYU
COLOR_YUV2RGB
COLOR_YUV2RGBA_I420
COLOR_YUV2RGBA_IYUV
COLOR_YUV2RGBA_NV12
COLOR_YUV2RGBA_NV21
COLOR_YUV2RGBA_UYNV
COLOR_YUV2RGBA_UYVY
COLOR_YUV2RGBA_Y422
COLOR_YUV2RGBA_YUNV
COLOR_YUV2RGBA_YUY2
COLOR_YUV2RGBA_YUYV
COLOR_YUV2RGBA_YV12
COLOR_YUV2RGBA_YVYU
COLOR_YUV2RGB_I420
COLOR_YUV2RGB_IYUV
COLOR_YUV2RGB_NV12
COLOR_YUV2RGB_NV21
COLOR_YUV2RGB_UYNV
COLOR_YUV2RGB_UYVY
COLOR_YUV2RGB_Y422
COLOR_YUV2RGB_YUNV
COLOR_YUV2RGB_YUY2
COLOR_YUV2RGB_YUYV
COLOR_YUV2RGB_YV12
COLOR_YUV2RGB_YVYU
COLOR_YUV420P2BGR
COLOR_YUV420P2BGRA
COLOR_YUV420P2GRAY
COLOR_YUV420P2RGB
COLOR_YUV420P2RGBA
COLOR_YUV420SP2BGR
COLOR_YUV420SP2BGRA
COLOR_YUV420SP2GRAY
COLOR_YUV420SP2RGB
COLOR_YUV420SP2RGBA
I did some prototyping for Luminance conversions ( based on >>> http://en.wikipedia.org/wiki/HSL_and_HSV )
But not tested for release.
def get_YUV_V_Cr_Rec601_BRG_frame( brgFRAME ): # For the Rec. 601 primaries used in gamma-corrected sRGB, fast, VECTORISED MUL/ADD CODE
out = numpy.zeros( brgFRAME.shape[0:2] )
out += 0.615 / 255 * brgFRAME[:,:,1] # // Red # normalise to <0.0 - 1.0> before vectorised MUL/ADD, saves [usec] ... on 480x640 [px] faster goes about 2.2 [msec] instead of 5.4 [msec]
out -= 0.515 / 255 * brgFRAME[:,:,2] # // Green
out -= 0.100 / 255 * brgFRAME[:,:,0] # // Blue # normalise to <0.0 - 1.0> before vectorised MUL/ADD
return out
# -*- coding: utf-8 -*-
# #File : rgb2hls.py
# #Info : # TSMC
# #Desc :
import colorsys
import numpy as np
import scipy.misc
import tensorflow as tf
from PIL import Image
def rgb2hls(img):
""" note: elements in img is a float number less than 1.0 and greater than 0.
:param img: an numpy ndarray with shape NHWC
:return:
"""
assert len(img.shape) == 3
hue = np.zeros_like(img[:, :, 0])
luminance = np.zeros_like(img[:, :, 0])
saturation = np.zeros_like(img[:, :, 0])
for x in range(height):
for y in range(width):
r, g, b = img[x, y]
h, l, s = colorsys.rgb_to_hls(r, g, b)
hue[x, y] = h
luminance[x, y] = l
saturation[x, y] = s
return hue, luminance, saturation
def np_rgb2hls(img):
r, g, b = img[:, :, 0], img[:, :, 1], img[:, :, 2]
maxc = np.max(img, -1)
minc = np.min(img, -1)
l = (minc + maxc) / 2.0
if np.array_equal(minc, maxc):
return np.zeros_like(l), l, np.zeros_like(l)
smask = np.greater(l, 0.5).astype(np.float32)
s = (1.0 - smask) * ((maxc - minc) / (maxc + minc)) + smask * ((maxc - minc) / (2.001 - maxc - minc))
rc = (maxc - r) / (maxc - minc + 0.001)
gc = (maxc - g) / (maxc - minc + 0.001)
bc = (maxc - b) / (maxc - minc + 0.001)
rmask = np.equal(r, maxc).astype(np.float32)
gmask = np.equal(g, maxc).astype(np.float32)
rgmask = np.logical_or(rmask, gmask).astype(np.float32)
h = rmask * (bc - gc) + gmask * (2.0 + rc - bc) + (1.0 - rgmask) * (4.0 + gc - rc)
h = np.remainder(h / 6.0, 1.0)
return h, l, s
def tf_rgb2hls(img):
""" note: elements in img all in [0,1]
:param img: a tensor with shape NHWC
:return:
"""
assert img.get_shape()[-1] == 3
r, g, b = img[:, :, 0], img[:, :, 1], img[:, :, 2]
maxc = tf.reduce_max(img, -1)
minc = tf.reduce_min(img, -1)
l = (minc + maxc) / 2.0
# if tf.reduce_all(tf.equal(minc, maxc)):
# return tf.zeros_like(l), l, tf.zeros_like(l)
smask = tf.cast(tf.greater(l, 0.5), tf.float32)
s = (1.0 - smask) * ((maxc - minc) / (maxc + minc)) + smask * ((maxc - minc) / (2.001 - maxc - minc))
rc = (maxc - r) / (maxc - minc + 0.001)
gc = (maxc - g) / (maxc - minc + 0.001)
bc = (maxc - b) / (maxc - minc + 0.001)
rmask = tf.equal(r, maxc)
gmask = tf.equal(g, maxc)
rgmask = tf.cast(tf.logical_or(rmask, gmask), tf.float32)
rmask = tf.cast(rmask, tf.float32)
gmask = tf.cast(gmask, tf.float32)
h = rmask * (bc - gc) + gmask * (2.0 + rc - bc) + (1.0 - rgmask) * (4.0 + gc - rc)
h = tf.mod(h / 6.0, 1.0)
h = tf.expand_dims(h, -1)
l = tf.expand_dims(l, -1)
s = tf.expand_dims(s, -1)
x = tf.concat([tf.zeros_like(l), l, tf.zeros_like(l)], -1)
y = tf.concat([h, l, s], -1)
return tf.where(condition=tf.reduce_all(tf.equal(minc, maxc)), x=x, y=y)
if __name__ == '__main__':
"""
HLS: Hue, Luminance, Saturation
H: position in the spectrum
L: color lightness
S: color saturation
"""
avatar = Image.open("hue.jpg")
width, height = avatar.size
print("width: {}, height: {}".format(width, height))
img = np.array(avatar)
img = img / 255.0
print(img.shape)
# # hue, luminance, saturation = rgb2hls(img)
# hue, luminance, saturation = np_rgb2hls(img)
img_tensor = tf.convert_to_tensor(img, tf.float32)
hls = tf_rgb2hls(img_tensor)
h, l, s = hls[:, :, 0], hls[:, :, 1], hls[:, :, 2]
with tf.Session() as sess:
sess.run(tf.global_variables_initializer())
hue, luminance, saturation = sess.run([h, l, s])
scipy.misc.imsave("hls_h_.jpg", hue)
scipy.misc.imsave("hls_l_.jpg", luminance)
scipy.misc.imsave("hls_s_.jpg", saturation)
In case someone is looking for a self-contained solution (I really didn't want to add OpenCV as a dependency), I rewrote the official python colorsys rgb_to_hls() and hls_to_rgb() functions to be usable for numpy:
import numpy as np
def rgb_to_hls(rgb_array: np.ndarray) -> np.ndarray:
"""
Expects an array of shape (X, 3), each row being RGB colours.
Returns an array of same size, each row being HLS colours.
Like `colorsys` python module, all values are between 0 and 1.
NOTE: like `colorsys`, this uses HLS rather than the more usual HSL
"""
assert rgb_array.ndim == 2
assert rgb_array.shape[1] == 3
assert np.max(rgb_array) <= 1
assert np.min(rgb_array) >= 0
r, g, b = rgb_array.T.reshape((3, -1, 1))
maxc = np.max(rgb_array, axis=1).reshape((-1, 1))
minc = np.min(rgb_array, axis=1).reshape((-1, 1))
sumc = (maxc+minc)
rangec = (maxc-minc)
with np.errstate(divide='ignore', invalid='ignore'):
rgb_c = (maxc - rgb_array) / rangec
rc, gc, bc = rgb_c.T.reshape((3, -1, 1))
h = (np.where(minc == maxc, 0, np.where(r == maxc, bc - gc, np.where(g == maxc, 2.0+rc-bc, 4.0+gc-rc)))
/ 6) % 1
l = sumc/2.0
with np.errstate(divide='ignore', invalid='ignore'):
s = np.where(minc == maxc, 0,
np.where(l < 0.5, rangec / sumc, rangec / (2.0-sumc)))
return np.concatenate((h, l, s), axis=1)
def hls_to_rgb(hls_array: np.ndarray) -> np.ndarray:
"""
Expects an array of shape (X, 3), each row being HLS colours.
Returns an array of same size, each row being RGB colours.
Like `colorsys` python module, all values are between 0 and 1.
NOTE: like `colorsys`, this uses HLS rather than the more usual HSL
"""
ONE_THIRD = 1 / 3
TWO_THIRD = 2 / 3
ONE_SIXTH = 1 / 6
def _v(m1, m2, h):
h = h % 1.0
return np.where(h < ONE_SIXTH, m1 + (m2 - m1) * h * 6,
np.where(h < .5, m2,
np.where(h < TWO_THIRD, m1 + (m2 - m1) * (TWO_THIRD - h) * 6,
m1)))
assert hls_array.ndim == 2
assert hls_array.shape[1] == 3
assert np.max(hls_array) <= 1
assert np.min(hls_array) >= 0
h, l, s = hls_array.T.reshape((3, -1, 1))
m2 = np.where(l < 0.5, l * (1 + s), l + s - (l * s))
m1 = 2 * l - m2
r = np.where(s == 0, l, _v(m1, m2, h + ONE_THIRD))
g = np.where(s == 0, l, _v(m1, m2, h))
b = np.where(s == 0, l, _v(m1, m2, h - ONE_THIRD))
return np.concatenate((r, g, b), axis=1)
def _test1():
import colorsys
rgb_array = np.array([[.5, .5, .8], [.3, .7, 1], [0, 0, 0], [1, 1, 1], [.5, .5, .5]])
hls_array = rgb_to_hls(rgb_array)
for rgb, hls in zip(rgb_array, hls_array):
assert np.all(abs(np.array(colorsys.rgb_to_hls(*rgb) - hls) < 0.001))
new_rgb_array = hls_to_rgb(hls_array)
for hls, rgb in zip(hls_array, new_rgb_array):
assert np.all(abs(np.array(colorsys.hls_to_rgb(*hls) - rgb) < 0.001))
assert np.all(abs(rgb_array - new_rgb_array) < 0.001)
print("tests part 1 done")
def _test2():
import colorsys
hls_array = np.array([[0.6456692913385826, 0.14960629921259844, 0.7480314960629921], [.3, .7, 1], [0, 0, 0], [0, 1, 0], [.5, .5, .5]])
rgb_array = hls_to_rgb(hls_array)
for hls, rgb in zip(hls_array, rgb_array):
assert np.all(abs(np.array(colorsys.hls_to_rgb(*hls) - rgb) < 0.001))
new_hls_array = rgb_to_hls(rgb_array)
for rgb, hls in zip(rgb_array, new_hls_array):
assert np.all(abs(np.array(colorsys.rgb_to_hls(*rgb) - hls) < 0.001))
assert np.all(abs(hls_array - new_hls_array) < 0.001)
print("All tests done")
def _test():
_test1()
_test2()
if __name__ == "__main__":
_test()
(see gist)
(off topic: converting the other functions in the same way is actually a great training for someone wanting to get their hands dirty with numpy (or other SIMD / GPU) programming). Let me know if you do so :)
edit: rgb_to_hsv and hsv_to_rgb now also in the gist.
I need to blur an image by taking a kernel K and averaging the values in the 2D array and setting the center value to the average of K. Here is the code I have written to do so...
def Clamp(pix):
pix = abs(pix)
if pix > 255:
pix = 255
return pix
def Convolve2D(image1, K, image2):
img = graphics.Image(graphics.Point(0, 0), image1)
img.save(image2)
secondimage=graphics.Image(graphics.Point(0,0),image2)
h = img.getHeight()
w = img.getWidth()
A = [[0]*h for y in range(w)]
B = [[0]*w for x in range(h)]
#iterate over all rows (ignore 1-pixel borders)
for v in range(1, h-3):
graphics.update() # this updates the output for each row
# for every row, iterate over all columns (again ignore 1-pixel borders)
for u in range(1, w-3):
#A[u][v] = 0
#B[u][v] = 0
# for every pixel, iterate over region of overlap between
# input image and 3x3 kernel centered at current pixel
for i in range (0, 3):
for j in range (0, 3):
A[u][v] = A[u][v] + B[v+i][u+j] * K[i][j]
r, g, b = img.getPixel(u, v)
if (r * A[u][v] >= 255):
Clamp(r)
else:
r = r * A[u][v]
if (g * A[u][v] >= 255):
Clamp(g)
else:
g = g * A[u][v]
if (b * A[u][v] >= 255):
Clamp(b)
else:
b = b * A[u][v]
newcolor = graphics.color_rgb(r, g, b)
secondimage.setPixel(u, v , newcolor)
print("Not yet implemented") # to be removed
secondimage.save(image2)
secondimage.move(secondimage.getWidth()/2, secondimage.getHeight()/2)
win = graphics.GraphWin(secondimage, secondimage.getWidth(), secondimage.getHeight())
secondimage.draw(win)
def Blur3(image1, image2):
K = [[1/9, 1/9, 1/9], [1/9, 1/9, 1/9], [1/9, 1/9, 1/9]]
return Convolve2D(image1, K, image2)
This is the image I am trying to blur
This is what comes out of my code
is it possibly my if and else statements and the clamp function that is doing this? I just want a blurred image to come out like this
do this :
for v in range(h):
graphics.update() # this updates the output for each row
for u in range(w):
for i in range (0, 3):
for j in range (0, 3):
if v-i>=0 and u-j>=0 and v+i<=256 and u+j<=256 :
img[u][v] = img[u][v] + img[v-i][u-j] * K[i][j]
this should work!
Can you please tell me why you have two images A, B and blur image A using B? I mean :
A[u][v] = A[u][v] + B[v+i][u+j] * K[i][j]
Here I add a code which works for gray scale image,you can expand it for your need!
import matplotlib.image as mpimg
Img=mpimg.imread('GrayScaleImg.jpg')
kernel=towDimGuassKernel(size)
def conv2D(I,kernel):
filterWidth=kernel.shape[0]
half=filterWidth/2
bluredImg=np.zeros(I.shape)
for imgRow in range(I.shape[0]):
print imgRow
for imgCol in range(I.shape[1]):
for filterRow in range(filterWidth):
for filterCol in range(filterWidth):
if imgRow-filterRow>=0 and imgCol-filterCol>=0 and imgRow+filterRow<=256 and imgCol+filterCol<=256 :
bluredImg[imgRow,imgCol]+=I[imgRow-filterRow,imgCol-filterCol]*kernel[filterRow,filterCol]
return bluredImg
You've initialized A and B to empty lists with a size of 0. You need to initialize them instead to be the size of the image, in both dimensions.
A = [[0]*w for y in range(h)]
Edit: Your second problem is that you're defining the kernel with 1/9 which is an integer division yielding 0.