I'm trying to convert an RGB to a gray-value image of the same size (with values between 0 and 1). The mapping is done by a dictionary called MASK_LUT_IDX which takes in a tuple (RGB) and returns the corresponding value. The current code is 2x faster than before, but still takes 1.5s (according to timeit), which is proving to be an issue.
import numpy as np
def quickConv(numpy_triple):
return MASK_LUT_IDX[tuple(numpy_triple)]
def ImageSegmenter(masked_img):
rgb_tuples = np.array(masked_img.getdata(), dtype=tuple)
class_idxs = np.apply_along_axis(quickConv, 1,rgb_tuples)
return np.array(class_idxs).reshape(masked_img.size[1],masked_img.size[0])
class_img = ImageSegmenter(masked_img)
Is there a better way of converting this? I've looked into the palette functionalities, but it doesn't seem to quite fit my needs.
Thanks to Cris Luengo's help, here is a sped up version of the function.
def ImageSegmenter(masked_img):
masked_img = np.array(masked_img)
class_idxs = FASTER_MASK_LUT_IDX[masked_img[:,:,0],masked_img[:,:,1],masked_img[:,:,2]]
return class_idxs
Where FASTER_MASK_LUT_IDX is a 3d tensor set given by
FASTER_MASK_LUT_IDX = np.zeros((256,256,256))
for idx,label in zip(CLASS_IDX,CLASS_LABELS):
red_idx = RGB_CLASS_MAPPING[label]['R']
green_idx = RGB_CLASS_MAPPING[label]['G']
blue_idx = RGB_CLASS_MAPPING[label]['B']
FASTER_MASK_LUT_IDX[red_idx,green_idx,blue_idx] = idx/NUM_CLASSES
RGB_CLASS_MAPPING maps an RGB value to a class, which was unrolled using enumerate to create CLASS_IDX and CLASS_LABELS using a list comprehension.
CLASS_IDX,CLASS_LABELS = zip(*[(idx,label for idx,label in enumerate(RGB_CLASS_MAPPING)])
Related
I would like to downsample a 3d array by taking the most frequent value (mode) of the original values. After some research, I found the block_reduce function in skimage library. For example, if I wanted like to take the average of the block, I can do it easily:
from skimage.measure import block_reduce
image = np.arange(4*4*4).reshape(4, 4, 4)
new_image = block_reduce(image, block_size=(2,2,2), func=np.mean, cval=np.mean(grades))
In my case, I want to pass the func argument a mode function. However, numpy doesn't have a mode function. According to the documentation, the passed function should accept 'axis' as an argument. I tried some workarounds such as writing my own function and combining np.unique and np.argmax, as well as passing scipy.stats.mode as the function. All of them failed.
I wrote some nested for loops to do this but it's way too slow with large arrays. Is there an easy way to do this? I don't necessarily need to use sci-kit image library.
Thank you in advance.
Let's start with the assumption that the input image shape is divisible by block_size, i.e. corresponding shape dimensions are divisible by each size parameter of block_size.
So, as pre-processing, we need to make blocks out off the input image, like so -
def blockify(image, block_size):
shp = image.shape
out_shp = [s//b for s,b in zip(shp, block_size)]
reshape_shp = np.c_[out_shp,block_size].ravel()
nC = np.prod(block_size)
return image.reshape(reshape_shp).transpose(0,2,4,1,3,5).reshape(-1,nC)
Next up, for our specific case of mode finding, we will use bincount2D_vectorized alongwith argmax -
# https://stackoverflow.com/a/46256361/ #Divakar
def bincount2D_vectorized(a):
N = a.max()+1
a_offs = a + np.arange(a.shape[0])[:,None]*N
return np.bincount(a_offs.ravel(), minlength=a.shape[0]*N).reshape(-1,N)
out = bincount2D_vectorized(blockify(image, block_size=(2,2,2))).argmax(1)
Alternatively, we can use n-dim mode -
out = mode(blockify(image, block_size=(2,2,2)), axis=1)[0]
Finally, if the initial assumption of divisibility doesn't hold true, we need to pad with the appropriate pad value. For the same, we can use np.pad, as part of the blockify method.
I am working on some code for converting an image to the palette of the NES. My current code is somewhat successful, but very very slow.
I am doing it by using Pythagoras' theorem. I'm using the RGB colour values as coordinates in 3D space and doing it that way. The colour in the palette with the smallest distance from the pixel's RGB is the colour that gets used.
class image_filter():
def load(self,path):
self.i = Image.open(path)
self.i = self.i.convert("RGB")
self.pix = self.i.load()
def colour_filter(self,colours=NES):
start = time.time()
for y in range(self.i.size[1]):
for x in range(self.i.size[0]):
pixel = list(self.pix[x,y])
distances = []
for colour in colours:
distance = ((colour[0]-pixel[0])**2)+((colour[1]-pixel[1])**2)+((colour[2]-pixel[2])**2)
distances.append(distance)
pixel = colours[distances.index(sorted(distances,key=lambda x:x)[0])]
self.pix[x,y] = tuple(pixel)
print "Took "+str(time.time()-start)+" seconds."
f = image_filter()
f.load("C:\\path\\to\\image.png")
f.colour_filter()
f.i.save("C:\\path\\to\\new\\image.png")
Using the list:
NES = [(124,124,124),(0,0,252),
(0,0,188),(68,40,188),
(148,0,132),(168,0,32),
(168,16,0),(136,20,0),
(80,48,0),(0,120,0),
(0,104,0),(0,88,0),
(0,64,88),(0,0,0),
(188,188,188),(0,120,248),
(0,88,248),(104,68,252),
(216,0,204),(228,0,88),
(248,56,0),(228,92,16),
(172,124,0),(0,184,0),
(0,168,0),(0,168,68),
(0,136,136),(248,248,248),
(60,188,252),(104,136,252),
(152,120,248),
(248,120,248),(248,88,152),
(248,120,88),(252,160,68),
(184,248,24),(88,216,84),
(88,248,152),(0,232,216),
(120,120,120),(252,252,252),(164,228,252),
(184,184,248),(216,184,248),
(248,184,248),(248,164,192),
(240,208,176),(252,224,168),
(248,216,120),(216,248,120),
(184,248,184),(184,248,216),
(0,252,252),(216,216,216)]
This produces the following Input:
and Output:
This takes between 14 and 20 seconds, which is much too long for its intended application. Does anyone know of any ways to greatly speed this up?
As an idea, I was thinking it may be possible to use numpy arrays for this; however I am not at all familiar enough with numpy arrays to be able to pull it off.
If possible, I would also like to try avoiding using scipy -- I know that, at least under Windows, it can be a pain to install and would prefer to steer clear.
Approach #1 : We could use Scipy's cdist to get the euclidean distances and then look for the min distance arg and thus select the appropriate colour.
Thus, with NumPy arrays as the inputs, we would have an implementation like so -
from scipy.spatial.distance import cdist
out = colours[cdist(pix.reshape(-1,3),colours).argmin(1)].reshape(pix.shape)
Approach #2 : Here's another approach with broadcasting and np.einsum -
subs = pix - colours[:,None,None]
out = colours[np.einsum('ijkl,ijkl->ijk',subs,subs).argmin(0)]
Interfacing between PIL/lists and NumPy arrays
To accept images read through PIL, use :
pix = np.asarray(Image.open('input_filename'))
To Use colours as array :
colours = np.asarray(NES)
# .... Use one of the listed approaches and get out as output array
To output the image :
i = Image.fromarray(out.astype('uint8'),'RGB')
i.save("output_filename")
Sample input, output using given colour palette NES -
I am working with some image processing routines, using binary images. In Matlab I can create a lookup table which provides the output for every possible 2^9=512 configurations of 3 x 3 neighbourhoods. That is, I can write a function func which produces a 0 or 1 for such a neighbourhood, and then create a lookup table with
lut = makelut(func,3)
(the "3" indicating the size of neighbourhood). Then that lookup table can be applied to my binary image im with
applylut(im, lut)
But how can I do the same thing in Python? There is an example given here:
http://pydoc.net/Python/scikits-image/0.4.2/skimage.morphology.skeletonize/
which certainly works, but seems very complicated, at least compared to Matlab's commands.
The filters defined in scipy.ndimage may be of use to you. If none of the pre-defined filters match your intent, you can apply a custom filter using
scipy.ndimage.generic_filter.
For example, you can reproduce the result shown on the Mathworks applylut doc page with:
import numpy as np
import scipy.ndimage as ndimage
from PIL import Image
filename = '/tmp/PerformErosionUsingA2by2NeighborhoodExample_01.png'
img = Image.open(filename).convert('L')
arr = np.array(img)
def func(x):
return (x==255).all()*255
arr2 = ndimage.generic_filter(arr, func, size=(2,2))
new_img = Image.fromarray(arr2.astype('uint8'), 'L')
new_img.save('/tmp/out.png')
PerformErosionUsingA2by2NeighborhoodExample_01.png:
out.png:
Note that in this case, ndimage.grey_erosion can produce the same result, and
since it is not calling a Python function once for every pixel, it's also a lot
faster:
arr3 = ndimage.grey_erosion(arr, size=(2,2))
print(np.allclose(arr2,arr3))
# True
Depending on the kind of computation you wish to perform in func, another faster alternative may be to express the result as a NumPy computation on slices. For example, the above grey_erosion could also be expressed as
arr4 = np.pad(arr.astype(bool), ((1,0),(1,0)), 'reflect')
arr4 = arr4[:-1,:-1] & arr4[1:,:-1] & arr4[:-1,1:] & arr4[1:,1:]
arr4 = arr4.astype('uint8')*255
assert np.allclose(arr3, arr4)
Again this is much faster than using generic_filter since here the computation is being performed on whole arrays rather than pixel-by-pixel.
I'm trying to improve the speed of a function that calculates the normalized cross-correlation between a search image and a template image by using the anfft module, which provides Python bindings for the FFTW C library and seems to be ~2-3x quicker than scipy.fftpack for my purposes.
When I take the FFT of my template, I need the result to be padded to the same size as my search image so that I can convolve them. Using scipy.fftpack.fftn I would just use the shape parameter to do padding/truncation, but anfft.fftn is more minimalistic and doesn't do any zero-padding itself.
When I try and do the zero padding myself, I get a very different result to what I get using shape. This example uses just scipy.fftpack, but I have the same problem with anfft:
import numpy as np
from scipy.fftpack import fftn
from scipy.misc import lena
img = lena()
temp = img[240:281,240:281]
def procrustes(a,target,padval=0):
# Forces an array to a target size by either padding it with a constant or
# truncating it
b = np.ones(target,a.dtype)*padval
aind = [slice(None,None)]*a.ndim
bind = [slice(None,None)]*a.ndim
for dd in xrange(a.ndim):
if a.shape[dd] > target[dd]:
diff = (a.shape[dd]-b.shape[dd])/2.
aind[dd] = slice(np.floor(diff),a.shape[dd]-np.ceil(diff))
elif a.shape[dd] < target[dd]:
diff = (b.shape[dd]-a.shape[dd])/2.
bind[dd] = slice(np.floor(diff),b.shape[dd]-np.ceil(diff))
b[bind] = a[aind]
return b
# using scipy.fftpack.fftn's shape parameter
F1 = fftn(temp,shape=img.shape)
# doing my own zero-padding
temp_padded = procrustes(temp,img.shape)
F2 = fftn(temp_padded)
# these results are quite different
np.allclose(F1,F2)
I suspect I'm probably making a very basic mistake, since I'm not overly familiar with the discrete Fourier transform.
Just do the inverse transform and you'll see that scipy does slightly different padding (only to top and right edges):
plt.imshow(ifftn(fftn(procrustes(temp,img.shape))).real)
plt.imshow(ifftn(fftn(temp,shape=img.shape)).real)
So I have an array (it's large - 2048x2048), and I would like to do some element wise operations dependent on where they are. I'm very confused how to do this (I was told not to use for loops, and when I tried that my IDE froze and it was going really slow).
Onto the question:
h = aperatureimage
h[:,:] = 0
indices = np.where(aperatureimage>1)
for True in h:
h[index] = np.exp(1j*k*z)*np.exp(1j*k*(x**2+y**2)/(2*z))/(1j*wave*z)
So I have an index, which is (I'm assuming here) essentially a 'cropped' version of my larger aperatureimage array. *Note: Aperature image is a grayscale image converted to an array, it has a shape or text on it, and I would like to find all the 'white' regions of the aperature and perform my operation.
How can I access the individual x/y values of index which will allow me to perform my exponential operation? When I try index[:,None], leads to the program spitting out 'ValueError: broadcast dimensions too large'. I also get array is not broadcastable to correct shape. Any help would be appreciated!
One more clarification: x and y are the only values I would like to change (essentially the points in my array where there is white, z, k, and whatever else are defined previously).
EDIT:
I'm not sure the code I posted above is correct, it returns two empty arrays. When I do this though
index = (aperatureimage==1)
print len(index)
Actually, nothing I've done so far works correctly. I have a 2048x2048 image with a 128x128 white square in the middle of it. I would like to convert this image to an array, look through all the values and determine the index values (x,y) where the array is not black (I only have white/black, bilevel image didn't work for me). I would then like to take all the values (x,y) where the array is not 0, and multiply them by the h[index] value listed above.
I can post more information if necessary. If you can't tell, I'm stuck.
EDIT2: Here's some code that might help - I think I have the problem above solved (I can now access members of the array and perform operations on them). But - for some reason the Fx values in my for loop never increase, it loops Fy forever....
import sys, os
from scipy.signal import *
import numpy as np
import Image, ImageDraw, ImageFont, ImageOps, ImageEnhance, ImageColor
def createImage(aperature, type):
imsize = aperature*8
middle = imsize/2
im = Image.new("L", (imsize,imsize))
draw = ImageDraw.Draw(im)
box = ((middle-aperature/2, middle-aperature/2), (middle+aperature/2, middle+aperature/2))
import sys, os
from scipy.signal import *
import numpy as np
import Image, ImageDraw, ImageFont, ImageOps, ImageEnhance, ImageColor
def createImage(aperature, type):
imsize = aperature*8 #Add 0 padding to make it nice
middle = imsize/2 # The middle (physical 0) of our image will be the imagesize/2
im = Image.new("L", (imsize,imsize)) #Make a grayscale image with imsize*imsize pixels
draw = ImageDraw.Draw(im) #Create a new draw method
box = ((middle-aperature/2, middle-aperature/2), (middle+aperature/2, middle+aperature/2)) #Bounding box for aperature
if type == 'Rectangle':
draw.rectangle(box, fill = 'white') #Draw rectangle in the box and color it white
del draw
return im, middle
def Diffraction(aperaturediameter = 1, type = 'Rectangle', z = 2000000, wave = .001):
# Constants
deltaF = 1/8 # Image will be 8mm wide
z = 1/3.
wave = 0.001
k = 2*pi/wave
# Now let's get to work
aperature = aperaturediameter * 128 # Aperaturediameter (in mm) to some pixels
im, middle = createImage(aperature, type) #Create an image depending on type of aperature
aperaturearray = np.array(im) # Turn image into numpy array
# Fourier Transform of Aperature
Ta = np.fft.fftshift(np.fft.fft2(aperaturearray))/(len(aperaturearray))
# Transforming and calculating of Transfer Function Method
H = aperaturearray.copy() # Copy image so H (transfer function) has the same dimensions as aperaturearray
H[:,:] = 0 # Set H to 0
U = aperaturearray.copy()
U[:,:] = 0
index = np.nonzero(aperaturearray) # Find nonzero elements of aperaturearray
H[index[0],index[1]] = np.exp(1j*k*z)*np.exp(-1j*k*wave*z*((index[0]-middle)**2+(index[1]-middle)**2)) # Free space transfer for ap array
Utfm = abs(np.fft.fftshift(np.fft.ifft2(Ta*H))) # Compute intensity at distance z
# Fourier Integral Method
apindex = np.nonzero(aperaturearray)
U[index[0],index[1]] = aperaturearray[index[0],index[1]] * np.exp(1j*k*((index[0]-middle)**2+(index[1]-middle)**2)/(2*z))
Ufim = abs(np.fft.fftshift(np.fft.fft2(U))/len(U))
# Save image
fim = Image.fromarray(np.uint8(Ufim))
fim.save("PATH\Fim.jpg")
ftfm = Image.fromarray(np.uint8(Utfm))
ftfm.save("PATH\FTFM.jpg")
print "that may have worked..."
return
if __name__ == '__main__':
Diffraction()
You'll need numpy, scipy, and PIL to work with this code.
When I run this, it goes through the code, but there is no data in them (everything is black). Now I have a real problem here as I don't entirely understand the math I'm doing (this is for HW), and I don't have a firm grasp on Python.
U[index[0],index[1]] = aperaturearray[index[0],index[1]] * np.exp(1j*k*((index[0]-middle)**2+(index[1]-middle)**2)/(2*z))
Should that line work for performing elementwise calculations on my array?
Could you perhaps post a minimal, yet complete, example? One that we can copy/paste and run ourselves?
In the meantime, in the first two lines of your current example:
h = aperatureimage
h[:,:] = 0
you set both 'aperatureimage' and 'h' to 0. That's probably not what you intended. You might want to consider:
h = aperatureimage.copy()
This generates a copy of aperatureimage while your code simply points h to the same array as aperatureimage. So changing one changes the other.
Be aware, copying very large arrays might cost you more memory then you would prefer.
What I think you are trying to do is this:
import numpy as np
N = 2048
M = 64
a = np.zeros((N, N))
a[N/2-M:N/2+M,N/2-M:N/2+M]=1
x,y = np.meshgrid(np.linspace(0, 1, N), np.linspace(0, 1, N))
b = a.copy()
indices = np.where(a>0)
b[indices] = np.exp(x[indices]**2+y[indices]**2)
Or something similar. This, in any case, sets some values in 'b' based on the x/y coordinates where 'a' is bigger than 0. Try visualizing it with imshow. Good luck!
Concerning the edit
You should normalize your output so it fits in the 8 bit integer. Currently, one of your arrays has a maximum value much larger than 255 and one has a maximum much smaller. Try this instead:
fim = Image.fromarray(np.uint8(255*Ufim/np.amax(Ufim)))
fim.save("PATH\Fim.jpg")
ftfm = Image.fromarray(np.uint8(255*Utfm/np.amax(Utfm)))
ftfm.save("PATH\FTFM.jpg")
Also consider np.zeros_like() instead of copying and clearing H and U.
Finally, I personally very much like working with ipython when developing something like this. If you put the code in your Diffraction function in the top level of your script (in place of 'if __ name __ &c.'), then you can access the variables directly from ipython. A quick command like np.amax(Utfm) would show you that there are indeed values!=0. imshow() is always nice to look at matrices.