Numpy- How to create ROI iteratively in OpenCV python? - python

I am trying to split an image in a grid of smaller images so that I can process each small image separately. For that I realized that I'll have to define each small image as an ROI and I can use it easily from there.
Now, my grid size is not fixed. I.e, if user inputs 5, I have to make a grid of 5x5.
Iterating over the image pixel by pixel would be slow, so I decided to use Numpy to create ROI by using this construct :
#Assuming user entered grid size =5
roiwidth=w/5
roiheight=h/5
roi0=img[0:roiheight,0:roiwidth]
This would be my first slice. h and w are height and width of the image respectively. For the next slice I'd have to do:
roi1=img[0:roiheight,roiwidth+1:2*roiwidth]
While my last roi will be:
roi25=img[4*roiheight+1:5*roiheight, 4*roiwidth+1:5*roiwidth]
But I need to do it iteratively, and cannot figure out the correct way to do that. I don't want to iterate over the image pixel by pixel and need it to be dynamic
EDIT: I am iterating like this now:
import cv2
import numpy
img=cv2.imread('01.jpg')
h,w,chan=img.shape
rh=h/5
rw=w/5
z={}
count=0
for i in range (0,5):
for j in range (0,5):
yl=i*rh
yh=(i+1)*rh
xl=j*rw
xh=(j+1)*rw
z[count]=img[yl:yh,xl:xh]
count=count+1
But I don't know whether this is the most efficient way of doing this.

If you want to split your image using Numpy functions, take a look at numpy.array_split.
In your case you would write something like this:
z = {}
count = 0
split1 = np.array_split(img, rh)
for sub in split1:
split2 = np.array_split(sub, rw, 1)
for sub2 in split2:
z[count] = sub2
count++

For efficiency purposes, listed below is a vectorized approach using reshaping and permuting dimensions.
1) Let's define the input parameters and setup inputs :
M = 5 # Number of patches along height and width
img_slice = img[:rh*M,:rw*M] # Slice out valid image data
2) The main processing part comes here. Split the first two axes of sliced image such that we create two new axes of lengths M each by reshaping. Thus, the two remaining axes would represent the window (rh x rw). Our final aim is to bring them adjacent to each other so as to give us (rh,rw) patches and thus the other two split axes would also come next to each other. To do so, we need to permute dimensions with np.transpose. After permuting, we reshape to merge the two dimensions of lengths (M,M) so that we end up with one axis of length M^2, each of whose element would represent one window from the image.
So, finally we would have :
z = img_slice.reshape(M,rh,M,rw,-1).transpose(0,2,1,3,4).reshape(M**2,rh,rw,-1)
This gives us a NumPy array with M^2 elements along the first axis. Each slice along that axis would correspond to each window/patch. So, z[0] would be the top left corner patch and so on.

Related

Python Numpy in which Voxel is the Point?

I have a Point-Cloud saved in a Numpy-Array like this:
[[x1,y1,z1],[x2,y2,z2],....]
Now I want to create a voxel grid with a grid size that I can change. After that Iwanna know all the Voxel in which a Point is. Is there a Numpy Method that could help me do that fast, the only idea i had so far was by solving it with slow for-loops and numpy-mask.
If I understand you correcly one way you could populate a 3d numpy array with a set of 3d points is by using indexing.
This will work as long as your grid size is at least as big as your largest xyz value along each axis.
Note that your data will loose precision when voxelising in this way.
xyz = np.random.rand(50000, 3) * 100
voxel_size = 0.5 # each voxel will be half of the pointclouds unit along each axis
xyz_q = np.round(np.array(xyz/voxel_size)).astype(int) # quantized point values, here you will loose precision
vox_grid = np.zeros((int(100/voxel_size)+1, int(100/voxel_size)+1, int(100/voxel_size)+1)) #Empty voxel grid
vox_grid[xyz_q[:,0],xyz_q[:,1],xyz_q[:,2]] = 1 # Setting all voxels containitn a points equal to 1
xyz_v = np.asarray(np.where(vox_grid == 1)) # get back indexes of populated voxels

Optimize 4D Numpy array construction

I have a 4D array data of shape (50,8,2048,256) which are 50 groups containing 8 2048x256 pixel images. times is an array of shape (50,8) giving the time that each image was taken.
I calculate a 1st order polynomial fit at each pixel for all images in each group, giving me an array of shape (50,2048,256,2). This is essentially a vector plot for each of the 50 groups. The code I use to store the polynomials is:
fits = np.ones((50,2048,256,2))
times = times.reshape(50,8,1).repeat(2048,2).reshape(50,8,2048,1).repeat(256,3)
for group in range(50):
for xpos in range(2048):
for ypos in range(256):
px_data = data[:,:,ypos,xpos]
fits[group,ypos,xpos,:] = np.polyfit(times[group,:,ypos,xpos],data[group,:,ypos,xpos],1)
Now the challenge is that I want to generate an array new_data of shape (50,12,2048,256) where I use the polynomial coefficients from fits and the times from new_time to generate 50 groups of 12 images.
I figure I can use something like np.polyval(fits, new_time) to generate the images but I'm very confused with how to phrase it. It should be something like:
new_data = np.ones((50,12,2048,256))
for i,(times,fit) in enumerate(zip(new_times,fits)):
new_data[i] = np.polyval(fit,times)
But I'm getting broadcasting errors. Any assistance would be greatly appreciated!
Update
Ok, so I changed the code a bit so that it does work and do exactly what I want, but it is terribly slow with all these loops (~1 minute per group meaning this would take me almost an hour to run!). Can anyone suggest a way to optimize this to speed it up?
# Generate the polynomials for each pixel in each group
fits = np.ones((50,2048,256,2))
times = np.arange(0,50*8*grptme,grptme).reshape(50,8)
times = times.reshape(50,8,1).repeat(2048,2).reshape(50,8,2048,1).repeat(256,3)
for group in range(50):
for xpos in range(2048):
for ypos in range(256):
fits[group,xpos,ypos] = np.polyfit(times[group,:,xpos,ypos],data[group,:,xpos,ypos],1)
# Create new array of 12 images per group using the polynomials for each pixel
new_data = np.ones((50,12,2048,256))
times = np.arange(0,50*12*grptme,grptme).reshape(50,12)
times = times.reshape(50,12,1).repeat(2048,2).reshape(50,12,2048,1).repeat(256,3)
for group in range(50):
for img in range(12):
for xpos in range(2048):
for ypos in range(256):
new_data[group,img,xpos,ypos] = np.polynomial.polynomial.polyval(times[group,img,xpos,ypos],fits[group,xpos,ypos])
Regarding the speed I see a lot of loops which is what should and often can be avoided due to the beauty of numpy. If I understand your problem fully you want to fit a first order polynom on 50 groups of 8 data points 2048 * 256 times. So for the fit the shape of your image does not play a role. So my suggestion is to flatten your images because with np.polyfit you can fit for a range of x-values several sets of y-values at the same time
From the doc string
x : array_like, shape (M,)
x-coordinates of the M sample points ``(x[i], y[i])``.
y : array_like, shape (M,) or (M, K)
y-coordinates of the sample points. Several data sets of sample
points sharing the same x-coordinates can be fitted at once by
passing in a 2D-array that contains one dataset per column.
So I would go for
# Generate the polynomials for each pixel in each group
fits = np.ones((50,2048*256,2))
times = np.arange(0,50*8*grptme,grptme).reshape(50,8)
data_fit = data.reshape((50,8,2048*256))
for group in range(50):
fits[group] = np.polyfit(times[group],data_fit[group],1).T
fits_original_shape = fits.reshape((50,2048,256,2))
The transposing is necessary since you want to have the parameters in the last index, but np.polyfit has them first and then the different data sets
And then to evaluate it it is basically the same trick again:
# Create new array of 12 images per group using the polynomials for each pixel
new_data = np.zeros((50,12,2048*256))
times = np.arange(0,50*12*grptme,grptme).reshape(50,12)
#times = times.reshape(50,12,1).repeat(2048,2).reshape(50,12,2048,1).repeat(256,3)
for group in range(50):
new_data[group] = np.polynomial.polynomial.polyval(times[group],fits[group].T).T
new_data_original_shape = new_data.reshape((50,12,2048,256))
The two transposes are again needed due to the ordering of the parameters vs. the different data sets so that matches with the shapes of your arrays.
Probably one could also avoid with some advanced numpy magic the loop over the groups, but with this the code runs much faster already.
I hope it helps!

How to stack multiple images on top of each other using python or matlab?

How can I stack multiple images and save the new output image using python (or matlab)?
I need to set the alpha of each image and do i little translation, e.g.:
here's an example based on my comment:
mask=zeros(50,50,5);
for n=1:size(mask,3)
mask(randi(20):randi(20)+20,randi(20):randi(20)+20,n )=1;
mask(:,:,n)= bwperim( mask(:,:,n),8);
end
A=permute(mask,[3 2 1]);
% plottning
h=slice(A,[],1:5,[]);
set(h,'EdgeColor','none','FaceColor','interp');
alpha(0.3);
colormap(flipud(flag))
You could make such a stack of translated (shifted) images with Python, using the numpy and matplotlib module. Pillow (another Python module) by itself could probably do it as well, but I would have to look up how to ensure values of overlapping pixels get added, rather than overwritten.
So, here's a numpy + matplotlib solution, that starts off with a test image:
import numpy as np
import matplotlib.pyplot as plt
img1 = plt.imread('img.png')
For those following along, a very simply test image is shown at the end of this post, which will also serve to show the different options available for stacking (overwriting or additive which is weighted opacity with equal weights).
layers = 5 # How many images should be stacked.
x_offset, y_offset = 40, 20 # Number of pixels to offset each image.
new_shape = ((layers - 1)*y_offset + img1.shape[0],
(layers - 1)*x_offset + img1.shape[1],
4) # the last number, i.e. 4, refers to the 4 different channels, being RGB + alpha
stacked = np.zeros(new_shape, dtype=np.float)
for layer in range(layers):
stacked[layer*y_offset:layer*y_offset + img1.shape[0],
layer*x_offset:layer*x_offset + img1.shape[1],
...] += img1*1./layers
plt.imsave('stacked.png', stacked, vmin=0, vmax=1)
It's very simple really: you precalculate the size of the output image, initialize it to have full transparency and then you "drop" the base image in that file, each time offset by a certain offset vector. The interesting part comes when parts overlap. You then have some choices:
overwrite what was there before. In this case, change the += operator to simply =. Also, don't scale by the number of layers.
add in a weighted fashion. You should rescale all the intensity values in each channel by a certain weight (equal importance was taken in the example above) and then add those values. It is possible, depending a.o. on the weights, that you saturate pixels. You have the option then to clip the array (thereby resulting in loss of information) or simply rescale everything by the newly obtained maximum value. The example above uses clipping by specifying vmin and vmax in the call to imsave.
The test image shown here contains 4 transparent squares, but those are not easily distinguished from the 2 white ones in the top left row. They were added to illustrate the transparency addition and effect of rescaling (white becomes gray).
After running the above code, you end up with something like this (change your offsets though) ("add")
or like this ("overwrite")
There are a few more ways you can think of that reflect what you want to do when pixels overlap. The 2 situations here are probably the most common ones though. In any case, the approach laid out here should give you a good start.

Creating a grid in Fourier-space

I have a code which creates a square image with dimensions 4x4 arcsec running from -2 arcsec to +2 arcsec and is created on an 80x80 grid. To this I want to add another image.
This second image is created through a FFT of an 80x80 grid and thus starts out in Fourier space. After the FFT, I want the image to have exactly the same dimensions in real space as the first image.
Because Fourier space represents the scales and the wavenumber is defined as k = 2pi/x (although in this case the numpy.fft uses the definition where I think k = 1/x), I thought the largest scale would have to have the smallest k-value and the smallest scale the largest k-value.
So if x_max = 2 (the dimensions in the x-direction of the first image) and dim_x = 80 (the number of columns in the grid):
k_x,max = 1/(2*x_max/dim_x)
k_x,min = 1/(2*x_max)
and let the grid in Fourier-space run from k_x,min to k_x,max (same for the y-direction)
I hope I explained this clearly enough, but I haven't been able to find any confirmation or explanation for this in the literature about FFT's and would really like to know if this correct.
Thanks in advance
This is not correct. The k-space values will range from -N/2*omega_0 to (N-1)/2*omega_0, where omega_0 is the inverse of the sample length, given by 2*pi/(max(x)-min(x)) and N is the number of samples. So for your case you get something along the lines of this:
N = len(x)
dx = x[-1]-x[0]
k = np.linspace(-N*pi/dx, (N+1)*pi/dx, N)

How to plot an image with non-linear y-axis with Matplotlib using imshow?

How can I plot an 2D array as an image with Matplotlib having the y scale relative to the power of two of the y value?
For instance the first row of my array will have a height in the image of 1, the second row will have a height of 4, etc. (units are irrelevant)
It's not simple to explain with words so look at this image please (that's the kind of result I want):
alt text http://support.sas.com/rnd/app/da/new/802ce/iml/chap1/images/wavex1k.gif
As you can see the first row is 2 times smaller that the upper one, and so on.
For those interested in why I am trying to do this:
I have a pretty big array (10, 700000) of floats, representing the discrete wavelet transform coefficients of a sound file. I am trying to plot the scalogram using those coefficients.
I could copy the array x times until I get the desired image row size but the memory cannot hold so much information...
Have you tried to transform the axis? For example:
ax = subplot(111)
ax.yaxis.set_ticks([0, 2, 4, 8])
imshow(data)
This means there must be gaps in the data for the non-existent coordinates, unless there is a way to provide a transform function instead of just lists (never tried).
Edit:
I admit it was just a lead, not a complete solution. Here is what I meant in more details.
Let's assume you have your data in an array, a. You can use a transform like this one:
class arr(object):
#staticmethod
def mylog2(x):
lx = 0
while x > 1:
x >>= 1
lx += 1
return lx
def __init__(self, array):
self.array = array
def __getitem__(self, index):
return self.array[arr.mylog2(index+1)]
def __len__(self):
return 1 << len(self.array)
Basically it will transform the first coordinate of an array or list with the mylog2 function (that you can transform as you wish - it's home-made as a simplification of log2). The advantage is, you can re-use that for another transform should you need it, and you can easily control it too.
Then map your array to this one, which doesn't make a copy but a local reference in the instance:
b = arr(a)
Now you can display it, for example:
ax = subplot(111)
ax.yaxis.set_ticks([16, 8, 4, 2, 1, 0])
axis([-0.5, 4.5, 31.5, 0.5])
imshow(b, interpolation="nearest")
Here is a sample (with an array containing random values):
alt text http://img691.imageshack.us/img691/8883/clipboard01f.png
The best way I've found to make a scalogram using matplotlib is to use imshow, similar to the implementation of specgram. Using rectangles is slow, because you're having to make a separate glyph for each value. Similarly, you don't want to have to bake things into a uniform NumPy array, because you'll probably run out of memory fast, since your highest level is going to be about as long as half your signal.
Here's an example using SciPy and PyWavelets:
from pylab import *
import pywt
import scipy.io.wavfile as wavfile
# Find the highest power of two less than or equal to the input.
def lepow2(x):
return 2 ** floor(log2(x))
# Make a scalogram given an MRA tree.
def scalogram(data):
bottom = 0
vmin = min(map(lambda x: min(abs(x)), data))
vmax = max(map(lambda x: max(abs(x)), data))
gca().set_autoscale_on(False)
for row in range(0, len(data)):
scale = 2.0 ** (row - len(data))
imshow(
array([abs(data[row])]),
interpolation = 'nearest',
vmin = vmin,
vmax = vmax,
extent = [0, 1, bottom, bottom + scale])
bottom += scale
# Load the signal, take the first channel, limit length to a power of 2 for simplicity.
rate, signal = wavfile.read('kitten.wav')
signal = signal[0:lepow2(len(signal)),0]
tree = pywt.wavedec(signal, 'db5')
# Plotting.
gray()
scalogram(tree)
show()
You may also want to scale values adaptively per-level.
This works pretty well for me. The only problem I have is that matplotlib creates a hairline-thin space between levels. I'm still looking for a way to fix this.
P.S. - Even though this question is pretty old now, I figured I'd respond here, because this page came up on Google when I was looking for a method of creating scalograms using MPL.
You can look at matplotlib.image.NonUniformImage. But that only assists with having nonuniform axis - I don't think you're going to be able to plot adaptively like you want to (I think each point in the image is always going to have the same area - so you are going to have to have the wider rows multiple times). Is there any reason you need to plot the full array? Obviously the full detail isn't going to show up in any plot - so I would suggest heavily downsampling the original matrix so you can copy rows as required to get the image without running out of memory.
If you want both to be able to zoom and save memory, you could do the drawing "by hand". Matplotlib allows you to draw rectangles (they would be your "rectangular pixels"):
from matplotlib import patches
axes = subplot(111)
axes.add_patch(patches.Rectangle((0.2, 0.2), 0.5, 0.5))
Note that the extents of the axes are not set by add_patch(), but you can set them yourself to the values you want (axes.set_xlim,…).
PS: I looks to me like thrope's response (matplotlib.image.NonUniformImage) can actually do what you want, in a simpler way that the "manual" method described here!

Categories