I have written the following code to scale an image to 50%. However, it took this algorithm 65 seconds to shrink a 3264x2448 image. Can someone who understands numpy explain why this algorithm is so inefficient and suggest more efficient changes?
def shrinkX2(im):
X, Y = im.shape[1] / 2, im.shape[0] / 2
new = np.zeros((Y, X, 3))
for y in range(Y):
for x in range(X):
new[y, x] = im[2*y:2*y + 2, 2*x:2*x + 2].reshape(4, 3).mean(axis=0)
return new
Going by the text of the question, it seems you are shrinking the image by 50% and by the code it seems, you are doing it in blocks. We can reshape to split each of the two axes of the 2D input by lengths as the required block sizes to get a 4D array and then compute mean along the axes corresponding to the block sizes, like so -
def block_mean(im, BSZ):
m,n = im.shape[:2]
return im.reshape(m//BSZ[0],BSZ[0],n//BSZ[1],BSZ[1],-1).mean((1,3))
Sample run -
In [44]: np.random.seed(0)
...: im = np.random.randint(0,9,(6,8,3))
In [45]: im[:2,:2,:].mean((0,1)) # average of first block across all 3 channels
Out[45]: array([3.25, 3.75, 3.5 ])
In [46]: block_mean(im, BSZ=(2,2))
Out[46]:
array([[[3.25, 3.75, 3.5 ],
[4. , 4.5 , 3.75],
[5.75, 2.75, 5. ],
[3. , 3.5 , 3.25]],
[[4. , 5.5 , 5.25],
[6.25, 1.75, 2. ],
[4.25, 2.75, 1.75],
[2. , 4.75, 3.75]],
[[3.25, 3.5 , 5.25],
[4.25, 1.5 , 5.25],
[3.5 , 3.5 , 4.25],
[0.75, 5. , 5.5 ]]])
Related
i have vectors of this form :
test=np.linspace(0,1,10)
i want to stack them horizontally in order to make a matrix .
problem is that i define them in a loop so the first stack is between an empty matrix and the first column vector , which gives the following error:
ValueError: all the input arrays must have same number of dimensions
bottom line - i have a for loop that with every iteration creates a vector p1 and i want to add it to a final matrix of the form :
[p1 p2 p3 p4] which i could then do matrix operations on such as multiplying by the transposed etc
If you've got a list of 1D arrays that you want horizontally stacked, you could convert them all to column first, but it's probably easier to just vertically stack them and then transpose:
In [6]: vector_list = [np.linspace(0, 1, 10) for _ in range(3)]
In [7]: np.vstack(vector_list).T
Out[7]:
array([[0. , 0. , 0. ],
[0.11111111, 0.11111111, 0.11111111],
[0.22222222, 0.22222222, 0.22222222],
[0.33333333, 0.33333333, 0.33333333],
[0.44444444, 0.44444444, 0.44444444],
[0.55555556, 0.55555556, 0.55555556],
[0.66666667, 0.66666667, 0.66666667],
[0.77777778, 0.77777778, 0.77777778],
[0.88888889, 0.88888889, 0.88888889],
[1. , 1. , 1. ]])
How did you get this dimension error? What does empty array have to do with it?
A list of arrays of the same length:
In [610]: alist = [np.linspace(0,1,6), np.linspace(10,11,6)]
In [611]: alist
Out[611]:
[array([0. , 0.2, 0.4, 0.6, 0.8, 1. ]),
array([10. , 10.2, 10.4, 10.6, 10.8, 11. ])]
Several ways of making an array from them:
In [612]: np.array(alist)
Out[612]:
array([[ 0. , 0.2, 0.4, 0.6, 0.8, 1. ],
[10. , 10.2, 10.4, 10.6, 10.8, 11. ]])
In [614]: np.stack(alist)
Out[614]:
array([[ 0. , 0.2, 0.4, 0.6, 0.8, 1. ],
[10. , 10.2, 10.4, 10.6, 10.8, 11. ]])
If you want to join them in columns, you can transpose one of the above, or use:
In [615]: np.stack(alist, axis=1)
Out[615]:
array([[ 0. , 10. ],
[ 0.2, 10.2],
[ 0.4, 10.4],
[ 0.6, 10.6],
[ 0.8, 10.8],
[ 1. , 11. ]])
np.column_stack is also handy.
In newer numpy versions you can do:
In [617]: np.linspace((0,10),(1,11),6)
Out[617]:
array([[ 0. , 10. ],
[ 0.2, 10.2],
[ 0.4, 10.4],
[ 0.6, 10.6],
[ 0.8, 10.8],
[ 1. , 11. ]])
You don't specify how you create the 'empty array' and how you attempt to stack. I can't exactly recreate the error message (full traceback would have helped). But given that message did you check the number of dimensions of the inputs? Did they match?
Array stacking in a loop is tricky. You have to pay close attention to the shapes, especially of the initial 'empty' array. There isn't a close analog to the empty list []. np.array([]) is 1d with shape (1,). np.empty((0,6)) is 2d with shape (0,6). Also all the stacking functions create a new array with each call (non operate in-place), so they are inefficient (compared to list append).
I have a function that creates a 2-dim array, a Vandermonde matrix and is called as:
vandermonde(generator, rank)
Where generator is a n-sized array for example
generator = np.array([-1/2, 1/2, 3/2, 5/2, 7/2, 9/2])
and rank=4
Then I need to create 4 Vandermonde matrices (because rank=4) skewed by h in my space (that h is arbitrary here, lets call h=1).
Therefore I came with the following deterministic code:
V = np.array([
vandermonde(generator-0*h, rank),
vandermonde(generator-1*h, rank),
vandermonde(generator-2*h, rank),
vandermonde(generator-3*h, rank)
])
Then I want instead do multiple manual calls to vandermonde I used a for-loop as in:
V=[]
for i in range(rank):
V.append(vandermonde(generator - h*i, rank))
V = np.array(V)
This approach works fine, but seems too "patchy". I tried a np.append approach as below:
M = np.array([])
for i in range(rank):
M = np.append(M,[vandermonde(generator - h*i, rank)])
But didn't worked as I expected, seems np.append expand the array instead to create a new element.
My questions are:
How can I not use standard Python lists, use directly a np approach cause np.append seems not behave as I expect, instead it just grow that array instead add a new array element
Is there any more direct numpy approaches to this?
My vandermonde function is:
def vandermonde(generator, rank=None):
"""Returns a vandermonde matrix
If rank not passwd returns a square vandermonde matrix
"""
if rank is None:
rank = len(generator)
return np.tile(generator,(rank,1)) ** np.array(range(rank)).reshape((rank,1))
The expected answer is a 3 dimensional array with size (generator, rank, rank) where each element is one of the generator skewed vandermonde matrices. For the constants above(generator, rank, h) we have:
V= array([[[ 1. , 1. , 1. , 1. , 1. , 1. ],
[ -0.5 , 0.5 , 1.5 , 2.5 , 3.5 , 4.5 ],
[ 0.25, 0.25, 2.25, 6.25, 12.25, 20.25],
[ -0.12, 0.12, 3.38, 15.62, 42.88, 91.12]],
[[ 1. , 1. , 1. , 1. , 1. , 1. ],
[ -1.5 , -0.5 , 0.5 , 1.5 , 2.5 , 3.5 ],
[ 2.25, 0.25, 0.25, 2.25, 6.25, 12.25],
[ -3.38, -0.12, 0.12, 3.38, 15.62, 42.88]],
[[ 1. , 1. , 1. , 1. , 1. , 1. ],
[ -2.5 , -1.5 , -0.5 , 0.5 , 1.5 , 2.5 ],
[ 6.25, 2.25, 0.25, 0.25, 2.25, 6.25],
[-15.62, -3.38, -0.12, 0.12, 3.38, 15.62]],
[[ 1. , 1. , 1. , 1. , 1. , 1. ],
[ -3.5 , -2.5 , -1.5 , -0.5 , 0.5 , 1.5 ],
[ 12.25, 6.25, 2.25, 0.25, 0.25, 2.25],
[-42.88, -15.62, -3.38, -0.12, 0.12, 3.38]]])
Some related ideas can be found in this discussion on: efficient-way-to-compute-the-vandermonde-matrix
Use broadcasting to get the final 3D array in a vectorized manner -
r = np.arange(rank)
V_out = (generator - h*r[:,None,None]) ** r[:,None]
We can also use cumprod to achieve the exponential values for another solution -
gr = np.repeat(generator - h*r[:,None,None], rank, axis=1)
gr[:,0] = 1
out = gr.cumprod(1)
There is scipy.misc.imresize for resampling the first two dimensions of 3D arrays. It also supports bilinear interpolation. However, there does not seem to be an existing function for resizing all dimensions of arrays with any number of dimensions. How can I resample any array given a new shape of the same rank, using multi-linear interpolation?
You want scipy.ndimage.zoom, which can be used as follows:
>>> x = np.arange(8, dtype=np.float_).reshape(2, 2, 2)
>>> scipy.ndimage.zoom(x, 1.5, order=1)
array([[[ 0. , 0.5, 1. ],
[ 1. , 1.5, 2. ],
[ 2. , 2.5, 3. ]],
[[ 2. , 2.5, 3. ],
[ 3. , 3.5, 4. ],
[ 4. , 4.5, 5. ]],
[[ 4. , 4.5, 5. ],
[ 5. , 5.5, 6. ],
[ 6. , 6.5, 7. ]]])
Note that this function always preserves the boundaries of the image, essentially resampling a mesh with a node at each pixel center. You might want to look at other functions in scipy.ndimage if you need more control over exactly where the resampling occurs
I am trying to generate a .wav file in python using Numpy. I have voltages ranging between 0-5V and I need to normalize them between -1 and 1 to use them in a .wav file.
I have seen this website which uses numpy to generate a wav file but the algorithm used to normalize is no long available.
Can anyone explain how I would go about generating these values in Python on my Raspberry Pi.
isn't this just a simple calculation? Divide by half the maximum value and minus 1:
In [12]: data=np.linspace(0,5,21)
In [13]: data
Out[13]:
array([ 0. , 0.25, 0.5 , 0.75, 1. , 1.25, 1.5 , 1.75, 2. ,
2.25, 2.5 , 2.75, 3. , 3.25, 3.5 , 3.75, 4. , 4.25,
4.5 , 4.75, 5. ])
In [14]: data/2.5-1.
Out[14]:
array([-1. , -0.9, -0.8, -0.7, -0.6, -0.5, -0.4, -0.3, -0.2, -0.1, 0. ,
0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 1. ])
The following function should do what you want, irrespective of the range of the input data, i.e., it works also if you have negative values.
import numpy as np
def my_norm(a):
ratio = 2/(np.max(a)-np.min(a))
#as you want your data to be between -1 and 1, everything should be scaled to 2,
#if your desired min and max are other values, replace 2 with your_max - your_min
shift = (np.max(a)+np.min(a))/2
#now you need to shift the center to the middle, this is not the average of the values.
return (a - shift)*ratio
my_norm(data)
You can use the fit_transform method in sklearn.preprocessing.StandardScaler. This method will remove the mean from your data and scale your array to unit variance (-1,1)
from sklearn.preprocessing import StandardScaler
data = np.asarray([[0, 0, 0],
[1, 1, 1],
[2,1, 3]])
data = StandardScaler().fit_transform(data)
And if you print out data, you will now have:
[[-1.22474487 -1.41421356 -1.06904497]
[ 0. 0.70710678 -0.26726124]
[ 1.22474487 0.70710678 1.33630621]]
I have a file in which I need to use the first column. The remaining columns need to be integrated with respect to the first. Lets say my file looks like this:
100 1.0 1.1 1.2 1.3 0.9
110 1.8 1.9 2.0 2.1 2.2
120 1.8 1.9 2.0 2.1 2.2
130 2.0 2.1 2.3 2.4 2.5
Could I write a piece of code that takes the second column and integrates with the first then the third and integrates with respect to the first and so on? For my code I have:
import scipy as sp
first_col=dat[:,0] #first column from data file
cols=dat[:,1:] #other columns from data file
col2 = cols[:,0] # gets the first column from variable cols
I = sp.integrate.cumtrapz(col2, first_col, initial = 0) #integration step
This works only for the first row from the variable col, however, I don't want to write this out for all the other columns, it would look discussing (the thought of it makes me shiver). I have seen similar questions but haven't been able to relate the answers to mine and the ones that are more or less the same have vague answers. Any ideas?
The function cumtrapz accepts an axis argument. For example, suppose you put your first column in x and the remaining columns in y, and they have these values:
In [61]: x
Out[61]: array([100, 110, 120, 130])
In [62]: y
Out[62]:
array([[ 1.1, 2.1, 2. , 1.1, 1.1],
[ 2. , 2.1, 1. , 1.2, 2.1],
[ 1.2, 1. , 1.1, 1. , 1.2],
[ 2. , 1.1, 1.2, 2. , 1.2]])
You can integrate each column of y with respect to x as follows:
In [63]: cumtrapz(y, x=x, axis=0, initial=0)
Out[63]:
array([[ 0. , 0. , 0. , 0. , 0. ],
[ 15.5, 21. , 15. , 11.5, 16. ],
[ 31.5, 36.5, 25.5, 22.5, 32.5],
[ 47.5, 47. , 37. , 37.5, 44.5]])