Related
I have a 3D NumPy array arr. Here is an example:
>>> arr
array([[[0.05, 0.05, 0.9 ],
[0.4 , 0.5 , 0.1 ],
[0.7 , 0.2 , 0.1 ],
[0.1 , 0.2 , 0.7 ]],
[[0.98, 0.01, 0.01],
[0.2 , 0.3 , 0.95],
[0.33, 0.33, 0.34],
[0.33, 0.33, 0.34]]])
For each layer of the cube (i.e., for each matrix), I want to find the index of the column containing the largest number in the matrix. For example, let's take the first layer:
>>> arr[0]
array([[0.05, 0.05, 0.9 ],
[0.4 , 0.5 , 0.1 ],
[0.7 , 0.2 , 0.1 ],
[0.1 , 0.2 , 0.7 ]])
Here, the largest element is 0.9, and it can be found on the third column (i.e. index 2). In the second layer, instead, the max can be found on the first column (the largest number is 0.98, the column index is 0).
The expected result from the previous example is:
array([2, 0])
Here's what I have done so far:
tmp = arr.max(axis=-1)
argtmp = arr.argmax(axis=-1)
indices = np.take_along_axis(
argtmp,
tmp.argmax(axis=-1).reshape((arr.shape[0], -1)),
1,
).reshape(-1)
The code above works, but I'm wondering if it can be further simplified as it seems too much complicated from my point of view.
Find the maximum in each column before applying argmax:
arr.max(-2).argmax(-1)
Reducing the column to a single maximum value will not change which column has the largest value. Since you don't care about the row index, this saves you a lot of trouble.
We have a function f(x,y). We want to calculate the matrix Bij = f(xi,xj) = f(ih,jh) for 1 <= i,j <= n and h=1/(n+1), such as :
If f(x,y)=x+y, then Bij = ih+jh and the matrix becomes (here, n=3) :
I would like to program a function calculating the column vector b that concatenates all the columns of Bij. For example, with my previous example, we would have :
I done, we can change the function and n, here f(x,y)=x+y :
n=3
def f(i,j):
h=1.0/(n+1)
a=((i+1)*h)+((j+1)*h)
return a
B = np.fromfunction(f,(n,n))
print(B)
But I don't know how to do the vector b. And with
np.concatenate((B[:,0],B[:,1],B[:,2],B[:,3])
I get a line vector, and not a column vector. Could you help me ? Sorry for my bad english, and I'm beginner in Python.
The ravel function along with a new axis should do the trick:
import numpy as np
x = np.array([[0.5, 0.75, 1],
[0.75, 1, 1.25],
[1, 1.25, 1.5]])
x.T.ravel()[:, np.newaxis]
# array([[ 0.5 ],
# [ 0.75],
# [ 1. ],
# [ 0.75],
# [ 1. ],
# [ 1.25],
# [ 1. ],
# [ 1.25],
# [ 1.5 ]])
Ravel stitches together all the rows, so we first transpose the matrix (with .T). The result is a row-vector, and we change it to a column vector by adding a new axis.
import numpy as np
# create sample matrix `m`
m = np.matrix([[0.5, 0.75, 1], [0.75, 1, 1.25], [1, 1.25, 1.5]])
# convert matrix `m` to a 'flat' matrix
m_flat = m.flatten()
print(m_flat)
# `m_flat` is still a matrix, in case you need an array:
m_flat_arr = np.squeeze(np.asarray(m_flat))
print(m_flat_arr)
The snippet uses .flatten(), .asarray() and .squeeze() to convert the original matrix m being
matrix([[ 0.5 , 0.75, 1. ],
[ 0.75, 1. , 1.25],
[ 1. , 1.25, 1.5 ]])
into an array m_flat_arr of:
array([ 0.5 , 0.75, 1. , 0.75, 1. , 1.25, 1. , 1.25, 1.5 ])
I am trying to generate a .wav file in python using Numpy. I have voltages ranging between 0-5V and I need to normalize them between -1 and 1 to use them in a .wav file.
I have seen this website which uses numpy to generate a wav file but the algorithm used to normalize is no long available.
Can anyone explain how I would go about generating these values in Python on my Raspberry Pi.
isn't this just a simple calculation? Divide by half the maximum value and minus 1:
In [12]: data=np.linspace(0,5,21)
In [13]: data
Out[13]:
array([ 0. , 0.25, 0.5 , 0.75, 1. , 1.25, 1.5 , 1.75, 2. ,
2.25, 2.5 , 2.75, 3. , 3.25, 3.5 , 3.75, 4. , 4.25,
4.5 , 4.75, 5. ])
In [14]: data/2.5-1.
Out[14]:
array([-1. , -0.9, -0.8, -0.7, -0.6, -0.5, -0.4, -0.3, -0.2, -0.1, 0. ,
0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 1. ])
The following function should do what you want, irrespective of the range of the input data, i.e., it works also if you have negative values.
import numpy as np
def my_norm(a):
ratio = 2/(np.max(a)-np.min(a))
#as you want your data to be between -1 and 1, everything should be scaled to 2,
#if your desired min and max are other values, replace 2 with your_max - your_min
shift = (np.max(a)+np.min(a))/2
#now you need to shift the center to the middle, this is not the average of the values.
return (a - shift)*ratio
my_norm(data)
You can use the fit_transform method in sklearn.preprocessing.StandardScaler. This method will remove the mean from your data and scale your array to unit variance (-1,1)
from sklearn.preprocessing import StandardScaler
data = np.asarray([[0, 0, 0],
[1, 1, 1],
[2,1, 3]])
data = StandardScaler().fit_transform(data)
And if you print out data, you will now have:
[[-1.22474487 -1.41421356 -1.06904497]
[ 0. 0.70710678 -0.26726124]
[ 1.22474487 0.70710678 1.33630621]]
I have a matrix
A = np.array([[0.2, 0.4, 0.6],
[0.5, 0.5, 0.5],
[0.6, 0.4, 0.2]])
I want a new matrix, where the value of the entry in row i and column j is the product of all the entries of the ith row of A, except for the cell of that row in the jth column.
array([[ 0.24, 0.12, 0.08],
[ 0.25, 0.25, 0.25],
[ 0.08, 0.12, 0.24]])
The solution that first occurred to me was
np.repeat(np.prod(A, 1, keepdims = True), 3, axis = 1) / A
But this only works so long as no entries have values zero.
Any thoughts? Thank you!
Edit: I have developed
B = np.zeros((3, 3))
for i in range(3):
for j in range(3):
B[i, j] = np.prod(i, A[[x for x in range(3) if x != j]])
but surely there is a more elegant way to accomplish this, which makes use of numpy's efficient C backend instead of inefficient python loops?
If you're willing to tolerate a single loop:
B = np.empty_like(A)
for col in range(A.shape[1]):
B[:,col] = np.prod(np.delete(A, col, 1), 1)
That computes what you need, a single column at a time. It is not as efficient as theoretically possible because np.delete() creates a copy; if you care a lot about memory allocation, use a mask instead:
B = np.empty_like(A)
mask = np.ones(A.shape[1], dtype=bool)
for col in range(A.shape[1]):
mask[col] = False
B[:,col] = np.prod(A[:,mask], 1)
mask[col] = True
A variation on your solution using repeat, uses [:,None].
np.prod(A,axis=1)[:,None]/A
My 1st stab at handling 0s is:
In [21]: B
array([[ 0.2, 0.4, 0.6],
[ 0. , 0.5, 0.5],
[ 0.6, 0.4, 0.2]])
In [22]: np.prod(B,axis=1)[:,None]/(B+np.where(B==0,1,0))
array([[ 0.24, 0.12, 0.08],
[ 0. , 0. , 0. ],
[ 0.08, 0.12, 0.24]])
But as the comment pointed out; the [0,1] cell should be 0.25.
This corrects that problem, but now has problems when there are multiple 0s in a row.
In [30]: I=B==0
In [31]: B1=B+np.where(I,1,0)
In [32]: B2=np.prod(B1,axis=1)[:,None]/B1
In [33]: B3=np.prod(B,axis=1)[:,None]/B1
In [34]: np.where(I,B2,B3)
Out[34]:
array([[ 0.24, 0.12, 0.08],
[ 0.25, 0. , 0. ],
[ 0.08, 0.12, 0.24]])
In [55]: C
array([[ 0.2, 0.4, 0.6],
[ 0. , 0.5, 0. ],
[ 0.6, 0.4, 0.2]])
In [64]: np.where(I,sum1[:,None],sum[:,None])/C1
array([[ 0.24, 0.12, 0.08],
[ 0.5 , 0. , 0.5 ],
[ 0.08, 0.12, 0.24]])
Blaz Bratanic's epsilon approach is the best non iterative solution (so far):
In [74]: np.prod(C+eps,axis=1)[:,None]/(C+eps)
A different solution iterating over the columns:
def paulj(A):
P = np.ones_like(A)
for i in range(1,A.shape[1]):
P *= np.roll(A, i, axis=1)
return P
In [130]: paulj(A)
array([[ 0.24, 0.12, 0.08],
[ 0.25, 0.25, 0.25],
[ 0.08, 0.12, 0.24]])
In [131]: paulj(B)
array([[ 0.24, 0.12, 0.08],
[ 0.25, 0. , 0. ],
[ 0.08, 0.12, 0.24]])
In [132]: paulj(C)
array([[ 0.24, 0.12, 0.08],
[ 0. , 0. , 0. ],
[ 0.08, 0.12, 0.24]])
I tried some timings on a large matrix
In [13]: A=np.random.randint(0,100,(1000,1000))*0.01
In [14]: timeit paulj(A)
1 loops, best of 3: 23.2 s per loop
In [15]: timeit blaz(A)
10 loops, best of 3: 80.7 ms per loop
In [16]: timeit zwinck1(A)
1 loops, best of 3: 15.3 s per loop
In [17]: timeit zwinck2(A)
1 loops, best of 3: 65.3 s per loop
The epsilon approximation is probably the best speed we can expect, but has some rounding issues. Having to iterate over many columns hurts the speed. I'm not sure why the np.prod(A[:,mask], 1) approach is slowest.
eeclo https://stackoverflow.com/a/22441825/901925 suggested using as_strided. Here's what I think he has in mind (adapted from an overlapping block question, https://stackoverflow.com/a/8070716/901925)
def strided(A):
h,w = A.shape
A2 = np.hstack([A,A])
x,y = A2.strides
strides = (y,x,y)
shape = (w, h, w-1)
blocks = np.lib.stride_tricks.as_strided(A2[:,1:], shape=shape, strides=strides)
P = blocks.prod(2).T # faster to prod on last dim
# alt: shape = (w-1, h, w), and P=blocks.prod(0)
return P
Timing for the (1000,1000) array is quite an improvement over the column iterations, though still much slower than the epsilon approach.
In [153]: timeit strided(A)
1 loops, best of 3: 2.51 s per loop
Another indexing approach, while relatively straight forward, is slower, and produces memory errors sooner.
def foo(A):
h,w = A.shape
I = (np.arange(w)[:,None]+np.arange(1,w))
I1 = np.array(I)%w
P = A[:,I1].prod(2)
return P
Im on the run, so I do not have time to work out this solution; but what id do is create a contiguous circular view over the last axis, by means of concatenating the array to itself along the last axis, and then use np.lib.index_tricks.as_strided to select the appropriate elements to take an np.prod over. No python loops, no numerical approximation.
edit: here you go:
import numpy as np
A = np.array([[0.2, 0.4, 0.6],
[0.5, 0.5, 0.5],
[0.5, 0.0, 0.5],
[0.6, 0.4, 0.2]])
B = np.concatenate((A,A),axis=1)
C = np.lib.index_tricks.as_strided(
B,
A.shape +A.shape[1:],
B.strides+B.strides[1:])
D = np.prod(C[...,1:], axis=-1)
print D
Note: this method is not ideal, as it is O(n^3). See my other posted solution, which is O(n^2)
If you are willing to tolerate small error you could use the solution you first proposed.
A += 1e-10
np.around(np.repeat(np.prod(A, 1, keepdims = True), 3, axis = 1) / A, 9)
Here is an O(n^2) method without python loops or numerical approximation:
def double_cumprod(A):
B = np.empty((A.shape[0],A.shape[1]+1),A.dtype)
B[:,0] = 1
B[:,1:] = A
L = np.cumprod(B, axis=1)
B[:,1:] = A[:,::-1]
R = np.cumprod(B, axis=1)[:,::-1]
return L[:,:-1] * R[:,1:]
Note: it appears to be about twice as slow as the numerical approximation method, which is in line with expectation.
So I'm not the best at python but I need to create this program for one of my courses and I keep getting this error.
Basically I have w_array = linspace(0.6, 1.1, 11), then I have zq = array([1, 1, w_array, 1])
and it comes up with the error message:
ValueError: setting an array element with a sequence.
the basic function of the code is to take a bezier spline aerofoil, with control points and weights, run the data in xfoil and print cd and cl values, but this addition is to show a graph of the range of cd for a certain control point.
hope it makes sense, any help would be greatly appreciated.
If you want zq be an array containing both ints and lists, use parameter dtype:
In [300]: zq = array([1, 1, w_array, 1], dtype=object)
In [301]: zq
Out[301]:
array([1, 1,
array([ 0.6 , 0.65, 0.7 , 0.75, 0.8 , 0.85, 0.9 , 0.95, 1. ,
1.05, 1.1 ]),
1], dtype=object)
Is this your intended result?
In [2]:
numpy.hstack((1,1,numpy.linspace(0.6,1.1,11),1))
Out[2]:
array([ 1. , 1. , 0.6 , 0.65, 0.7 , 0.75, 0.8 , 0.85, 0.9 ,
0.95, 1. , 1.05, 1.1, 1. ])
You probably want the resulting array to have float64 dtypes rather than object, a mixed bag of dtypes, as #DSM pointed out.