I have a numpy array of arrays, say 400x80. I want to turn it into an array 400x160 so that each item would be formed like this:
Here each frame of 80 is copied into the beginning of the next frame and the first frame gest 80 zeroes. So how to do such thing in numpy? Is there a mechanism that can generalize to three or more frames?
Lets assume that your data is in X, then
np.hstack((np.vstack((np.zeros(X.shape[1]), X[:-1])), X))
where:
np.vstack((np.zeros(X.shape[1]), X[:-1]))
creates the first column, we add a row of zeros, and cut the last row
and then with hstack we just combine the two "columns" together.
import numpy as np
X = np.random.normal(size=(400, 80))
print(np.hstack((np.vstack((np.zeros(X.shape[1]), X[:-1])), X)).shape)
gives (400, 160) as expected.
Or you can do things manually:
Y = []
previos = np.zeros(X.shape[1])
for row in X:
Y.append(np.vstack((previous, row)))
previous = row
Y = np.array(Y)
You are trying to create a sliding window. If you want a view that looks into the original buffer, indexing the same memory locations multiple times, you can make some adjustments.
m, n = a.shape
p = 2 * n
x = np.lib.stride_tricks.as_strided(a, shape=(m * ((n - 1) + p) // p, p), strides=a.strides)
This is the totally general approach. If you're guaranteed p % n == 0, then for k = p // n, you can do
x = np.lib.stride_tricks.as_strided(a, shape=(m - k + 1, n * k), strides=a.strides)
In either case, to avoid memory issues, you can use x.copy()
Related
I've written the following code for the N-dimensional Fast Fourier Transform but it doesn't give me the same result as numpy's function.
def nffourier(f, direct):
dim = f.ndim
N = f.shape
G = np.zeros(f.shape, dtype=complex)
G = f
for k in range(dim):
for i in range(N[k]):
aux = G[(slice(None),) * (k) + (i,)]
trans = ffourier(aux, direct)
G[(slice(None),) * (k) + (i,)] = trans
return G
My code for calculating FFT in 1d is the following:
def ffourier(f, direct):
N = len(f)
M = int(m.log(N)/m.log(2))
G = []
order = []
for i in range(N):
order.append(int(bin(i)[2:]))
digitos = len(aux)
for i in range(N):
contenido_aux = str(int(order[i]))
aux = len(str(order[i]))
if(aux<digitos):
añadir=digitos-aux
for k in range(añadir):
contenido_aux = '0'+contenido_aux
G.append(contenido_aux)
for i in range(len(G)):
G[i] = G[i][::-1]
for i in range(len(G)):
G[i] = int(G[i], 2)
for i in range(len(G)):
G[i] = f[G[i]]
if direct == False:
signo = 1
else:
signo = -1
kmax = 1
kmax = int(kmax)
for alfa in range(1,M+1):
w1 = np.exp(signo*1j*2*m.pi/(2**alfa))
kmax = int(2*kmax)
W = 1
for k in range(0, int(kmax/2)-1+1):
for s in range(0, N-1+1, int(kmax)):
T0 = G[s+k]
T1 = G[s+k+int(kmax/2)]*W
G[s+k]=T0+T1
G[s+k+int(kmax/2)]=T0-T1
W=W*w1
cte = 1/m.sqrt(N)
for i in range(0, N-1+1):
G[i] = G[i]*cte
return G
The fundamentals of it is quite hard to explain, it's based on bit inversion, but I've checked it works properly, so the problem is with the N dimensional code.
Your indexing G[(slice(None),) * (k) + (i,)] works in 2D but not in higher dimensions. Let’s see what it does:
Say G is 2D. Now when k=0, your indexing is the same as G[i], which is the same as G[i,:]. You are selecting rows. When k=1, then that indexing is G[:,i]. You are selecting columns.
But now say G is 3D. Now when k=0, you get G[i] again, which now is equivalent to G[i,:,:]. You are selecting a 2D subarray! What you need is a 1D subarray. You need to get G[i,j,:] for all i and all j. And then G[i,:,j], and then G[:,i,j].
Likewise, for a 5D array, you want G[i,j,k,l,:], etc. That is to say, you want to loop over all dimensions minus one.
To loop over all i and j, you could do a double loop, but then you have specific 3D code. It is possible to write a loop over an arbitrary number of dimensions, but it’s not pretty. So we’ll look for an alternative.
I think the simplest way to get this to work is to flatten those N-1 dimensions, turning a MxNxOxPxQ array into a 2D (N*M*O*P)xQ array. Now you can do a 1D loop over the first dimension.
Now you need to loop over the dimensions, it’s a different dimension that we leave out every time. We can simplify this problem by “rolling” the dimensions, make a different dimension the last one every time, then apply that same flattening code. Now it’s easy to write a loop (not tested):
def nffourier(f, direct):
dim = f.ndim
G = f.astype(complex)
for k in range(dim):
G = np.moveaxis(G, 0, -1) # shifts the dimensions by one to the left
shape = G.shape
m = shape[-1]
G = np.reshape(G, (-1, m)) # flattens all but last dimension
m = G.shape[0]
for i in range(m): # loop over first dimension
G[i, :] = ffourier(G[i, :], direct) # apply over last dimension
G = np.reshape(G, shape) # return to original shape
# After applying moveaxis dim times, G should have the same dimension order it had at the start
return G
(Note also, as we already discussed in the comments, that the G = f line causes the output array G to be of the same type as f, likely not complex, and so will cause errors also.)
I'm trying to sum a two dimensional function using the array method, somehow, using a for loop is not outputting the correct answer. I want to find (in latex) $$\sum_{i=1}^{M}\sum_{j=1}^{M_2}\cos(i)\cos(j)$$ where according to Mathematica the answer when M=5 is 1.52725. According to the for loop:
def f(N):
s1=0;
for p1 in range(N):
for p2 in range(N):
s1+=np.cos(p1+1)*np.cos(p2+1)
return s1
print(f(4))
is 0.291927.
I have thus been trying to use some code of the form:
def f1(N):
mat3=np.zeros((N,N),np.complex)
for i in range(0,len(mat3)):
for j in range(0,len(mat3)):
mat3[i][j]=np.cos(i+1)*np.cos(j+1)
return sum(mat3)
which again
print(f1(4))
outputs 0.291927. Looking at the array we should find for each value of i and j a matrix of the form
mat3=[[np.cos(1)*np.cos(1),np.cos(2)*np.cos(1),...],[np.cos(2)*np.cos(1),...]...[np.cos(N+1)*np.cos(N+1)]]
so for N=4 we should have
mat3=[[np.cos(1)*np.cos(1) np.cos(2)*np.cos(1) ...] [np.cos(2)*np.cos(1) ...]...[... np.cos(5)*np.cos(5)]]
but what I actually get is the following
mat3=[[0.29192658+0.j 0.+0.j 0.+0.j ... 0.+0.j] ... [... 0.+0.j]]
or a matrix of all zeros apart from the mat3[0][0] element.
Does anybody know a correct way to do this and get the correct answer? I chose this as an example because the problem I'm trying to solve involves plotting a function which has been summed over two indices and the function that python outputs is not the same as Mathematica (i.e., a function of the form $$f(E)=\sum_{i=1}^{M}\sum_{j=1}^{M_2}F(i,j,E)$$).
The return statement is not indented correctly in your sample code. It returns immediately in the first loop iteration. Indent it on the function body instead, so that both for loops finish:
def f(N):
s1=0;
for p1 in range(N):
for p2 in range(N):
s1+=np.cos(p1+1)*np.cos(p2+1)
return s1
>>> print(f(5))
1.527247272700347
I have moved your code to a more numpy-ish version:
import numpy as np
N = 5
x = np.arange(N) + 1
y = np.arange(N) + 1
x = x.reshape((-1, 1))
y = y.reshape((1, -1))
mat = np.cos(x) * np.cos(y)
print(mat.sum()) # 1.5272472727003474
The trick here is to reshape x to a column and y to a row vector. If you multiply them, they are matched up like in your loop.
This should be more performant, since cos() is only called 2*N times. And it avoids loops (bad in python).
UPDATE (regarding your comment):
This pattern can be extended in any dimension. Basically, you get something like a crossproduct. Where every instance of x is matched up with every instance of y, z, u, k, ... Along the corresponding dimensions.
It's a bit confusing to describe, so here is some more code:
import numpy as np
N = 5
x = np.arange(N) + 1
y = np.arange(N) + 1
z = np.arange(N) + 1
x = x.reshape((-1, 1, 1))
y = y.reshape((1, -1, 1))
z = z.reshape((1, 1, -1))
mat = z**2 * np.cos(x) * np.cos(y)
# x along first axis
# y along second, z along third
# mat[0, 0, 0] == 1**2 * np.cos(1) * np.cos(1)
# mat[0, 4, 2] == 3**2 * np.cos(1) * np.cos(5)
If you use this for many dimensions, and big values for N, you will run into memory problems, though.
I have written the following code for creating a 2D array and filing the first element of each row. I am new to numpy. Is there a better way to do this?
y=np.zeros(N*T1).reshape(N,T1)
x = np.linspace(0,L,num = N)
for k in range(0,N):
y[k][0] = np.sin(PI*x[k]/L)
Yes, since numpy vectorizes operations, you can just do:
y[:,0] = np.sin(np.pi * x / L)
Note that y[:,0] grabs the first column of y (the : in the first coordinate essentially means "grab all rows", and the 0 in the second coordinate means "from the column at index 0" (ie the first column)). Since np.sin(np.pi * x / L) is also an array, you can assign the latter to the former directly.
This question is rather for codereview#stackexchange, but this snippet works!
import numpy as np
N = 1000 # arbitrary
T1 = 1000 # arbitrary
L = 10 # arbitrary
x = np.linspace(0,L,num = N)
# you don't need reshape here, give the size as a tuple!
y = np.zeros((N,T1))
# use a vectorized call here:
y[:,0] = np.sin(np.pi*x/L)
I have a 2D array shaped (1002,1004). For this question it could be generated via:
a = numpy.arange( (1002 * 1004) ).reshape(1002, 1004)
What I do is generate two lists. The lists are generated via:
theta = (61/180.) * numpy.pi
x = numpy.arange(a.shape[0]) #(1002, )
y = numpy.arange(a.shape[1]) #(1004, )
max_y_for_angle = int(y[-1] - (x[-1] / numpy.tan(theta)))
The first list is given by:
x_list = numpy.linspace(0, x[-1], len(x))
Note that this list is identical to x. However, for illustration purposes and to give a clear picture I declared this 'list'.
What I now want to do is create a y_list which is as long as x_list. I want to use these lists to determine the elements in my 2D array. After I determine and store the sum of the elements, I want to shift my y_list by one and determine the sum of the elements again. I want to do this for max_y_for_angle iterations. The code I have is:
sum_list = numpy.zeros(max_y_for_angle)
for idx in range(max_y_for_angle):
y_list = numpy.linspace((len(x) / numpy.tan(theta)) + idx, y[0] + idx , len(x))
elements = 0
for i in range(len(x)):
elements += a[x_list[i]][y_list[i]]
sum_list[idx] = elements
This operation works. However, as one might imagine this takes a lot of time due to the for loop within a for loop. The number of iterations of the for loops do not help as well. How can I speed things up? The operation now takes about 1 s. I'm looking for something below 200 ms.
Is it maybe possible to return a list of the 2D array elements when the inputs are x_list and y_list? I tried the following but this does not work:
a[x_list][y_list]
Thank you very much!
It's possible to return an array of elements form a 2d array by doing a[x, y] where x and y are both integer arrays. This is called advanced indexing or sometimes fancy indexing. In your question you mention lists a lot but never actually use any lists in your code, x_list and y_list are both arrays. Also, numpy multidimensional arrays are generally indexed a[i, j] even when when i and j are integers values.
Using fancy indexing along with some clean up of you code produced this:
import numpy
def line_sums(a, thata):
xsize, ysize = a.shape
tan_theta = numpy.tan(theta)
max_y_for_angle = int(ysize - 1 - ((xsize - 1) / tan_theta))
x = numpy.arange(xsize)
y_base = numpy.linspace(xsize / tan_theta, 0, xsize)
y_base = y_base.astype(int)
sum_list = numpy.zeros(max_y_for_angle)
for idx in range(max_y_for_angle):
sum_list[idx] = a[x, y_base + idx].sum()
return sum_list
a = numpy.arange( (1002 * 1004) ).reshape(1002, 1004)
theta = (61/180.) * numpy.pi
sum_list = line_sums(a, theta)
Hope that helps.
In the example below I have a 2D array that has some real results that are shifted and padded. The shifts depend on the row (the padding is used to make the array rectangular as required by numpy). Is it possible to extract the real results without a Python loop?
import numpy as np
# results are 'shifted' where the shift depends on the row
shifts = np.array([0, 8, 4, 2], dtype=int)
max_shift = shifts.max()
n = len(shifts)
t = 10 # length of the real results we care about
a = np.empty((n, t + max_shift), dtype=int)
b = np.empty((n, t), dtype=int)
for i in range(n):
a[i] = np.concatenate([[0] * shifts[i], # shift
(i+1) * np.arange(1, t+1), # real data
[0] * (max_shift - shifts[i]) # padding
])
print "shifted and padded\n", a
# I'd like to remove this Python loop if possible
for i in range(n):
b[i] = a[i, shifts[i]:shifts[i] + t]
print "real data\n", b
You can use two array to get the data out:
a[np.arange(4)[:, None], shifts[:, None] + np.arange(10)]
or:
i, j = np.ogrid[:4, :10]
a[i, shifts[:, None]+j]
This is called Advanced indexing in NumPy document.