I am trying to index a large array, so that penultimately I can have a 4-d array with values to each of the points , I can do this in matlab using sub2ind, but I can't figure out how to do it in python, any help would be appreciated (I am also not sure if my indexing is right (I know matlab goes from 1, python goes from 0)
#Create the array
[Nx, Ny, Nz] = (60, 220, 85)
[I, J, K] = (np.arange(1,Nx+1,1),np.arange(1,Ny+1,1),np.arange(1,Nz+1,1))
[I, J, K] = np.meshgrid(I, J, K)
print([I])
ix=np.ravel_multi_index((Nx,Ny,Nz), (I[:], J[:], K[:]), order='F')
Thanks in advance
This is a 3d array
Seems to be working, think it was because of the indexing and didn't structure the arguments correctly
#Create the array
[Nx, Ny, Nz] = (60, 220, 85)
[I, J, K] = (np.arange(0,Nx,1),np.arange(0,Ny,1),np.arange(0,Nz,1))
[I, J, K] = np.meshgrid(I, J, K)
#Create the 1-d idexed array
ix = np.ravel_multi_index((I,J,K),(Nx,Ny,Nz),order='F')
print(ix)
Related
I am attempting a numpy.matmul call using as variables
Matrix A of dimensions (p, t, q)
Matrix B of dimensions (r, t).
A categories vector of shape r and p categories, used to take slices of B and define the index of A do use.
The multiplications are done iteratively using the indices of each category. For each category p_i, I extract from A a submatrix (t, q). Then, I multiply those with a subset of B (x, t), where x is a mask defined by r == p_i. Finally, the matrix multiplication of (x, t) and (t, q) produces the output (x, q) which is stored at S[x].
I have noted that I cannot figure out a non-iterative version of this algorithm. The first snippet describes an iterative solution. The second one is an attempt at what I would wish to get, where everything is calculated as a single-step and would be presumably faster. However, it is incorrect because matrix A has three dimensions instead of two. Maybe there is no way to do this in NumPy with a single call, and in general, looking for advice/ideas to try out.
Thanks!
import numpy as np
p, q, r, t = 2, 9, 512, 4
# data initialization (random)
np.random.seed(500)
S = np.random.rand(r, q)
A = np.random.randint(0, 3, size=(p, t, q))
B = np.random.rand(r, t)
categories = np.random.randint(0, p, r)
print('iterative') # iterative
for i in range(p):
# print(i)
a = A[i, :, :]
mask = categories == i
b = B[mask]
print(b.shape, a.shape, S[mask].shape,
np.matmul(b, a).shape)
S[mask] = np.matmul(b, a)
print(S.shape)
a simple way to write it down
S = np.random.rand(r, q)
print(A[:p,:,:].shape)
result = np.matmul(B, A[:p,:,:])
# iterative assignment
i = 0
S[categories == i] = result[i, categories == i, :]
i = 1
S[categories == i] = result[i, categories == i, :]
The next snippet will produce an error during the multiplication step.
# attempt to multiply once, indexing all categories only once (not possible)
np.random.seed(500)
S = np.random.rand(r, q)
# attempt to use the categories vector
a = A[categories, :, :]
b = B[categories]
# due to the shapes of the arrays, this multiplication is not possible
print('\nsingle step (error due to shapes of the matrix a')
print(b.shape, a.shape, S[categories].shape)
S[categories] = np.matmul(b, a)
print(scores.shape)
iterative
(250, 4) (4, 9) (250, 9) (250, 9)
(262, 4) (4, 9) (262, 9) (262, 9)
(512, 9)
single step (error due to shapes of the 2nd matrix a).
(512, 4) (512, 4, 9) (512, 9)
In [63]: (np.ones((512,4))#np.ones((512,4,9))).shape
Out[63]: (512, 512, 9)
This because the first array is broadcasted to (1,512,4). I think you want instead to do:
In [64]: (np.ones((512,1,4))#np.ones((512,4,9))).shape
Out[64]: (512, 1, 9)
Then remove the middle dimension to get a (512,9).
Another way:
In [72]: np.einsum('ij,ijk->ik', np.ones((512,4)), np.ones((512,4,9))).shape
Out[72]: (512, 9)
To remove the loop altogether, you can try this
bigmask = np.arange(p)[:, np.newaxis] == categories
C = np.matmul(B, A)
res = C[np.broadcast_to(bigmask[..., np.newaxis], C.shape)].reshape(r, q)
# `res` has the same rows as the iterative `S` but in the wrong order
# so we need to reorder the rows
sort_index = np.argsort(np.broadcast_to(np.arange(r), bigmask.shape)[bigmask])
assert np.allclose(S, res[sort_index])
Though I'm not sure it's much faster than the iterative version.
I am facing an error indicating that there are different array sizes when trying to insert values into a predefined 0 values array using the slice method and for loop. In the three examples array [start:stop], stop-start=5. But a problem occurs when the start and stop values are increased by +1 or decreased by -1, which is strange because as I mentioned the difference between start and stop is always the same.
Attached are three examples that I have tried, with only the third example working without a problem, while the other two show the following error:
ValueError: could not broadcast input array from shape (y,1) into shape (x,1)
# First example:
arr0 = np.array([38, 78, 118, 158, 158, 198, 238, 278])
arr1 = np.array([43, 83, 123, 163, 163, 203, 243, 283])
res1 = [np.array([0, -0.5, -1, -0.5, 0])[:, None], np.array([0, -1., -2., -1., 0])[:, None]]
nod1 = np.array([0, 0, 0, 0, 1, 1, 1, 1])
out1 = []
for i in range(0, 8):
print(i)
out1.append(np.zeros((281, 1)))
for i, j, k, l in zip(range(0, 8), nod1, arr0, arr1):
print(i, j, k, l)
out1[i][k:l] = res1[j]
"Error: ValueError: could not broadcast input array from shape (5,1) into shape (3,1)"
# Second example:
arr2 = arr0 - 1
arr3 = arr1 - 1
out1 = []
for i in range(0, 8):
print(i)
out1.append(np.zeros((281, 1)))
for i, j, k, l in zip(range(0, 8), nod1, arr2, arr3):
print(i, j, k, l)
out1[i][k:l] = res1[j]
"Error: ValueError: could not broadcast input array from shape (5,1) into shape (4,1)"
# Third example:
arr4 = arr0 - 2
arr5 = arr1 - 2
out1 = []
for i in range(0, 8):
print(i)
out1.append(np.zeros((281, 1)))
for i, j, k, l in zip(range(0, 8), nod1, arr4, arr5):
print(i, j, k, l)
out1[i][k:l] = res1[j]
"Passed, both sides have shape (5, 1)"
My question is what is causing this error to appear?
From what I read at Understanding slicing, one user has pointed out that:
Of course, if (stop-start)%stride != 0, then the end point will be a
little lower than start-1.
, but unfortunately I can't see this in my examples.
I would ask for your help and discussion on how to overcome this problem, because the [start:stop] values are important to me in order to determine the exact location to place new values in the predefined array with 0 values.
I need to multiply two 3D arrays in an usual way.
If needed to accomplish my task, I can ''afford'' to permute (change their shape) as I need as they are pretty small in size (less than (1_000, 200, 200) of np.complex128).
At the moment, I have the following inefficient triple nested for-loop:
import numpy as np
result = np.zeros( (640, 39, 20) )
a = np.random.rand(640, 640, 20)
b = np.random.rand(39, 640, 20)
for j in range(640):
for m in range(39):
for l in range(20):
result[j, m, l] = (a[j, :, l] * b[m, :, l]).sum()
How can I make the above as efficient as possible using numpy's magic?
I know I could use numba and hope that I beat numpy by using compiled code and parallel=True, but I want to see if numpy suffices for my task.
EDIT: Does it work for a more complex inner for loop as below?
for l in range(20):
for m in range(-l, l+1, 1):
for j in range(640):
result[j, m, l] = (a[j, :, l] * b[m, :, l]).sum()
After #hpaulj comment, I now understand that the above is not possible.
Thank you!
I have a functional evaluation which has many parameters, and I want to vectorize the evaluation. Something like this:
I = 100
J = 34
K = 6
i, j, k = array(range(I)), array(range(J)), array(range(K))
i, j, k = meshgrid(i, j, k)
f = myfun(i, j, k)
This is excellent, however, now I also have a parameter that I want to send to myfun that I generate with some other function and that is invariant over some of the indices above, thus:
p = my_param_gen()
and let's say
p.shape()
will output
(100, 6)
This would correspond to p being invariant over the index J. Now, I would like to expand the shape of p to be
(100, 34, 6)
in a meshgrid-kind of fashion so that the new dimension is filled constant with the old dimensions. How do I do this the best? The approach should work also with adding many new dimensions. I have seen numpy.expand_dims, but it does not do this.
Your
In [116]: i.shape
Out[116]: (34, 100, 6)
If p.shape is (100,6), then p will broadcast with i,j,k without further change. That is p[None,:,:] expansion is automatic.
If you'd used i, j, k = np.meshgrid(i, j, k, indexing='ij'),
In [121]: i.shape
Out[121]: (100, 34, 6)
And p[:,None,:] would be needed for broadcasting (equilvalently np.expand_dims(p,1))
To be more specific, here is the exact requirement. I'm not sure how to word the question.
I have an image, of size say (500,500). I extract only r and g channels
r = image[:, :, 0]
g = image[:, :, 1]
Then, I compute the 2D histogram of r and g
hist2d = np.histogram2d(r, g, bins=256, range=[(255,255),(255,255)])
Now, hist2d[0].shape is (256, 256) since It corresponds to every pair of 256x256 colors. Fine
The main requirement is, in an separate image, called result with same shape as original image i.e. (500, 500), I want to populate each element of result with the value of 2d histogram of r and g channels
For example, if r[200,200] is 23 and g[200, 200] is 26, I want to place result[200, 200] = hist2d[0][23, 26]
The naive method for doing this is, simple python loop.
for i in range(r.shape[0]):
for j in range(r.shape[1]):
result[i, j] = hist2d[0][r[i, j], g[i, j]]
But for a large image, this takes a significant time to compute. Is there a numpy way of doing this?
Thanks
just use hist2d[0][r, g]:
import numpy as np
r, g, b = np.random.randint(0, 256, size=(3, 500, 500)).astype(np.uint8)
hist2d = np.histogram2d(r.ravel(), g.ravel(), bins=256, range=[[0, 256], [0, 256]])
hist2d[0][r, g]