I have a 3D numpy array of shape (3,3,3). I would like to obtain indices of maximum values in a plane,"plane" according to me is as follows:
a = np.random.rand(3,3,3)
>>> a[:,:,0]
array([[0.98423332, 0.44410844, 0.06945133],
[0.69876575, 0.87411547, 0.53595041],
[0.53418486, 0.16186808, 0.60579623]])
>>> a[:,:,1]
array([[0.38969199, 0.80202126, 0.62189662],
[0.66609605, 0.09771614, 0.74061269],
[0.77081531, 0.20068743, 0.72762023]])
>>> a[:,:,2]
array([[0.57110332, 0.29021439, 0.15433043],
[0.21762439, 0.93112448, 0.05763075],
[0.77880124, 0.36637245, 0.29070822]])
I have a solution but I would like to have something shorter and quicker without for loops, my solution is as belows:
for i in range(3):
x=a[:,:,i].argmax()/3
y=a[:,:,i].argmax()%3
z=i
print(x,y,z)
print a[x][y][z]
(0, 0, 0)
0.9842333247061394
(0, 1, 1)
0.8020212566990867
(1, 1, 2)
0.9311244845473187
We simply need to reshape the input array to 2D by merging the last two axes and then applying argmax along the second one i.e. the merged one to give ourselves a vectorized approach -
def argmax_each_plane(a):
a2D = a.reshape(a.shape[0],-1)
idx = a2D.argmax(1)
indices = np.unravel_index(idx, a.shape[1:])
vals = a2D[np.arange(len(idx)), idx]
return vals, np.c_[indices]
Sample run -
In [60]: np.random.seed(0)
...: a = np.random.rand(3,3,3)
In [61]: a
Out[61]:
array([[[0.5488135 , 0.71518937, 0.60276338],
[0.54488318, 0.4236548 , 0.64589411],
[0.43758721, 0.891773 , 0.96366276]],
[[0.38344152, 0.79172504, 0.52889492],
[0.56804456, 0.92559664, 0.07103606],
[0.0871293 , 0.0202184 , 0.83261985]],
[[0.77815675, 0.87001215, 0.97861834],
[0.79915856, 0.46147936, 0.78052918],
[0.11827443, 0.63992102, 0.14335329]]])
In [62]: v, ind = argmax_each_plane(a)
In [63]: v
Out[63]: array([0.96366276, 0.92559664, 0.97861834])
In [64]: ind
Out[64]:
array([[2, 2],
[1, 1],
[0, 2]])
If you need z indices as well, use : np.c_[indices[0], indices[1], range(len(a2D))].
Related
Suppose I have an n-D array A, with an arbitrary n:
a = np.arange(3*4*5*...).reshape((3, 4, 5, ...))
...And I have an array of indices (exactly this shape or transposed):
ix = np.array([
[0,1,1,...],
[0,0,1,...],
[1,0,1,...],
])
What is the fastest way of getting an array of the elements from a at indices from ix ?
(e.g. the fastest way of obtaining :
def take_at(a, ix):
res = np.empty((ix.shape[0],*a.shape[ix.shape[1]:]))
for i in range(ix.shape[0]):
res[i,...] = a[(*ix[i],)]
return res
...Using (if possible) only numpy vectorized function / loop / operations ?)
simple copy-pastable example to test : :
a = np.arange(3*4*5).reshape((3,4,5))
ix = np.array([
[0,1,1],
[0,0,1],
[1,0,1],
])
def take_at(a, ix):
res = np.empty((ix.shape[0],*a.shape[ix.shape[1]:]), dtype=a.dtype)
for i in range(ix.shape[0]):
res[i,...] = a[(*ix[i],)]
return res
take_at(a, ix)
You need to pack ix into a tuple:
take_at(a, ix)
Out[]: array([ 6, 1, 21])
a[tuple(ix.T)]
Out[]: array([ 6, 1, 21])
You can do:
a[ix[:,0],ix[:,1],ix[:,2]]
Output:
array([ 6, 1, 21])
weight = np.array([[[ 0.38932115, -0.27430567]],
[[-0.04543304, -0.05643598]],
[[ 0.46912688, -0.07695298]]])
data = np.array([[-0.2056065, 0.7889058]])
like,
data = np.array([[1, 2, 3], [4, 5, 6]])
I want to take the dot product of the row in data with each row in weight, how could I accomplish this? I tried tensordot but it seems a bit convoluted / non-obvious the way axes works. Is there an easier way?
Your use of row and vector is a bit ambiguous:
In [7]: weight = np.array([[[ 0.38932115, -0.27430567]],
...:
...: [[-0.04543304, -0.05643598]],
...:
...: [[ 0.46912688, -0.07695298]]])
...:
...: data = np.array([[-0.2056065, 0.7889058]])
In [8]: weight.shape
Out[8]: (3, 1, 2)
In [9]: data.shape
Out[9]: (1, 2)
Are your rows of shape (2,) or (1,2)?
dot is a 'sum of products' function, but sum on which axis?
With einsum we can control the sum axis.
Sum on both the 1 and 2's:
In [11]: np.einsum('ijk,jk',weight, data)
Out[11]: array([-0.29644829, -0.03518134, -0.15716419]) # shape (3,)
or just the 1's:
In [12]: np.einsum('ijk,jm',weight, data)
Out[12]:
array([[[-0.08004696, 0.30713771],
[ 0.05639903, -0.21640133]],
[[ 0.00934133, -0.03584239],
[ 0.0116036 , -0.04452267]],
[[-0.09645554, 0.37009692],
[ 0.01582203, -0.06070865]]])
In [13]: _.shape
Out[13]: (3, 2, 2)
Or just the 2's:
In [14]: np.einsum('ijk,mk',weight, data)
Out[14]:
array([[[-0.29644829]],
[[-0.03518134]],
[[-0.15716419]]])
In [16]: _.shape
Out[16]: (3, 1, 1)
matmul/# also does this sum - data.T changes the (1,2) array to a (2,1). This pairs the (3,1,2) with a (2,1) to fit the "Last A with the second to the last of B" rule for dot/#.
In [17]: weight # data.T
Out[17]:
array([[[-0.29644829]],
[[-0.03518134]],
[[-0.15716419]]])
You ask about a multidimensional data. Just what do you mean by that? It already is 2d. Do you mean a (n,2) array, or a (n,1,2)? What's the relation between this n dimension and the 3 dimension of weight? No hand waving please :)
You can also use np.apply_along_axis
np.apply_along_axis(lambda x:np.dot(x,data.T),2,weight)
which gives
array([[[-0.29644829]],
[[-0.03518134]],
[[-0.15716419]]])
If data contains more than one row, this will also work, for example
weight = np.array([[[ 0.38932115, -0.27430567]],
[[-0.04543304, -0.05643598]],
[[ 0.46912688, -0.07695298]]])
data = np.array([[-0.2056065, 0.7889058],[-0.2056065, 0.7889058]])
np.apply_along_axis(lambda x:np.dot(x,data.T),2,weight)
gives you
array([[[-0.29644829, -0.29644829]],
[[-0.03518134, -0.03518134]],
[[-0.15716419, -0.15716419]]])
First transpose data before taking the dot product.
>>> weight.dot(data.T)
array([[[-0.29644829]],
[[-0.03518134]],
[[-0.15716419]]])
# Multiple rows of data.
data = np.array([[-0.2056065, 0.7889058],
[0.7889058, -.2056065]])
>>> weight.dot(data.T)
array([[[-0.29644829, 0.36353674]],
[[-0.03518134, -0.02423878]],
[[-0.15716419, 0.38591895]]])
I am new to python/numpy.
I need to do the following calculation:
for an array of discrete times t, calculate $e^{At}$ for a $2\times 2$ matrix $A$
What I did:
def calculate(t_,x_0,v_0,omega_0,c):
# define A
a_11,a_12, a_21, a_22=0,1,-omega_0^2,-c
A =np.matrix([[a_11,a_12], [a_21, a_22]])
print A
# use vectorization
temps = np.array(t_)
A_ = np.array([A for k in range (1,n+1,1)])
temps*A_
x_=scipy.linalg.expm(temps*A)
v_=A*scipy.linalg.expm(temps*A)
return x_,v_
n=10
omega_0=1
c=1
x_0=1
v_0=1
t_ = [float(5*k*np.pi/n) for k in range (1,n+1,1)]
x_, v_ = calculate(t_,x_0,v_0,omega_0,c)
However, I get this error when multiplying A_ (array containing n times A ) and temps (containg the times for which I want to calculate exp(At) :
ValueError: operands could not be broadcast together with shapes (10,) (10,2,2)
As I understand vectorization, each element in A_ would be multiplied by element at the same index from temps; but I think i don't get it right.
Any help/ comments much appreciated
A pure numpy calculation of t_ is (creates an array instead of a list):
In [254]: t = 5*np.arange(1,n+1)*np.pi/n
In [255]: t
Out[255]:
array([ 1.57079633, 3.14159265, 4.71238898, 6.28318531, 7.85398163,
9.42477796, 10.99557429, 12.56637061, 14.13716694, 15.70796327])
In [256]: a_11,a_12, a_21, a_22=0,1,-omega_0^2,-c
In [257]: a_11
Out[257]: 0
In [258]: A = np.array([[a_11,a_12], [a_21, a_22]])
In [259]: A
Out[259]:
array([[ 0, 1],
[-3, -1]])
In [260]: t.shape
Out[260]: (10,)
In [261]: A.shape
Out[261]: (2, 2)
In [262]: A_ = np.array([A for k in range (1,n+1,1)])
In [263]: A_.shape
Out[263]: (10, 2, 2)
A_ is np.ndarray. I made A a np.ndarray as well; yours is np.matrix, but your A_ will still be np.ndarray. np.matrix can only be 2d, where as A_ is 3d.
So t * A will be array elementwise multiplication, hence the broadcasting error, (10,) (10,2,2).
To do that elementwise multiplication right you need something like
In [264]: result = t[:,None,None]*A[None,:,:]
In [265]: result.shape
Out[265]: (10, 2, 2)
But if you want matrix multiplication of the (10,) with (10,2,2), then einsum does it easily:
In [266]: result1 = np.einsum('i,ijk', t, A_)
In [267]: result1
Out[267]:
array([[ 0. , 86.39379797],
[-259.18139392, -86.39379797]])
np.dot can't do it because its rule is 'last with 2nd to last'. tensordot can, but I'm more comfortable with einsum.
But that einsum expression makes it obvious (to me) that I can get the same thing from the elementwise *, by summing on the 1st axis:
In [268]: (t[:,None,None]*A[None,:,:]).sum(axis=0)
Out[268]:
array([[ 0. , 86.39379797],
[-259.18139392, -86.39379797]])
Or (t[:,None,None]*A[None,:,:]).cumsum(axis=0) to get a 2x2 for each time.
This is what I would do.
import numpy as np
from scipy.linalg import expm
A = np.array([[1, 2], [3, 4]])
for t in np.linspace(0, 5*np.pi, 20):
print(expm(t*A))
No attempt at vectorization here. The expm function applies to one matrix at a time, and it surely takes the bulk of computation time. No need to worry about the cost of multiplying A by a scalar.
I have a set of data in python likes:
x y angle
If I want to calculate the distance between two points with all possible value and plot the distances with the difference between two angles.
x, y, a = np.loadtxt('w51e2-pa-2pk.log', unpack=True)
n = 0
f=(((x[n])-x[n+1:])**2+((y[n])-y[n+1:])**2)**0.5
d = a[n]-a[n+1:]
plt.scatter(f,d)
There are 255 points in my data.
f is the distance and d is the difference between two angles.
My question is can I set n = [1,2,3,.....255] and do the calculation again to get the f and d of all possible pairs?
You can obtain the pairwise distances through broadcasting by considering it as an outer operation on the array of 2-dimensional vectors as follows:
vecs = np.stack((x, y)).T
np.linalg.norm(vecs[np.newaxis, :] - vecs[:, np.newaxis], axis=2)
For example,
In [1]: import numpy as np
...: x = np.array([1, 2, 3])
...: y = np.array([3, 4, 6])
...: vecs = np.stack((x, y)).T
...: np.linalg.norm(vecs[np.newaxis, :] - vecs[:, np.newaxis], axis=2)
...:
Out[1]:
array([[ 0. , 1.41421356, 3.60555128],
[ 1.41421356, 0. , 2.23606798],
[ 3.60555128, 2.23606798, 0. ]])
Here, the (i, j)'th entry is the distance between the i'th and j'th vectors.
The case of the pairwise differences between angles is similar, but simpler, as you only have one dimension to deal with:
In [2]: a = np.array([10, 12, 15])
...: a[np.newaxis, :] - a[: , np.newaxis]
...:
Out[2]:
array([[ 0, 2, 5],
[-2, 0, 3],
[-5, -3, 0]])
Moreover, plt.scatter does not care that the results are given as matrices, and putting everything together using the notation of the question, you can obtain the plot of angles by distances by doing something like
vecs = np.stack((x, y)).T
f = np.linalg.norm(vecs[np.newaxis, :] - vecs[:, np.newaxis], axis=2)
d = angle[np.newaxis, :] - angle[: , np.newaxis]
plt.scatter(f, d)
You have to use a for loop and range() to iterate over n, e.g. like like this:
n = len(x)
for i in range(n):
# do something with the current index
# e.g. print the points
print x[i]
print y[i]
But note that if you use i+1 inside the last iteration, this will already be outside of your list.
Also in your calculation there are errors. (x[n])-x[n+1:] does not work because x[n] is a single value in your list while x[n+1:] is a list starting from n+1'th element. You can not subtract a list from an int or whatever it is.
Maybe you will have to even use two nested loops to do what you want. I guess that you want to calculate the distance between each point so a two dimensional array may be the data structure you want.
If you are interested in all combinations of the points in x and y I suggest to use itertools, which will give you all possible combinations. Then you can do it like follows:
import itertools
f = [((x[i]-x[j])**2 + (y[i]-y[j])**2)**0.5 for i,j in itertools.product(255,255) if i!=j]
# and similar for the angles
But maybe there is even an easier way...
I have a matrix (2d numpy ndarray, to be precise):
A = np.array([[4, 0, 0],
[1, 2, 3],
[0, 0, 5]])
And I want to roll each row of A independently, according to roll values in another array:
r = np.array([2, 0, -1])
That is, I want to do this:
print np.array([np.roll(row, x) for row,x in zip(A, r)])
[[0 0 4]
[1 2 3]
[0 5 0]]
Is there a way to do this efficiently? Perhaps using fancy indexing tricks?
Sure you can do it using advanced indexing, whether it is the fastest way probably depends on your array size (if your rows are large it may not be):
rows, column_indices = np.ogrid[:A.shape[0], :A.shape[1]]
# Use always a negative shift, so that column_indices are valid.
# (could also use module operation)
r[r < 0] += A.shape[1]
column_indices = column_indices - r[:, np.newaxis]
result = A[rows, column_indices]
numpy.lib.stride_tricks.as_strided stricks (abbrev pun intended) again!
Speaking of fancy indexing tricks, there's the infamous - np.lib.stride_tricks.as_strided. The idea/trick would be to get a sliced portion starting from the first column until the second last one and concatenate at the end. This ensures that we can stride in the forward direction as needed to leverage np.lib.stride_tricks.as_strided and thus avoid the need of actually rolling back. That's the whole idea!
Now, in terms of actual implementation we would use scikit-image's view_as_windows to elegantly use np.lib.stride_tricks.as_strided under the hoods. Thus, the final implementation would be -
from skimage.util.shape import view_as_windows as viewW
def strided_indexing_roll(a, r):
# Concatenate with sliced to cover all rolls
a_ext = np.concatenate((a,a[:,:-1]),axis=1)
# Get sliding windows; use advanced-indexing to select appropriate ones
n = a.shape[1]
return viewW(a_ext,(1,n))[np.arange(len(r)), (n-r)%n,0]
Here's a sample run -
In [327]: A = np.array([[4, 0, 0],
...: [1, 2, 3],
...: [0, 0, 5]])
In [328]: r = np.array([2, 0, -1])
In [329]: strided_indexing_roll(A, r)
Out[329]:
array([[0, 0, 4],
[1, 2, 3],
[0, 5, 0]])
Benchmarking
# #seberg's solution
def advindexing_roll(A, r):
rows, column_indices = np.ogrid[:A.shape[0], :A.shape[1]]
r[r < 0] += A.shape[1]
column_indices = column_indices - r[:,np.newaxis]
return A[rows, column_indices]
Let's do some benchmarking on an array with large number of rows and columns -
In [324]: np.random.seed(0)
...: a = np.random.rand(10000,1000)
...: r = np.random.randint(-1000,1000,(10000))
# #seberg's solution
In [325]: %timeit advindexing_roll(a, r)
10 loops, best of 3: 71.3 ms per loop
# Solution from this post
In [326]: %timeit strided_indexing_roll(a, r)
10 loops, best of 3: 44 ms per loop
In case you want more general solution (dealing with any shape and with any axis), I modified #seberg's solution:
def indep_roll(arr, shifts, axis=1):
"""Apply an independent roll for each dimensions of a single axis.
Parameters
----------
arr : np.ndarray
Array of any shape.
shifts : np.ndarray
How many shifting to use for each dimension. Shape: `(arr.shape[axis],)`.
axis : int
Axis along which elements are shifted.
"""
arr = np.swapaxes(arr,axis,-1)
all_idcs = np.ogrid[[slice(0,n) for n in arr.shape]]
# Convert to a positive shift
shifts[shifts < 0] += arr.shape[-1]
all_idcs[-1] = all_idcs[-1] - shifts[:, np.newaxis]
result = arr[tuple(all_idcs)]
arr = np.swapaxes(result,-1,axis)
return arr
I implement a pure numpy.lib.stride_tricks.as_strided solution as follows
from numpy.lib.stride_tricks import as_strided
def custom_roll(arr, r_tup):
m = np.asarray(r_tup)
arr_roll = arr[:, [*range(arr.shape[1]),*range(arr.shape[1]-1)]].copy() #need `copy`
strd_0, strd_1 = arr_roll.strides
n = arr.shape[1]
result = as_strided(arr_roll, (*arr.shape, n), (strd_0 ,strd_1, strd_1))
return result[np.arange(arr.shape[0]), (n-m)%n]
A = np.array([[4, 0, 0],
[1, 2, 3],
[0, 0, 5]])
r = np.array([2, 0, -1])
out = custom_roll(A, r)
Out[789]:
array([[0, 0, 4],
[1, 2, 3],
[0, 5, 0]])
By using a fast fourrier transform we can apply a transformation in the frequency domain and then use the inverse fast fourrier transform to obtain the row shift.
So this is a pure numpy solution that take only one line:
import numpy as np
from numpy.fft import fft, ifft
# The row shift function using the fast fourrier transform
# rshift(A,r) where A is a 2D array, r the row shift vector
def rshift(A,r):
return np.real(ifft(fft(A,axis=1)*np.exp(2*1j*np.pi/A.shape[1]*r[:,None]*np.r_[0:A.shape[1]][None,:]),axis=1).round())
This will apply a left shift, but we can simply negate the exponential exponant to turn the function into a right shift function:
ifft(fft(...)*np.exp(-2*1j...)
It can be used like that:
# Example:
A = np.array([[1,2,3,4],
[1,2,3,4],
[1,2,3,4]])
r = np.array([1,-1,3])
print(rshift(A,r))
Building on divakar's excellent answer, you can apply this logic to 3D array easily (which was the problematic that brought me here in the first place). Here's an example - basically flatten your data, roll it & reshape it after::
def applyroll_30(cube, threshold=25, offset=500):
flattened_cube = cube.copy().reshape(cube.shape[0]*cube.shape[1], cube.shape[2])
roll_matrix = calc_roll_matrix_flattened(flattened_cube, threshold, offset)
rolled_cube = strided_indexing_roll(flattened_cube, roll_matrix, cube_shape=cube.shape)
rolled_cube = triggered_cube.reshape(cube.shape[0], cube.shape[1], cube.shape[2])
return rolled_cube
def calc_roll_matrix_flattened(cube_flattened, threshold, offset):
""" Calculates the number of position along time axis we need to shift
elements in order to trig the data.
We return a 1D numpy array of shape (X*Y, time) elements
"""
# armax(...) finds the position in the cube (3d) where we are above threshold
roll_matrix = np.argmax(cube_flattened > threshold, axis=1) + offset
# ensure we don't have index out of bound
roll_matrix[roll_matrix>cube_flattened.shape[1]] = cube_flattened.shape[1]
return roll_matrix
def strided_indexing_roll(cube_flattened, roll_matrix_flattened, cube_shape):
# Concatenate with sliced to cover all rolls
# otherwise we shift in the wrong direction for my application
roll_matrix_flattened = -1 * roll_matrix_flattened
a_ext = np.concatenate((cube_flattened, cube_flattened[:, :-1]), axis=1)
# Get sliding windows; use advanced-indexing to select appropriate ones
n = cube_flattened.shape[1]
result = viewW(a_ext,(1,n))[np.arange(len(roll_matrix_flattened)), (n - roll_matrix_flattened) % n, 0]
result = result.reshape(cube_shape)
return result
Divakar's answer doesn't do justice to how much more efficient this is on large cube of data. I've timed it on a 400x400x2000 data formatted as int8. An equivalent for-loop does ~5.5seconds, Seberg's answer ~3.0seconds and strided_indexing.... ~0.5second.