I have a Numpy array of dimensions (d1,d2,d3,d4), for instance A = np.arange(120).reshape((2,3,4,5)).
I would like to contract it so as to obtain B of dimensions (d1,d2,d4).
The d3-indices of parts to pick are collected in an indexing array Idx of dimensions (d1,d2).
Idx provides, for each couple (x1,x2) of indices along (d1,d2), the index x3 for which B should retain the whole corresponding d4-line in A, for example Idx = rng.integers(4, size=(2,3)).
To sum up, for all (x1,x2), I want B[x1,x2,:] = A[x1,x2,Idx[x1,x2],:].
Is there an efficient, vectorized way to do that, without using a loop? I'm aware that this is similar to Easy way to do nd-array contraction using advanced indexing in Python but I have trouble extending the solution to higher dimensional arrays.
MWE
A = np.arange(120).reshape((2,3,4,5))
Idx = rng.integers(4, size=(2,3))
# correct result:
B = np.zeros((2,3,5))
for i in range(2):
for j in range(3):
B[i,j,:] = A[i,j,Idx[i,j],:]
# what I would like, which doesn't work:
B = A[:,:,Idx[:,:],:]
One way to do that is np.squeeze(np.take_along_axis(A, Idx[:,:,None,None], axis=2), axis=2).
For example,
In [49]: A = np.arange(120).reshape(2, 3, 4, 5)
In [50]: rng = np.random.default_rng(0xeeeeeeeeeee)
In [51]: Idx = rng.integers(4, size=(2,3))
In [52]: Idx
Out[52]:
array([[2, 0, 1],
[0, 2, 1]])
In [53]: C = np.squeeze(np.take_along_axis(A, Idx[:,:,None,None], axis=2), axis=2)
In [54]: C
Out[54]:
array([[[ 10, 11, 12, 13, 14],
[ 20, 21, 22, 23, 24],
[ 45, 46, 47, 48, 49]],
[[ 60, 61, 62, 63, 64],
[ 90, 91, 92, 93, 94],
[105, 106, 107, 108, 109]]])
Check the known correct result:
In [55]: # correct result:
...: B = np.zeros((2,3,5))
...: for i in range(2):
...: for j in range(3):
...: B[i,j,:] = A[i,j,Idx[i,j],:]
...:
In [56]: B
Out[56]:
array([[[ 10., 11., 12., 13., 14.],
[ 20., 21., 22., 23., 24.],
[ 45., 46., 47., 48., 49.]],
[[ 60., 61., 62., 63., 64.],
[ 90., 91., 92., 93., 94.],
[105., 106., 107., 108., 109.]]])
Times for 3 alternatives:
In [91]: %%timeit
...: B = np.zeros((2,3,5),A.dtype)
...: for i in range(2):
...: for j in range(3):
...: B[i,j,:] = A[i,j,Idx[i,j],:]
...:
11 µs ± 48.8 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each)
In [92]: timeit A[np.arange(2)[:,None],np.arange(3),Idx]
8.58 µs ± 44 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each)
In [94]: timeit np.squeeze(np.take_along_axis(A, Idx[:,:,None,None], axis=2), axis=2)
29.4 µs ± 448 ns per loop (mean ± std. dev. of 7 runs, 10,000 loops each)
Relative times may differ with larger arrays. But this is a good size for testing the correctness.
Related
I have an array of values arr with shape (N,) and an array of coordinates coords with shape (N,2). I want to represent this in an (M,M) array grid such that grid takes the value 0 at coordinates that are not in coords, and for the coordinates that are included it should store the sum of all values in arr that have that coordinate. So if M=3, arr = np.arange(4)+1, and coords = np.array([[0,0,1,2],[0,0,2,2]]) then grid should be:
array([[3., 0., 0.],
[0., 0., 3.],
[0., 0., 4.]])
The reason this is nontrivial is that I need to be able to repeat this step many times and the values in arr change each time, and so can the coordinates. Ideally I am looking for a vectorized solution. I suspect that I might be able to use np.where somehow but it's not immediately obvious how.
Timing the solutions
I have timed the solutions present at this time and it appear that the accumulator method is slightly faster than the sparse matrix method, with the second accumulation method being the slowest for the reasons explained in the comments:
%timeit for x in range(100): accumulate_arr(np.random.randint(100,size=(2,10000)),np.random.normal(0,1,10000))
%timeit for x in range(100): accumulate_arr_v2(np.random.randint(100,size=(2,10000)),np.random.normal(0,1,10000))
%timeit for x in range(100): sparse.coo_matrix((np.random.normal(0,1,10000),np.random.randint(100,size=(2,10000))),(100,100)).A
47.3 ms ± 1.79 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)
103 ms ± 255 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)
48.2 ms ± 36 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)
One way would be to create a sparse.coo_matrix and convert that to dense:
from scipy import sparse
sparse.coo_matrix((arr,coords),(M,M)).A
# array([[3, 0, 0],
# [0, 0, 3],
# [0, 0, 4]])
With np.bincount -
def accumulate_arr(coords, arr):
# Get output array shape
m,n = coords.max(1)+1
# Get linear indices to be used as IDs with bincount
lidx = np.ravel_multi_index(coords, (m,n))
# Or lidx = coords[0]*(coords[1].max()+1) + coords[1]
# Accumulate arr with IDs from lidx
return np.bincount(lidx,arr,minlength=m*n).reshape(m,n)
Sample run -
In [58]: arr
Out[58]: array([1, 2, 3, 4])
In [59]: coords
Out[59]:
array([[0, 0, 1, 2],
[0, 0, 2, 2]])
In [60]: accumulate_arr(coords, arr)
Out[60]:
array([[3., 0., 0.],
[0., 0., 3.],
[0., 0., 4.]])
Another with np.add.at on similar lines and might be easier to follow -
def accumulate_arr_v2(coords, arr):
m,n = coords.max(1)+1
out = np.zeros((m,n), dtype=arr.dtype)
np.add.at(out, tuple(coords), arr)
return out
scipy.sparse.diags allows me to enter multiple diagonal vectors, together with their location, to build a matrix such as
from scipy.sparse import diags
vec = np.ones((5,))
vec2 = vec + 1
diags([vec, vec2], [-2, 2])
I'm looking for an efficient way to do the same but build a dense matrix, instead of DIA. np.diag only supports a single diagonal. What's an efficient way to build a dense matrix from multiple diagonal vectors?
Expected output: the same as np.array(diags([vec, vec2], [-2, 2]).todense())
One way would be to index into the flattened output array using a step of N+1:
import numpy as np
from scipy.sparse import diags
from timeit import timeit
def diags_pp(vecs, offs, dtype=float, N=None):
if N is None:
N = len(vecs[0]) + abs(offs[0])
out = np.zeros((N, N), dtype)
outf = out.reshape(-1)
for vec, off in zip(vecs, offs):
if off<0:
outf[-N*off::N+1] = vec
else:
outf[off:N*(N-off):N+1] = vec
return out
def diags_sp(vecs, offs):
return diags(vecs, offs).A
for N, k in [(10, 2), (100, 20), (1000, 200)]:
print(N)
O = np.arange(-k,k)
D = [np.arange(1, N+1-abs(o)) for o in O]
for n, f in list(globals().items()):
if n.startswith('diags_'):
print(n.replace('diags_', ''), timeit(lambda: f(D, O), number=10000//N)*N)
if n != 'diags_sp':
assert np.all(f(D, O) == diags_sp(D, O))
Sample run:
10
pp 0.06757194991223514
sp 1.9529316504485905
100
pp 0.45834919437766075
sp 4.684177896706387
1000
pp 23.397524026222527
sp 170.66762899048626
With Paul Panzer's (10,2) case
In [107]: O
Out[107]: array([-2, -1, 0, 1])
In [108]: D
Out[108]:
[array([1, 2, 3, 4, 5, 6, 7, 8]),
array([1, 2, 3, 4, 5, 6, 7, 8, 9]),
array([ 1, 2, 3, 4, 5, 6, 7, 8, 9, 10]),
array([1, 2, 3, 4, 5, 6, 7, 8, 9])]
The diagonals have different lengths.
sparse.diags converts this to a sparse.dia_matrix:
In [109]: M = sparse.diags(D,O)
In [110]: M
Out[110]:
<10x10 sparse matrix of type '<class 'numpy.float64'>'
with 36 stored elements (4 diagonals) in DIAgonal format>
In [111]: M.data
Out[111]:
array([[ 1., 2., 3., 4., 5., 6., 7., 8., 0., 0.],
[ 1., 2., 3., 4., 5., 6., 7., 8., 9., 0.],
[ 1., 2., 3., 4., 5., 6., 7., 8., 9., 10.],
[ 0., 1., 2., 3., 4., 5., 6., 7., 8., 9.]])
Here the ragged list of diagonals has been converted to a padded 2d array. This can be a convenient way of specifying the diagonals, but it isn't particularly efficient. It has to be converted to csr format for most calculations:
In [112]: timeit sparse.diags(D,O)
99.8 µs ± 3.65 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)
In [113]: timeit sparse.diags(D,O, format='csr')
371 µs ± 155 ns per loop (mean ± std. dev. of 7 runs, 1000 loops each)
Using np.diag I can construct the same array with an iteration
np.add.reduce([np.diag(v,k) for v,k in zip(D,O)])
In [117]: timeit np.add.reduce([np.diag(v,k) for v,k in zip(D,O)])
39.3 µs ± 131 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)
and with Paul's function:
In [120]: timeit diags_pp(D,O)
12.3 µs ± 24.2 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
The key step in np.diags is a flat assignment:
res[:n-k].flat[i::n+1] = v
This is essentially the same as Paul's outf assignments. So the functionality is basically the same, assigning each diagonal via a slice. Paul streamlines it.
Creating the M.data array (Out[111]) also requires copying the D arrays to a 2d array - but with different slices.
Note 1: None of the answers given to this question work in my case.
Note 2: The solution must work in NumPy 1.14.
Assume I have the following structured array:
arr = np.array([(105.0, 34.0, 145.0, 217.0)], dtype=[('a', 'f4'), ('b', 'f4'), ('c', 'f4'), ('d', 'f4')]).
Now I'm slicing into the structured data type like so:
arr2 = arr[['a', 'b']]
And now I'm trying to convert that slice into a regular array:
out = arr2[0].view((np.float32, 2))
which results in
ValueError: Changing the dtype of a 0d array is only supported if the itemsize is unchanged
What I would like to get is just a regular array like so:
[105.0, 34.0]
Note that this example is simplified in order to be minimal. In my real use case I'm obviously not dealing with an array that holds one element.
I know that this solution works:
out = np.asarray(list(arr2[0]))
but I thought there must be a more efficient solution than copying data that is already in a NumPy array into a list and then back into an array. I assume there is a way to stay in NumPy an maybe not actually copy any data at all, I just don't know how.
The 1d array does convert with view:
In [270]: arr = np.array([(105.0, 34.0, 145.0, 217.0)], dtype=[('a', 'f4'), ('b','f4'), ('c', 'f4'), ('d', 'f4')])
In [271]: arr
Out[271]:
array([(105., 34., 145., 217.)],
dtype=[('a', '<f4'), ('b', '<f4'), ('c', '<f4'), ('d', '<f4')])
In [272]: arr.view('<f4')
Out[272]: array([105., 34., 145., 217.], dtype=float32)
It's when we try to convert a single element, that we get this error:
In [273]: arr[0].view('<f4')
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-273-70fbab8f61ba> in <module>()
----> 1 arr[0].view('<f4')
ValueError: Changing the dtype of a 0d array is only supported if the itemsize is unchanged
Earlier view often required a tweak in the dimensions. I suspect that with recent changes to handling of structured arrays (most evident when indexing several fields at once), this error is a result, either intentionally or not.
In the whole array case it changed the 1d, 4 field array into a 1d, 4 element array, (1,) to (4,). But changing the element, goes from () to (4,).
In the past I have recommended tolist as the surest way around problem with view (and astype):
In [274]: arr[0].tolist()
Out[274]: (105.0, 34.0, 145.0, 217.0)
In [279]: list(arr[0].tolist())
Out[279]: [105.0, 34.0, 145.0, 217.0]
In [280]: np.array(arr[0].tolist())
Out[280]: array([105., 34., 145., 217.])
item is also a good way of pulling an element out of its numpy structure:
In [281]: arr[0].item()
Out[281]: (105.0, 34.0, 145.0, 217.0)
The result from tolost and item is a tuple.
You worry about speed. But you are just converting one element. It's one thing to worry about the speed when using tolist on a 1000 item array, quite another when working with 1 element.
In [283]: timeit arr[0]
131 ns ± 1.31 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)
In [284]: timeit arr[0].tolist()
1.25 µs ± 11.9 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
In [285]: timeit arr[0].item()
1.27 µs ± 2.39 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
In [286]: timeit arr.tolist()
493 ns ± 17.2 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
In [287]: timeit arr.view('f4')
1.74 µs ± 18.7 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
You could index the element in a way that doesn't reduce the dimension to 0 (not that it helps much with speed):
In [288]: arr[[0]].view('f4')
Out[288]: array([105., 34., 145., 217.], dtype=float32)
In [289]: timeit arr[[0]].view('f4')
6.54 µs ± 15.9 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
In [290]: timeit arr[0:1].view('f4')
2.63 µs ± 105 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
In [298]: timeit arr[0][None].view('f4')
4.28 µs ± 160 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
view still requires a change in shape; consider a big array:
In [299]: arrs = np.repeat(arr, 10000)
In [301]: arrs.view('f4')
Out[301]: array([105., 34., 145., ..., 34., 145., 217.], dtype=float32)
In [303]: arrs.shape
Out[303]: (10000,)
In [304]: arrs.view('f4').shape
Out[304]: (40000,)
The view is still 1d, where as we'd probably want a (10000,4) shaped 2d array.
A better view change:
In [306]: arrs.view(('f4',4))
Out[306]:
array([[105., 34., 145., 217.],
[105., 34., 145., 217.],
[105., 34., 145., 217.],
...,
[105., 34., 145., 217.],
[105., 34., 145., 217.],
[105., 34., 145., 217.]], dtype=float32)
In [307]: _.shape
Out[307]: (10000, 4)
This works with the 1 element array, whether 1d or 0d:
In [308]: arr.view(('f4',4))
Out[308]: array([[105., 34., 145., 217.]], dtype=float32)
In [309]: _.shape
Out[309]: (1, 4)
In [310]: arr[0].view(('f4',4))
Out[310]: array([105., 34., 145., 217.], dtype=float32)
In [311]: _.shape
Out[311]: (4,)
This was suggested in one of the answers in your link: https://stackoverflow.com/a/10171321/901925
Contrary to your comment there, it works for me:
In [312]: arr[0].view((np.float32, len(arr.dtype.names)))
Out[312]: array([105., 34., 145., 217.], dtype=float32)
In [313]: np.__version__
Out[313]: '1.14.0'
With the edit:
In [84]: arr = np.array([(105.0, 34.0, 145.0, 217.0)], dtype=[('a', 'f4'), ('b','f4'), ('c', 'f4'), ('d', 'f4')])
In [85]: arr2 = arr[['a', 'b']]
In [86]: arr2
Out[86]:
array([(105., 34.)],
dtype={'names':['a','b'], 'formats':['<f4','<f4'], 'offsets':[0,4], 'itemsize':16})
In [87]: arr2.view(('f4',2))
...
ValueError: Changing the dtype to a subarray type is only supported if the total itemsize is unchanged
Note that the arr2 dtype includes an offsets value. In a recent numpy version, multiple field selection has changed. It is now a true view, preserving the original data - all of it, not just the selected fields. The itemsize is unchanged:
In [93]: arr.itemsize
Out[93]: 16
In [94]: arr2.itemsize
Out[94]: 16
arr.view(('f4',4) and arr2.view(('f4',4)) produce the same thing.
So you can't view (change dtype) a partial set of the fields. You have to first take the view of the whole array, and then select rows/columns, or work with tolist.
I'm using 1.14.0. Release notes for 1.14.1 says:
The change in 1.14.0 that multi-field indexing of structured arrays returns a
view instead of a copy has been reverted but remains on track for NumPy 1.15.
Affected users should read the 1.14.1 Numpy User Guide section
"basics/structured arrays/accessing multiple fields" for advice on how to
manage this transition.
https://docs.scipy.org/doc/numpy-1.14.2/user/basics.rec.html#accessing-multiple-fields
This is still under development. That doc mentions a repack_fields function, but that doesn't exist yet.
I have a numpy array like
m = np.array([[0,0,0,0,0],
[0,0,0,1,1],
[0,1,0,1,0],
[0,1,0,1,0],
[0,0,1,1,1],])
(m == 1).argmax(0) will give array([0, 2, 4, 1, 1]). Is there any similar function to get both 1's minimum index and maximum index across each column. i.e
array([[ nan, 2., 4., 1., 1.], [ nan, 3., 4., 4., 4.]])
One approach would be -
mask = m==1
mask_cumsum = mask.cumsum(0)
valid_mask = mask.any(0)
min_idx = (mask_cumsum==1).argmax(0)
max_idx = mask_cumsum.argmax(0)
min_max_idx = np.vstack((min_idx, max_idx))
out = np.where(valid_mask, min_max_idx, np.nan)
It's not a single function but you can wrap it into one and it should give the expected return:
def argwhere_both_sides_with_nan(arr):
ones = arr == 1
min_r = ones[::-1].argmax(0) # reversed columns
max_n = ones.argmax(0)
res = np.array([max_n, arr.shape[0] - 1 - min_r], dtype=float)
res[:, ~ones.any(0)] = np.nan
return res
>>> argwhere_both_sides_with_nan(m)
array([[ nan, 2., 4., 1., 1.],
[ nan, 3., 4., 4., 4.]])
In case you have numba you could use it here, which could speed this up a bit:
import numba as nb
#nb.njit
def first_and_last_index(arr, val):
rows, cols = arr.shape
ret = np.full((2, cols), np.nan)
for row_idx in range(rows):
for col_idx in range(cols):
if arr[row_idx, col_idx] == val:
if np.isnan(ret[0, col_idx]):
ret[0, col_idx] = row_idx
ret[1, col_idx] = row_idx
return ret
>>> first_and_last_index(m, 1)
array([[ nan, 2., 4., 1., 1.],
[ nan, 3., 4., 4., 4.]])
Timings:
#nb.njit
def first_and_last_index(arr, val):
rows, cols = arr.shape
ret = np.full((2, cols), np.nan)
for row_idx in range(rows):
for col_idx in range(cols):
if arr[row_idx, col_idx] == val:
if np.isnan(ret[0, col_idx]):
ret[0, col_idx] = row_idx
ret[1, col_idx] = row_idx
return ret
def argwhere_both_sides_with_nan(arr):
ones = arr == 1
min_r = ones[::-1].argmax(0) # reversed columns
max_n = ones.argmax(0)
res = np.array([max_n, arr.shape[0] - 1 - min_r], dtype=float)
res[:, ~ones.any(0)] = np.nan
return res
def divakars(m):
mask = m==1
mask_cumsum = mask.cumsum(0)
valid_mask = mask.any(0)
min_idx = (mask_cumsum==1).argmax(0)
max_idx = mask_cumsum.argmax(0)
min_max_idx = np.vstack((min_idx, max_idx))
out = np.where(valid_mask, min_max_idx, np.nan)
return out
m = np.array([[0, 0, 0, 0, 0],
[0, 0, 0, 1, 1],
[0, 1, 0, 1, 0],
[0, 1, 0, 1, 0],
[0, 0, 1, 1, 1],
[0, 0, 1, 1, 1]])
np.testing.assert_array_equal(first_and_last_index(m, 1), argwhere_both_sides_with_nan(m))
np.testing.assert_array_equal(first_and_last_index(m, 1), divakars(m))
%timeit first_and_last_index(m, 1)
# 6.77 µs ± 178 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
%timeit argwhere_both_sides_with_nan(m)
# 121 µs ± 3.57 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)
%timeit divakars(m)
# 138 µs ± 4.24 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)
m = (np.random.random((1000, 1000)) > 0.5).astype(np.int64)
np.testing.assert_array_equal(first_and_last_index(m, 1), argwhere_both_sides_with_nan(m))
np.testing.assert_array_equal(first_and_last_index(m, 1), divakars(m))
%timeit first_and_last_index(m, 1)
# 10 ms ± 248 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
%timeit argwhere_both_sides_with_nan(m)
# 12.8 ms ± 393 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
%timeit divakars(m)
# 67.2 ms ± 2.05 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)
Another solution by flipping it :
import numpy as np
a = list((m == 1).argmax(0))
b = (np.flipud(m) == 1).argmax(0)
# a is your first output
c = [len(m) - elt - 1 for elt in b]
# c is your second output
Current output :
a = [0, 2, 4, 1, 1]
c = [4, 3, 4, 4, 4]
Just a prob with the full 0 colum to fix.
I have a (w,h) np array in 2d. I want to make a 3d dimension that has a value greater than 1 and copy its value over along the 3rd dimensions. I was hoping broadcast would do it but it can't. This is how i'm doing it
arr = np.expand_dims(arr, axis=2)
arr = np.concatenate((arr,arr,arr), axis=2)
is there a a faster way to do so?
You can push all dims forward, introducing a singleton dim/new axis as the last dim to create a 3D array and then repeat three times along that one with np.repeat, like so -
arr3D = np.repeat(arr[...,None],3,axis=2)
Here's another approach using np.tile -
arr3D = np.tile(arr[...,None],3)
Another approach that works:
x_train = np.stack((x_train,) * 3, axis=-1)
Better helpful in converting gray a-channel matrix into 3 channel matrix.
img3 = np.zeros((gray.shape[0],gray.shape[1],3))
img3[:,:,0] = gray
img3[:,:,1] = gray
img3[:,:,2] = gray
fig = plt.figure(figsize = (15,15))
plt.imshow(img3)
Another simple approach is to use matrix multiplication - multiplying by a matrix of ones that will essentially copy the values across the new dimension:
a=np.random.randn(4,4) #a.shape = (4,4)
a = np.expand_dims(a,-1) #a.shape = (4,4,1)
a = a*np.ones((1,1,3))
a.shape #(4, 4, 3)
I'd suggest you to use the barebones numpy.concatenate() simply because the below piece of code shows that it's the fastest among all other suggested answers:
# sample 2D array to work with
In [51]: arr = np.random.random_sample((12, 34))
# promote the array `arr` to 3D and then concatenate along `axis 2`
In [52]: arr3D = np.concatenate([arr[..., np.newaxis]]*3, axis=2)
# verify for desired shape
In [53]: arr3D.shape
Out[53]: (12, 34, 3)
You can see the timings below to convince yourselves. (ordered: best to worst):
In [42]: %timeit -n 100000 np.concatenate([arr[..., np.newaxis]]*3, axis=2)
1.94 µs ± 32.9 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
In [43]: %timeit -n 100000 np.repeat(arr[..., np.newaxis], 3, axis=2)
4.38 µs ± 46.7 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
In [44]: %timeit -n 100000 np.dstack([arr]*3)
5.1 µs ± 57.6 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
In [49]: %timeit -n 100000 np.stack([arr]*3, -1)
5.12 µs ± 125 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
In [46]: %timeit -n 100000 np.tile(arr[..., np.newaxis], 3)
7.13 µs ± 85.1 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
Having said that, if you're looking for shortest piece of code, then you can use:
# wrap your 2D array in an iterable and then multiply it by the needed depth
arr3D = np.dstack([arr]*3)
# verify shape
print(arr3D.shape)
(12, 34, 3)
This would work. (I think this would not a recommended way :-) But maybe this is the most closest way you thought.)
np.array([img, img, img]).transpose(1,2,0)
just stacking targets(img) any time you want(3), and make the channel(3) go to the last axis.
Not sure if I understood correctly, but broadcasting seems working to me in this case:
>>> a = numpy.array([[1,2], [3,4]])
>>> c = numpy.zeros((4, 2, 2))
>>> c[0] = a
>>> c[1:] = a+1
>>> c
array([[[ 1., 2.],
[ 3., 4.]],
[[ 2., 3.],
[ 4., 5.]],
[[ 2., 3.],
[ 4., 5.]],
[[ 2., 3.],
[ 4., 5.]]])