how to add dimensions to a numpy element? - python

i have a numpy.array like this
[[1,2,3]
[4,5,6]
[7,8,9]]
How can i change it to this:-
[[[1,0], [2,0], [3,0]]
[[4,0], [5,0], [6,0]]
[[7,0], [8,0], [9,0]]]
Thanks in advance.

With a as the input array, you can use array-assignment and this would work for a generic n-dim input -
out = np.zeros(a.shape+(2,),dtype=a.dtype)
out[...,0] = a
Sample run -
In [81]: a
Out[81]:
array([[1, 2, 3],
[4, 5, 6],
[7, 8, 9]])
In [82]: out = np.zeros(a.shape+(2,),dtype=a.dtype)
...: out[...,0] = a
In [83]: out
Out[83]:
array([[[1, 0],
[2, 0],
[3, 0]],
[[4, 0],
[5, 0],
[6, 0]],
[[7, 0],
[8, 0],
[9, 0]]])
If you play around with broadcasting, here's a compact one -
a[...,None]*[1,0]

I think numpy.dstack might provide the solution. Let's call A your first array. Do
B = np.zeros((3,3))
R = np.dstack((A,B))
And R should be the array you want.

If your input is unsigned integer and your dtype is "large enough", you can use the following code to pad zero without creating copy:
b = str(a.dtype).split('int')
b = a[...,None].view(b[0]+'int'+str(int(b[1])//2))
with a equal to your example, the output looks like
array([[[1, 0],
[2, 0],
[3, 0]],
[[4, 0],
[5, 0],
[6, 0]],
[[7, 0],
[8, 0],
[9, 0]]], dtype=int16)

Disclaimer: This one is fast (for large operands), but pretty unsound. Also it only works for 32 or 64 bit dtypes. Do not use in serious code.
def squeeze_in_zero(a):
sh = a.shape
n = a.dtype.itemsize
return a.view(f'f{n}').astype(f'c{2*n}').view(a.dtype).reshape(*a.shape, 2)
Speedwise at 10000 elements on my machine it is roughly on par with #Divakar's array assignment. Below it is slower, above it is faster.
Sample run:
>>> a = np.arange(-4, 5).reshape(3, 3)
>>> squeeze_in_zero(a)
array([[[-4, 0],
[-3, 0],
[-2, 0]],
[[-1, 0],
[ 0, 0],
[ 1, 0]],
[[ 2, 0],
[ 3, 0],
[ 4, 0]]])

Related

Create a 3D (partially) diagonal array from a 2D array

I'd like to ask how can I efficiently generate a numpy 3D array from a 2D array with each row filling the diagonal part of the new array?
For example, the input 2D array is
array([[1, 2],
[3, 4],
[5, 6],
[7, 8]])
and I want the output to be
array([[[1, 0],
[0, 2]],
[[3, 0],
[0, 4]],
[[5, 0],
[0, 6]],
[[7, 0],
[0, 8]]])
Typically, the size of the first dimensional is very large. Thanks in advance.
Assuming a the input and using indexing with unravel_index:
x, y = np.unravel_index(np.arange(a.size), a.shape)
out = np.zeros(a.shape+(a.shape[-1],), dtype=a.dtype)
out[x, y, y] = a.flat
Output:
array([[[1, 0],
[0, 2]],
[[3, 0],
[0, 4]],
[[5, 0],
[0, 6]],
[[7, 0],
[0, 8]]])
timings:
arr = np.array([[1, 2], [3, 4], [5, 6], [7, 8]])
res = np.apply_along_axis(np.diag, 1, arr)

Row-wise replacement of numpy array with values of another numpy array

I have 0s and 1s store in a 3-dimensional numpy array:
g = np.array([[[0, 1], [0, 1], [1, 0]], [[0, 0], [1, 0], [1, 1]]])
# array([
# [[0, 1], [0, 1], [1, 0]],
# [[0, 0], [1, 0], [1, 1]]])
and I'd like to replace these values by those in another array using a row-wise replacement strategy. For example, replacing the vales of g by x:
x = np.array([[2, 3], [4, 5]])
array([[2, 3],
[4, 5]])
to obtain:
array([
[[2, 3], [2, 3], [3, 2]],
[[4, 4], [5, 4], [5, 5]]])
The idea here would be to have the first row of g replaced by the first elements of x (0 becomes 2 and 1 becomes 3) and the same for the other row (the first dimension - number of "rows" - will always be the same for g and x)
I can't seem to be able to use np.where because there's a ValueError: operands could not be broadcast together with shapes (2,3,2) (2,2) (2,2).
IIUC,
np.stack([x[i, g[i]] for i in range(x.shape[0])])
Output:
array([[[2, 3],
[2, 3],
[3, 2]],
[[4, 4],
[5, 4],
[5, 5]]])
Vectorized approach with np.take_along_axis to index into the last axis of x with g using axis=-1 -
In [20]: np.take_along_axis(x[:,None],g,axis=-1)
Out[20]:
array([[[2, 3],
[2, 3],
[3, 2]],
[[4, 4],
[5, 4],
[5, 5]]])
Or with manual integer-based indexing -
In [27]: x[np.arange(len(g))[:,None,None],g]
Out[27]:
array([[[2, 3],
[2, 3],
[3, 2]],
[[4, 4],
[5, 4],
[5, 5]]])
One solution, is to simply use comprehension directly here:
>>> np.array([[x[i][c] for c in r] for i, r in enumerate(g)])
array([[[2, 3],
[2, 3],
[3, 2]],
[[4, 4],
[5, 4],
[5, 5]]])
From what I understand, g is an array of indexes (indexes being 0 or 1) and x is the array to who's values you use.
Something like this should work (tested quickly)
import numpy as np
def swap_indexes(index_array, array):
out_array = []
for i, row in enumerate(index_array):
out_array.append([array[i,indexes] for indexes in row])
return np.array(out_array)
index_array = np.array([[[0, 1], [0, 1], [1, 0]], [[0, 0], [1, 0], [1, 1]]])
x = np.array([[2, 3], [4, 5]])
print(swap_indexes(index_array, x))
[EDIT: fixed typo that created duplicates]

np.choose not giving desired result after broadcasting

I would like to pick the nth elements as specified in maxsuit from suitCounts. I did broadcast the maxsuit array so I do get a result, but not the desired one. Any suggestions what I'm doing conceptually wrong is appreciated. I don't understand the result of np.choose(self.maxsuit[:,:,None]-1, self.suitCounts), which is not what I'm looking for.
>>> self.maxsuit
Out[38]:
array([[3, 3],
[1, 1],
[1, 1]], dtype=int64)
>>> self.maxsuit[:,:,None]-1
Out[33]:
array([[[2],
[2]],
[[0],
[0]],
[[0],
[0]]], dtype=int64)
>>> self.suitCounts
Out[34]:
array([[[2, 1, 3, 0],
[1, 0, 3, 0]],
[[4, 1, 2, 0],
[3, 0, 3, 0]],
[[2, 2, 0, 0],
[1, 1, 1, 0]]])
>>> np.choose(self.maxsuit[:,:,None]-1, self.suitCounts)
Out[35]:
array([[[2, 2, 0, 0],
[1, 1, 1, 0]],
[[2, 1, 3, 0],
[1, 0, 3, 0]],
[[2, 1, 3, 0],
[1, 0, 3, 0]]])
The desired result would be:
[[3,3],[4,3],[2,1]]
You could use advanced-indexing for a broadcasted way to index into the array, like so -
In [415]: val # Data array
Out[415]:
array([[[2, 1, 3, 0],
[1, 0, 3, 0]],
[[4, 1, 2, 0],
[3, 0, 3, 0]],
[[2, 2, 0, 0],
[1, 1, 1, 0]]])
In [416]: idx # Indexing array
Out[416]:
array([[3, 3],
[1, 1],
[1, 1]])
In [417]: m,n = val.shape[:2]
In [418]: val[np.arange(m)[:,None],np.arange(n),idx-1]
Out[418]:
array([[3, 3],
[4, 3],
[2, 1]])
A bit cleaner way with np.ogrid to use open range arrays -
In [424]: d0,d1 = np.ogrid[:m,:n]
In [425]: val[d0,d1,idx-1]
Out[425]:
array([[3, 3],
[4, 3],
[2, 1]])
This is the best I can do with choose
In [23]: np.choose([[1,2,0],[1,2,0]], suitcounts[:,:,:3])
Out[23]:
array([[4, 2, 3],
[3, 1, 3]])
choose prefers that we use a list of arrays, rather than single one. It's supposed to prevent misuse. So the problem could be written as:
In [24]: np.choose([[1,2,0],[1,2,0]], [suitcounts[0,:,:3], suitcounts[1,:,:3], suitcounts[2,:,:3]])
Out[24]:
array([[4, 2, 3],
[3, 1, 3]])
The idea is to select items from the 3 subarrays, based on an index array like:
In [25]: np.array([[1,2,0],[1,2,0]])
Out[25]:
array([[1, 2, 0],
[1, 2, 0]])
The output will match the indexing array in shape. The choise arrays have match in shape as well, hence my use of [...,:3].
Values for the first column are selected from suitcounts[1,:,:3], for the 2nd column from suitcounts[2...] etc.
choose is limited to 32 choices; this is limitation imposed by the broadcasting mechanism.
Speaking of broadcasting I could simplify the expression
In [26]: np.choose([1,2,0], suitcounts[:,:,:3])
Out[26]:
array([[4, 2, 3],
[3, 1, 3]])
This broadcasts [1,2,0] to match the 2x3 shape of the subarrays.
I could get the target order by reordering the columns:
In [27]: np.choose([0,1,2], suitcounts[:,:,[2,0,1]])
Out[27]:
array([[3, 4, 2],
[3, 3, 1]])

broadcast numpy difference to a third dimension

Setup
import numpy as np
A = np.array([[1, 2], [2, 3]])
B = np.array([[1, 1], [2, 2], [4, 3]])
A
array([[1, 2],
[2, 3]])
B
array([[1, 1],
[2, 2],
[4, 3]])
I need to take the difference of the first row of A with each row of B. If I do:
A - B[0]
array([[0, 1],
[1, 2]])
I just need this for each row of B.
A non-vectorized approach is:
np.array([A - B[i] for i in range(B.shape[0])])
array([[[ 0, 1],
[ 1, 2]],
[[-1, 0],
[ 0, 1]],
[[-3, -1],
[-2, 0]]])
Question
What is a vectorized approach to get the same 3-dimensional array? I'm ok with using pandas if that makes it easier.
The easiest way is to add a dimension to your B array for numpy to properly broadcast it:
In [15]: A - B[:, np.newaxis]
Out[15]:
array([[[ 0, 1],
[ 1, 2]],
[[-1, 0],
[ 0, 1]],
[[-3, -1],
[-2, 0]]])

How do I set cell values in `np.array()` based on condition?

I have a numpy array and a list of valid values in that array:
import numpy as np
arr = np.array([[1,2,0], [2,2,0], [4,1,0], [4,1,0], [3,2,0], ... ])
valid = [1,4]
Is there a nice pythonic way to set all array values to zero, that are not in the list of valid values and do it in-place? After this operation, the list should look like this:
[[1,0,0], [0,0,0], [4,1,0], [4,1,0], [0,0,0], ... ]
The following creates a copy of the array in memory, which is bad for large arrays:
arr = np.vectorize(lambda x: x if x in valid else 0)(arr)
It bugs me, that for now I loop over each array element and set it to zero if it is in the valid list.
Edit: I found an answer suggesting there is no in-place function to achieve this. Also stop changing my whitespaces. It's easier to see the changes in arr whith them.
You can use np.place for an in-situ update -
np.place(arr,~np.in1d(arr,valid),0)
Sample run -
In [66]: arr
Out[66]:
array([[1, 2, 0],
[2, 2, 0],
[4, 1, 0],
[4, 1, 0],
[3, 2, 0]])
In [67]: np.place(arr,~np.in1d(arr,valid),0)
In [68]: arr
Out[68]:
array([[1, 0, 0],
[0, 0, 0],
[4, 1, 0],
[4, 1, 0],
[0, 0, 0]])
Along the same lines, np.put could also be used -
np.put(arr,np.where(~np.in1d(arr,valid))[0],0)
Sample run -
In [70]: arr
Out[70]:
array([[1, 2, 0],
[2, 2, 0],
[4, 1, 0],
[4, 1, 0],
[3, 2, 0]])
In [71]: np.put(arr,np.where(~np.in1d(arr,valid))[0],0)
In [72]: arr
Out[72]:
array([[1, 0, 0],
[0, 0, 0],
[4, 1, 0],
[4, 1, 0],
[0, 0, 0]])
Indexing with booleans would work too:
>>> arr = np.array([[1, 2, 0], [2, 2, 0], [4, 1, 0], [4, 1, 0], [3, 2, 0]])
>>> arr[~np.in1d(arr, valid).reshape(arr.shape)] = 0
>>> arr
array([[1, 0, 0],
[0, 0, 0],
[4, 1, 0],
[4, 1, 0],
[0, 0, 0]])

Categories