broadcast numpy difference to a third dimension - python

Setup
import numpy as np
A = np.array([[1, 2], [2, 3]])
B = np.array([[1, 1], [2, 2], [4, 3]])
A
array([[1, 2],
[2, 3]])
B
array([[1, 1],
[2, 2],
[4, 3]])
I need to take the difference of the first row of A with each row of B. If I do:
A - B[0]
array([[0, 1],
[1, 2]])
I just need this for each row of B.
A non-vectorized approach is:
np.array([A - B[i] for i in range(B.shape[0])])
array([[[ 0, 1],
[ 1, 2]],
[[-1, 0],
[ 0, 1]],
[[-3, -1],
[-2, 0]]])
Question
What is a vectorized approach to get the same 3-dimensional array? I'm ok with using pandas if that makes it easier.

The easiest way is to add a dimension to your B array for numpy to properly broadcast it:
In [15]: A - B[:, np.newaxis]
Out[15]:
array([[[ 0, 1],
[ 1, 2]],
[[-1, 0],
[ 0, 1]],
[[-3, -1],
[-2, 0]]])

Related

how to add dimensions to a numpy element?

i have a numpy.array like this
[[1,2,3]
[4,5,6]
[7,8,9]]
How can i change it to this:-
[[[1,0], [2,0], [3,0]]
[[4,0], [5,0], [6,0]]
[[7,0], [8,0], [9,0]]]
Thanks in advance.
With a as the input array, you can use array-assignment and this would work for a generic n-dim input -
out = np.zeros(a.shape+(2,),dtype=a.dtype)
out[...,0] = a
Sample run -
In [81]: a
Out[81]:
array([[1, 2, 3],
[4, 5, 6],
[7, 8, 9]])
In [82]: out = np.zeros(a.shape+(2,),dtype=a.dtype)
...: out[...,0] = a
In [83]: out
Out[83]:
array([[[1, 0],
[2, 0],
[3, 0]],
[[4, 0],
[5, 0],
[6, 0]],
[[7, 0],
[8, 0],
[9, 0]]])
If you play around with broadcasting, here's a compact one -
a[...,None]*[1,0]
I think numpy.dstack might provide the solution. Let's call A your first array. Do
B = np.zeros((3,3))
R = np.dstack((A,B))
And R should be the array you want.
If your input is unsigned integer and your dtype is "large enough", you can use the following code to pad zero without creating copy:
b = str(a.dtype).split('int')
b = a[...,None].view(b[0]+'int'+str(int(b[1])//2))
with a equal to your example, the output looks like
array([[[1, 0],
[2, 0],
[3, 0]],
[[4, 0],
[5, 0],
[6, 0]],
[[7, 0],
[8, 0],
[9, 0]]], dtype=int16)
Disclaimer: This one is fast (for large operands), but pretty unsound. Also it only works for 32 or 64 bit dtypes. Do not use in serious code.
def squeeze_in_zero(a):
sh = a.shape
n = a.dtype.itemsize
return a.view(f'f{n}').astype(f'c{2*n}').view(a.dtype).reshape(*a.shape, 2)
Speedwise at 10000 elements on my machine it is roughly on par with #Divakar's array assignment. Below it is slower, above it is faster.
Sample run:
>>> a = np.arange(-4, 5).reshape(3, 3)
>>> squeeze_in_zero(a)
array([[[-4, 0],
[-3, 0],
[-2, 0]],
[[-1, 0],
[ 0, 0],
[ 1, 0]],
[[ 2, 0],
[ 3, 0],
[ 4, 0]]])

np.choose not giving desired result after broadcasting

I would like to pick the nth elements as specified in maxsuit from suitCounts. I did broadcast the maxsuit array so I do get a result, but not the desired one. Any suggestions what I'm doing conceptually wrong is appreciated. I don't understand the result of np.choose(self.maxsuit[:,:,None]-1, self.suitCounts), which is not what I'm looking for.
>>> self.maxsuit
Out[38]:
array([[3, 3],
[1, 1],
[1, 1]], dtype=int64)
>>> self.maxsuit[:,:,None]-1
Out[33]:
array([[[2],
[2]],
[[0],
[0]],
[[0],
[0]]], dtype=int64)
>>> self.suitCounts
Out[34]:
array([[[2, 1, 3, 0],
[1, 0, 3, 0]],
[[4, 1, 2, 0],
[3, 0, 3, 0]],
[[2, 2, 0, 0],
[1, 1, 1, 0]]])
>>> np.choose(self.maxsuit[:,:,None]-1, self.suitCounts)
Out[35]:
array([[[2, 2, 0, 0],
[1, 1, 1, 0]],
[[2, 1, 3, 0],
[1, 0, 3, 0]],
[[2, 1, 3, 0],
[1, 0, 3, 0]]])
The desired result would be:
[[3,3],[4,3],[2,1]]
You could use advanced-indexing for a broadcasted way to index into the array, like so -
In [415]: val # Data array
Out[415]:
array([[[2, 1, 3, 0],
[1, 0, 3, 0]],
[[4, 1, 2, 0],
[3, 0, 3, 0]],
[[2, 2, 0, 0],
[1, 1, 1, 0]]])
In [416]: idx # Indexing array
Out[416]:
array([[3, 3],
[1, 1],
[1, 1]])
In [417]: m,n = val.shape[:2]
In [418]: val[np.arange(m)[:,None],np.arange(n),idx-1]
Out[418]:
array([[3, 3],
[4, 3],
[2, 1]])
A bit cleaner way with np.ogrid to use open range arrays -
In [424]: d0,d1 = np.ogrid[:m,:n]
In [425]: val[d0,d1,idx-1]
Out[425]:
array([[3, 3],
[4, 3],
[2, 1]])
This is the best I can do with choose
In [23]: np.choose([[1,2,0],[1,2,0]], suitcounts[:,:,:3])
Out[23]:
array([[4, 2, 3],
[3, 1, 3]])
choose prefers that we use a list of arrays, rather than single one. It's supposed to prevent misuse. So the problem could be written as:
In [24]: np.choose([[1,2,0],[1,2,0]], [suitcounts[0,:,:3], suitcounts[1,:,:3], suitcounts[2,:,:3]])
Out[24]:
array([[4, 2, 3],
[3, 1, 3]])
The idea is to select items from the 3 subarrays, based on an index array like:
In [25]: np.array([[1,2,0],[1,2,0]])
Out[25]:
array([[1, 2, 0],
[1, 2, 0]])
The output will match the indexing array in shape. The choise arrays have match in shape as well, hence my use of [...,:3].
Values for the first column are selected from suitcounts[1,:,:3], for the 2nd column from suitcounts[2...] etc.
choose is limited to 32 choices; this is limitation imposed by the broadcasting mechanism.
Speaking of broadcasting I could simplify the expression
In [26]: np.choose([1,2,0], suitcounts[:,:,:3])
Out[26]:
array([[4, 2, 3],
[3, 1, 3]])
This broadcasts [1,2,0] to match the 2x3 shape of the subarrays.
I could get the target order by reordering the columns:
In [27]: np.choose([0,1,2], suitcounts[:,:,[2,0,1]])
Out[27]:
array([[3, 4, 2],
[3, 3, 1]])

How do I set cell values in `np.array()` based on condition?

I have a numpy array and a list of valid values in that array:
import numpy as np
arr = np.array([[1,2,0], [2,2,0], [4,1,0], [4,1,0], [3,2,0], ... ])
valid = [1,4]
Is there a nice pythonic way to set all array values to zero, that are not in the list of valid values and do it in-place? After this operation, the list should look like this:
[[1,0,0], [0,0,0], [4,1,0], [4,1,0], [0,0,0], ... ]
The following creates a copy of the array in memory, which is bad for large arrays:
arr = np.vectorize(lambda x: x if x in valid else 0)(arr)
It bugs me, that for now I loop over each array element and set it to zero if it is in the valid list.
Edit: I found an answer suggesting there is no in-place function to achieve this. Also stop changing my whitespaces. It's easier to see the changes in arr whith them.
You can use np.place for an in-situ update -
np.place(arr,~np.in1d(arr,valid),0)
Sample run -
In [66]: arr
Out[66]:
array([[1, 2, 0],
[2, 2, 0],
[4, 1, 0],
[4, 1, 0],
[3, 2, 0]])
In [67]: np.place(arr,~np.in1d(arr,valid),0)
In [68]: arr
Out[68]:
array([[1, 0, 0],
[0, 0, 0],
[4, 1, 0],
[4, 1, 0],
[0, 0, 0]])
Along the same lines, np.put could also be used -
np.put(arr,np.where(~np.in1d(arr,valid))[0],0)
Sample run -
In [70]: arr
Out[70]:
array([[1, 2, 0],
[2, 2, 0],
[4, 1, 0],
[4, 1, 0],
[3, 2, 0]])
In [71]: np.put(arr,np.where(~np.in1d(arr,valid))[0],0)
In [72]: arr
Out[72]:
array([[1, 0, 0],
[0, 0, 0],
[4, 1, 0],
[4, 1, 0],
[0, 0, 0]])
Indexing with booleans would work too:
>>> arr = np.array([[1, 2, 0], [2, 2, 0], [4, 1, 0], [4, 1, 0], [3, 2, 0]])
>>> arr[~np.in1d(arr, valid).reshape(arr.shape)] = 0
>>> arr
array([[1, 0, 0],
[0, 0, 0],
[4, 1, 0],
[4, 1, 0],
[0, 0, 0]])

Is there a way to compose two matrix of numbers in one matrix of text?

I would like to compose two matrix of numbers into one matrix of formated text in python.
Is there a easy way?
I could use for, but I just want this because is better for work.
As a simple example:
array([[0, 1, 2],
[0, 1, 2],
[0, 1, 2],
[0, 1, 2],
[0, 1, 2]])
array([[0, 0, 0],
[1, 1, 1],
[2, 2, 2],
[3, 3, 3],
[4, 4, 4]])
to
array([['0:0', '1:0', '2:0'],
['0:1', '1:1', '2:1'],
['0:2', '1:2', '2:2'],
['0:3', '1:3', '2:3'],
['0:4', '1:4', '2:4']])
You can use np.dstack to combine both the arrays and use string manipulation with comprehension to manipulate each cell of the combined array
>>> arr = np.dstack((arr1, arr2))
>>> np.array([np.array([':'.join(map(str,cell)) for cell in row ]) for row in arr])
array([['0:0', '1:0', '2:0'],
['0:1', '1:1', '2:1'],
['0:2', '1:2', '2:2'],
['0:3', '1:3', '2:3'],
['0:4', '1:4', '2:4']],
dtype='|S3')
You could use nditer to iterate over the arrays, and make strings as needed: e.g.
import numpy as np
a1 = np.array([[0, 1, 2],
[0, 1, 2],
[0, 1, 2],
[0, 1, 2],
[0, 1, 2]])
a2 = np.array([[0, 0, 0],
[1, 1, 1],
[2, 2, 2],
[3, 3, 3],
[4, 4, 4]])
out=np.empty(a1.shape, dtype='S5')
for x,y,o in np.nditer([a1, a2, out], op_flags=['readwrite']):
o[...] = "{}:{}".format(x,y)
print(out)
Result:
[['0:0' '1:0' '2:0']
['0:1' '1:1' '2:1']
['0:2' '1:2' '2:2']
['0:3' '1:3' '2:3']
['0:4' '1:4' '2:4']]
Use list comprehensions and zip() to form a new array:
from numpy import array
ar1 = array([[0, 1, 2],
[0, 1, 2],
[0, 1, 2],
[0, 1, 2],
[0, 1, 2]])
ar2 = array([[0, 0, 0],
[1, 1, 1],
[2, 2, 2],
[3, 3, 3],
[4, 4, 4]])
res = array([['%s:%s' % (j1, j2) for j1, j2 in zip(i1, i2)] for i1, i2 in zip(ar1, ar2)])
print(res)
Result:
[['0:0' '1:0' '2:0']
['0:1' '1:1' '2:1']
['0:2' '1:2' '2:2']
['0:3' '1:3' '2:3']
['0:4' '1:4' '2:4']]
This solution will also fit usual Python two-dimensional lists (just remove the 'array' functions).

Numpy.select from 3D array

Suppose I have the following numpy arrays:
>>a
array([[0, 0, 2],
[2, 0, 1],
[2, 2, 1]])
>>b
array([[2, 2, 0],
[2, 0, 2],
[1, 1, 2]])
that I then vertically stack
c=np.dstack((a,b))
resulting in:
>>c
array([[[0, 2],
[0, 2],
[2, 0]],
[[2, 2],
[0, 0],
[1, 2]],
[[2, 1],
[2, 1],
[1, 2]]])
From this I wish to, for each 3rd dimension of c, check which combination is present in this subarray, and then number it accordingingly with the index of the list-match. I've tried the following, but it is not working. The algorithm is simple enough with double for-loops, but because c is very large, it is prohibitively slow.
classes=[(0,0),(2,1),(2,2)]
out=np.select( [h==c for h in classes], range(len(classes)), default=-1)
My desired output would be
out = [[-1,-1,-1],
[3, 1,-1],
[2, 2,-1]]
How about this:
(np.array([np.array(h)[...,:] == c for h in classes]).all(axis = -1) *
(2 + np.arange(len(classes)))[:, None, None]).max(axis=0) - 1
It returns, what you actually need
array([[-1, -1, -1],
[ 3, 1, -1],
[ 2, 2, -1]])
You can test the a and b arrays separately like this:
clsa = (0,2,2)
clesb = (0,1,2)
np.select ( [(ca==a) & (cb==b) for ca,cb in zip (clsa, clsb)], range (3), default = -1)
which gets your desired result (except returns 0,1,2 instead of 1,2,3).
Here is another way to get what you want, thought I would post it in case it's useful to anyone.
import numpy as np
a = np.array([[0, 0, 2],
[2, 0, 1],
[2, 2, 1]])
b = np.array([[2, 2, 0],
[2, 0, 2],
[1, 1, 2]])
classes=[(0,0),(2,1),(2,2)]
c = np.empty(a.shape, dtype=[('a', a.dtype), ('b', b.dtype)])
c['a'] = a
c['b'] = b
classes = np.array(classes, dtype=c.dtype)
classes.sort()
out = classes.searchsorted(c)
out = np.where(c == classes[out], out+1, -1)
print out
#array([[-1, -1, -1]
# [ 3, 1, -1]
# [ 2, 1, -1]])

Categories