np.choose not giving desired result after broadcasting - python

I would like to pick the nth elements as specified in maxsuit from suitCounts. I did broadcast the maxsuit array so I do get a result, but not the desired one. Any suggestions what I'm doing conceptually wrong is appreciated. I don't understand the result of np.choose(self.maxsuit[:,:,None]-1, self.suitCounts), which is not what I'm looking for.
>>> self.maxsuit
Out[38]:
array([[3, 3],
[1, 1],
[1, 1]], dtype=int64)
>>> self.maxsuit[:,:,None]-1
Out[33]:
array([[[2],
[2]],
[[0],
[0]],
[[0],
[0]]], dtype=int64)
>>> self.suitCounts
Out[34]:
array([[[2, 1, 3, 0],
[1, 0, 3, 0]],
[[4, 1, 2, 0],
[3, 0, 3, 0]],
[[2, 2, 0, 0],
[1, 1, 1, 0]]])
>>> np.choose(self.maxsuit[:,:,None]-1, self.suitCounts)
Out[35]:
array([[[2, 2, 0, 0],
[1, 1, 1, 0]],
[[2, 1, 3, 0],
[1, 0, 3, 0]],
[[2, 1, 3, 0],
[1, 0, 3, 0]]])
The desired result would be:
[[3,3],[4,3],[2,1]]

You could use advanced-indexing for a broadcasted way to index into the array, like so -
In [415]: val # Data array
Out[415]:
array([[[2, 1, 3, 0],
[1, 0, 3, 0]],
[[4, 1, 2, 0],
[3, 0, 3, 0]],
[[2, 2, 0, 0],
[1, 1, 1, 0]]])
In [416]: idx # Indexing array
Out[416]:
array([[3, 3],
[1, 1],
[1, 1]])
In [417]: m,n = val.shape[:2]
In [418]: val[np.arange(m)[:,None],np.arange(n),idx-1]
Out[418]:
array([[3, 3],
[4, 3],
[2, 1]])
A bit cleaner way with np.ogrid to use open range arrays -
In [424]: d0,d1 = np.ogrid[:m,:n]
In [425]: val[d0,d1,idx-1]
Out[425]:
array([[3, 3],
[4, 3],
[2, 1]])

This is the best I can do with choose
In [23]: np.choose([[1,2,0],[1,2,0]], suitcounts[:,:,:3])
Out[23]:
array([[4, 2, 3],
[3, 1, 3]])
choose prefers that we use a list of arrays, rather than single one. It's supposed to prevent misuse. So the problem could be written as:
In [24]: np.choose([[1,2,0],[1,2,0]], [suitcounts[0,:,:3], suitcounts[1,:,:3], suitcounts[2,:,:3]])
Out[24]:
array([[4, 2, 3],
[3, 1, 3]])
The idea is to select items from the 3 subarrays, based on an index array like:
In [25]: np.array([[1,2,0],[1,2,0]])
Out[25]:
array([[1, 2, 0],
[1, 2, 0]])
The output will match the indexing array in shape. The choise arrays have match in shape as well, hence my use of [...,:3].
Values for the first column are selected from suitcounts[1,:,:3], for the 2nd column from suitcounts[2...] etc.
choose is limited to 32 choices; this is limitation imposed by the broadcasting mechanism.
Speaking of broadcasting I could simplify the expression
In [26]: np.choose([1,2,0], suitcounts[:,:,:3])
Out[26]:
array([[4, 2, 3],
[3, 1, 3]])
This broadcasts [1,2,0] to match the 2x3 shape of the subarrays.
I could get the target order by reordering the columns:
In [27]: np.choose([0,1,2], suitcounts[:,:,[2,0,1]])
Out[27]:
array([[3, 4, 2],
[3, 3, 1]])

Related

Removing values from a 3D array of indices

I have a 3D array of indices generated from np.argsort, sorted by the 0-th axis, so that each column is the sorting index. However, I want to drop some values from this array, say 0. Of course, I can remove the 0-th slice then sort again, but I need to repeat this sort for many times and each time I need to remove some different values, so I would like to see if there is a more efficient way to generate the array. I think this problem is the same as shifting the NaN value along axis=0 to the end.
Example
Consider the following 3D array of sorting indices. Notice that along axis=0 the array has unique values.
arr = np.array(
[[[0, 0],
[1, 2]],
[[1, 2],
[0, 1]],
[[2, 1],
[2, 0]]]
)
Suppose I would like to remove the value 0 from it. The result would look like
array([[[1, 2],
[1, 2]],
[[2, 1],
[2, 1]]])
What I've tried
I tried removing the values using np.where and then reshape the array, but it is different from the expected array.
>>> arr[np.where(arr != 0)]
array([1, 2, 1, 2, 1, 2, 1, 2])
>>> arr[np.where(arr != 0)].reshape(-1, 2, 2)
array([[[1, 2],
[1, 2]],
[[1, 2],
[1, 2]]])
Explanation of output
In consider arr[:, 1, 0] = [1, 0, 2]. After dropping 0, the new array is [1, 2]. Therefore new_arr[:, 1, 0] = [1, 2].
I just realized that you can specify the axis order in np.transpose, and I came up with a solution with that.
Solution
>>> arr_t = np.transpose(arr, (1, 2, 0))
>>> arr_dp = arr_t[arr_t != 0]
>>> arr_dp_rs = arr_dp.reshape(arr.shape[1], arr.shape[2], -1)
>>> new_arr = np.transpose(arr_dp_rs, (2, 0 ,1))
>>> new_arr
array([[[1, 2],
[1, 2]],
[[2, 1],
[2, 1]]])
Explanation
We first transpose arr so the 0-th axis is the inner most axis. This ensures after subsetting, the values are ordered in the 0-th axis.
>>> arr_t = np.transpose(arr, (1, 2, 0))
>>> arr_t
array([[[0, 1, 2],
[0, 2, 1]],
[[1, 0, 2],
[2, 1, 0]]])
>>> arr_dp = arr_t[arr_t != 0]
>>> arr_dp
array([1, 2, 2, 1, 1, 2, 2, 1])
Now the values are in the desired order but along the 0-th axis, we reshape it then swap the axis again.
arr_dp_rs = arr_dp.reshape(arr.shape[1], arr.shape[2], -1)
arr_dp_rs
array([[[1, 2],
[2, 1]],
[[1, 2],
[2, 1]]])
new_arr = np.transpose(arr_dp_rs, (2, 0 ,1))
new_arr
array([[[1, 2],
[1, 2]],
[[2, 1],
[2, 1]]])

How can I tranpose arrays in a numpy matrix?

Similar to the premise in this question, I'd like to transpose each sub-array in the matrix. However, my sub-arrays are of different sizes. I've tried the following lines of code:
import numpy as np
test_array = np.array([
np.array([[1, 1, 1, 1],
[1, 1, 1, 1]]),
np.array([[2, 2, 2, 2],
[2, 2, 2, 2],
[2, 2, 2, 2]]),
np.array([[3, 3],
[3, 3],
[3, 3]])
])
new_test_array = np.apply_along_axis(test_array, 0, np.transpose)
*** numpy.AxisError: axis 0 is out of bounds for array of dimension 0
new_test_array = np.transpose(test_array, (0, 2, 1))
*** ValueError: axes don't match array
new_test_array = np.array(list(map(np.transpose, test_array)))
returns original array
My expected output is
new_test_array = np.array([
np.array([[1, 1],
[1, 1],
[1, 1],
[1, 1]]),
np.array([[2, 2, 2],
[2, 2, 2],
[2, 2, 2],
[2, 2, 2]]),
np.array([[3, 3, 3],
[3, 3, 3]])
])
To answer shortly, you can do this on your data to get what you want:
new_test_array = [np.transpose(x) for x in test_array]
But in your example you build an array of lists instead of an array of varying sizes (which is impossible in numpy). It is also why your methods did not work.
So if you want to do it in a more correct way, first you have to use a list and then convert each list into a numpy array, which you can then transpose individually.
Here's an example code:
test_list = [[[1, 1, 1, 1],
[1, 1, 1, 1]],
[[2, 2, 2, 2],
[2, 2, 2, 2],
[2, 2, 2, 2]],
[[3, 3],
[3, 3],
[3, 3]]]
list_of_arrays = [np.array(x) for x in test_list]
transposed_arrays = [np.transpose(x) for x in list_of_arrays]
Printing transposed_arrays will give you this:
[array([[1, 1],
[1, 1],
[1, 1],
[1, 1]]),
array([[2, 2, 2],
[2, 2, 2],
[2, 2, 2],
[2, 2, 2]]),
array([[3, 3, 3],
[3, 3, 3]])]

Inserting an array of arrays as the last column

I have an array A:
array([[1, 2, 3],
[1, 1, 1],
[2, 2, 2]])
and an array B:
array([[1, 0],
[1, 0],
[0, 1]])
I want to make array B as the last column of array A, so I want the result array (let's call it C) to look like this:
array([[1, 2, 3, [1, 0]],
[1, 1, 1, [1, 0]],
[2, 2, 2, [0, 1]]])
I tried: np.insert(a,-1,b,axis=1) , but this gave me an error:
ValueError: could not broadcast input array from shape (2,3) into shape (3,3)
Maybe that's what you're looking for:
import numpy as np
a = np.array([[1, 2, 3],
[1, 1, 1],
[2, 2, 2]])
b = np.array([[1, 0],
[1, 0],
[0, 1]])
np.hstack([a,b])
Which results in:
array([[1, 2, 3, 1, 0],
[1, 1, 1, 1, 0],
[2, 2, 2, 0, 1]])
print zip(*zip(*a)+[b.tolist(),])
although it wont be a numpy array afterwards
>>> a
array([[1, 2, 3],
[1, 1, 1],
[2, 2, 2]])
>>> b
array([[1, 0],
[1, 0],
[0, 1]])
>>> zip(*zip(*a)+[b.tolist(),])
[(1, 2, 3, [1, 0]), (1, 1, 1, [1, 0]), (2, 2, 2, [0, 1])]

How do I set cell values in `np.array()` based on condition?

I have a numpy array and a list of valid values in that array:
import numpy as np
arr = np.array([[1,2,0], [2,2,0], [4,1,0], [4,1,0], [3,2,0], ... ])
valid = [1,4]
Is there a nice pythonic way to set all array values to zero, that are not in the list of valid values and do it in-place? After this operation, the list should look like this:
[[1,0,0], [0,0,0], [4,1,0], [4,1,0], [0,0,0], ... ]
The following creates a copy of the array in memory, which is bad for large arrays:
arr = np.vectorize(lambda x: x if x in valid else 0)(arr)
It bugs me, that for now I loop over each array element and set it to zero if it is in the valid list.
Edit: I found an answer suggesting there is no in-place function to achieve this. Also stop changing my whitespaces. It's easier to see the changes in arr whith them.
You can use np.place for an in-situ update -
np.place(arr,~np.in1d(arr,valid),0)
Sample run -
In [66]: arr
Out[66]:
array([[1, 2, 0],
[2, 2, 0],
[4, 1, 0],
[4, 1, 0],
[3, 2, 0]])
In [67]: np.place(arr,~np.in1d(arr,valid),0)
In [68]: arr
Out[68]:
array([[1, 0, 0],
[0, 0, 0],
[4, 1, 0],
[4, 1, 0],
[0, 0, 0]])
Along the same lines, np.put could also be used -
np.put(arr,np.where(~np.in1d(arr,valid))[0],0)
Sample run -
In [70]: arr
Out[70]:
array([[1, 2, 0],
[2, 2, 0],
[4, 1, 0],
[4, 1, 0],
[3, 2, 0]])
In [71]: np.put(arr,np.where(~np.in1d(arr,valid))[0],0)
In [72]: arr
Out[72]:
array([[1, 0, 0],
[0, 0, 0],
[4, 1, 0],
[4, 1, 0],
[0, 0, 0]])
Indexing with booleans would work too:
>>> arr = np.array([[1, 2, 0], [2, 2, 0], [4, 1, 0], [4, 1, 0], [3, 2, 0]])
>>> arr[~np.in1d(arr, valid).reshape(arr.shape)] = 0
>>> arr
array([[1, 0, 0],
[0, 0, 0],
[4, 1, 0],
[4, 1, 0],
[0, 0, 0]])

Is there a way to compose two matrix of numbers in one matrix of text?

I would like to compose two matrix of numbers into one matrix of formated text in python.
Is there a easy way?
I could use for, but I just want this because is better for work.
As a simple example:
array([[0, 1, 2],
[0, 1, 2],
[0, 1, 2],
[0, 1, 2],
[0, 1, 2]])
array([[0, 0, 0],
[1, 1, 1],
[2, 2, 2],
[3, 3, 3],
[4, 4, 4]])
to
array([['0:0', '1:0', '2:0'],
['0:1', '1:1', '2:1'],
['0:2', '1:2', '2:2'],
['0:3', '1:3', '2:3'],
['0:4', '1:4', '2:4']])
You can use np.dstack to combine both the arrays and use string manipulation with comprehension to manipulate each cell of the combined array
>>> arr = np.dstack((arr1, arr2))
>>> np.array([np.array([':'.join(map(str,cell)) for cell in row ]) for row in arr])
array([['0:0', '1:0', '2:0'],
['0:1', '1:1', '2:1'],
['0:2', '1:2', '2:2'],
['0:3', '1:3', '2:3'],
['0:4', '1:4', '2:4']],
dtype='|S3')
You could use nditer to iterate over the arrays, and make strings as needed: e.g.
import numpy as np
a1 = np.array([[0, 1, 2],
[0, 1, 2],
[0, 1, 2],
[0, 1, 2],
[0, 1, 2]])
a2 = np.array([[0, 0, 0],
[1, 1, 1],
[2, 2, 2],
[3, 3, 3],
[4, 4, 4]])
out=np.empty(a1.shape, dtype='S5')
for x,y,o in np.nditer([a1, a2, out], op_flags=['readwrite']):
o[...] = "{}:{}".format(x,y)
print(out)
Result:
[['0:0' '1:0' '2:0']
['0:1' '1:1' '2:1']
['0:2' '1:2' '2:2']
['0:3' '1:3' '2:3']
['0:4' '1:4' '2:4']]
Use list comprehensions and zip() to form a new array:
from numpy import array
ar1 = array([[0, 1, 2],
[0, 1, 2],
[0, 1, 2],
[0, 1, 2],
[0, 1, 2]])
ar2 = array([[0, 0, 0],
[1, 1, 1],
[2, 2, 2],
[3, 3, 3],
[4, 4, 4]])
res = array([['%s:%s' % (j1, j2) for j1, j2 in zip(i1, i2)] for i1, i2 in zip(ar1, ar2)])
print(res)
Result:
[['0:0' '1:0' '2:0']
['0:1' '1:1' '2:1']
['0:2' '1:2' '2:2']
['0:3' '1:3' '2:3']
['0:4' '1:4' '2:4']]
This solution will also fit usual Python two-dimensional lists (just remove the 'array' functions).

Categories