Find indices of each bin using numpy - python

I'm encountering a problem that I hope you can help me solve.
I have a 2D numpy array which I want to divide into bins by value. Then I need to know the exact initial indices of all the numbers in each bin.
For example, consider the matrix
[[1,2,3], [4,5,6], [7,8,9]]
and the bin array
[0,2,4,6,8,10].
Then the element first element ([0,0]) should be stored in one bin, the next two elements ([0,1],[0,2]) should be stored in another bin and so on. The desired output looks like this:
[[[0,0]],[[0,1],[0,2]],[[1,0],[1,1]],[[1,2],[2,0]],[[2,1],[2,2]]]
Even though I tried several numpy functions, I'm not able to do this in an elegant way. The best attempt might be
>>> a = [[1,2,3], [4,5,6], [7,8,9]]
>>> bins = [0,2,4,6,8,10]
>>> bin_in_mat = np.digitize(a, bins, right=False)
>>> bin_in_mat
array([[1, 2, 2],
[3, 3, 4],
[4, 5, 5]])
>>> indices = np.argwhere(bin_in_mat)
>>> indices
array([[0, 0],
[0, 1],
[0, 2],
[1, 0],
[1, 1],
[1, 2],
[2, 0],
[2, 1],
[2, 2]])
but this doesn't solve my problem. Any suggestions?

You need to leave numpy and use a loop for this - it's not capable of representing your result:
bin_in_mat = np.digitize(a, bins, right=False)
bin_contents = [np.argwhere(bin_in_mat == i) for i in range(len(bins))]
>>> for b in bin_contents:
... print(repr(b))
array([], shape=(0, 2), dtype=int64)
array([[0, 0]], dtype=int64)
array([[0, 1],
[0, 2]], dtype=int64)
array([[1, 0],
[1, 1]], dtype=int64)
array([[1, 2],
[2, 0]], dtype=int64)
array([[2, 1],
[2, 2]], dtype=int64)
Note that digitize is a bad choice for large integer input (until 1.15), and is faster and more correct as bin_in_mat = np.searchsorted(bins, a, side='left')

Related

Extract list elements at diagonal above with numpy.tril

Hi I want to extract a nested list's elements at diagonal above with numpy.tril. From my understanding, set parameter k>0, it will return diagonal above. However, my code doesn't return the result expected.
np.tril([[1,2,3],[4,5,6],[7,8,9]], 1)
>>> array([[1, 2, 0],
[4, 5, 6],
[7, 8, 9]])
expected output:
array([1,2,3],
[4,5,0],
[7,0,0,])
You can flip the array, get the upper triangle, then flip it back:
In [1]: import numpy as np
In [2]: a = np.array([[1,2,3],[4,5,6],[7,8,9]])
In [3]: np.triu(a[:, ::-1])[:, ::-1]
Out[3]:
array([[1, 2, 3],
[4, 5, 0],
[7, 0, 0]])
Two issues. First, np.tril (as indicated in its name) gives a lower diagonal. Second, triangular arrays are conventionally the mirror image of your desired output.
We can take a peek at the source code for np.triu and adapt it for a new triu_anti function via np.fliplr:
def triu_anti(m, k=0):
m = np.asanyarray(m)
mask = np.fliplr(np.tri(*m.shape[-2:], k=k-1, dtype=bool))
return np.where(mask, np.zeros(1, m.dtype), m)
res = triu_anti([[1,2,3],[4,5,6],[7,8,9]])
print(res)
# array([[1, 2, 3],
# [4, 5, 0],
# [7, 0, 0]])
Using T for twice
np.tril(a.T,0).T
array([[1, 2, 3],
[0, 5, 6],
[0, 0, 9]])

numpy sort 2d: rearrange rows without changing values in row

How can the rows in an array be sorted without that the values in each row will changed?
Furthermore: how to get the indicies of this sort-process?
input:
a = np.array([[4,3],[0,3],[3,0],[1,3],[1,2],[2,0]])
required sorting arrray:
b = np.array([1,4,3,5,2,0])
a = a[b]
output:
a = np.array([[0,3],[1,2],[1,3][2,0],[3,0],[4,3]])
How do I get the array b ?
You need lexsort here:
b = np.lexsort((a[:, 1], a[:, 0]))
# array([1, 4, 3, 5, 2, 0], dtype=int64)
And applied to your initial array:
>>> a[b]
array([[0, 3],
[1, 2],
[1, 3],
[2, 0],
[3, 0],
[4, 3]])
As #miradulo pointed out, you may also use:
b = np.lexsort(np.fliplr(a).T)
Which is less verbose than explicitly stating the columns to sort on.

Converting a list of numpy array to a single int numpy array

I have a list of numpy arrays, that I want to convert into a single int numpy array.
For example if I have 46 4 x 4 numpy arrays in a list of dimension 2 x 23, I want to convert it into a single integer numpy array of 2 x 23 x 4 x 4 dimension. I have found a way to do this by going through every single element and using numpy.stack(). Is there any better way?
You can simply use np.asarray like so
import numpy as np
list_of_lists = [[np.random.normal(0, 1, (4, 4)) for _ in range(23)]
for _ in range(2)]
a = np.asarray(list_of_lists)
a.shape
The function will infer the shape of the list of lists for you and create an appropriate array.
Stack works for me:
In [191]: A,B,C = np.zeros((2,2),int),np.ones((2,2),int),np.arange(4).reshape(2,
...: 2)
In [192]: x = [[A,B,C],[C,B,A]]
In [193]:
In [193]: x
Out[193]:
[[array([[0, 0],
[0, 0]]), array([[1, 1],
[1, 1]]), array([[0, 1],
[2, 3]])], [array([[0, 1],
[2, 3]]), array([[1, 1],
[1, 1]]), array([[0, 0],
[0, 0]])]]
In [194]: np.stack(x)
Out[194]:
array([[[[0, 0],
[0, 0]],
[[1, 1],
[1, 1]],
[[0, 1],
[2, 3]]],
[[[0, 1],
[2, 3]],
[[1, 1],
[1, 1]],
[[0, 0],
[0, 0]]]])
In [195]: _.shape
Out[195]: (2, 3, 2, 2)
stack views x as a list of 2 items, and applies np.asarray to each.
In [198]: np.array(x[0]).shape
Out[198]: (3, 2, 2)
Then adds a dimension, (1,3,2,2), and concatenates on the first axis.
In this case np.array(x) works just as well
In [201]: np.array(x).shape
Out[201]: (2, 3, 2, 2)

I have a numpy array, and an array of indexs, how can I access to these positions at the same time

for example, I have the numpy arrays like this
a =
array([[1, 2, 3],
[4, 3, 2]])
and index like this to select the max values
max_idx =
array([[0, 2],
[1, 0]])
how can I access there positions at the same time, to modify them.
like "a[max_idx] = 0" getting the following
array([[1, 2, 0],
[0, 3, 2]])
Simply use subscripted-indexing -
a[max_idx[:,0],max_idx[:,1]] = 0
If you are working with higher dimensional arrays and don't want to type out slices of max_idx for each axis, you can use linear-indexing to assign zeros, like so -
a.ravel()[np.ravel_multi_index(max_idx.T,a.shape)] = 0
Sample run -
In [28]: a
Out[28]:
array([[1, 2, 3],
[4, 3, 2]])
In [29]: max_idx
Out[29]:
array([[0, 2],
[1, 0]])
In [30]: a[max_idx[:,0],max_idx[:,1]] = 0
In [31]: a
Out[31]:
array([[1, 2, 0],
[0, 3, 2]])
Numpy support advanced slicing like this:
a[b[:, 0], b[:, 1]] = 0
Code above would fit your requirement.
If b is more than 2-D. A better way should be like this:
a[np.split(b, 2, axis=1)]
The np.split will split ndarray into columns.

matrix entries also matrices in python

is there a way to create a matrix whose entries are also matrices in Python? I don't see any way to do so with numpy.
*In other words, I want A[i,j] to be a matrix as well.
If a 4d array is ok, then
x = np.zeros((3,4,2,2), dtype=int)
where
x[0,0].shape # (2,2)
If it must be np.matrix type, then it has to be 2d. It can be dtype=object, where each element is in turn a 2d matrix. That construction is a bit more convoluted (a lot more?).
Make an empty array with dtype=object
In [565]: x=np.zeros((2,2),dtype=object)
In [566]: x
Out[566]:
array([[0, 0],
[0, 0]], dtype=object)
Fill each element with a matrix:
In [567]: x[0,0]=np.matrix([[0,1],[2,3]])
In [569]: x[0,1]=np.matrix([[0,1],[2,3]])
In [570]: x[1,0]=np.matrix([[0,1],[2,3]])
In [571]: x[1,1]=np.matrix([[0,1],[2,3]])
In [572]: x
Out[572]:
array([[matrix([[0, 1],
[2, 3]]), matrix([[0, 1],
[2, 3]])],
[matrix([[0, 1],
[2, 3]]), matrix([[0, 1],
[2, 3]])]], dtype=object)
Turn it into a matrix:
In [573]: xm=np.matrix(x)
In [574]: xm
Out[574]:
matrix([[matrix([[0, 1],
[2, 3]]), matrix([[0, 1],
[2, 3]])],
[matrix([[0, 1],
[2, 3]]), matrix([[0, 1],
[2, 3]])]], dtype=object)
I don't know whether xm has any useful computational properties.

Categories