Numpy re-index to first N natural numbers

Numpy re-index to first N natural numbers - python

I have a matrix that has a quite sparse index (the largest values in both rows and columns are beyond 130000), but only a few of those rows/columns actually have non-zero values.
Thus, I want to have the row and column indices shifted to only represent the non-zero ones, by the first N natural numbers.
Visually, I want a example matrix like this
1 0 1
0 0 0
0 0 1
to look like this
1 1
0 1
but only if all values in the row/column are zero.
Since I do have the matrix in a sparse format, I could simply create a dictionary, store every value by an increasing counter (for row and matrix separately), and get a result.
row_dict = {}
col_dict = {}
row_ind = 0
col_ind = 0
# el looks like this: (row, column, value)
for el in sparse_matrix:
if el[0] not in row_dict.keys():
row_dict[el[0]] = row_ind
row_ind += 1
if el[1] not in col_dict.keys():
col_dict[el[1]] = col_ind
col_ind += 1
# now recreate matrix with new index
But I was looking for maybe an internal function in NumPy. Also note that I do not really know how to word the question, so there might well be a duplicate out there that I do not know of; Any pointers in the right direction are appreciated.

You can use np.unique:
>>> import numpy as np
>>> from scipy import sparse
>>>
>>> A = np.random.randint(-100, 10, (10, 10)).clip(0, None)
>>> A
array([[6, 0, 5, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 7, 0, 0, 0, 0, 4, 9],
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 4, 0],
[9, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 4, 0, 0, 0, 0, 0, 0]])
>>> B = sparse.coo_matrix(A)
>>> B
<10x10 sparse matrix of type '<class 'numpy.int64'>'
with 8 stored elements in COOrdinate format>
>>> runq, ridx = np.unique(B.row, return_inverse=True)
>>> cunq, cidx = np.unique(B.col, return_inverse=True)
>>> C = sparse.coo_matrix((B.data, (ridx, cidx)))
>>> C.A
array([[6, 5, 0, 0, 0],
[0, 0, 7, 4, 9],
[0, 0, 0, 4, 0],
[9, 0, 0, 0, 0],
[0, 0, 4, 0, 0]])

Related

Transform 2d numpy array into 2d one hot encoding

How would I transform
a=[[0,6],
[3,7],
[5,5]]
into
b=[[1,0,0,0,0,0,1,0],
[0,0,0,1,0,0,0,1],
[0,0,0,0,0,1,0,0]]
I want to bring notice to how the final array in b only has one value set to 1 due to the repeat in the final array in a.

Using indexing:
a = np.array([[0,6],
[3,7],
[5,5]])
b = np.zeros((len(a), a.max()+1), dtype=int)
b[np.arange(len(a)), a.T] = 1
Output:
array([[1, 0, 0, 0, 0, 0, 1, 0],
[0, 0, 0, 1, 0, 0, 0, 1],
[0, 0, 0, 0, 0, 1, 0, 0]])

This can also be done using numpy broadcasting and boolean comparision in the following way:
a = np.array([[0,6],
[3,7],
[5,5]])
# Convert to 3d array so that each element is present along the last axis
# Compare with max+1 to get the index of values as True.
b = (a[:,:,None] == np.arange(a.max()+1))
# Check if any element along axis 1 is true and convert the type to int
b = b.any(1).astype(int)
Output:
array([[1, 0, 0, 0, 0, 0, 1, 0],
[0, 0, 0, 1, 0, 0, 0, 1],
[0, 0, 0, 0, 0, 1, 0, 0]])

encode a 0-1 matrix from an integer matrix numpy

So I have an n*K integer matrix [Note: its a representation of the number of samples drawn from K-distributions (K-columns)]
a =[[0,1,0,0,2,0],
[0,0,1,0,0,0],
[3,0,0,0,0,0],
]
[Note: in the application context this matrix basically means that for the i row (sim instance) we drew 1 element from the "distribution 1" (1 \in [0,..K]) (a[0,1] = 1) and 2 from the distribution 4(a[0,4] = 2)].
What I need is to generate a 0-1 matrix that represents the same integer matrix but with ones(1). In this case, is a 3D matrix of n*a.max()*K that has a 1 for each sample that is drawn from the distributions. [Note: we need this matrix so we can multiply by our K-distribution sample matrix]
Output
b = [[[0,1,0,0,1,0], # we don't care if they samples are stack
[0,0,0,0,1,0],
[0,0,0,0,0,0]], # this is the first row representation
[[0,0,1,0,0,0],
[0,0,0,0,0,0],
[0,0,0,0,0,0]], # this is the second row representation
[[1,0,0,0,0,0],
[1,0,0,0,0,0],
[1,0,0,0,0,0]], # this is the third row representation
]
how to do that in NumPy ?
Thanks !

from #michael-szczesny comment
a = np.array([[0,1,0,0,2,0],
[0,0,1,0,0,0],
[3,0,0,0,0,0],
])
b = (np.arange(1, a.max()+1)[:,None] <= a[:,None]).astype('uint8')
print(b)
array([[[0, 1, 0, 0, 1, 0],
[0, 0, 0, 0, 1, 0],
[0, 0, 0, 0, 0, 0]],
[[0, 0, 1, 0, 0, 0],
[0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0]],
[[1, 0, 0, 0, 0, 0],
[1, 0, 0, 0, 0, 0],
[1, 0, 0, 0, 0, 0]]], dtype=uint8)

Inserting sub-matrices into scipy sparse matrix

How can I efficiently insert sub-matrices at specific positions into my sparse matrix? Also, which scipy sparse matrix class is recommended for such an incremental construction?
More specifically, how can I fill the matrix M in the code below?
def rrd(mesh, rel_rotations, neighbors, R_0):
M = scipy.sparse.lil_matrix((N_FACES*9*3,N_FACES*9))
for i in range(0,N_FACES*27,27):
for j in range(3):
for k in range(0,N_FACES*9,9):
M[i+j*9:i+(j+1)*9,k:k+9] = -np.eye(9)
for i in range(len(rel_rotations)):
diagonals = [
rel_rotations[i][0][2],
np.append(rel_rotations[i][0][1].repeat(3), rel_rotations[i][1][2].repeat(3)),
np.append(rel_rotations[i][0][0].repeat(3), np.append(rel_rotations[i][1][1].repeat(3),
rel_rotations[i][2][2].repeat(3))),
np.append(rel_rotations[i][1][0].repeat(3), rel_rotations[i][2][1].repeat(3)),
rel_rotations[i][2][0].repeat(3)
]
diag_rel_rotations = scipy.sparse.diags(diagonals, [-6,-3,0,3,6], shape=(9,9)).todense()
mod = i % 3
div = int((i-mod)/3)
n_idx = neighbors[div][mod]
M[i+mod*9:i+(mod+1)*9][n_idx*9:(n_idx+1)*9] = diag_rel_rotations
Slicing doesn't work here and I looked through some different types of sparse matrices but couldn't figure out which is the appropriate one for this problem.

lil is the right one for assignment.
In [553]: M = sparse.lil_matrix((9,9), dtype=int)
In [554]: M
Out[554]:
<9x9 sparse matrix of type '<class 'numpy.int64'>'
with 0 stored elements in LInked List format>
In [555]: M[2:5, 3:6] = np.eye(3)
In [556]: M
Out[556]:
<9x9 sparse matrix of type '<class 'numpy.int64'>'
with 3 stored elements in LInked List format>
In [557]: M.A
Out[557]:
array([[0, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 1, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 1, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 1, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0]])
In [558]: d = sparse.diags([[1,2],[1,2,3],[2,3]], [-1,0,1])
In [562]: M[0:3, 6:9] = d

How to set a probability of a value becoming a zero for an np.array?

I've got an np.array 219 by 219 with mostly 0s and 2% of nonzeros and I know want to create new arrays where each of the nonzero values has 90% of chance of becoming a zero.
I now know how to change the n-th non zero value to 0 but how to work with probabilities?
Probably this can be modified:
index=0
for x in range(0, 219):
for y in range(0, 219):
if (index+1) % 10 == 0:
B[x][y] = 0
index+=1
print(B)

You could use np.random.random to create an array of random numbers to compare with 0.9, and then use np.where to select either the original value or 0. Since each draw is independent, it doesn't matter if we replace a 0 with a 0, so we don't need to treat zero and nonzero values differently. For example:
In [184]: A = np.random.randint(0, 2, (8,8))
In [185]: A
Out[185]:
array([[1, 1, 1, 0, 0, 0, 0, 1],
[1, 1, 1, 0, 0, 0, 0, 0],
[1, 1, 1, 1, 1, 0, 0, 0],
[0, 1, 0, 1, 0, 0, 0, 1],
[0, 1, 0, 1, 1, 1, 1, 0],
[1, 1, 0, 1, 1, 0, 0, 0],
[1, 0, 0, 1, 0, 0, 1, 0],
[1, 1, 0, 0, 0, 1, 0, 1]])
In [186]: np.where(np.random.random(A.shape) < 0.9, 0, A)
Out[186]:
array([[0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 1],
[0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0],
[0, 1, 0, 0, 0, 0, 0, 0]])

# first method
prob=0.3
print(np.random.choice([2,5], (5,), p=[prob,1-prob]))
# second method (i prefer)
import random
import numpy as np
def randomZerosOnes(a,b, N, prob):
if prob > 1-prob:
n1=int((1-prob)*N)
n0=N-n1
else:
n0=int(prob*N)
n1=N-n0
zo=np.concatenate(([a for _ in range(n0)] ,[b for _ in range(n1)] ), axis=0 )
random.shuffle(zo)
return zo
zo=randomZerosOnes(2,5, N=5, prob=0.3)
print(zo)

How to shift the columns of a 2D array multiple times, while still considering its original position?

Alright, so consider that I have a matrix m, as follows:
m = [[0, 1, 0, 0, 0, 1],
[4, 0, 0, 3, 2, 0],
[0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0]]
My goal is to check each row of the matrix and see if the sum of that row is zero. If the sum is not zero, I want to shift the column that corresponds to that row to the end of the matrix. If the sum of the row is zero, nothing happens. So in the given matrix above the following should occur:
The program discovers that the 0th row has a sum that does not equal zero
The 0th column of the matrix is shifted to the end of the matrix, as follows:
m = [[1, 0, 0, 0, 1, 0],
[0, 0, 3, 2, 0, 4],
[0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0]]
The program checks the next row and does the same, shifting the column to the end of the matrix
m = [[0, 0, 0, 1, 0, 1],
[0, 3, 2, 0, 4, 0],
[0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0]]
Each of the other rows are checked, but since all of their sums are zero no shift is made, and the final result is the matrix above.
The issue arises after shifting the columns of the matrix for the first time, once all of the values are shifted it becomes tricky to tell which column corresponds to the correct row.
I can't use numpy to solve this problem as I can only use the original Python 2 libraries.

Use a simple loop and when the sum is not equal to zero loop over rows again and append the popped first item to each row.
>>> from pprint import pprint
>>> m = [[0, 1, 0, 0, 0, 1],
[4, 0, 0, 3, 2, 0],
[0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0]]
>>> for row in m:
# If all numbers are >= 0 then we can short-circuit this using `if any(row):`.
if sum(row) != 0:
for row in m:
row.append(row.pop(0))
...
>>> pprint(m)
[[0, 0, 0, 1, 0, 1],
[0, 3, 2, 0, 4, 0],
[0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0]]
list.pop is O(N) operation, if you need something fast then use collections.deque.

deque can rotate elements.
from collections import deque
def rotate(matrix):
matrix_d = [deque(row) for row in matrix]
for row in matrix:
if sum(row) != 0:
for row_d in matrix_d:
row_d.rotate(-1)
return [list(row) for row in matrix_d]

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Numpy re-index to first N natural numbers - python

Related

Transform 2d numpy array into 2d one hot encoding

encode a 0-1 matrix from an integer matrix numpy

Inserting sub-matrices into scipy sparse matrix

How to set a probability of a value becoming a zero for an np.array?

How to shift the columns of a 2D array multiple times, while still considering its original position?

Categories

Resources