cupy/numpy ignores duplicate indexes - python

When we uses arrays as indexes cupy/numpy ignores duplicates.
Example:
import cupy as cp
matrix = cp.zeros((3, 3))
xi = cp.asarray([0, 1, 1, 2])
yi = cp.asarray([0, 1, 1, 2])
matrix[xi, yi] += 1
print(matrix.get())
Output:
[[1. 0. 0.]
[0. 1. 0.]
[0. 0. 1.]]
Desired output:
[[1. 0. 0.]
[0. 2. 0.]
[0. 0. 1.]]
The second one (1, 1) index is ignored. How to apply operation for duplicate indexes also?

Related

How do I create a binary matrix with a specific repeating pattern of 1s and 0s?

I want to efficiently print a matrix in Python that follows a repeating specific pattern in the columns of 3 1s in a column, the rest of the columns 0s then the column 1s switch and so on for 1000 rows as shown below:
100000
100000
100000
010000
010000
010000
001000
001000
001000
000100
000100
000100
000010
000010
000010
000001
000001
000001
100000
100000
100000
010000
010000
010000
...
First, you can create a diagonal matrix of size (6, 6) with only ones on the diagonal:
>>> arr = np.diag(np.ones(6))
Then, you can repeat each rows of that matrix 3times:
>>> arr = np.repeat(arr, repeats=3, axis=0)
>>> arr
[[1. 0. 0. 0. 0. 0.]
[1. 0. 0. 0. 0. 0.]
[1. 0. 0. 0. 0. 0.]
[0. 1. 0. 0. 0. 0.]
[0. 1. 0. 0. 0. 0.]
[0. 1. 0. 0. 0. 0.]
[0. 0. 1. 0. 0. 0.]
[0. 0. 1. 0. 0. 0.]
[0. 0. 1. 0. 0. 0.]
[0. 0. 0. 1. 0. 0.]
[0. 0. 0. 1. 0. 0.]
[0. 0. 0. 1. 0. 0.]
[0. 0. 0. 0. 1. 0.]
[0. 0. 0. 0. 1. 0.]
[0. 0. 0. 0. 1. 0.]
[0. 0. 0. 0. 0. 1.]
[0. 0. 0. 0. 0. 1.]
[0. 0. 0. 0. 0. 1.]]
Finally, use np.tile to tile this matrix the number of times you want. In your case, as you want 1000 rows, you can repeat the array 1000 // 18 + 1 = 56 times, and only keep the first 1000 rows.
>>> arr = np.tile(arr, (56, 1))[:1000]
Build an identity matrix, and then take out the matrix you need by generating the row indices (elegant but inefficient):
>>> np.eye(6, dtype=int)[np.arange(1000) // 3 % 6]
array([[1, 0, 0, 0, 0, 0],
[1, 0, 0, 0, 0, 0],
[1, 0, 0, 0, 0, 0],
...,
[0, 0, 1, 0, 0, 0],
[0, 0, 1, 0, 0, 0],
[0, 0, 0, 1, 0, 0]])

Is there a way in Python to get a sub matrix as in Matlab?

For example, let's say that I have the following matrices in Matlab:
A = zeros(10)
B = ones(2,2)
I want to add the matrix A with B in specific positions of A that are stored like this:
locations = [1, 3]
I can do this:
A(locations, locations) = A(locations, locations) + B
So the job is done. In python, I would like to the same using NumPy arrays, like:
import numpy as np
A = np.zeros([10,10])
B = np.ones([2,2])
locations = np.array([0, 2]) #Because NumPy arrays are zero indexed
A[locations, locations] = A[locations, locations] + B
But I get this error:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ValueError: shape mismatch: value array of shape (2,2) could not be broadcast to indexing result of shape (2,)
Does anyone know how can I do this?
In [126]: A = np.zeros((5,5),int)
In [127]: B = np.arange(1,5).reshape(2,2)
In [128]: idx = np.array([0,2])
In numpy indexing with 2 1d arrays produces a 'diagonal', the points (0,0) and (2,2). In MATLAB you have to use some sort of sub2ind to convert the 2d indexing to 1d.
In [129]: A[idx,idx]
Out[129]: array([0, 0])
To get a block (as MATLAB) does we have to take advantage of broadcasting:
In [130]: A[idx[:,None],idx]
Out[130]:
array([[0, 0],
[0, 0]])
In [131]: A[idx[:,None],idx]=B
In [132]: A
Out[132]:
array([[1, 0, 2, 0, 0],
[0, 0, 0, 0, 0],
[3, 0, 4, 0, 0],
[0, 0, 0, 0, 0],
[0, 0, 0, 0, 0]])
This, and other indexing details, is covered in
https://numpy.org/doc/stable/reference/arrays.indexing.html
https://numpy.org/doc/stable/user/basics.broadcasting.html
A = np.zeros([10,10])
B = np.ones([2,2])
print(B)
A[:2,:2]=B
print(A)
#output B
[[1. 1.]
[1. 1.]]
#output A
[[1. 1. 0. 0. 0. 0. 0. 0. 0. 0.]
[1. 1. 0. 0. 0. 0. 0. 0. 0. 0.]
[0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
[0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
[0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
[0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
[0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
[0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
[0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
[0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]]

Convert categorical data back to numbers using keras utils to_categorical

I am using to_categorical from keras.utils for one-hot encoding the numbers in a list. How can get back the numbers from categorical data? Is there any function available for that.
Y=to_categorical(y, num_classes=79)
You can do it simply by np.argmax():
import numpy as np
y = [0, 1, 2, 0, 4, 5]
Y = to_categorical(y, num_classes=len(y))
print(Y)
y = np.argmax(Y, axis=-1)
print(y)
# [0, 1, 2, 0, 4, 5]
Why use argmax(axis=-1)?
In the above example, to_categorical returns a matrix with shape (6,6). Set axis=-1 means, extract largest indices in each row.
[[1. 0. 0. 0. 0. 0.]
[0. 1. 0. 0. 0. 0.]
[0. 0. 1. 0. 0. 0.]
[1. 0. 0. 0. 0. 0.]
[0. 0. 0. 0. 1. 0.]
[0. 0. 0. 0. 0. 1.]]
See more at here about indexing.
What if my data have more than 1 dimension?
No difference. Each entry, in the preliminary list, converts to a one-hot encoding with the size of [1, nb_classes] which only one index is one and the rest are zero. Similar to the above example, when you find the maximum in each row, it converts to the original list.
y = [[0, 1], [2, 0], [4, 5]]
Y = keras.utils.to_categorical(y, num_classes=6)
#[[[1. 0. 0. 0. 0. 0.]
# [0. 1. 0. 0. 0. 0.]]
#
# [[0. 0. 1. 0. 0. 0.]
# [1. 0. 0. 0. 0. 0.]]
#
# [[0. 0. 0. 0. 1. 0.]
# [0. 0. 0. 0. 0. 1.]]]
y = np.argmax(Y, axis=-1)
#[[0 1]
# [2 0]
# [4 5]]

Add 2-d array to 3-d array with constantly changing index fast

I'm trying to add a 2-d array to a 3-d array with constantly changing index , I come up with following code:
import numpy as np
a = np.zeros([8, 3, 5])
k = 0
for i in range(2):
for j in range(4):
a[k, i: i + 2, j: j + 2] += np.ones([2, 2], dtype=int)
k += 1
print(a)
which will give exactly what i want:
[[[1. 1. 0. 0. 0.]
[1. 1. 0. 0. 0.]
[0. 0. 0. 0. 0.]]
[[0. 1. 1. 0. 0.]
[0. 1. 1. 0. 0.]
[0. 0. 0. 0. 0.]]
[[0. 0. 1. 1. 0.]
[0. 0. 1. 1. 0.]
[0. 0. 0. 0. 0.]]
[[0. 0. 0. 1. 1.]
[0. 0. 0. 1. 1.]
[0. 0. 0. 0. 0.]]
[[0. 0. 0. 0. 0.]
[1. 1. 0. 0. 0.]
[1. 1. 0. 0. 0.]]
[[0. 0. 0. 0. 0.]
[0. 1. 1. 0. 0.]
[0. 1. 1. 0. 0.]]
[[0. 0. 0. 0. 0.]
[0. 0. 1. 1. 0.]
[0. 0. 1. 1. 0.]]
[[0. 0. 0. 0. 0.]
[0. 0. 0. 1. 1.]
[0. 0. 0. 1. 1.]]]
I wish it can be faster so I create an array for index and trying to use np.vectorize. But as manual described, vectorize is not for performance. And my goal is running through an array with shape of (10^6, 15, 15) which end up with 10^6 iteration. I hope there are some cleaner solution can get rid of all the for-loop.
This is the first time I using stack overflow, any suggestion are appreciated.
Thank you.
A efficient solution using numpy.lib.stride_tricks, which can "view" all the possibilities.
N=4 #tray size #(square)
P=3 # chunk size
R=N-P
from numpy.lib.stride_tricks import as_strided
tray = zeros((N,N),numpy.int32)
chunk = ones((P,P),numpy.int32)
tray[R:,R:] = chunk
tray = np.vstack((tray,tray))
view = as_strided(tray,shape=(R+1,R+1,N,N),strides=(4*N,4,4*N,4))
a_view = view.reshape(-1,N,N)
a_hard = a_view.copy()
Here is the result :
In [3]: a_view
Out[3]:
array([[[0, 0, 0, 0],
[0, 1, 1, 1],
[0, 1, 1, 1],
[0, 1, 1, 1]],
[[0, 0, 0, 0],
[1, 1, 1, 0],
[1, 1, 1, 0],
[1, 1, 1, 0]],
[[0, 1, 1, 1],
[0, 1, 1, 1],
[0, 1, 1, 1],
[0, 0, 0, 0]],
[[1, 1, 1, 0],
[1, 1, 1, 0],
[1, 1, 1, 0],
[0, 0, 0, 0]]])
a_view is just a view on possible positions of a chunk on the tray. It doesn't cost any computation, and it just uses twice the tray space.
a_hard is a hard copy, necessary if you need to modify it.

Update 3 and 4 dimension elements of numpy array

I have a numpy array of shape [12, 8, 5, 5]. I want to modify the values of 3rd and 4th dimension for each element.
For e.g.
import numpy as np
x = np.zeros((12, 80, 5, 5))
print(x[0,0,:,:])
Output:
[[ 0. 0. 0. 0. 0.]
[ 0. 0. 0. 0. 0.]
[ 0. 0. 0. 0. 0.]
[ 0. 0. 0. 0. 0.]
[ 0. 0. 0. 0. 0.]]
Modify values:
y = np.ones((5,5))
x[0,0,:,:] = y
print(x[0,0,:,:])
Output:
[[ 1. 1. 1. 1. 1.]
[ 1. 1. 1. 1. 1.]
[ 1. 1. 1. 1. 1.]
[ 1. 1. 1. 1. 1.]
[ 1. 1. 1. 1. 1.]]
I can modify for all x[i,j,:,:] using two for loops. But, I was wondering if there is any pythonic way to do it without running two loops. Just curious to know :)
UPDATE
Actual use case:
dict_weights = copy.deepcopy(combined_weights)
for i in range(0, len(combined_weights[each_layer][:, 0, 0, 0])):
for j in range(0, len(combined_weights[each_layer][0, :, 0, 0])):
# Extract 5x5
trans_weight = combined_weights[each_layer][i,j]
trans_weight = np.fliplr(np.flipud(trans_weight ))
# Update
dict_weights[each_layer][i, j] = trans_weight
NOTE: The dimensions i, j of combined_weights can vary. There are around 200 elements in this list with varied i and j dimensions, but 3rd and 4th dimensions are always same (i.e. 5x5).
I just want to know if I can updated the elements combined_weights[:,:,5, 5] with transposed values without running 2 for loops.
Thanks.
Simply do -
dict_weights[each_layer] = combined_weights[each_layer][...,::-1,::-1]

Categories