Numpy Cyclic Broadcast of Fancy Indexing - python

A is an numpy array with shape (6, 8)
I want:
x_id = np.array([0, 3])
y_id = np.array([1, 3, 4, 7])
A[ [x_id, y_id] += 1 # this doesn't actually work.
Tricks like ::2 won't work because the indices do not increase regularly.
I don't want to use extra memory to repeat [0, 3] and make a new array [0, 3, 0, 3] because that is slow.
The indices for the two dimensions do not have equal length.
which is equivalent to:
A[0, 1] += 1
A[3, 3] += 1
A[0, 4] += 1
A[3, 7] += 1
Can numpy do something like this?
Update:
Not sure if broadcast_to or stride_tricks is faster than nested python loops. (Repeat NumPy array without replicating data?)

You can convert y_id to a 2d array with the 2nd dimension the same as x_id, and then the two indices will be automatically broadcasted due to the dimension difference:
x_id = np.array([0, 3])
y_id = np.array([1, 3, 4, 7])
​
A = np.zeros((6,8))
A[x_id, y_id.reshape(-1, x_id.size)] += 1
A
array([[ 0., 1., 0., 0., 1., 0., 0., 0.],
[ 0., 0., 0., 0., 0., 0., 0., 0.],
[ 0., 0., 0., 0., 0., 0., 0., 0.],
[ 0., 0., 0., 1., 0., 0., 0., 1.],
[ 0., 0., 0., 0., 0., 0., 0., 0.],
[ 0., 0., 0., 0., 0., 0., 0., 0.]])

Related

Map numpy categorical data to a numpy vector

I am having a numpy array that is looking like:
my_arr = array([[0., 0., 0., 0., 1., 0.],
[0., 1., 0., 0., 0., 0.],
[0., 0., 0., 1., 0., 0.],
[0., 0., 0., 0., 1., 0.],
[1., 0., 0., 0., 0., 0.],
[0., 0., 0., 0., 1., 0.],
[0., 1., 0., 0., 0., 0.],
[0., 0., 0., 0., 1., 0.],
...
...]
I want to return a vector that will contain for each vector of my_arr the index of entry with value one. How can I do so?
You use np.argmax() for that.
inds = np.argmax(my_arr, axis=1)
# array([4, 1, 3, 4, 0, 4, 1, 4])
np.where(my_arr)[1]
Look at docs: https://docs.scipy.org/doc/numpy/reference/generated/numpy.where.html
You can use np.argwhere to return an array of coordinates:
arr = np.random.randint(0, 2, (5, 5))
print(arr)
[[0 0 1 1 1]
[0 1 0 1 1]
[1 1 0 0 1]
[1 1 1 0 0]
[1 1 1 1 0]]
res = np.argwhere(arr)
print(res)
array([[0, 2], [0, 3], ..., [4, 2], [4, 3]], dtype=int64)

Numpy: raise ValueError("shape too large to be a matrix.")

I am using a 12x12 numpy matrix, and I am getting "shape too large to be a matrix." My best guess is that numpy "kron" function is making trouble.
Here's my code:
a = np.matrix("0 1 0; 0 0 1; 0 0 0 ")
a_dag = np.matrix("0 0 0; 1 0 0 ; 0 1 0")
Sp = np.matrix("0 1; 0 0")
Sm = np.matrix("0 0; 1 0")
...
119 H_I1 = (np.exp(1j*(phi-omega*t))*kron(np.eye(3),Sp,np.eye(2))
120 +np.exp(-1j*(phi-omega*t))*kron(np.eye(3),Sm,np.eye(2)))
121 H_I2 = kron(a,Sp,np.eye(2)) + kron(a_dag,Sm,np.eye(2))
Here's the error:
Traceback (most recent call last):
File "/home/fyodr/qc_final.py", line 121, in <module>
H_I2 = kron(a,Sp,np.eye(2)) + kron(a_dag,Sm,np.eye(2))
File "/home/fyodr/qc_final.py", line 70, in kron
return np.kron(m[0],kron(m[1:]))
File "/usr/lib/python2.7/dist-packages/numpy/lib/shape_base.py", line 754, in kron
result = wrapper(result)
File "/usr/lib/python2.7/dist-packages/numpy/matrixlib/defmatrix.py", line 303, in __array_finalize__
raise ValueError("shape too large to be a matrix.")
ValueError: shape too large to be a matrix.
Thanks!
EDIT: I defined kron as
def kron(*m):
if len(m) == 1:
return m
else :
return np.kron(m[0],kron(m[1:]))
If np.kron were computing a regular kronecker product, then this should not be a problem.
As I commented, your kron with 3 arguments is unknown. But if it produces a 3d array as some stage, it could produce your error.
In [264]: np.kron(a.A, np.ones((3,3,3))).shape
Out[264]: (3, 9, 9)
A 2d array with a 3d returns a 3d array. But if a is a np.matrix it tries to convert that to a matrix resulting in the error. np.matrix is always 2d.
In [265]: np.kron(a, np.ones((3,3,3))).shape
---------------------------------------------------------------------------
....
ValueError: shape too large to be a matrix.
Experienced numpy users don't use np.matrix unless we really need its features, and can live with its drawbacks.
With the kron that you added, the recursive step does:
In [270]: m = (a, Sp, np.eye(2))
In [271]: kron(m[1:])
Out[271]:
((matrix([[0, 1],
[0, 0]]), array([[ 1., 0.],
[ 0., 1.]])),)
In [272]: np.array(_)
Out[272]:
array([[[[ 0., 1.],
[ 0., 0.]],
[[ 1., 0.],
[ 0., 1.]]]])
In [273]: _.shape
Out[273]: (1, 2, 2, 2)
For 2 items, your kron returns a nested tuple of arrays. np.kron applies a np.asanyarray(b) to that 2nd argument, which results in a 4d array.
Applying your kron to full *m, but turning the matrices into arrays:
In [275]: kron(a.A, Sp.A, np.eye(2))
Out[275]:
array([[[[ 0., 0., 0., 1., 0., 0.],
[ 0., 0., 0., 0., 0., 0.],
[ 0., 0., 0., 0., 0., 1.],
[ 0., 0., 0., 0., 0., 0.],
[ 0., 0., 0., 0., 0., 0.],
[ 0., 0., 0., 0., 0., 0.]],
[[ 0., 0., 1., 0., 0., 0.],
[ 0., 0., 0., 1., 0., 0.],
[ 0., 0., 0., 0., 1., 0.],
[ 0., 0., 0., 0., 0., 1.],
[ 0., 0., 0., 0., 0., 0.],
[ 0., 0., 0., 0., 0., 0.]]]])
In [276]: _.shape
Out[276]: (1, 2, 6, 6)
Did you even test the kron function by itself? It should have been debugged before use in a more complicated task.

A neater way to set values at indexes with NumPy

I have a numpy array initially with zeros, like this:
v = np.zeros((5, 5))
v
array([[ 0., 0., 0., 0., 0.],
[ 0., 0., 0., 0., 0.],
[ 0., 0., 0., 0., 0.],
[ 0., 0., 0., 0., 0.],
[ 0., 0., 0., 0., 0.]])
I also have a set of arrays idx1 and idx2.
idx1
array([[0, 3],
[0, 4],
[1, 3],
[2, 4]])
idx2
array([[0, 1],
[0, 2],
[0, 4],
[1, 3]])
Look upon each pair of values as row and column indices. So, for example, in idx1, the first pair (0, 3) would be indexers into v[0, 3] and so on.
I want to first set values at indexes specified by idx1 to 1, followed by all indexes specified by idx2 to 0.
Also, please note that if there is a pair (i, j) in some array, I want to set v[i, j] and v[j, i] at the same time.
My final result becomes:
array([[ 0., 0., 0., 1., 0.],
[ 0., 0., 0., 0., 0.],
[ 0., 0., 0., 0., 1.],
[ 1., 0., 0., 0., 0.],
[ 0., 0., 1., 0., 0.]])
I currently achieve this by doing:
def set_vals(x, i, j, v):
x[i, j] = x.T[i, j] = v
v = np.zeros((5, 5))
i1, j1 = idx1[:, 0], idx1[:, 1]
i2, j2 = idx2[:, 0], idx2[:, 1]
set_vals(v, i1, j1, 1)
set_vals(v, i2, j2, 0)
v # the result
However, I believe there might be a better way. Would love to hear any thoughts/suggestions for improvement. Thanks!
In search of a more "compact" way of expressing it, I got this -
v = np.zeros((5, 5))
v[tuple(np.r_[idx1,idx1[:,::-1]].T)] = 1
v[tuple(np.r_[idx2,idx2[:,::-1]].T)] = 0
On python3.6+, you can use the * unpacking operator to reduce this further:
v[[*np.r_[idx1,idx1[:,::-1]].T]] = 1
v[[*np.r_[idx2,idx2[:,::-1]].T]] = 0
v
array([[ 0., 0., 0., 1., 0.],
[ 0., 0., 0., 0., 0.],
[ 0., 0., 0., 0., 1.],
[ 1., 0., 0., 0., 0.],
[ 0., 0., 1., 0., 0.]])

Scikit: Convert one-hot encoding to encoding with integers

I need to convert one-hot encoding to categories represented by unique integers. So one-hot encoding created with the following code:
from sklearn.preprocessing import OneHotEncoder
enc = OneHotEncoder()
labels = [[1],[2],[3]]
enc.fit(labels)
for x in [1,2,3]:
print(enc.transform([[x]]).toarray())
Out:
[[ 1. 0. 0.]]
[[ 0. 1. 0.]]
[[ 0. 0. 1.]]
Could be converted back to a set of unique integers, for example:
[1,2,3] or [11,37, 45] or any other where each integer uniquely represents a single class.
Is it possible to do with scikit-learn or any other python lib?
* Update *
Tried to:
labels = [[1],[2],[3], [4], [5],[6],[7]]
enc.fit(labels)
lst = []
for x in [1,2,3,4,5,6,7]:
lst.append(enc.transform([[x]]).toarray())
lst
Out:
[array([[ 1., 0., 0., 0., 0., 0., 0.]]),
array([[ 0., 1., 0., 0., 0., 0., 0.]]),
array([[ 0., 0., 1., 0., 0., 0., 0.]]),
array([[ 0., 0., 0., 1., 0., 0., 0.]]),
array([[ 0., 0., 0., 0., 1., 0., 0.]]),
array([[ 0., 0., 0., 0., 0., 1., 0.]]),
array([[ 0., 0., 0., 0., 0., 0., 1.]])]
a = np.array(lst)
np.where(a==1)[1]
Out:
array([0, 0, 0, 0, 0, 0, 0], dtype=int64)
Not what I need
You can do that using np.where as follows:
import numpy as np
a=np.array([[ 0., 1., 0.],
[ 1., 0., 0.],
[ 0., 0., 1.]])
np.where(a==1)[1]
This prints array([1, 0, 2], dtype=int64). This works since np.where(a==1)[1] returns the column indices of the 1's, which are exactly the labels.
In addition, since a is a 0,1-matrix, you can also replace np.where(a==1)[1] with just np.where(a)[1].
Update: The following solution should work with your format:
l=[np.array([[ 1., 0., 0., 0., 0., 0., 0.]]),
np.array([[ 0., 0., 1., 0., 0., 0., 0.]]),
np.array([[ 0., 1., 0., 0., 0., 0., 0.]]),
np.array([[ 0., 0., 0., 0., 1., 0., 0.]]),
np.array([[ 0., 0., 0., 0., 1., 0., 0.]]),
np.array([[ 0., 0., 0., 0., 0., 1., 0.]]),
np.array([[ 0., 0., 0., 0., 0., 0., 1.]])]
a=np.array(l)
np.where(a)[2]
This prints
array([0, 2, 1, 4, 4, 5, 6], dtype=int64)
Alternativaly, you could use the original solution together with #ml4294's comment.
You can use np.argmax():
from sklearn.preprocessing import OneHotEncoder
import numpy as np
enc = OneHotEncoder()
labels = [[1],[2],[3]]
enc.fit(labels)
x = enc.transform(labels).toarray()
# x = np.array([[1, 0, 0], [0, 1, 0], [0, 0, 1]])
xr = (np.argmax(x, axis=1)+1).reshape(-1, 1)
print(xr)
This should return array([[1], [2], [3]]). If you want instead array([[0], [1], [2]]), just remove the +1 in the definition of xr.
Since you are using sklearn.preprocessing.OneHotEncoder to 'encode' the data, you can use its .inverse_transform() method to 'decode' the data (I think this requires .__version__ = 0.20.1 or newer):
from sklearn.preprocessing import OneHotEncoder
enc = OneHotEncoder()
labels = [[1],[2],[3]]
encoder = enc.fit(labels)
encoded_labels = encoder.transform(labels)
decoded_labels = encoder.inverse_transform(encoded_labels)
decoded_labels # array([[1],
[2],
[3]])
n.b. decoded_labels is a numpy array not a list.
Source: https://scikit-learn.org/stable/modules/generated/sklearn.preprocessing.OneHotEncoder.html#sklearn.preprocessing.OneHotEncoder.inverse_transform

updating specific numpy matrix columns

I have the following list of indices [2 4 3 4] which correspond to my target indices. I'm creating a matrix of zeroes with the following line of code targets = np.zeros((features.shape[0], 5)). Im wondering if its possible to slice in such a way that I could update the specific indices all at once and set those values to 1 without a for loop, ideally the matrix would look like
([0,0,1,0,0], [0,0,0,0,1], [0,0,0,1,0], [0,0,0,0,1])
I believe you can do something like this:
targets = np.zeros((4, 5))
ind = [2, 4, 3, 4]
targets[np.arange(0, 4), ind] = 1
Here is the result:
array([[ 0., 0., 1., 0., 0.],
[ 0., 0., 0., 0., 1.],
[ 0., 0., 0., 1., 0.],
[ 0., 0., 0., 0., 1.]])

Categories