Say I have a matrix of shape (2,3), I need to diagonalize the 3-elements vector into matrix of shape (3,3), for all the 2 vectors at once. That is, I need to return matrix with shape (2,3,3). How can I do that with Numpy elegantly ?
given data = np.array([[1,2,3],[4,5,6]])
i want the result [[[1,0,0],
[0,2,0],
[0,0,3]],
[[4,0,0],
[0,5,0],
[0,0,6]]]
Thanks
tl;dr, my one-liner: mydiag=np.vectorize(np.diag, signature='(n)->(n,n)')
I suppose here that by "diagonalize" you mean "applying np.diag".
Which, as a teacher of linear algebra, tickles me a bit. Since "diagonalizing" has a specific meaning, which is not that (it is computing eigen vectors and values, and from there, writing M=P⁻¹ΛP. Which you cannot do from the inputs you have).
So, I suppose that if input matrix is
[[1, 2, 3],
[9, 8, 7]]
The output matrix you want is
[[[1, 0, 0],
[0, 2, 0],
[0, 0, 3]],
[[9, 0, 0],
[0, 8, 0],
[0, 0, 7]]]
If not, you can ignore this answer [Edit: in the meantime, you explained exactly that. So yo may continue to read].
There are many way to do that.
My one liner would be
mydiag=np.vectorize(np.diag, signature='(n)->(n,n)')
Which build a new functions which does what you want (it interprets the input as a list of 1D-array, call np.diag of each of them, to get a 2D-array, and put each 2D-array in a numpy array, thus getting a 3D-array)
Then, you just call mydiag(M)
One advantage of vectorize, is that it uses numpy broadcasting. In other words, the loops are executed in C, not in python. In yet other words, it is faster. Well it is supposed to be (on small matrix, it is in fact slower than Michael's method - in comment; on large matrix, it is has the exact same speed. Which is frustrating, since einsum doc itself specify that it sacrifices broadcasting).
Plus, it is a one-liner, which has no other interest than bragging on forums. But well, here we are.
Here is one way with indexing:
out = np.zeros(data.shape+(data.shape[-1],), dtype=data.dtype)
x,y = np.indices(data.shape).reshape(2, -1)
out[x,y,y] = data.ravel()
output:
array([[[1, 0, 0],
[0, 2, 0],
[0, 0, 3]],
[[4, 0, 0],
[0, 5, 0],
[0, 0, 6]]])
We use array indexing to precisely grab those elements that are on the diagonal. Note that array indexing allows broadcasting between the indices, so we have index1 contain the index of the array, and index2 contain the index of the diagonal element.
index1 = np.arange(2)[:, None] # 2 is the number of arrays
index2 = np.arange(3)[None, :] # 3 is the square size of each matrix
result = np.zeros((2, 3, 3))
result[index1, index2, index2] = data
Related
Suppose having two matrices: X(m, n) and index matrix I(m, 1). Every item in index matrix I_k represents the index of the kth element X_k in X.
And suppose the index is in the range of [0, 1, 2, ..., j-1]
I would like to calculate the average of tensors in X with the same index i and return a result matrix R(j, n).
For example,
X = [[1, 1, 1],
[2, 2, 2],
[3, 3, 3]]
I = [0, 0, 1]
The result matrix should be:
R = [[torch.mean(([1, 1, 1], [2, 2, 2]))],
[torch.mean(([3, 3, 3]))]
which equals to:
R = [[1.5, 1.5, 1.5],
[3, 3, 3]]
My current solution is to traverse through m, stack the tensors with the same index and perform torch.mean.
Is there a way avoiding traversing through m? It seems not elegant and rather time-consuming.
ret = torch.empty_like(X)
ret.scatter_reduce_(0, I.unsqueeze(-1).expand_as(X), X, "mean", include_self=False)
should do what you want.
Now, note that this is a fairly new method so it may not be particularly performant. If you bump into an issue with this method, you may be better off running scatter_add_ on the tensor X and a tensor of ones and then divide.
If you want to also have a smaller tensor as output, you may want to figure out how many indices and with that infer the size of out.
Suppose I have the following numpy array.
Q = np.array([[0,1,1],[1,0,1],[0,2,0])
Question: How do I identify the position of each max value along axis 1? So the desired output would be something like:
array([[1,2],[0,2],[1]]) # The dtype of the output is not required to be a np array.
With np.argmax I can identify the first occurrence of the maximum along the axis, but not the subsequent values.
In: np.argmax(Q, axis =1)
Out: array([1, 0, 1])
I've also seen answers that rely on using np.argwhere that use a term like this.
np.argwhere(Q == np.amax(Q))
This will also not work here because I can't limit argwhere to work along a single axis. I also can't just flatten out the np array to a single axis because the max's in each row will differ. I need to identify each instance of the max of each row.
Is there a pythonic way to achieve this without looping through each row of the entire array, or is there a function analogous to np.argwhere that accepts an axis argument?
Any insight would be appreciated thanks!
Try with np.where:
np.where(Q == Q.max(axis=1)[:,None])
Output:
(array([0, 0, 1, 1, 2]), array([1, 2, 0, 2, 1]))
Not quite the output you want, but contains equivalent information.
You can also use np.argwhere which gives you the zip data:
np.argwhere(Q==Q.max(axis=1)[:,None])
Output:
array([[0, 1],
[0, 2],
[1, 0],
[1, 2],
[2, 1]])
This code is swapping first and the last channels of an RBG image which is loaded into a Numpy array:
img = imread('image1.jpg')
# Convert from RGB -> BGR
img = img[..., [2, 1, 0]]
While I understand the use of Ellipsis for slicing in Numpy arrays, I couldn't understand the use of Ellipsis here. Could anybody explain what is exactly happening here?
tl;dr
img[..., [2, 1, 0]] produces the same result as taking the slices img[:, :, i] for each i in the index array [2, 1, 0], and then stacking the results along the last dimension of img. In other words:
img[..., [2,1,0]]
will produce the same output as:
np.stack([img[:,:,2], img[:,:,1], img[:,:,0]], axis=2)
The ellipsis ... is a placeholder that tells numpy which axis to apply the index array to. Without the ... the index array will be applied to the first axis of img instead of the last. Thus, without ..., the index statement:
img[[2,1,0]]
will produce the same output as:
np.stack([img[2,:,:], img[1,:,:], img[0,:,:]], axis=0)
What the docs say
This is an example of what the docs call "Combining advanced and basic indexing":
When there is at least one slice (:), ellipsis (...) or np.newaxis in the index (or the array has more dimensions than there are advanced indexes), then the behaviour can be more complicated. It is like concatenating the indexing result for each advanced index element.
It goes on to describe that in this
case, the dimensions from the advanced indexing operations [in your example [2, 1, 0]] are inserted into the result array at the same spot as they were in the initial array (the latter logic is what makes simple advanced indexing behave just like slicing).
The 2D case
The docs aren't the easiest to understand, but in this case it's not too hard to pick apart. Start with a simpler 2D case:
arr = np.arange(12).reshape(4,3)
array([[ 0, 1, 2],
[ 3, 4, 5],
[ 6, 7, 8],
[ 9, 10, 11]])
Using the same kind of advanced indexing with a single index value yields:
arr[:, [1]]
array([[ 1],
[ 4],
[ 7],
[10]])
which is the 1st column of arr. In other words, it's like you yielded all possible values from arr while holding the index of the last axis fixed. Like #hpaulj said in his comment, the ellipsis is there to act as a placeholder. It effectively tells numpy to iterate freely over all of the axes except for the last, to which the indexing array is applied.
You can use also this indexing syntax to shuffle the columns of arr around however you'd like:
arr[..., [1,0,2]]
array([[ 1, 0, 2],
[ 4, 3, 5],
[ 7, 6, 8],
[10, 9, 11]])
This is essentially the same operation as in your example, but on a 2D array instead of a 3D one.
You can explain what's going on with arr[..., [1,0,2]] by breaking it down to simpler indexing ops. It's kind of like you first take the return value of arr[..., [1]]:
array([[ 1],
[ 4],
[ 7],
[10]])
then the return value of arr[..., [0]]:
array([[0],
[3],
[6],
[9]])
then the return value of arr[..., [1]]:
array([[ 2],
[ 5],
[ 8],
[11]])
and then finally concatenated all of those results into a single array of shape (*arr.shape[:-1], len(ix)), where ix = [2, 0, 1] is the index array. The data along the last axis are ordered according to their order in ix.
One good way to understand exactly the ellipsis is doing is to perform the same op without it:
arr[[1,0,2]]
array([[6, 7, 8],
[0, 1, 2],
[3, 4, 5]])
In this case, the index array is applied to the first axis of arr, so the output is an array containing the [1,0,2] rows of arr. Adding an ... before the index array tells numpy to apply the index array to the last axis of arr instead.
Your 3D case
The case you asked about is the 3D equivalent of the 2D arr[..., [1,0,2]] example above. Say that img.shape is (480, 640, 3). You can think about img[..., [2, 1, 0]] as looping over each value i in ix=[2, 1, 0]. For every i, the indexing operation will gather the slab of shape (480, 640, 1) that lies along the ith index of the last axis of img. Once all three slabs are collected, the final result will be the equivalent of concatenating along their last axis (and in the order they were found).
notes
The only difference between arr[..., [1]] and arr[:,1] is that arr[..., [1]] preserves the shape of the data from the original array.
For a 2D array, arr[:, [1]] is equivalent to arr[..., [1]]. : acts as a placeholder just like ..., but only for a single dimension.
Assume values and tensor T both have shape (N,K). Now if we think of them in terms of matrices, I would like for each row of T to get the row element corresponding to the index where values has it's maximum. I can easily find those indices with
max_indicies = tf.argmax(T, 1)
which returns a tensor of shape (N). Now, how can I gather up these indices from T such that I get something of shape N? I tried
result = tf.gather(T,max_indices)
but it doesn't do the right thing - it returns something of shape (N,K) which means that it didn't gather up anything.
You can use tf.gather_nd.
For example,
import tensorflow as tf
sess = tf.InteractiveSession()
values = tf.constant([[0, 0, 0, 1],
[0, 1, 0, 0],
[0, 0, 1, 0]])
T = tf.constant([[0, 1, 2 , 3],
[4, 5, 6 , 7],
[8, 9, 10, 11]])
max_indices = tf.argmax(values, axis=1)
# If T.get_shape()[0] is None, you can replace it with tf.shape(T)[0].
result = tf.gather_nd(T, tf.stack((tf.range(T.get_shape()[0],
dtype=max_indices.dtype),
max_indices),
axis=1))
print(result.eval())
However when the ranks of values and T are higher, the use of tf.gather_nd will be a little awkward. I posted my current solution on this question. There might be a better solution in case of high dimensional values and T.
I wanted to repeat the rows of a scipy csr sparse matrix, but when I tried to call numpy's repeat method, it simply treats the sparse matrix like an object, and would only repeat it as an object in an ndarray. I looked through the documentation, but I couldn't find any utility to repeats the rows of a scipy csr sparse matrix.
I wrote the following code that operates on the internal data, which seems to work
def csr_repeat(csr, repeats):
if isinstance(repeats, int):
repeats = np.repeat(repeats, csr.shape[0])
repeats = np.asarray(repeats)
rnnz = np.diff(csr.indptr)
ndata = rnnz.dot(repeats)
if ndata == 0:
return sparse.csr_matrix((np.sum(repeats), csr.shape[1]),
dtype=csr.dtype)
indmap = np.ones(ndata, dtype=np.int)
indmap[0] = 0
rnnz_ = np.repeat(rnnz, repeats)
indptr_ = rnnz_.cumsum()
mask = indptr_ < ndata
indmap -= np.int_(np.bincount(indptr_[mask],
weights=rnnz_[mask],
minlength=ndata))
jumps = (rnnz * repeats).cumsum()
mask = jumps < ndata
indmap += np.int_(np.bincount(jumps[mask],
weights=rnnz[mask],
minlength=ndata))
indmap = indmap.cumsum()
return sparse.csr_matrix((csr.data[indmap],
csr.indices[indmap],
np.r_[0, indptr_]),
shape=(np.sum(repeats), csr.shape[1]))
and be reasonably efficient, but I'd rather not monkey patch the class. Is there a better way to do this?
Edit
As I revisit this question, I wonder why I posted it in the first place. Almost everything I could think to do with the repeated matrix would be easier to do with the original matrix, and then apply the repetition afterwards. My assumption is that post repetition will always be the better way to approach this problem than any of the potential answers.
from scipy.sparse import csr_matrix
repeated_row_matrix = csr_matrix(np.ones([repeat_number,1])) * sparse_row
It's not surprising that np.repeat does not work. It delegates the action to the hardcoded a.repeat method, and failing that, first turns a into an array (object if needed).
In the linear algebra world where sparse code was developed, most of the assembly work was done on the row, col, data arrays BEFORE creating the sparse matrix. The focus was on efficient math operations, and not so much on adding/deleting/indexing rows and elements.
I haven't worked through your code, but I'm not surprised that a csr format matrix requires that much work.
I worked out a similar function for the lil format (working from lil.copy):
def lil_repeat(S, repeat):
# row repeat for lil sparse matrix
# test for lil type and/or convert
shape=list(S.shape)
if isinstance(repeat, int):
shape[0]=shape[0]*repeat
else:
shape[0]=sum(repeat)
shape = tuple(shape)
new = sparse.lil_matrix(shape, dtype=S.dtype)
new.data = S.data.repeat(repeat) # flat repeat
new.rows = S.rows.repeat(repeat)
return new
But it is also possible to repeat using indices. Both lil and csr support indexing that is close to that of regular numpy arrays (at least in new enough versions). Thus:
S = sparse.lil_matrix([[0,1,2],[0,0,0],[1,0,0]])
print S.A.repeat([1,2,3], axis=0)
print S.A[(0,1,1,2,2,2),:]
print lil_repeat(S,[1,2,3]).A
print S[(0,1,1,2,2,2),:].A
give the same result
and best of all?
print S[np.arange(3).repeat([1,2,3]),:].A
After someone posted a really clever response for how best to do this I revisited my original question, to see if there was an even better way. I I came up with one more way that has some pros and cons. Instead of repeating all of the data (as is done with the accepted answer), we can instead instruct scipy to reuse the data of the repeated rows, creating something akin to a view of the original sparse array (as you might do with broadcast_to). This can be done by simply tiling the indptr field.
repeated = sparse.csr_matrix((orig.data, orig.indices, np.tile(orig.indptr, repeat_num)))
This technique repeats the vector repeat_num times, while only modifying the the indptr. The downside is that due to the way the csr matrices encode data, instead of creating a matrix that's repeat_num x n in dimension, it creates one that's (2 * repeat_num - 1) x n where every odd row is 0. This shouldn't be too big of a deal as any operation will be quick given that each row is 0, and they should be pretty easy to slice out afterwards (with something like [::2]), but it's not ideal.
I think the marked answer is probably still the "best" way to do this.
One of the most efficient ways to repeat the sparse matrix would be the way OP suggested. I modified indptr so that it doesn't output rows of 0s.
## original sparse matrix
indptr = np.array([0, 2, 3, 6])
indices = np.array([0, 2, 2, 0, 1, 2])
data = np.array([1, 2, 3, 4, 5, 6])
x = scipy.sparse.csr_matrix((data, indices, indptr), shape=(3, 3))
x.toarray()
array([[1, 0, 2],
[0, 0, 3],
[4, 5, 6]])
To repeat this, you need to repeat data and indices, and you need to fix-up the indptr. This is not the most elegant way, but it works.
## repeated sparse matrix
repeat = 5
new_indptr = indptr
for r in range(1,repeat):
new_indptr = np.concatenate((new_indptr, new_indptr[-1]+indptr[1:]))
x = scipy.sparse.csr_matrix((np.tile(data,repeat), np.tile(indices,repeat), new_indptr))
x.toarray()
array([[1, 0, 2],
[0, 0, 3],
[4, 5, 6],
[1, 0, 2],
[0, 0, 3],
[4, 5, 6],
[1, 0, 2],
[0, 0, 3],
[4, 5, 6],
[1, 0, 2],
[0, 0, 3],
[4, 5, 6],
[1, 0, 2],
[0, 0, 3],
[4, 5, 6]])