Suppose I have some Scipy sparse csr matrix, A, and a set of indices, inds=[i_1,...,i_n]. I can access the submatrix given by the rows and columns given in inds via A[inds,:][:,inds]. But I cannot figure out how to modify them. All of the following fail (i.e. do not change the matrix values):
A[inds,:][:,inds] *= 5.0
(A[inds,:][:,inds]).multiply(5.0)
A[inds,:][:,inds] = 5.0
Is there any easy way to modify a submatrix of a sparse matrix?
The rules for accessing a block, or submatrix, in sparse are the same [edit: similar] as for numpy. The 2 index arrays need to be broadcastable. The simplest way is to make the 1st one a column vector.
I'll illustrate:
In [13]: A = np.arange(24).reshape(4,6)
In [14]: M=sparse.csr_matrix(A)
In [15]: A[[[1],[2]],[1,2,3]]
Out[15]:
array([[ 7, 8, 9],
[13, 14, 15]])
In [16]: M[[[1],[2]],[1,2,3]].A
Out[16]:
array([[ 7, 8, 9],
[13, 14, 15]], dtype=int32)
In [17]: idx1=np.array([1,2])[:,None]
In [18]: idx1
Out[18]:
array([[1],
[2]])
In [19]: idx2=np.array([1,2,3])
In [20]: M[idx1, idx2].A
Out[20]:
array([[ 7, 8, 9],
[13, 14, 15]], dtype=int32)
In [21]: M[idx1, idx2] *= 2
In [22]: M.A
Out[22]:
array([[ 0, 1, 2, 3, 4, 5],
[ 6, 14, 16, 18, 10, 11],
[12, 26, 28, 30, 16, 17],
[18, 19, 20, 21, 22, 23]], dtype=int32)
M[inds,:][:,inds] has the same problem in sparse as in numpy. With a list inds, M[inds,:] is a copy of the original, not a view. I've show that with reference to the data buffers in numpy. I'm not quite sure how to demonstate it with sparse.
Roughly, A[...][...] = ... translates to A.__getitem__(...).__setitem__(...,...). If A.__getitem__(...) is a copy, then modifying it won't modify A itself.
Actually sparse matrices don't have distinction between views and copies. Most, if not all, indexing produces a copy. M[:2,:] is a copy, even though A[:2,:] is a view.
I should also add that changing the values of a sparse matrix is something you should do with caution. In place multiplications (*=) is ok.
In place addition is not supported:
In [31]: M[idx1, idx2] += 2
...
NotImplementedError:
Modification of values may produce an EfficiencyWarning - if it turns a 0 value to nonzero:
In [33]: M[:2, :2] = 3
/usr/lib/python3/dist-packages/scipy/sparse/compressed.py:690: SparseEfficiencyWarning: Changing the sparsity structure of a csr_matrix is expensive. lil_matrix is more efficient.
SparseEfficiencyWarning)
The np.ix_ answer to your previous question works here as well.
Python - list of same columns / rows from matrix
M[np.ix_([1,2],[1,2,3])].A
Related
i have a 3d array that contains n 2d arrays
is there anyway that i can map each 2d array to some value
with out using the bulit in python map or for loops (i have been asked to do with out it)
only with numpy functions.
for example i want to do something like that:
def map2d_arr(arr2d): #this is for the exampe only can be any condition on the 2d array
return sum(arr2d) > 10;
a = np.arange(27).reshape((3,3,3))
res = np.map(map2d_arr,a) # how to do something like that only in numpy
#should return boolean array
there is np.vectorize and along axis
but they dont take full 2d arrays
No, but you can easily make a lambda function to serve that purpose.
list = np.array([1, 2, 3, 4])
#define function
my_function = lambda x: x*2
#apply function to NumPy array
my_function(list)
#Result : list = [2, 4, 6, 8]
Hope this helped!
Let's drop the boolean test, and just use sum - it clarifies the dimensional action:
In [130]: a = np.arange(27).reshape((3,3,3))
You apparently want to apply sum to each of the 2d arrays, iterating on the first dimension.
In [131]: sum(a[0])
Out[131]: array([ 9, 12, 15])
Effectively a[0][0]+a[0][1]+a[0][2].
For this the python map works just fine:
In [132]: list(map(sum, a))
Out[132]: [array([ 9, 12, 15]), array([36, 39, 42]), array([63, 66, 69])]
Equivalently in list comprehension form:
In [133]: [sum(i) for i in a]
Out[133]: [array([ 9, 12, 15]), array([36, 39, 42]), array([63, 66, 69])]
Or using vectorize with signature:
In [134]: f = np.vectorize(sum, signature='(n,m)->(m)')
In [135]: f(a)
Out[135]:
array([[ 9, 12, 15],
[36, 39, 42],
[63, 66, 69]])
but we can do the same with np.sum:
In [136]: np.sum(a, axis=1)
Out[136]:
array([[ 9, 12, 15],
[36, 39, 42],
[63, 66, 69]])
[136] is, by far the fastest. vectorize is slower than list comprehension for small cases, but scales better. vectorize with signature is generally slower (but that needs to be tested). map and list comprehension are about the same.
apply_along_axis is generally slower, though it can simplify nested iterations. I don't recommend it for 2d cases, where list comprehension is clean enough and faster.
Here is the example:
import numpy as np
a = np.array([1,2],[3,4],[5,6])
b = np.array([7,8],[9,10],[11,12],[12,13])
what I want is to use each item in a to plus every item in b then plus them together. For example, [1,2] should plus every row in b 1+7=8,2+8=10, 8+10=18;1+9=10,2+10=12,10+12=22... The result would like to be that[[18,22,26,28...],[22,26,....],[26,30....]]
My question is how to fulfil that? I know use numpy can be more efficient than loop but how to use the matrix to calculate this?
I believe this does what you want:
>>> a = np.array([[1,2],[3,4],[5,6]])
>>> b = np.array([[7,8],[9,10],[11,12],[12,13]])
>>> np.sum(a, axis=1)[:,None] + np.sum(b, axis=1)[None,:]
array([[18, 22, 26, 28],
[22, 26, 30, 32],
[26, 30, 34, 36]])
You can use list comprehensions:
import numpy as np
a = np.array([[1, 2], [3, 4], [5, 6]])
b = np.array([[7, 8], [9, 10], [11, 12], [12, 13]])
[[sum(i) + sum(j) for j in b] for i in a]
Output:
[[18, 22, 26, 28], [22, 26, 30, 32], [26, 30, 34, 36]]
This can be done in the most precise and succinct way as follows:
np.einsum('ijk-> ij', a[:,None,:]+b)
Let me explain each step.It combines einsum and broadcasting concepts of numpy.
Step 1- a[:,None,:] reshapes matrix a to shape (3,1,2). This mid axis with value 1 is helpful for broadcasting.
Step 2- a[:,None,:] + b broadcasts a and adds matrix b to get a resultant matrix of shape (3,4,2).
Step 3- np.einsum('ijk-> ij', a[:,None,:]+b) does sum reduction along the last axis of matrix obtained from previous step.
I have the following matrix:
M = np.matrix([[1,2,3,4,5,6,7,8,9,10],
[11,12,13,14,15,16,17,18,19,20],
[21,22,23,24,25,26,27,28,29,30]])
And I receive a vector indexing the columns of the matrix:
index = np.array([1,1,2,2,2,2,3,4,4,4])
This vector has 4 different values, so my objective is to create a list containing four new matrices so that the first matrix is made by the first two columns of M, the second matrix is made by columns 3 to 6 and so on:
M1 = np.matrix([[1,2],[11,12],[21,22]])
M2 = np.matrix([[3,4,5,6],[13,14,15,16],[23,24,25,26]])
M3 = np.matrix([[7],[17],[27]])
M4 = np.matrix([[8,9,10],[18,19,20],[28,29,30]])
l = list(M1,M2,M3,M4)
I need to do this in a automated way, since the number of rows and columns of M as well as the indexing scheme are not fixed. How can I do this?
There are 3 points to note:
For a variable number of variables, as in this case, the recommended solution is to use a dictionary.
You can use simple numpy indexing for the individual case.
Unless you have a very specific reason, use numpy.array instead of numpy.matrix.
Combining these points, you can use a dictionary comprehension:
d = {k: np.array(M[:, np.where(index==k)[0]]) for k in np.unique(index)}
Result:
{1: array([[ 1, 2],
[11, 12],
[21, 22]]),
2: array([[ 3, 4, 5, 6],
[13, 14, 15, 16],
[23, 24, 25, 26]]),
3: array([[ 7],
[17],
[27]]),
4: array([[ 8, 9, 10],
[18, 19, 20],
[28, 29, 30]])}
import numpy as np
M = np.matrix([[1,2,3,4,5,6,7,8,9,10],
[11,12,13,14,15,16,17,18,19,20],
[21,22,23,24,25,26,27,28,29,30]])
index = np.array([1,1,2,2,2,2,3,4,4,4])
m = [[],[],[],[]]
for i,c in enumerate(index):
m[k-1].append(c)
for idx in m:
print M[:,idx]
this is a little hard coded, I assumed you will always want 4 matrixes and such.. you can change it for more generalisation
How do I sum over the columns of a tensor?
torch.Size([10, 100]) ---> torch.Size([10])
The simplest and best solution is to use torch.sum().
To sum all elements of a tensor:
torch.sum(x) # gives back a scalar
To sum over all rows (i.e. for each column):
torch.sum(x, dim=0) # size = [ncol]
To sum over all columns (i.e. for each row):
torch.sum(x, dim=1) # size = [nrow]
It should be noted that the dimension summed over is eliminated from the resulting tensor.
Alternatively, you can use tensor.sum(axis) where axis indicates 0 and 1 for summing over rows and columns respectively, for a 2D tensor.
In [210]: X
Out[210]:
tensor([[ 1, -3, 0, 10],
[ 9, 3, 2, 10],
[ 0, 3, -12, 32]])
In [211]: X.sum(1)
Out[211]: tensor([ 8, 24, 23])
In [212]: X.sum(0)
Out[212]: tensor([ 10, 3, -10, 52])
As, we can see from the above outputs, in both cases, the output is a 1D tensor. If you, on the other hand, wish to retain the dimension of the original tensor in the output as well, then you've set the boolean kwarg keepdim to True as in:
In [217]: X.sum(0, keepdim=True)
Out[217]: tensor([[ 10, 3, -10, 52]])
In [218]: X.sum(1, keepdim=True)
Out[218]:
tensor([[ 8],
[24],
[23]])
If you have tensor my_tensor, and you wish to sum across the second array dimension (that is, the one with index 1, which is the column-dimension, if the tensor is 2-dimensional, as yours is), use torch.sum(my_tensor,1) or equivalently my_tensor.sum(1) see documentation here.
One thing that is not mentioned explicitly in the documentation is: you can sum across the last array-dimension by using -1 (or the second-to last dimension, with -2, etc.)
So, in your example, you could use: outputs.sum(1) or torch.sum(outputs,1), or, equivalently, outputs.sum(-1) or torch.sum(outputs,-1). All of these would give the same result, an output tensor of size torch.Size([10]), with each entry being the sum over the all rows in a given column of the tensor outputs.
To illustrate with a 3-dimensional tensor:
In [1]: my_tensor = torch.arange(24).view(2, 3, 4)
Out[1]:
tensor([[[ 0, 1, 2, 3],
[ 4, 5, 6, 7],
[ 8, 9, 10, 11]],
[[12, 13, 14, 15],
[16, 17, 18, 19],
[20, 21, 22, 23]]])
In [2]: my_tensor.sum(2)
Out[2]:
tensor([[ 6, 22, 38],
[54, 70, 86]])
In [3]: my_tensor.sum(-1)
Out[3]:
tensor([[ 6, 22, 38],
[54, 70, 86]])
Based on doc https://pytorch.org/docs/stable/generated/torch.sum.html
it should be
dim (int or tuple of python:ints) – the dimension or dimensions to reduce.
dim=0 means reduce row dimensions: condense all rows = sum by col
dim=1 means reduce col dimensions: condense cols= sum by row
Torch sum along multiple axis or dimensions
Just for the sake of completeness (I could not find it easily) I include how to sum along multiple dimensions with torch.sum which is heavily used in computer vision tasks where you have to reduce along H and W dimensions.
If you have an image x with shape C x H x W and want to compute the average pixel intensity value per channel you could do:
avg = torch.sum(x, dim=(1,2)) / (H*W) # Sum along (H,W) and norm
I am trying to implement a function, which can split a 3 dimensional numpy array in to 8 pieces, whilst keeping the order intact. Essentially I need the splits to be:
G[:21, :18,:25]
G[21:, :18,:25]
G[21:, 18:,:25]
G[:21, 18:,:25]
G[:21, :18,25:]
G[21:, :18,25:]
G[21:, 18:,25:]
G[:21, 18:,25:]
Where the original size of this particular matrix would have been 42, 36, 50. How is it possible to generalise these 8 "slices" so I do not have to hardcode all of them? essentially move the :in every possible position.
Thanks!
You could apply the 1d slice to successive (lists of) dimensions.
With a smaller 3d array
In [147]: X=np.arange(4**3).reshape(4,4,4)
A compound list comprehension produces a nested list. Here I'm using the simplest double split
In [148]: S=[np.split(z,2,0) for y in np.split(X,2,2) for z in np.split(y,2,1)]
In this case, all sublists have the same size, so I can convert it to an array for convenient viewing:
In [149]: SA=np.array(S)
In [150]: SA.shape
Out[150]: (4, 2, 2, 2, 2)
There are your 8 subarrays, but grouped (4,2).
In [153]: SAA = SA.reshape(8,2,2,2)
In [154]: SAA[0]
Out[154]:
array([[[ 0, 1],
[ 4, 5]],
[[16, 17],
[20, 21]]])
In [155]: SAA[1]
Out[155]:
array([[[32, 33],
[36, 37]],
[[48, 49],
[52, 53]]])
Is the order right? I can change it by changing the axis in the 3 split operations.
Another approach is write your indexing expressions as tuples
In [156]: x,y,z = 2,2,2 # define the split points
In [157]: ind = [(slice(None,x), slice(None,y), slice(None,z)),
(slice(x,None), slice(None,y), slice(None,z)),]
# and so on
In [158]: S1=[X[i] for i in ind]
In [159]: S1[0]
Out[159]:
array([[[ 0, 1],
[ 4, 5]],
[[16, 17],
[20, 21]]])
In [160]: S1[1]
Out[160]:
array([[[32, 33],
[36, 37]],
[[48, 49],
[52, 53]]])
Looks like the same order I got before.
That ind list of tuples can be produced with some sort of iteration and/or list comprehension. Maybe even using itertools.product or np.mgrid to generate the permutations.
An itertools.product version could look something like
In [220]: def foo(i):
return [(slice(None,x) if j else slice(x,None))
for j,x in zip(i,[2,2,2])]
In [221]: SAA = np.array([X[foo(i)] for i in
itertools.product(range(2),range(2),range(2))])
In [222]: SAA[-1]
Out[222]:
array([[[ 0, 1],
[ 4, 5]],
[[16, 17],
[20, 21]]])
product iterates the last value fastest, so the list is reversed (compared to your target).
To generate a particular order it may be easier to list the tuples explicitly, e.g.:
In [227]: [X[foo(i)] for i in [(1,1,1),(0,1,1),(0,0,1)]]
Out[227]:
[array([[[ 0, 1],
[ 4, 5]],
[[16, 17],
[20, 21]]]), array([[[32, 33],
[36, 37]],
[[48, 49],
[52, 53]]]), array([[[40, 41],
[44, 45]],
[[56, 57],
[60, 61]]])]
This highlights the fact that there are 2 distinct issues - generating the iteration pattern, and splitting the array based on this pattern.