Related
Here is the example:
import numpy as np
a = np.array([1,2],[3,4],[5,6])
b = np.array([7,8],[9,10],[11,12],[12,13])
what I want is to use each item in a to plus every item in b then plus them together. For example, [1,2] should plus every row in b 1+7=8,2+8=10, 8+10=18;1+9=10,2+10=12,10+12=22... The result would like to be that[[18,22,26,28...],[22,26,....],[26,30....]]
My question is how to fulfil that? I know use numpy can be more efficient than loop but how to use the matrix to calculate this?
I believe this does what you want:
>>> a = np.array([[1,2],[3,4],[5,6]])
>>> b = np.array([[7,8],[9,10],[11,12],[12,13]])
>>> np.sum(a, axis=1)[:,None] + np.sum(b, axis=1)[None,:]
array([[18, 22, 26, 28],
[22, 26, 30, 32],
[26, 30, 34, 36]])
You can use list comprehensions:
import numpy as np
a = np.array([[1, 2], [3, 4], [5, 6]])
b = np.array([[7, 8], [9, 10], [11, 12], [12, 13]])
[[sum(i) + sum(j) for j in b] for i in a]
Output:
[[18, 22, 26, 28], [22, 26, 30, 32], [26, 30, 34, 36]]
This can be done in the most precise and succinct way as follows:
np.einsum('ijk-> ij', a[:,None,:]+b)
Let me explain each step.It combines einsum and broadcasting concepts of numpy.
Step 1- a[:,None,:] reshapes matrix a to shape (3,1,2). This mid axis with value 1 is helpful for broadcasting.
Step 2- a[:,None,:] + b broadcasts a and adds matrix b to get a resultant matrix of shape (3,4,2).
Step 3- np.einsum('ijk-> ij', a[:,None,:]+b) does sum reduction along the last axis of matrix obtained from previous step.
This question already has answers here:
How to calculate the sum of all columns of a 2D numpy array (efficiently)
(6 answers)
Closed 4 years ago.
I have a numpy array
np.array(data).shape
(50,50)
Now, I want to find the cumulative sums across axis=1. The problem is cumsum creates an array of cumulative sums, but I just care about the final value of every row.
This is incorrect of course:
np.cumsum(data, axis=1)[-1]
Is there a succinct way of doing this without looping through the array.
You are almost there, but as you have it now, you are selecting just the final row. What you need is to select all rows from the last column, so your indexing at the end should be: [:,-1].
Example:
>>> a
array([[ 0, 1, 2, 3, 4],
[ 5, 6, 7, 8, 9],
[10, 11, 12, 13, 14],
[15, 16, 17, 18, 19],
[20, 21, 22, 23, 24]])
>>> a.cumsum(axis=1)[:,-1]
array([ 10, 35, 60, 85, 110])
Note, I'm leaving this up as I think it explains what was going wrong with your attempt, but admittedly, there are more effective ways of doing this in the other answers!
The final cumulative sum of every row, is in fact simply the sum of every row, or the row-wise sum, so we can implement this as:
>>> x.sum(axis=1)
array([ 10, 35, 60, 85, 110])
So here for every row, we calculate the sum of all the columns. We thus do not need to first generate the sums in between (well these will likely be stored in an accumulator in numpy), but not "emitted" in the array.
You can use numpy.ufunc.reduce if you don't need the intermediary accumulated results of any ufunc.
>>> a = np.arange(9).reshape(3,3)
>>> a
>>>
array([[0, 1, 2],
[3, 4, 5],
[6, 7, 8]])
>>>
>>> np.add.reduce(a, axis=1)
>>> array([ 3, 12, 21])
However, in the case of sum, Willem's answer is clearly superior and to be preferred. Just keep in mind that in the general case, there's ufunc.reduce.
I have the following matrix:
M = np.matrix([[1,2,3,4,5,6,7,8,9,10],
[11,12,13,14,15,16,17,18,19,20],
[21,22,23,24,25,26,27,28,29,30]])
And I receive a vector indexing the columns of the matrix:
index = np.array([1,1,2,2,2,2,3,4,4,4])
This vector has 4 different values, so my objective is to create a list containing four new matrices so that the first matrix is made by the first two columns of M, the second matrix is made by columns 3 to 6 and so on:
M1 = np.matrix([[1,2],[11,12],[21,22]])
M2 = np.matrix([[3,4,5,6],[13,14,15,16],[23,24,25,26]])
M3 = np.matrix([[7],[17],[27]])
M4 = np.matrix([[8,9,10],[18,19,20],[28,29,30]])
l = list(M1,M2,M3,M4)
I need to do this in a automated way, since the number of rows and columns of M as well as the indexing scheme are not fixed. How can I do this?
There are 3 points to note:
For a variable number of variables, as in this case, the recommended solution is to use a dictionary.
You can use simple numpy indexing for the individual case.
Unless you have a very specific reason, use numpy.array instead of numpy.matrix.
Combining these points, you can use a dictionary comprehension:
d = {k: np.array(M[:, np.where(index==k)[0]]) for k in np.unique(index)}
Result:
{1: array([[ 1, 2],
[11, 12],
[21, 22]]),
2: array([[ 3, 4, 5, 6],
[13, 14, 15, 16],
[23, 24, 25, 26]]),
3: array([[ 7],
[17],
[27]]),
4: array([[ 8, 9, 10],
[18, 19, 20],
[28, 29, 30]])}
import numpy as np
M = np.matrix([[1,2,3,4,5,6,7,8,9,10],
[11,12,13,14,15,16,17,18,19,20],
[21,22,23,24,25,26,27,28,29,30]])
index = np.array([1,1,2,2,2,2,3,4,4,4])
m = [[],[],[],[]]
for i,c in enumerate(index):
m[k-1].append(c)
for idx in m:
print M[:,idx]
this is a little hard coded, I assumed you will always want 4 matrixes and such.. you can change it for more generalisation
I am trying to implement a function, which can split a 3 dimensional numpy array in to 8 pieces, whilst keeping the order intact. Essentially I need the splits to be:
G[:21, :18,:25]
G[21:, :18,:25]
G[21:, 18:,:25]
G[:21, 18:,:25]
G[:21, :18,25:]
G[21:, :18,25:]
G[21:, 18:,25:]
G[:21, 18:,25:]
Where the original size of this particular matrix would have been 42, 36, 50. How is it possible to generalise these 8 "slices" so I do not have to hardcode all of them? essentially move the :in every possible position.
Thanks!
You could apply the 1d slice to successive (lists of) dimensions.
With a smaller 3d array
In [147]: X=np.arange(4**3).reshape(4,4,4)
A compound list comprehension produces a nested list. Here I'm using the simplest double split
In [148]: S=[np.split(z,2,0) for y in np.split(X,2,2) for z in np.split(y,2,1)]
In this case, all sublists have the same size, so I can convert it to an array for convenient viewing:
In [149]: SA=np.array(S)
In [150]: SA.shape
Out[150]: (4, 2, 2, 2, 2)
There are your 8 subarrays, but grouped (4,2).
In [153]: SAA = SA.reshape(8,2,2,2)
In [154]: SAA[0]
Out[154]:
array([[[ 0, 1],
[ 4, 5]],
[[16, 17],
[20, 21]]])
In [155]: SAA[1]
Out[155]:
array([[[32, 33],
[36, 37]],
[[48, 49],
[52, 53]]])
Is the order right? I can change it by changing the axis in the 3 split operations.
Another approach is write your indexing expressions as tuples
In [156]: x,y,z = 2,2,2 # define the split points
In [157]: ind = [(slice(None,x), slice(None,y), slice(None,z)),
(slice(x,None), slice(None,y), slice(None,z)),]
# and so on
In [158]: S1=[X[i] for i in ind]
In [159]: S1[0]
Out[159]:
array([[[ 0, 1],
[ 4, 5]],
[[16, 17],
[20, 21]]])
In [160]: S1[1]
Out[160]:
array([[[32, 33],
[36, 37]],
[[48, 49],
[52, 53]]])
Looks like the same order I got before.
That ind list of tuples can be produced with some sort of iteration and/or list comprehension. Maybe even using itertools.product or np.mgrid to generate the permutations.
An itertools.product version could look something like
In [220]: def foo(i):
return [(slice(None,x) if j else slice(x,None))
for j,x in zip(i,[2,2,2])]
In [221]: SAA = np.array([X[foo(i)] for i in
itertools.product(range(2),range(2),range(2))])
In [222]: SAA[-1]
Out[222]:
array([[[ 0, 1],
[ 4, 5]],
[[16, 17],
[20, 21]]])
product iterates the last value fastest, so the list is reversed (compared to your target).
To generate a particular order it may be easier to list the tuples explicitly, e.g.:
In [227]: [X[foo(i)] for i in [(1,1,1),(0,1,1),(0,0,1)]]
Out[227]:
[array([[[ 0, 1],
[ 4, 5]],
[[16, 17],
[20, 21]]]), array([[[32, 33],
[36, 37]],
[[48, 49],
[52, 53]]]), array([[[40, 41],
[44, 45]],
[[56, 57],
[60, 61]]])]
This highlights the fact that there are 2 distinct issues - generating the iteration pattern, and splitting the array based on this pattern.
Suppose I have some Scipy sparse csr matrix, A, and a set of indices, inds=[i_1,...,i_n]. I can access the submatrix given by the rows and columns given in inds via A[inds,:][:,inds]. But I cannot figure out how to modify them. All of the following fail (i.e. do not change the matrix values):
A[inds,:][:,inds] *= 5.0
(A[inds,:][:,inds]).multiply(5.0)
A[inds,:][:,inds] = 5.0
Is there any easy way to modify a submatrix of a sparse matrix?
The rules for accessing a block, or submatrix, in sparse are the same [edit: similar] as for numpy. The 2 index arrays need to be broadcastable. The simplest way is to make the 1st one a column vector.
I'll illustrate:
In [13]: A = np.arange(24).reshape(4,6)
In [14]: M=sparse.csr_matrix(A)
In [15]: A[[[1],[2]],[1,2,3]]
Out[15]:
array([[ 7, 8, 9],
[13, 14, 15]])
In [16]: M[[[1],[2]],[1,2,3]].A
Out[16]:
array([[ 7, 8, 9],
[13, 14, 15]], dtype=int32)
In [17]: idx1=np.array([1,2])[:,None]
In [18]: idx1
Out[18]:
array([[1],
[2]])
In [19]: idx2=np.array([1,2,3])
In [20]: M[idx1, idx2].A
Out[20]:
array([[ 7, 8, 9],
[13, 14, 15]], dtype=int32)
In [21]: M[idx1, idx2] *= 2
In [22]: M.A
Out[22]:
array([[ 0, 1, 2, 3, 4, 5],
[ 6, 14, 16, 18, 10, 11],
[12, 26, 28, 30, 16, 17],
[18, 19, 20, 21, 22, 23]], dtype=int32)
M[inds,:][:,inds] has the same problem in sparse as in numpy. With a list inds, M[inds,:] is a copy of the original, not a view. I've show that with reference to the data buffers in numpy. I'm not quite sure how to demonstate it with sparse.
Roughly, A[...][...] = ... translates to A.__getitem__(...).__setitem__(...,...). If A.__getitem__(...) is a copy, then modifying it won't modify A itself.
Actually sparse matrices don't have distinction between views and copies. Most, if not all, indexing produces a copy. M[:2,:] is a copy, even though A[:2,:] is a view.
I should also add that changing the values of a sparse matrix is something you should do with caution. In place multiplications (*=) is ok.
In place addition is not supported:
In [31]: M[idx1, idx2] += 2
...
NotImplementedError:
Modification of values may produce an EfficiencyWarning - if it turns a 0 value to nonzero:
In [33]: M[:2, :2] = 3
/usr/lib/python3/dist-packages/scipy/sparse/compressed.py:690: SparseEfficiencyWarning: Changing the sparsity structure of a csr_matrix is expensive. lil_matrix is more efficient.
SparseEfficiencyWarning)
The np.ix_ answer to your previous question works here as well.
Python - list of same columns / rows from matrix
M[np.ix_([1,2],[1,2,3])].A