subtracting a certain row in a matrix - python

So I have a 4 by 4 matrix. [[1,2,3,4],[2,3,4,5],[3,4,5,6],[4,5,6,7]]
I need to subtract the second row by [1,2,3,4]
no numpy if possible. I'm a beginner and don't know how to use that
thnx

With regular Python loops:
a = [[1,2,3,4],[2,3,4,5],[3,4,5,6],[4,5,6,7]]
b = [1,2,3,4]
for i in range(4):
a[1][i] -= b[i]
Simply loop over the entries in the b list and subtract from the corresponding entries in a[1], the second list (ie row) of the a matrix.
However, NumPy can do this for you faster and easier and isn't too hard to learn:
In [47]: import numpy as np
In [48]: a = np.array([[1,2,3,4],[2,3,4,5],[3,4,5,6],[4,5,6,7]])
In [49]: a
Out[49]:
array([[1, 2, 3, 4],
[2, 3, 4, 5],
[3, 4, 5, 6],
[4, 5, 6, 7]])
In [50]: a[1] -= [1,2,3,4]
In [51]: a
Out[51]:
array([[1, 2, 3, 4],
[1, 1, 1, 1],
[3, 4, 5, 6],
[4, 5, 6, 7]])
Note that NumPy vectorizes many of its operations (such as subtraction), so the loops involved are handled for you (in fast, pre-compiled C-code).

Related

How to loop back to beginning of the array for out of bounds index in numpy?

I have a 2D numpy array that I want to extract a submatrix from.
I get the submatrix by slicing the array as below.
Here I want a 3*3 submatrix around an item at the index of (2,3).
>>> import numpy as np
>>> a = np.array([[0, 1, 2, 3],
... [4, 5, 6, 7],
... [8, 9, 0, 1],
... [2, 3, 4, 5]])
>>> a[1:4, 2:5]
array([[6, 7],
[0, 1],
[4, 5]])
But what I want is that for indexes that are out of range, it goes back to the beginning of array and continues from there. This is the result I want:
array([[6, 7, 4],
[0, 1, 8],
[4, 5, 2]])
I know that I can do things like getting mod of the index to the width of the array; but I'm looking for a numpy function that does that.
And also for an one dimensional array this will cause an index out of range error, which is not really useful...
This is one way using np.pad with wraparound mode.
>>> a = np.array([[0, 1, 2, 3],
[4, 5, 6, 7],
[8, 9, 0, 1],
[2, 3, 4, 5]])
>>> pad_width = 1
>>> i, j = 2, 3
>>> startrow, endrow = i-1+pad_width, i+2+pad_width # for 3 x 3 submatrix
>>> startcol, endcol = j-1+pad_width, j+2+pad_width
>>> np.pad(a, (pad_width, pad_width), 'wrap')[startrow:endrow, startcol:endcol]
array([[6, 7, 4],
[0, 1, 8],
[4, 5, 2]])
Depending on the shape of your patch (eg. 5 x 5 instead of 3 x 3) you can increase the pad_width and start and end row and column indices accordingly.
np.take does have a mode parameter which can wrap-around out of bound indices. But it's a bit hacky to use np.take for multidimensional arrays since the axis must be a scalar.
However, In your particular case you could do this:
a = np.array([[0, 1, 2, 3],
[4, 5, 6, 7],
[8, 9, 0, 1],
[2, 3, 4, 5]])
np.take(a, np.r_[2:5], axis=1, mode='wrap')[1:4]
Output:
array([[6, 7, 4],
[0, 1, 8],
[4, 5, 2]])
EDIT
This function might be what you are looking for (?)
def select3x3(a, idx):
x,y = idx
return np.take(np.take(a, np.r_[x-1:x+2], axis=0, mode='wrap'), np.r_[y-1:y+2], axis=1, mode='wrap')
But in retrospect, i recommend using modulo and fancy indexing for this kind of operation (it's basically what the mode='wrap' is doing internally anyways):
def select3x3(a, idx):
x,y = idx
return a[np.r_[x-1:x+2][:,None] % a.shape[0], np.r_[y-1:y+2][None,:] % a.shape[1]]
The above solution is also generalized for any 2d shape on a.

Add a matrix outside a loop

I have a function that gives me a matrix of 17*3 (float (17,3)). I call that function again and again in a loop, i want to add the matrices so that rows remain 17 but column keeps on adding to make one big matrix.
Without NUMPY:
Transpose the matrix first because you are not gonna touch the 17 rows.
# a matrix is 17 * 3
a_transpose = [[a[j][i] for j in range(len(a))] for i in range(len(a[0]))]
Then, add the column of 17 rows as one row of 17 columns
a_transpose.append([1,2,3, ... 17])
Once, you are done adding several rows, transpose the matrix back as said above. That way, you don't iterate through your array 17 times everytime you add a column to your matrix.
With NUMPY:
Transpose
# a matrix is 17 * 3
a = numpy.array(a)
a_transpose = a.transpose()
Add a row (17 column values you wanted to add)
a_transpose.append([1,2,3, .... 17], axis=0)
Your function:
In [187]: def foo(i):
...: return np.arange(i,i+6).reshape(3,2)
...:
Iteratively build a list of arrays:
In [188]: alist = []
In [189]: for i in range(4):
...: alist.append(foo(i))
...:
In [190]: alist
Out[190]:
[array([[0, 1],
[2, 3],
[4, 5]]), array([[1, 2],
[3, 4],
[5, 6]]), array([[2, 3],
[4, 5],
[6, 7]]), array([[3, 4],
[5, 6],
[7, 8]])]
Make an array from that list:
In [191]: np.concatenate(alist, axis=1)
Out[191]:
array([[0, 1, 1, 2, 2, 3, 3, 4],
[2, 3, 3, 4, 4, 5, 5, 6],
[4, 5, 5, 6, 6, 7, 7, 8]])

NumPy complicated slicing

I have a NumPy array, for example:
>>> import numpy as np
>>> x = np.random.randint(0, 10, size=(5, 5))
>>> x
array([[4, 7, 3, 7, 6],
[7, 9, 5, 7, 8],
[3, 1, 6, 3, 2],
[9, 2, 3, 8, 4],
[0, 9, 9, 0, 4]])
Is there a way to get a view (or copy) that contains indices 1:3 of the first row, indices 2:4 of the second row and indices 3:5 of the forth row?
So, in the above example, I wish to get:
>>> # What to write here?
array([[7, 3],
[5, 7],
[8, 4]])
Obviously, I would like a general method that would work efficiently also for multi-dimensional large arrays (and not only for the toy example above).
Try:
>>> np.array([x[0, 1:3], x[1, 2:4], x[3, 3:5]])
array([[7, 3],
[5, 7],
[8, 4]])
You can use numpy.lib.stride_tricks.as_strided as long as the offsets between rows are uniform:
# How far to step along the rows
offset = 1
# How wide the chunk of each row is
width = 2
view = np.lib.stride_tricks.as_strided(x, shape=(x.shape[0], width), strides=(x.strides[0] + offset * x.strides[1],) + x.strides[1:])
The result is guaranteed to be a view into the original data, not a copy.
Since as_strided is ridiculously powerful, be very careful how you use it. For example, make absolutely sure that the view does not go out of bounds in the last few rows.
If you can avoid it, try not to assign anything into a view returned by as_strided. Assignment just increases the dangers of unpredictable behavior and crashing a thousandfold if you don't know exactly what you're doing.
I guess something like this :D
In:
import numpy as np
x = np.random.randint(0, 10, size=(5, 5))
Out:
array([[7, 3, 3, 1, 9],
[6, 1, 3, 8, 7],
[0, 2, 2, 8, 4],
[8, 8, 1, 8, 8],
[1, 2, 4, 3, 4]])
In:
list_of_indicies = [[0,1,3], [1,2,4], [3,3,5]] #[row, start, stop]
def func(array, row, start, stop):
return array[row, start:stop]
for i in range(len(list_of_indicies)):
print(func(x,list_of_indicies[i][0],list_of_indicies[i][1], list_of_indicies[i][2]))
Out:
[3 3]
[3 8]
[3 4]
So u can modify it for your needs. Good luck!
I would extract diagonal vectors and stack them together, like this:
def diag_slice(x, start, end):
n_rows = min(*x.shape)-end+1
columns = [x.diagonal(i)[:n_rows, None] for i in range(start, end)]
return np.hstack(columns)
In [37]: diag_slice(x, 1, 3)
Out[37]:
array([[7, 3],
[5, 7],
[3, 2]])
For the general case it will be hard to beat a row by row list comprehension:
In [28]: idx = np.array([[0,1,3],[1,2,4],[4,3,5]])
In [29]: [x[i,j:k] for i,j,k in idx]
Out[29]: [array([7, 8]), array([2, 0]), array([9, 2])]
If the resulting arrays are all the same size, they can be combined into one 2d array:
In [30]: np.array(_)
Out[30]:
array([[7, 8],
[2, 0],
[9, 2]])
Another approach is to concatenate the indices before. I won't get into the details, but create something like this:
In [27]: x[[0,0,1,1,3,3],[1,2,2,3,3,4]]
Out[27]: array([7, 8, 2, 0, 3, 8])
Selecting from different rows complicates this 2nd approach. Conceptually the first is simpler. Past experience suggests the speed is about the same.
For uniform length slices, something like the as_strided trick may be faster, but it requires more understanding.
Some masking based approaches have also been suggested. But the details are more complicated, so I'll leave those to people like #Divakar who have specialized in them.
Someone has already pointed out the as_strided tricks, and yes, you should really use it with caution.
Here is a broadcast / fancy index approach which is less efficient than as_strided but still works pretty well IMO
window_size, step_size = 2, 1
# index within window
index = np.arange(2)
# offset
offset = np.arange(1, 4, step_size)
# for your case it's [0, 1, 3], I'm not sure how to generalize it without further information
fancy_row = np.array([0, 1, 3]).reshape(-1, 1)
# array([[1, 2],
# [2, 3],
# [3, 4]])
fancy_col = offset.reshape(-1, 1) + index
x[fancy_row, fancy_col]

Put numpy arrays split with np.split() back together

I have split a numpy array like so:
x = np.random.randn(10,3)
x_split = np.split(x,5)
which splits x equally into five numpy arrays each with shape (2,3) and puts them in a list. What is the best way to combine a subset of these back together (e.g. x_split[:k] and x_split[k+1:]) so that the resulting shape is similar to the original x i.e. (something,3)?
I found that for k > 0 this is possible with you do:
np.vstack((np.vstack(x_split[:k]),np.vstack(x_split[k+1:])))
but this does not work when k = 0 as x_split[:0] = [] so there must be a better and cleaner way. The error message I get when k = 0 is:
ValueError: need at least one array to concatenate
The comment by Paul Panzer is right on target, but since NumPy now gently discourages vstack, here is the concatenate version:
x = np.random.randn(10, 3)
x_split = np.split(x, 5, axis=0)
k = 0
np.concatenate(x_split[:k] + x_split[k+1:], axis=0)
Note the explicit axis argument passed both times (it has to be the same); this makes it easy to adapt the code to work for other axes if needed. E.g.,
x_split = np.split(x, 3, axis=1)
k = 0
np.concatenate(x_split[:k] + x_split[k+1:], axis=1)
np.r_ can turn several slices into a list of indices.
In [20]: np.r_[0:3, 4:5]
Out[20]: array([0, 1, 2, 4])
In [21]: np.vstack([xsp[i] for i in _])
Out[21]:
array([[9, 7, 5],
[6, 4, 3],
[9, 8, 0],
[1, 2, 2],
[3, 3, 0],
[8, 1, 4],
[2, 2, 5],
[4, 4, 5]])
In [22]: np.r_[0:0, 1:5]
Out[22]: array([1, 2, 3, 4])
In [23]: np.vstack([xsp[i] for i in _])
Out[23]:
array([[9, 8, 0],
[1, 2, 2],
[3, 3, 0],
[8, 1, 4],
[3, 2, 0],
[0, 3, 8],
[2, 2, 5],
[4, 4, 5]])
Internally np.r_ has a lot of ifs and loops to handle the slices and their boundaries, but it hides it all from us.
If the xsp (your x_split) was an array, we could do xsp[np.r_[...]], but since it is a list we have to iterate. Well we could also hide that iteration with an operator.itemgetter object.
In [26]: operator.itemgetter(*Out[22])
Out[26]: operator.itemgetter(1, 2, 3, 4)
In [27]: np.vstack(operator.itemgetter(*Out[22])(xsp))

Apply same permutation for every row in a 2D numpy array

To permute a 1D array A I know that you can run the following code:
import numpy as np
A = np.random.permutation(A)
I have a 2D array and want to apply exactly the same permutation for every row of the array. Is there any way you can specify the numpy to do that for you?
Generate random permutations for the number of columns in A and index into the columns of A, like so -
A[:,np.random.permutation(A.shape[1])]
Sample run -
In [100]: A
Out[100]:
array([[3, 5, 7, 4, 7],
[2, 5, 2, 0, 3],
[1, 4, 3, 8, 8]])
In [101]: A[:,np.random.permutation(A.shape[1])]
Out[101]:
array([[7, 5, 7, 4, 3],
[3, 5, 2, 0, 2],
[8, 4, 3, 8, 1]])
Actually you do not need to do this, from the documentation:
If x is a multi-dimensional array, it is only shuffled along its first
index.
So, taking Divakar's array:
a = np.array([
[3, 5, 7, 4, 7],
[2, 5, 2, 0, 3],
[1, 4, 3, 8, 8]
])
you can just do: np.random.permutation(a) and get something like:
array([[2, 5, 2, 0, 3],
[3, 5, 7, 4, 7],
[1, 4, 3, 8, 8]])
P.S. if you need to perform column permutations - just do np.random.permutation(a.T).T. Similar things apply to multi-dim arrays.
It depends what you mean on every row.
If you want to permute all values (regardless of row and column), reshape your array to 1d, permute, reshape back to 2d.
If you want to permutate each row but not shuffle the elements among the different columns you need to loop trough the one axis and call permutation.
for i in range(len(A)):
A[i] = np.random.permutation(A[i])
It can probably done shorter somehow but that is how it can be done.

Categories