Numpy repeating a row or column - python

Suppose we have the matrix A:
A = [1,2,3
4,5,6
7,8,9]
I want to know if there is a way to obtain:
B = [1,2,3
4,5,6
7,8,9
7,8,9]
As well as:
B = [1,2,3,3
4,5,6,6
7,8,9,9]
This is because the function I want to implement is the following:
U(i,j) = min(A(i+1,j)^2, A(i,j)^2)
V(i,j) = min(A(i,j+1)^2, A(i,j)^2)
And the numpy.minimum seems to need two arrays with equal shapes.
My idea is the following:
np.minimum(np.square(A[1:]), np.square(A[:]))
but it will fail.

For your particular example you could use numpy.hstack and numpy.vstack:
In [11]: np.vstack((A, A[-1]))
Out[11]:
array([[1, 2, 3],
[4, 5, 6],
[7, 8, 9],
[7, 8, 9]])
In [12]: np.hstack((A, A[:, [-1]]))
Out[12]:
array([[1, 2, 3, 3],
[4, 5, 6, 6],
[7, 8, 9, 9]])
An alternative to the last one is np.hstack((A, np.atleast_2d(A[:,-1]).T)) or np.vstack((A.T, A.T[-1])).T): you can't hstack a (3,) array to a (3,3) one without putting the elements in the rows of a (3,1) array.

A good answer to your literal question is provided by #xnx, but I wonder whether what you actually need is something else.
This type of question comes up a lot in comparisons, and the usual solution is to take only the valid entries, rather than using a misaligned comparison. That is, something like this is common:
import numpy as np
A = np.arange(9).reshape((3,3))
U = np.minimum(A[1:,:]**2, A[:-1,:]**2)
V = np.minimum(A[:,1:]**2, A[:,:-1]**2)
print U
# [[ 0 1 4]
# [ 9 16 25]]
print V
# [[ 0 1]
# [ 9 16]
# [36 49]]
I suspect that you're probably thinking, "that's a hassle, now U and V have different shapes than A, which is not what I want". But, to this I'd say, "yes, it is a hassle, but it's better to deal with the problem up front and directly than hide it within an invalid row of an array."
A standard example and typical use case of this approach would be numpy.diff, where, "the shape of the output is the same as a except along axis where the dimension is smaller by n."

Related

Mean of each element of matrices in a list

Hello I'm new in python I couldn't solve my problem. Suppose I have a list (a), this list has many matricies which is the same shape. I want to get one matrix that result of mean of each elements.
here is the list and its elements:
a[0]=[1 2 3]
a[1]=[3 4 5]
a[2]=[6 7 8]
Here is the desired matrix:
mean=[10/3 13/3 16/3]
Mean of each element of a list of matrices
Actually, this answer is good for me but it's for the R, not python. Sorry if I made a mistake while asking a question.
Using Python list comprehension
a = [[1, 2, 3],
[3, 4, 5],
[6, 7, 8]]
mean = [sum(row)/len(row) for row in zip(*a)] # use zip(*a) to transpose matrix
# since sum along columns
# by summing rows of transposed a
# [3.3333333333333335, 4.333333333333333, 5.333333333333333]
Here is a pure python solution that would work with any matrice dimension:
matrice = [
[1, 2, 3],
[3, 4, 5],
[6, 7, 8]
]
def mean_mat(mat):
dim_axis_0 = mat.__len__()
mean = [0 for i in range(dim_axis_0)]
for vector in mat:
for i, value in enumerate(vector):
mean[i] += (value / dim_axis_0)
return mean
print(mean_mat(matrice))
>>> [3.333333333333333, 4.333333333333334, 5.333333333333334]
However, as user1740577 pointed out, you should checkout the Numpy library.
try this:
import numpy as np
a= [[1,2,3],[3,4,5],[6,7,8]]
np.mean(a, axis=0)
# array([3.33333333, 4.33333333, 5.33333333])

How to index a numpy array of dimension N with a 1-dimensional array of shape (N,)

I would like to index an array of dimension N using an array of size (N,).
For example, let us consider a case where N is 2.
import numpy as np
foo = np.arange(9).reshape(3,3)
bar = np.array((2,1))
>>> foo
array([[0, 1, 2],
[3, 4, 5],
[6, 7, 8]])
>>>bar
array([2, 1])
>>>foo[bar[0],bar[1]]
7
This works fine. However, with this method, I would need to write N times bar[i], which is not a nice solution if N is high.
The following command does not give the result that I need:
>>>foo[bar]
array([[6, 7, 8],
[3, 4, 5]])
What could I do to get the result that I want in a nice and concise way?
I think you can turn bar into tuple:
foo[tuple(bar)]
# 7

Given the indexes corresponding to each row, get the corresponding elements from a matrix

Given indexes for each row, how to return the corresponding elements in a 2-d matrix?
For instance, In array of np.array([[1,2,3,4],[4,5,6,7]]) I expect to see the output [[1,2],[4,5]] given indxs = np.array([[0,1],[0,1]]). Below is what I've tried:
a= np.array([[1,2,3,4],[4,5,6,7]])
indxs = np.array([[0,1],[0,1]]) #means return the elements located at 0 and 1 for each row
#I tried this, but it returns an array with shape (2, 2, 4)
a[idxs]
The reason you are getting two times your array is that when you do a[[0,1]] you are selecting the rows 0 and 1 from your array a, which are indeed your entire array.
In[]: a[[0,1]]
Out[]: array([[1, 2, 3, 4],
[4, 5, 6, 7]])
You can get the desired output using slides. That would be the easiest way.
a = np.array([[1,2,3,4],[4,5,6,7]])
a[:,0:2]
Out []: array([[1, 2],
[4, 5]])
In case you are still interested on indexing, you could also get your output doing:
In[]: [list(a[[0],[0,1]]),list(a[[1],[0,1]])]
Out[]: [[1, 2], [4, 5]]
The NumPy documentation gives you a really nice overview on how indexes work.
In [120]: indxs = np.array([[0,1],[0,1]])
In [121]: a= np.array([[1,2,3,4],[4,5,6,7]])
...: indxs = np.array([[0,1],[0,1]]) #
You need to provide an index for the first dimension, one that broadcasts with with indxs.
In [122]: a[np.arange(2)[:,None], indxs]
Out[122]:
array([[1, 2],
[4, 5]])
indxs is (2,n), so you need a (2,1) array to give a (2,n) result

masking array with logical values along an arbitrary axis

Suppose I have a multidimensional array and a vector of logical values. I want to select items along an arbitrary (n-th) dimension. In the following example I am going to select the first and third values along the second dimension:
>>> A = np.array([[1, 2, 3, 4], [5, 6, 7, 8], [9, 10, 11, 12]])
>>> mask = np.array([True, False, True, False])
>>> dim_to_mask = 1 # i.e. 2nd dimension because it's 0-indexed
>>> B = ... # here do mask the dim_to_mask-th dimension - HOW???
>>> B
[[1, 3],
[5, 7],
[9, 11]]
Note: assume that the length of the logical vector corresponds to the length of the given axis.
I know it would be easy if the array is just one-dimensional using [] operator, but this is multidimensional problem.
Actually I want something like function take(indices, axis) which selects given indices along an arbitrary axis. The only difference is that I do have logical values instead of numeric indices.
I am also aiming at the fastest solution so converting vector of logical values to indices and using take is probably not the best solution.
I guess it must be something obvious which I am missing. :)
You could use np.compress:
>>> A.compress(mask, axis=1)
array([[ 1, 3],
[ 5, 7],
[ 9, 11]])
This function returns slices of an array along a particular axis. It accepts a boolean array with which to make the selections.

How do I access the ith column of a NumPy multidimensional array?

Given:
test = numpy.array([[1, 2], [3, 4], [5, 6]])
test[i] gives the ith row (e.g. [1, 2]). How do I access the ith column? (e.g. [1, 3, 5]). Also, would this be an expensive operation?
To access column 0:
>>> test[:, 0]
array([1, 3, 5])
To access row 0:
>>> test[0, :]
array([1, 2])
This is covered in Section 1.4 (Indexing) of the NumPy reference. This is quick, at least in my experience. It's certainly much quicker than accessing each element in a loop.
>>> test[:,0]
array([1, 3, 5])
this command gives you a row vector, if you just want to loop over it, it's fine, but if you want to hstack with some other array with dimension 3xN, you will have
ValueError: all the input arrays must have same number of dimensions
while
>>> test[:,[0]]
array([[1],
[3],
[5]])
gives you a column vector, so that you can do concatenate or hstack operation.
e.g.
>>> np.hstack((test, test[:,[0]]))
array([[1, 2, 1],
[3, 4, 3],
[5, 6, 5]])
And if you want to access more than one column at a time you could do:
>>> test = np.arange(9).reshape((3,3))
>>> test
array([[0, 1, 2],
[3, 4, 5],
[6, 7, 8]])
>>> test[:,[0,2]]
array([[0, 2],
[3, 5],
[6, 8]])
You could also transpose and return a row:
In [4]: test.T[0]
Out[4]: array([1, 3, 5])
Although the question has been answered, let me mention some nuances.
Let's say you are interested in the first column of the array
arr = numpy.array([[1, 2],
[3, 4],
[5, 6]])
As you already know from other answers, to get it in the form of "row vector" (array of shape (3,)), you use slicing:
arr_col1_view = arr[:, 1] # creates a view of the 1st column of the arr
arr_col1_copy = arr[:, 1].copy() # creates a copy of the 1st column of the arr
To check if an array is a view or a copy of another array you can do the following:
arr_col1_view.base is arr # True
arr_col1_copy.base is arr # False
see ndarray.base.
Besides the obvious difference between the two (modifying arr_col1_view will affect the arr), the number of byte-steps for traversing each of them is different:
arr_col1_view.strides[0] # 8 bytes
arr_col1_copy.strides[0] # 4 bytes
see strides and this answer.
Why is this important? Imagine that you have a very big array A instead of the arr:
A = np.random.randint(2, size=(10000, 10000), dtype='int32')
A_col1_view = A[:, 1]
A_col1_copy = A[:, 1].copy()
and you want to compute the sum of all the elements of the first column, i.e. A_col1_view.sum() or A_col1_copy.sum(). Using the copied version is much faster:
%timeit A_col1_view.sum() # ~248 µs
%timeit A_col1_copy.sum() # ~12.8 µs
This is due to the different number of strides mentioned before:
A_col1_view.strides[0] # 40000 bytes
A_col1_copy.strides[0] # 4 bytes
Although it might seem that using column copies is better, it is not always true for the reason that making a copy takes time too and uses more memory (in this case it took me approx. 200 µs to create the A_col1_copy). However if we needed the copy in the first place, or we need to do many different operations on a specific column of the array and we are ok with sacrificing memory for speed, then making a copy is the way to go.
In the case we are interested in working mostly with columns, it could be a good idea to create our array in column-major ('F') order instead of the row-major ('C') order (which is the default), and then do the slicing as before to get a column without copying it:
A = np.asfortranarray(A) # or np.array(A, order='F')
A_col1_view = A[:, 1]
A_col1_view.strides[0] # 4 bytes
%timeit A_col1_view.sum() # ~12.6 µs vs ~248 µs
Now, performing the sum operation (or any other) on a column-view is as fast as performing it on a column copy.
Finally let me note that transposing an array and using row-slicing is the same as using the column-slicing on the original array, because transposing is done by just swapping the shape and the strides of the original array.
A[:, 1].strides[0] # 40000 bytes
A.T[1, :].strides[0] # 40000 bytes
To get several and indepent columns, just:
> test[:,[0,2]]
you will get colums 0 and 2
>>> test
array([[0, 1, 2, 3, 4],
[5, 6, 7, 8, 9]])
>>> ncol = test.shape[1]
>>> ncol
5L
Then you can select the 2nd - 4th column this way:
>>> test[0:, 1:(ncol - 1)]
array([[1, 2, 3],
[6, 7, 8]])
This is not multidimensional. It is 2 dimensional array. where you want to access the columns you wish.
test = numpy.array([[1, 2], [3, 4], [5, 6]])
test[:, a:b] # you can provide index in place of a and b

Categories