Having an array A with the shape (2,6, 60), is it possible to index it based on a binary array B of shape (6,)?
The 6 and 60 is quite arbitrary, they are simply the 2D data I wish to access.
The underlying thing I am trying to do is to calculate two variants of the 2D data (in this case, (6,60)) and then efficiently select the ones with the lowest total sum - that is where the binary (6,) array comes from.
Example: For B = [1,0,1,0,1,0] what I wish to receive is equal to stacking
A[1,0,:]
A[0,1,:]
A[1,2,:]
A[0,3,:]
A[1,4,:]
A[0,5,:]
but I would like to do it by direct indexing and not a for-loop.
I have tried A[B], A[:,B,:], A[B,:,:] A[:,:,B] with none of them providing the desired (6,60) matrix.
import numpy as np
A = np.array([[4, 4, 4, 4, 4, 4], [1, 1, 1, 1, 1, 1]])
A = np.atleast_3d(A)
A = np.tile(A, (1,1,60)
B = np.array([1, 0, 1, 0, 1, 0])
A[B]
Expected results are a (6,60) array containing the elements from A as described above, the received is either (2,6,60) or (6,6,60).
Thank you in advance,
Linus
You can generate a range of the indices you want to iterate over, in your case from 0 to 5:
count = A.shape[1]
indices = np.arange(count) # np.arange(6) for your particular case
>>> print(indices)
array([0, 1, 2, 3, 4, 5])
And then you can use that to do your advanced indexing:
result_array = A[B[indices], indices, :]
If you always use the full range from 0 to length - 1 (i.e. 0 to 5 in your case) of the second axis of A in increasing order, you can simplify that to:
result_array = A[B, indices, :]
# or the ugly result_array = A[B, np.arange(A.shape[1]), :]
Or even this if it's always 6:
result_array = A[B, np.arange(6), :]
An alternative solution using np.take_along_axis (from version 1.15 - docs)
import numpy as np
x = np.arange(2*6*6).reshape((2,6,6))
m = np.zeros(6, int)
m[0] = 1
#example: [1, 0, 0, 0, 0, 0]
np.take_along_axis(x, m[None, :, None], 0) #add dimensions to mask to match array dimensions
>>array([[[36, 37, 38, 39, 40, 41],
[ 6, 7, 8, 9, 10, 11],
[12, 13, 14, 15, 16, 17],
[18, 19, 20, 21, 22, 23],
[24, 25, 26, 27, 28, 29],
[30, 31, 32, 33, 34, 35]]])
Related
I want to define a 3d numpy array with shape = [3, 3, 5] and also with values as a range starting with 11 and step = 3 in column-wise manner. I mean:
B[:,:,0] = [[ 11, 20, 29],
[ 14, 23, 32],
[17, 26, 35]]
B[:,:,1] = [[ 38, ...],
[ 41, ...],
[ 44, ...]]
...
I am new to numpy and I doubt doing it with np.arange or np.mgrid maybe. but I don't how to do.
How can this be done with minimal lines of code?
You can calculate the end of the range by multiplying the shape by the step and adding the start. Then it's just reshape and transpose to move the column around:
start = 11
step = 3
shape = [5, 3, 3]
end = np.prod(shape) * step + start
B = np.arange(start, end, step).reshape([5, 3, 3]).transpose(2, 1, 0)
B[:, :, 0]
# array([[11, 20, 29],
# [14, 23, 32],
# [17, 26, 35]])
Suppose I have a numpy array img, with img.shape == (468,832,3). What does img[::2, ::2] do? It reduces the shape to (234,416,3) Can you please explain the logic?
Let's read documentation together (Source).
(Just read the bold part first)
The basic slice syntax is i:j:k where i is the starting index, j is the stopping index, and k is the step (k \neq 0). This selects the m elements (in the corresponding dimension) with index values i, i + k, ..., i + (m - 1) k where m = q + (r\neq0) and q and r are the quotient and remainder obtained by dividing j - i by k: j - i = q k + r, so that i + (m - 1) k < j.
...
Assume n is the number of elements in the dimension being sliced.
Then, if i is not given it defaults to 0 for k > 0 and n - 1 for k < 0
. If j is not given it defaults to n for k > 0 and -n-1 for k < 0 . If
k is not given it defaults to 1. Note that :: is the same as : and
means select all indices along this axis.
Now looking at your part.
[::2, ::2] will be translated to [0:468:2, 0:832:2] because you do not specify the first two or i and j in the documentation. (You only specify k here. Recall the i:j:k notation above.) You select elements on these axes at the step size 2 which means you select every other elements along the axes specified.
Because you did not specify for the 3rd dimension, all will be selected.
It slices every alternate row, and then every alternate column, from an array, returning an array of size (n // 2, n // 2, ...).
Here's an example of slicing with a 2D array -
>>> a = np.arange(16).reshape(4, 4)
>>> a
array([[ 0, 1, 2, 3],
[ 4, 5, 6, 7],
[ 8, 9, 10, 11],
[12, 13, 14, 15]])
>>> a[::2, ::2]
array([[ 0, 2],
[ 8, 10]])
And, here's another example with a 3D array -
>>> a = np.arange(27).reshape(3, 3, 3)
>>> a
array([[[ 0, 1, 2],
[ 3, 4, 5],
[ 6, 7, 8]],
[[ 9, 10, 11],
[12, 13, 14],
[15, 16, 17]],
[[18, 19, 20],
[21, 22, 23],
[24, 25, 26]]])
>>> a[::2, ::2] # same as a[::2, ::2, :]
array([[[ 0, 1, 2],
[ 6, 7, 8]],
[[18, 19, 20],
[24, 25, 26]]])
Well, we have the RGB image as a 3D array of shape:
img.shape=(468,832,3)
Now, what does img[::2, ::2] do?
we're just downsampling the image (i.e. we're shrinking the image size by half by taking only every other pixel from the original image and we do this by using a step size of 2, which means to skip one pixel). This should be clear from the example below.
Let's take a simple grayscale image for easier understanding.
In [13]: arr
Out[13]:
array([[10, 11, 12, 13, 14, 15],
[20, 21, 22, 23, 24, 25],
[30, 31, 32, 33, 34, 35],
[40, 41, 42, 43, 44, 45],
[50, 51, 52, 53, 54, 55],
[60, 61, 62, 63, 64, 65]])
In [14]: arr.shape
Out[14]: (6, 6)
In [15]: arr[::2, ::2]
Out[15]:
array([[10, 12, 14],
[30, 32, 34],
[50, 52, 54]])
In [16]: arr[::2, ::2].shape
Out[16]: (3, 3)
Notice which pixels are in the sliced version. Also, observe how the array shape changes after slicing (i.e. it is reduced by half).
Now, this downsampling happens for all three channels in the image since there's no slicing happening in the third axis. Thus, you will get the shape reduced only for the first two axis in your example.
(468, 832, 3)
. . |
. . |
(234, 416, 3)
Could any one explain me how does the command (<---) below works in python numpy
r = np.arange(36)
r.resize(6,6)
r.reshape(36)[::7] # <---
You just have to run the commands one by one and analyse their output:
Create a list of the first [0, 35] numbers.
>>> r = np.arange(36)
>>> r
array([ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16,
17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33,
34, 35])
Reshape the list in-place to a 6 x 6 array:
>>> r.resize(6,6) # equivalent to r = r.reshape(6,6)
>>> r
array([[ 0, 1, 2, 3, 4, 5],
[ 6, 7, 8, 9, 10, 11],
[12, 13, 14, 15, 16, 17],
[18, 19, 20, 21, 22, 23],
[24, 25, 26, 27, 28, 29],
[30, 31, 32, 33, 34, 35]])
Reshape the vector r to a 1Dimensional vector
>>> tmp = r.reshape(36)
tmp above is exactly the same as r in the first step
Filter every 7 element
>>> tmp[::7]
array([ 0, 7, 14, 21, 28, 35])
Slicing/Indexing is represented as i:j:k, where i = from, j = to and k = step. Thus, 5:10:2 would mean from element 5th to the 10th, give me elements every 2 steps. If i is not present, it is assumed to be from the beginning of the array. If j is not present, it is assumed to be until the end of the array. If k is not present it is assumed to have an step of 1 (all the elements in the range).
With all the above, you could rewrite your example in a single line as:
>>> np.arange(36)[::7]
Or if you already have r, which is N-Dimensional:
>>> r.ravel()[::7]
Here ravel will return a 1Dimensional view of r (preferred to reshape(36)).
If you want to know more about slicing, please refer to the numpy documentation.
At first, you are using NumPy ndarray.reshape, which reconstructs the given array to the specified shape. In your case, you are converting it to a 1-Dimension array with 36 elements.
Secondly, with the numbers between brackets, your are indexing certain values in the array. The slicing consists in 3 values per dimension, in the form of [number1:number2:number3]. If you leave the values blank (like in your case for numbers 1 and 2), you will leave them to default i.e. number1 will be 0, number2 will be -1 (the last array index) and number3 will be 3:
The first number indicates the array index where you will begin taking values.
The second number indicates the array index where you will stop taking values.
Finally, the last number indicates the number of positions that will be ignored after each index reading. In your case, you are reading every 7 indexes.
One point to add, both reshape() and resize() methods have the SAME functionality, the ONLY difference between them is how they affect the calling array object r:
r.resize() have no return. It directly change the shape of calling array object r.
r.reshape() returns a new reshaped array object. And leaves the original r unchanged.
>>> import numpy as np
>>> r = np.arange(36)
>>> r.shape
(36,)
>>> # 1. --- `reshape()` returns a new object and keep the `r` ---
>>> new = r.reshape(6,6)
>>> new.shape
(6, 6)
>>>
>>> # 2. --- resize changes `r` directly and returns `None` ---
>>> nothing = r.resize(6,6)
>>> type(nothing)
<class 'NoneType'>
>>> r.shape
(6, 6)
Say I have a 2-D numpy array A of size 20 x 10.
I also have an array of length 20, del_ind.
I want to delete an element from each row of A according to del_ind, to get a resultant array of size 20 x 9.
How can I do this?
I looked into np.delete with a specified axis = 1, but this only deletes element from the same position for each row.
Thanks for the help
You will probably have to build a new array.
Fortunately you can avoid python loops for this task, using fancy indexing:
h, w = 20, 10
A = np.arange(h*w).reshape(h, w)
del_ind = np.random.randint(0, w, size=h)
mask = np.ones((h,w), dtype=bool)
mask[range(h), del_ind] = False
A_ = A[mask].reshape(h, w-1)
Demo with a smaller dataset:
>>> h, w = 5, 4
>>> %paste
A = np.arange(h*w).reshape(h, w)
del_ind = np.random.randint(0, w, size=h)
mask = np.ones((h,w), dtype=bool)
mask[range(h), del_ind] = False
A_ = A[mask].reshape(h, w-1)
## -- End pasted text --
>>> A
array([[ 0, 1, 2, 3],
[ 4, 5, 6, 7],
[ 8, 9, 10, 11],
[12, 13, 14, 15],
[16, 17, 18, 19]])
>>> del_ind
array([2, 2, 1, 1, 0])
>>> A_
array([[ 0, 1, 3],
[ 4, 5, 7],
[ 8, 10, 11],
[12, 14, 15],
[17, 18, 19]])
Numpy isn't known for inplace edits; it's mainly intended for statically sized matrices. For that reason, I'd recommend doing this by copying the intended elements to a new array.
Assuming that it's sufficient to delete one column from every row:
def remove_indices(arr, indices):
result = np.empty((arr.shape[0], arr.shape[1] - 1))
for i, (delete_index, row) in enumerate(zip(indices, arr)):
result[i] = np.delete(row, delete_index)
return result
I have several multidimensional arrays that have been zipped into a single list and am trying to remove values from the list according to a selection criteria applied to a single array. Specifically I have the 4 arrays, all of which have the same shape, that have all been zipped into one list of arrays:
in: array1.shape
out: (5,3)
...
in: array4.shape
out: (5,3)
in: array1
out: ([[0, 1, 1],
[0, 0, 1],
[0, 0, 1],
[0, 1, 1],
[0, 0, 0]])
in: array4
out: ([[20, 16, 20],
[15, 19, 17],
[21, 24, 23],
[22, 22, 26],
[27, 24, 23]])
in: fullarray = zip(array1,...,array4)
in: fullarray[0]
out: (array([0, 1, 1]), array([3, 4, 5]), array([33, 34, 35]), array([20, 16, 20]))
I am trying to iterate over the values from a single target array within each set of arrays and select the values from each array with the same index as the target array when the value equals 20. I doubt I explained that clearly so I'll give an example.
in: fullarray[0]
out: (array([0, 1, 1]), array([3, 4, 5]), array([33, 34, 35]), array([20, 16, 20]))
what I want is to iterate over the values of the fourth array in the list for
fullarray[x] and where the value = 20 to take the value of each array with
the same index and append them into a new list as an array.
so the output for fullarray[0] would be ([[0, 3, 33, 20]), [1, 5, 35, 20]])
My previous attempts have all generated a variety of error messages (example below). Any help would be appreciated.
in: for i in g:
for n in i:
if n == 3:
for k in n:
if k == 0:
newlist.append(i[k])
out: for i in fullarray:
2 for n in i:
----> 3 if n == 3:
4 for k in n:
5 if k == 0:
ValueError: The truth value of an array with more than one element is ambiguous.
Use a.any() or a.all()
Edit for modified question:
Here's a piece of code that's doing what you what. The time complexity could probably be improved though.
from numpy import array
fullarray = [(array([0, 1, 1]), array([3, 4, 5]), array([33, 34, 35]), array([20, 16, 20]))]
newlist = []
for arrays in fullarray:
for idx, value in enumerate(arrays[3]):
if value == 20:
newlist.append([array[idx] for array in arrays])
print newlist
Old answer: Assuming all your arrays are the same size you could do the following:
full[idx] contains a tuple with the values of all your arrays at the index idx in the order you zipped them.
import numpy as np
ar1 = np.array([1] * 8)
ar2 = np.array([2] * 8)
full = zip(ar1, ar2)
print full
newlist = []
for idx, v in enumerate(ar1):
if v != 0:
newlist.append(full[idx]) # Here you get a tuple such as (ar1[idx], ar2[idx])
But if len(ar1) > len(ar2) it is going to throw and exception so keep that in mind and adjust your code accordingly.