Could any one explain me how does the command (<---) below works in python numpy
r = np.arange(36)
r.resize(6,6)
r.reshape(36)[::7] # <---
You just have to run the commands one by one and analyse their output:
Create a list of the first [0, 35] numbers.
>>> r = np.arange(36)
>>> r
array([ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16,
17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33,
34, 35])
Reshape the list in-place to a 6 x 6 array:
>>> r.resize(6,6) # equivalent to r = r.reshape(6,6)
>>> r
array([[ 0, 1, 2, 3, 4, 5],
[ 6, 7, 8, 9, 10, 11],
[12, 13, 14, 15, 16, 17],
[18, 19, 20, 21, 22, 23],
[24, 25, 26, 27, 28, 29],
[30, 31, 32, 33, 34, 35]])
Reshape the vector r to a 1Dimensional vector
>>> tmp = r.reshape(36)
tmp above is exactly the same as r in the first step
Filter every 7 element
>>> tmp[::7]
array([ 0, 7, 14, 21, 28, 35])
Slicing/Indexing is represented as i:j:k, where i = from, j = to and k = step. Thus, 5:10:2 would mean from element 5th to the 10th, give me elements every 2 steps. If i is not present, it is assumed to be from the beginning of the array. If j is not present, it is assumed to be until the end of the array. If k is not present it is assumed to have an step of 1 (all the elements in the range).
With all the above, you could rewrite your example in a single line as:
>>> np.arange(36)[::7]
Or if you already have r, which is N-Dimensional:
>>> r.ravel()[::7]
Here ravel will return a 1Dimensional view of r (preferred to reshape(36)).
If you want to know more about slicing, please refer to the numpy documentation.
At first, you are using NumPy ndarray.reshape, which reconstructs the given array to the specified shape. In your case, you are converting it to a 1-Dimension array with 36 elements.
Secondly, with the numbers between brackets, your are indexing certain values in the array. The slicing consists in 3 values per dimension, in the form of [number1:number2:number3]. If you leave the values blank (like in your case for numbers 1 and 2), you will leave them to default i.e. number1 will be 0, number2 will be -1 (the last array index) and number3 will be 3:
The first number indicates the array index where you will begin taking values.
The second number indicates the array index where you will stop taking values.
Finally, the last number indicates the number of positions that will be ignored after each index reading. In your case, you are reading every 7 indexes.
One point to add, both reshape() and resize() methods have the SAME functionality, the ONLY difference between them is how they affect the calling array object r:
r.resize() have no return. It directly change the shape of calling array object r.
r.reshape() returns a new reshaped array object. And leaves the original r unchanged.
>>> import numpy as np
>>> r = np.arange(36)
>>> r.shape
(36,)
>>> # 1. --- `reshape()` returns a new object and keep the `r` ---
>>> new = r.reshape(6,6)
>>> new.shape
(6, 6)
>>>
>>> # 2. --- resize changes `r` directly and returns `None` ---
>>> nothing = r.resize(6,6)
>>> type(nothing)
<class 'NoneType'>
>>> r.shape
(6, 6)
Related
I have created a code in which from my lists I create an array, which must be vertical, like a vector, the problem is that using the reshape method I don't get anything.
import numpy as np
data = [[ 28, 29, 30, 19, 20, 21],
[ 31, 32, 33, 22, 23, 24],
[ 1, 34, 35, 36, 25, 26],
[ 2, 19, 20, 21, 10, 11],
[ 3, 4, 5, 6, 7, 8 ]]
index = []
for i in range(len(data)):
index.append([data[i][0], data[i][1], data[i][2],
data[i][3], data[i][4], data[i][5]])
y = np.array([index[i]])
# y.reshape(6,1)
Is there any solution for these cases? Thank you.
I'm looking for something like this to remain:
If you want to view each row as a column, transpose the array in any one of the following ways:
index = data.T
index = np.transpose(data)
index = data.transpose()
index = np.swapaxes(data, 0, 1)
index = np.moveaxis(data, 1, 0)
...
Each column of index will be a row of data. If you just want to access one column at a time, you can do that too. For example, to get row 3 (4th row) of the original array, any of the following would work:
y = data[3, :]
y = data[3]
y = index[:, 3]
You can get a column vector from the result by explicitly reshaping it to one:
y = y.reshape(-1, 1)
y = np.reshape(y, (-1, 1))
y = np.expand_dims(y, 1)
Remember that reshaping creates a new array object which views the same data as the original. The only way I know to reshape an array in-place is to assign to its shape attribute:
y.shape = (y.size, 1)
You can use flatten() from numpy https://docs.scipy.org/doc/numpy/reference/generated/numpy.ndarray.flatten.html
(if you want a copy of the original array without modifying the original)
import numpy as np
data = [[ 28, 29, 30, 19, 20, 21],
[ 31, 32, 33, 22, 23, 24],
[ 1, 34, 35, 36, 25, 26],
[ 2, 19, 20, 21, 10, 11],
[ 3, 4, 5, 6, 7, 8 ]]
data = np.array(data).flatten()
print(data.shape)
(30,)
You can also use ravel()
(if you don't want a copy)
data = np.array(data).ravel()
If your array always has 2-d, this also works,
data = data.reshape(-1)
Having an array A with the shape (2,6, 60), is it possible to index it based on a binary array B of shape (6,)?
The 6 and 60 is quite arbitrary, they are simply the 2D data I wish to access.
The underlying thing I am trying to do is to calculate two variants of the 2D data (in this case, (6,60)) and then efficiently select the ones with the lowest total sum - that is where the binary (6,) array comes from.
Example: For B = [1,0,1,0,1,0] what I wish to receive is equal to stacking
A[1,0,:]
A[0,1,:]
A[1,2,:]
A[0,3,:]
A[1,4,:]
A[0,5,:]
but I would like to do it by direct indexing and not a for-loop.
I have tried A[B], A[:,B,:], A[B,:,:] A[:,:,B] with none of them providing the desired (6,60) matrix.
import numpy as np
A = np.array([[4, 4, 4, 4, 4, 4], [1, 1, 1, 1, 1, 1]])
A = np.atleast_3d(A)
A = np.tile(A, (1,1,60)
B = np.array([1, 0, 1, 0, 1, 0])
A[B]
Expected results are a (6,60) array containing the elements from A as described above, the received is either (2,6,60) or (6,6,60).
Thank you in advance,
Linus
You can generate a range of the indices you want to iterate over, in your case from 0 to 5:
count = A.shape[1]
indices = np.arange(count) # np.arange(6) for your particular case
>>> print(indices)
array([0, 1, 2, 3, 4, 5])
And then you can use that to do your advanced indexing:
result_array = A[B[indices], indices, :]
If you always use the full range from 0 to length - 1 (i.e. 0 to 5 in your case) of the second axis of A in increasing order, you can simplify that to:
result_array = A[B, indices, :]
# or the ugly result_array = A[B, np.arange(A.shape[1]), :]
Or even this if it's always 6:
result_array = A[B, np.arange(6), :]
An alternative solution using np.take_along_axis (from version 1.15 - docs)
import numpy as np
x = np.arange(2*6*6).reshape((2,6,6))
m = np.zeros(6, int)
m[0] = 1
#example: [1, 0, 0, 0, 0, 0]
np.take_along_axis(x, m[None, :, None], 0) #add dimensions to mask to match array dimensions
>>array([[[36, 37, 38, 39, 40, 41],
[ 6, 7, 8, 9, 10, 11],
[12, 13, 14, 15, 16, 17],
[18, 19, 20, 21, 22, 23],
[24, 25, 26, 27, 28, 29],
[30, 31, 32, 33, 34, 35]]])
I need to defina a function which reads a numpy array and produces the mean for k nearest points to number p in the array.
Example:
array= np.array([1, 2, 3, 4, 5, 6, 7, 50, 24, 32, 9, 11, 12, 10])
p= 15 (**Note this is not a number in the array, I will need to find the
number closest to p or p number itself)
k = 3
In this case, I would need to generate the mean for ([11, 12, 10)]
as they are closest to p = 15
With the above numbers, I will need to find the mean for k number of points closest to p and p can be explicitly stated in the array or may not be.
I am new and very confused at this point and feel I have exhausted my resources. I feel this question has been asked before but the answers are much too complex for what I need.
Thanks in advance.
Given a (1d) array arr and scalar input p, here's how you could find the mean of the n nearest values:
def neighbor_mean(arr, p, n=3):
idx = np.abs(arr - p).argsort()[:n]
return arr[idx].mean()
arr = np.array([1, 2, 3, 4, 5, 6, 7, 50, 24, 32, 9, 11, 12, 10])
neighbor_mean(arr, p=15)
# 11.0
In the above, first you take the absolute differences:
np.abs(arr - 15)
# array([14, 13, 12, 11, 10, 9, 8, 35, 9, 17, 6, 4, 3, 5])
Then argsort() returns the indices that would sort an array. We're interested in the n-smallest absolute differences. This is what you're really looking for, rather than sorting the differences directly.
np.abs(arr - p).argsort()[:3]
# array([12, 11, 13])
Lastly you want to index your input array arr and take the mean of this:
arr[[12, 11, 13]]
# array([12, 11, 10]) # mean: 11.0
Suppose I have a numpy array img, with img.shape == (468,832,3). What does img[::2, ::2] do? It reduces the shape to (234,416,3) Can you please explain the logic?
Let's read documentation together (Source).
(Just read the bold part first)
The basic slice syntax is i:j:k where i is the starting index, j is the stopping index, and k is the step (k \neq 0). This selects the m elements (in the corresponding dimension) with index values i, i + k, ..., i + (m - 1) k where m = q + (r\neq0) and q and r are the quotient and remainder obtained by dividing j - i by k: j - i = q k + r, so that i + (m - 1) k < j.
...
Assume n is the number of elements in the dimension being sliced.
Then, if i is not given it defaults to 0 for k > 0 and n - 1 for k < 0
. If j is not given it defaults to n for k > 0 and -n-1 for k < 0 . If
k is not given it defaults to 1. Note that :: is the same as : and
means select all indices along this axis.
Now looking at your part.
[::2, ::2] will be translated to [0:468:2, 0:832:2] because you do not specify the first two or i and j in the documentation. (You only specify k here. Recall the i:j:k notation above.) You select elements on these axes at the step size 2 which means you select every other elements along the axes specified.
Because you did not specify for the 3rd dimension, all will be selected.
It slices every alternate row, and then every alternate column, from an array, returning an array of size (n // 2, n // 2, ...).
Here's an example of slicing with a 2D array -
>>> a = np.arange(16).reshape(4, 4)
>>> a
array([[ 0, 1, 2, 3],
[ 4, 5, 6, 7],
[ 8, 9, 10, 11],
[12, 13, 14, 15]])
>>> a[::2, ::2]
array([[ 0, 2],
[ 8, 10]])
And, here's another example with a 3D array -
>>> a = np.arange(27).reshape(3, 3, 3)
>>> a
array([[[ 0, 1, 2],
[ 3, 4, 5],
[ 6, 7, 8]],
[[ 9, 10, 11],
[12, 13, 14],
[15, 16, 17]],
[[18, 19, 20],
[21, 22, 23],
[24, 25, 26]]])
>>> a[::2, ::2] # same as a[::2, ::2, :]
array([[[ 0, 1, 2],
[ 6, 7, 8]],
[[18, 19, 20],
[24, 25, 26]]])
Well, we have the RGB image as a 3D array of shape:
img.shape=(468,832,3)
Now, what does img[::2, ::2] do?
we're just downsampling the image (i.e. we're shrinking the image size by half by taking only every other pixel from the original image and we do this by using a step size of 2, which means to skip one pixel). This should be clear from the example below.
Let's take a simple grayscale image for easier understanding.
In [13]: arr
Out[13]:
array([[10, 11, 12, 13, 14, 15],
[20, 21, 22, 23, 24, 25],
[30, 31, 32, 33, 34, 35],
[40, 41, 42, 43, 44, 45],
[50, 51, 52, 53, 54, 55],
[60, 61, 62, 63, 64, 65]])
In [14]: arr.shape
Out[14]: (6, 6)
In [15]: arr[::2, ::2]
Out[15]:
array([[10, 12, 14],
[30, 32, 34],
[50, 52, 54]])
In [16]: arr[::2, ::2].shape
Out[16]: (3, 3)
Notice which pixels are in the sliced version. Also, observe how the array shape changes after slicing (i.e. it is reduced by half).
Now, this downsampling happens for all three channels in the image since there's no slicing happening in the third axis. Thus, you will get the shape reduced only for the first two axis in your example.
(468, 832, 3)
. . |
. . |
(234, 416, 3)
I have a numpy array of numbers, for example,
a = np.array([1, 3, 5, 6, 9, 10, 14, 15, 56])
I would like to find all the indexes of the elements within a specific range. For instance, if the range is (6, 10), the answer should be (3, 4, 5). Is there a built-in function to do this?
You can use np.where to get indices and np.logical_and to set two conditions:
import numpy as np
a = np.array([1, 3, 5, 6, 9, 10, 14, 15, 56])
np.where(np.logical_and(a>=6, a<=10))
# returns (array([3, 4, 5]),)
As in #deinonychusaur's reply, but even more compact:
In [7]: np.where((a >= 6) & (a <=10))
Out[7]: (array([3, 4, 5]),)
Summary of the answers
For understanding what is the best answer we can do some timing using the different solution.
Unfortunately, the question was not well-posed so there are answers to different questions, here I try to point the answer to the same question. Given the array:
a = np.array([1, 3, 5, 6, 9, 10, 14, 15, 56])
The answer should be the indexes of the elements between a certain range, we assume inclusive, in this case, 6 and 10.
answer = (3, 4, 5)
Corresponding to the values 6,9,10.
To test the best answer we can use this code.
import timeit
setup = """
import numpy as np
import numexpr as ne
a = np.array([1, 3, 5, 6, 9, 10, 14, 15, 56])
# or test it with an array of the similar size
# a = np.random.rand(100)*23 # change the number to the an estimate of your array size.
# we define the left and right limit
ll = 6
rl = 10
def sorted_slice(a,l,r):
start = np.searchsorted(a, l, 'left')
end = np.searchsorted(a, r, 'right')
return np.arange(start,end)
"""
functions = ['sorted_slice(a,ll,rl)', # works only for sorted values
'np.where(np.logical_and(a>=ll, a<=rl))[0]',
'np.where((a >= ll) & (a <=rl))[0]',
'np.where((a>=ll)*(a<=rl))[0]',
'np.where(np.vectorize(lambda x: ll <= x <= rl)(a))[0]',
'np.argwhere((a>=ll) & (a<=rl)).T[0]', # we traspose for getting a single row
'np.where(ne.evaluate("(ll <= a) & (a <= rl)"))[0]',]
functions2 = [
'a[np.logical_and(a>=ll, a<=rl)]',
'a[(a>=ll) & (a<=rl)]',
'a[(a>=ll)*(a<=rl)]',
'a[np.vectorize(lambda x: ll <= x <= rl)(a)]',
'a[ne.evaluate("(ll <= a) & (a <= rl)")]',
]
rdict = {}
for i in functions:
rdict[i] = timeit.timeit(i,setup=setup,number=1000)
print("%s -> %s s" %(i,rdict[i]))
print("Sorted:")
for w in sorted(rdict, key=rdict.get):
print(w, rdict[w])
Results
The results are reported in the following plot for a small array (on the top the fastest solution) as noted by #EZLearner they may vary depending on the size of the array. sorted slice could be faster for larger arrays, but it requires your array to be sorted, for arrays with over 10 M of entries ne.evaluate could be an option. Is hence always better to perform this test with an array of the same size as yours:
If instead of the indexes you want to extract the values you can perform the tests using functions2 but the results are almost the same.
I thought I would add this because the a in the example you gave is sorted:
import numpy as np
a = [1, 3, 5, 6, 9, 10, 14, 15, 56]
start = np.searchsorted(a, 6, 'left')
end = np.searchsorted(a, 10, 'right')
rng = np.arange(start, end)
rng
# array([3, 4, 5])
a = np.array([1,2,3,4,5,6,7,8,9])
b = a[(a>2) & (a<8)]
Other way is with:
np.vectorize(lambda x: 6 <= x <= 10)(a)
which returns:
array([False, False, False, True, True, True, False, False, False])
It is sometimes useful for masking time series, vectors, etc.
This code snippet returns all the numbers in a numpy array between two values:
a = np.array([1, 3, 5, 6, 9, 10, 14, 15, 56] )
a[(a>6)*(a<10)]
It works as following:
(a>6) returns a numpy array with True (1) and False (0), so does (a<10). By multiplying these two together you get an array with either a True, if both statements are True (because 1x1 = 1) or False (because 0x0 = 0 and 1x0 = 0).
The part a[...] returns all values of array a where the array between brackets returns a True statement.
Of course you can make this more complicated by saying for instance
...*(1-a<10)
which is similar to an "and Not" statement.
a = np.array([1, 3, 5, 6, 9, 10, 14, 15, 56])
np.argwhere((a>=6) & (a<=10))
Wanted to add numexpr into the mix:
import numpy as np
import numexpr as ne
a = np.array([1, 3, 5, 6, 9, 10, 14, 15, 56])
np.where(ne.evaluate("(6 <= a) & (a <= 10)"))[0]
# array([3, 4, 5], dtype=int64)
Would only make sense for larger arrays with millions... or if you hitting a memory limits.
This may not be the prettiest, but works for any dimension
a = np.array([[-1,2], [1,5], [6,7], [5,2], [3,4], [0, 0], [-1,-1]])
ranges = (0,4), (0,4)
def conditionRange(X : np.ndarray, ranges : list) -> np.ndarray:
idx = set()
for column, r in enumerate(ranges):
tmp = np.where(np.logical_and(X[:, column] >= r[0], X[:, column] <= r[1]))[0]
if idx:
idx = idx & set(tmp)
else:
idx = set(tmp)
idx = np.array(list(idx))
return X[idx, :]
b = conditionRange(a, ranges)
print(b)
s=[52, 33, 70, 39, 57, 59, 7, 2, 46, 69, 11, 74, 58, 60, 63, 43, 75, 92, 65, 19, 1, 79, 22, 38, 26, 3, 66, 88, 9, 15, 28, 44, 67, 87, 21, 49, 85, 32, 89, 77, 47, 93, 35, 12, 73, 76, 50, 45, 5, 29, 97, 94, 95, 56, 48, 71, 54, 55, 51, 23, 84, 80, 62, 30, 13, 34]
dic={}
for i in range(0,len(s),10):
dic[i,i+10]=list(filter(lambda x:((x>=i)&(x<i+10)),s))
print(dic)
for keys,values in dic.items():
print(keys)
print(values)
Output:
(0, 10)
[7, 2, 1, 3, 9, 5]
(20, 30)
[22, 26, 28, 21, 29, 23]
(30, 40)
[33, 39, 38, 32, 35, 30, 34]
(10, 20)
[11, 19, 15, 12, 13]
(40, 50)
[46, 43, 44, 49, 47, 45, 48]
(60, 70)
[69, 60, 63, 65, 66, 67, 62]
(50, 60)
[52, 57, 59, 58, 50, 56, 54, 55, 51]
You can use np.clip() to achieve the same:
a = [1, 3, 5, 6, 9, 10, 14, 15, 56]
np.clip(a,6,10)
However, it holds the values less than and greater than 6 and 10 respectively.