subsampling every nth entry in a numpy array - python

I am a beginner with numpy, and I am trying to extract some data from a long numpy array. What I need to do is start from a defined position in my array, and then subsample every nth data point from that position, until the end of my array.
basically if I had
a = [1,2,3,4,1,2,3,4,1,2,3,4....]
I want to subsample this to start at a[1] and then sample every fourth point from there, to produce something like
b = [2,2,2.....]

You can use numpy's slicing, simply start:stop:step.
>>> xs
array([1, 2, 3, 4, 1, 2, 3, 4, 1, 2, 3, 4])
>>> xs[1::4]
array([2, 2, 2])
This creates a view of the the original data, so it's constant time. It'll also reflect changes to the original array and keep the whole original array in memory:
>>> a
array([1, 2, 3, 4, 5])
>>> b = a[::2] # O(1), constant time
>>> b[:] = 0 # modifying the view changes original array
>>> a # original array is modified
array([0, 2, 0, 4, 0])
so if either of the above things are a problem, you can make a copy explicitly:
>>> a
array([1, 2, 3, 4, 5])
>>> b = a[::2].copy() # explicit copy, O(n)
>>> b[:] = 0 # modifying the copy
>>> a # original is intact
array([1, 2, 3, 4, 5])
This isn't constant time, but the result isn't tied to the original array. The copy also contiguous in memory, which can make some operations on it faster.

Complementary to behzad.nouri's answer:
If you want to control the number of final elements and ensure it's always fixed to a predefined value (rather than controlling a fixed step in between subsamples) you can use numpy's linspace method followed by integer rounding.
For example, with num_elements=4:
>>> a
array([1, 2, 3, 4, 5, 6, 7, 8, 9, 10])
>>> choice = np.round(np.linspace(1, len(a)-1, num=4)).astype(int)
>>> a[choice]
array([ 2, 5, 7, 10])
Or, subsampling an array with final start/end points in general:
>>> import numpy as np
>>> np.round(np.linspace(0, len(a)-1, num=4)).astype(int)
array([0, 3, 6, 9])
>>> np.round(np.linspace(0, len(a)-1, num=15)).astype(int)
array([0, 1, 1, 2, 3, 3, 4, 4, 5, 6, 6, 7, 8, 8, 9])

Related

Numpy Array: Slice several values at every step

I am trying to extract several values at once from an array but I can't seem to find a way to do it in a one-liner in Numpy.
Simply put, considering an array:
a = numpy.arange(10)
> array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])
I would like to be able to extract, say, 2 values, skip the next 2, extract the 2 following values etc. This would result in:
array([0, 1, 4, 5, 8, 9])
This is an example but I am ideally looking for a way to extract x values and skip y others.
I thought this could be done with slicing, doing something like:
a[:2:2]
but it only returns 0, which is the expected behavior.
I know I could obtain the expected result by combining several slicing operations (similarly to Numpy Array Slicing) but I was wondering if I was not missing some numpy feature.
If you want to avoid creating copies and allocating new memory, you could use a window_view of two elements:
win = np.lib.stride_tricks.sliding_window_view(a, 2)
array([[0, 1],
[1, 2],
[2, 3],
[3, 4],
[4, 5],
[5, 6],
[6, 7],
[7, 8],
[8, 9]])
And then only take every 4th window view:
win[::4].ravel()
array([0, 1, 4, 5, 8, 9])
Or directly go with the more dangerous as_strided, but heed the warnings in the documentation:
np.lib.stride_tricks.as_strided(a, shape=(3,2), strides=(32,8))
You can use a modulo operator:
x = 2 # keep
y = 2 # skip
out = a[np.arange(a.shape[0])%(x+y)<x]
Output: array([0, 1, 4, 5, 8, 9])
Output with x = 2 ; y = 3:
array([0, 1, 5, 6])

How to reverse a numpy array and then also switch each 'pair' of positions?

For example, how would you do this sequence of operations on a np 1D array, x:
[1,2,3,4,5,6,7,8]
[8,7,6,5,4,3,2,1]
[7,8,5,6,3,4,1,2]
The transition from state 1 to state 2 can be done with numpy.flip(x):
x = numpy.flip(x)
How can you go from this intermediate state to the final state, in which each 'pair' of positions switches positions
Notes: this is a variable length array, and will always be 1D
It is assumed that the length is always even. At this time, you only need to reshape, reverse and flatten:
>>> ar = np.arange(1, 9)
>>> ar.reshape(-1, 2)[::-1].ravel()
array([7, 8, 5, 6, 3, 4, 1, 2])
This always creates a copy, because the elements in the original array cannot be continuous after transformation, but ndarray.ravel() must create a continuous view.
If it is necessary to transition from state 2 to state 3:
>>> ar = ar[::-1]
>>> ar # state 2
array([8, 7, 6, 5, 4, 3, 2, 1])
>>> ar.reshape(-1, 2)[:, ::-1].ravel()
array([7, 8, 5, 6, 3, 4, 1, 2])
This should work (assumin you have a even number of elements, otherwise you might want to check this before)
x = x.reshape((len(x)//2, 2)) #split in two wolumns
x[:,0], x[:,1] = x[:,1], x[:,0].copy() # switch the columns
x = x.reshape(2*len(x)) # reshape back in a 1D array
You can do:
import numpy as np
arr = np.array([8,7,6,5,4,3,2,1])
result = np.vstack((arr[1::2], arr[::2])).T.flatten()
output:
array([7, 8, 5, 6, 3, 4, 1, 2])

pythonic way to get the (2,2) for every (4,4) block / grid in nxn numpy array [duplicate]

I am a beginner with numpy, and I am trying to extract some data from a long numpy array. What I need to do is start from a defined position in my array, and then subsample every nth data point from that position, until the end of my array.
basically if I had
a = [1,2,3,4,1,2,3,4,1,2,3,4....]
I want to subsample this to start at a[1] and then sample every fourth point from there, to produce something like
b = [2,2,2.....]
You can use numpy's slicing, simply start:stop:step.
>>> xs
array([1, 2, 3, 4, 1, 2, 3, 4, 1, 2, 3, 4])
>>> xs[1::4]
array([2, 2, 2])
This creates a view of the the original data, so it's constant time. It'll also reflect changes to the original array and keep the whole original array in memory:
>>> a
array([1, 2, 3, 4, 5])
>>> b = a[::2] # O(1), constant time
>>> b[:] = 0 # modifying the view changes original array
>>> a # original array is modified
array([0, 2, 0, 4, 0])
so if either of the above things are a problem, you can make a copy explicitly:
>>> a
array([1, 2, 3, 4, 5])
>>> b = a[::2].copy() # explicit copy, O(n)
>>> b[:] = 0 # modifying the copy
>>> a # original is intact
array([1, 2, 3, 4, 5])
This isn't constant time, but the result isn't tied to the original array. The copy also contiguous in memory, which can make some operations on it faster.
Complementary to behzad.nouri's answer:
If you want to control the number of final elements and ensure it's always fixed to a predefined value (rather than controlling a fixed step in between subsamples) you can use numpy's linspace method followed by integer rounding.
For example, with num_elements=4:
>>> a
array([1, 2, 3, 4, 5, 6, 7, 8, 9, 10])
>>> choice = np.round(np.linspace(1, len(a)-1, num=4)).astype(int)
>>> a[choice]
array([ 2, 5, 7, 10])
Or, subsampling an array with final start/end points in general:
>>> import numpy as np
>>> np.round(np.linspace(0, len(a)-1, num=4)).astype(int)
array([0, 3, 6, 9])
>>> np.round(np.linspace(0, len(a)-1, num=15)).astype(int)
array([0, 1, 1, 2, 3, 3, 4, 4, 5, 6, 6, 7, 8, 8, 9])

Finding differences between all values in an List

I want to find the differences between all values in a numpy array and append it to a new list.
Example: a = [1,4,2,6]
result : newlist= [3,1,5,3,2,2,1,2,4,5,2,4]
i.e for each value i of a, determine difference between values of the rest of the list.
At this point I have been unable to find a solution
You can do this:
a = [1,4,2,6]
newlist = [abs(i-j) for i in a for j in a if i != j]
Output:
print newlist
[3, 1, 5, 3, 2, 2, 1, 2, 4, 5, 2, 4]
I believe what you are trying to do is to calculate absolute differences between elements of the input list, but excluding the self-differences. So, with that idea, this could be one vectorized approach also known as array programming -
# Input list
a = [1,4,2,6]
# Convert input list to a numpy array
arr = np.array(a)
# Calculate absolute differences between each element
# against all elements to give us a 2D array
sub_arr = np.abs(arr[:,None] - arr)
# Get diagonal indices for the 2D array
N = arr.size
rem_idx = np.arange(N)*(N+1)
# Remove the diagonal elements for the final output
out = np.delete(sub_arr,rem_idx)
Sample run to show the outputs at each step -
In [60]: a
Out[60]: [1, 4, 2, 6]
In [61]: arr
Out[61]: array([1, 4, 2, 6])
In [62]: sub_arr
Out[62]:
array([[0, 3, 1, 5],
[3, 0, 2, 2],
[1, 2, 0, 4],
[5, 2, 4, 0]])
In [63]: rem_idx
Out[63]: array([ 0, 5, 10, 15])
In [64]: out
Out[64]: array([3, 1, 5, 3, 2, 2, 1, 2, 4, 5, 2, 4])

How to flatten a numpy array of dtype object

I'm taking ndarray slices with different length and I want my result to be flat.
For example:
a = np.array(((np.array((1,2)), np.array((1,2,3))), (np.array((1,2)), np.array((1,2,3,4,5,6,7,8)))))
Is there any straight way to make this array flat by using numpy functionalities (without loop)?
You could try flattening it and then using hstack, which stacks the array in sequence horizontally.
>>> a = np.array(((np.array((1,2)), np.array((1,2,3))), (np.array((1,2)), np.array((1,2,3,4,5,6,7,8)))))
>>> np.hstack(a.flatten())
array([1, 2, 1, 2, 3, 1, 2, 1, 2, 3, 4, 5, 6, 7, 8])

Categories