I've got a list of sorted samples. They're sorted by their sample time, where each sample is taken one second after the previous one.
I'd like to find the minimum value in a neighborhood of a specified size.
For example, given a neighborhood size of 2 and the following sample size:
samples = [ 5, 12.3, 12.3, 7, 2, 6, 9, 10, 5, 9, 17, 2 ]
I'd expect the following output: [5, 2, 5, 2]
What would be the best way to achieve this in numpy / scipy
Edited: Explained the reasoning behind the min values:
5 - the 2 number window next to it are [12.3 12.3]. 5 is smaller
2 - to the left [12.3, 7] to the right [6 9]. 2 is the min
5 - to the left [9 10] to the right [9 17]. 5 is the min
notice that 9 isn't min are there's a 2 window to its left and right with a smaller value (2)
Use scipy's argrelextrema:
>>> import numpy as np
>>> from scipy.signal import argrelextrema
>>> data = np.array([ 5, 12.3, 12.3, 7, 2, 6, 9, 10, 5, 9, 17, 2 ])
>>> radius = 2 # number of elements to the left and right to compare to
>>> argrelextrema(data, np.less, order=radius)
(array([4, 8]),)
Which suggest that numbers at position 4 and 8 (2 and 5) are the smallest ones in within a 2 size neighbourhood. The numbers at boundaries (5 and 2) are not detected since argrelextrema only supports clip or wrap boundary conditions. As for your question, I guess you are interested in them too. To detect them, it is easy to add reflect boundary conditions first:
>>> new_data = np.pad(data, radius, mode='reflect')
>>> new_data
array([ 12.3, 12.3, 5. , 12.3, 12.3, 7. , 2. , 6. , 9. ,
10. , 5. , 9. , 17. , 2. , 17. , 9. ])
With the data with the corresponding boundary conditions, we can now apply the previus extrema detector:
>>> arg_minimas = argrelextrema(new_data, np.less, order=radius)[0] - radius
>>> arg_minimas
array([ 0, 4, 8, 11])
Which returns the positions where the local extrema (minimum in this case since np.less) happens in a sliding window of radius=2.
NOTE the -radius to fix the +radius index after wrapping the array with reflect boundary conditions with np.pad.
EDIT: if you are insterested in the values and not in positions, it is straight forward:
>>> data[arg_minimas]
array([ 5., 2., 5., 2.])
It seems, basically you are finding local minima in a sliding window, but that sliding window slides in such a manner that the ending of the previous window act as the starting of a new window. For such a specific problem, suggested in this solution is a vectorized approach that uses broadcasting -
import numpy as np
# Inputs
N = 2
samples = [ 5, 12.3, 12.3, 7, 2, 6, 9, 10, 5, 9, 17, 2 ]
# Convert input list to a numpy array
S = np.asarray(samples)
# Calculate the number of Infs to be appended at the end
append_endlen = int(2*N*np.ceil((S.size+1)/(2*N))-1 - S.size)
# Append Infs at the start and end of the input array
S1 = np.concatenate((np.repeat(np.Inf,N),S,np.repeat(np.Inf,append_endlen)),0)
# Number of sliding windows
num_windows = int((S1.size-1)/(2*N))
# Get windowed values from input array into rows.
# Thus, get minimum from each row to get the desired local minimum.
indexed_vals = S1[np.arange(num_windows)[:,None]*2*N + np.arange(2*N+1)]
out = indexed_vals.min(1)
Sample runs
Run # 1: Original input data
In [105]: S # Input array
Out[105]:
array([ 5. , 12.3, 12.3, 7. , 2. , 6. , 9. , 10. , 5. ,
9. , 17. , 2. ])
In [106]: N # Window radius
Out[106]: 2
In [107]: out # Output array
Out[107]: array([ 5., 2., 5., 2.])
Run # 2: Modified input data, Window radius = 2
In [101]: S # Input array
Out[101]:
array([ 5. , 12.3, 12.3, 7. , 2. , 6. , 9. , 10. , 5. ,
9. , 17. , 2. , 0. , -3. , 7. , 99. , 1. , 0. ,
-4. , -2. ])
In [102]: N # Window radius
Out[102]: 2
In [103]: out # Output array
Out[103]: array([ 5., 2., 5., -3., -4., -4.])
Run # 3: Modified input data, Window radius = 3
In [97]: S # Input array
Out[97]:
array([ 5. , 12.3, 12.3, 7. , 2. , 6. , 9. , 10. , 5. ,
9. , 17. , 2. , 0. , -3. , 7. , 99. , 1. , 0. ,
-4. , -2. ])
In [98]: N # Window radius
Out[98]: 3
In [99]: out # Output array
Out[99]: array([ 5., 2., -3., -4.])
>>> import numpy as np
>>> a = np.array(samples)
>>> [a[max(i-2,0):i+2].min() for i in xrange(1, a.size)]
[5.0, 2.0, 2.0, 2.0, 2.0, 5.0, 5.0, 5.0, 2.0]
As Divakar pointed out in the comments, this is what a sliding window yields. If you want to remove duplicates, that can be done separately
This will look through each window, find the minimum value, and add it to a list if the window's minimum value isn't equal to the most recently added value.
samples = [5, 12.3, 12.3, 7, 2, 6, 9, 10, 5, 9, 17, 2]
neighborhood = 2
minima = []
for i in xrange(len(samples)):
window = samples[max(0, i - neighborhood):i + neighborhood + 1]
windowMin = min(window)
if minima == [] or windowMin != minima[-1]:
minima.append(windowMin)
This gives the output you described:
print minima
> [5, 2, 5, 2]
However, #imaluengo's answer is better since it will include both of two consecutive equal minimum values if they have different indices in the original list!
Related
I have this array (x,y,f(x,y)):
a=np.array([[ 1, 5, 3],
[ 4, 5, 6],
[ 4, 5, 6.1],
[ 1, 3, 42]])
I want to remove the duplicates with same x,y. In my array I have (4,5,6) and (4,5,6.1) and I want to remove one of them (no criterion).
If I had 2 columns (x,y) I could use
np.unique(a[:,:2], axis = 0)
But my array has 3 columns and I don't see how to do this in a simple way.
I can do a loop but my arrays can be very large.
Is there a way to do this more efficiently?
If I understand correctly, you need this:
a[np.unique(a[:,:2],axis=0,return_index=True)[1]]
output:
[[ 1. 3. 42.]
[ 1. 5. 3.]
[ 4. 5. 6.]]
Please be mindful that it does not keep the original order of rows in a. If you want to keep the order, simply sort the indices:
a[np.sort(np.unique(a[:,:2],axis=0,return_index=True)[1])]
output:
[[ 1. 5. 3.]
[ 4. 5. 6.]
[ 1. 3. 42.]]
I think you want to do this?
np.rint will round your numbers to an integer
import numpy as np
a = np.array([
[ 1, 5, 3],
[ 4, 5, 6],
[ 4, 5, 6.1],
[ 1, 3, 42]
])
a = np.unique(np.rint(a), axis = 0)
print(a)
//result :
[[ 1. 3. 42.]
[ 1. 5. 3.]
[ 4. 5. 6.]]
I have some problem with understanding. I read the following:
class MGridClass(nd_grid):
"""
`nd_grid` instance which returns a dense multi-dimensional "meshgrid".
An instance of `numpy.lib.index_tricks.nd_grid` which returns an dense
(or fleshed out) mesh-grid when indexed, so that each returned argument
has the same shape. The dimensions and number of the output arrays are
equal to the number of indexing dimensions. If the step length is not a
complex number, then the stop is not inclusive.
However, if the step length is a **complex number** (e.g. 5j), then
the integer part of its magnitude is interpreted as specifying the
number of points to create between the start and stop values, where
the stop value **is inclusive**.
So if I give real numbers, the content is 'modulo n==0'-wise divided:
>>> numpy.mgrid[0:4:1, 10:15:2]
array([[[ 0, 0, 0],
[ 1, 1, 1],
[ 2, 2, 2],
[ 3, 3, 3]],
[[10, 12, 14],
[10, 12, 14],
[10, 12, 14],
[10, 12, 14]]])
And with complex numbers - the number the integer with j suffix, instead of i for technical purposes - its the length of resulting values in the corresponding axis.
>>> numpy.mgrid[0:4:3j, 10:15:5j]
array([[[ 0. , 0. , 0. , 0. , 0. ],
[ 2. , 2. , 2. , 2. , 2. ],
[ 4. , 4. , 4. , 4. , 4. ]],
[[10. , 11.25, 12.5 , 13.75, 15. ],
[10. , 11.25, 12.5 , 13.75, 15. ],
[10. , 11.25, 12.5 , 13.75, 15. ]]])
But what's special with complex numbers, that they would be appropriate to reflect this change of perspective instead of a simple flag? Is here another part of real fancyness of numpy?
I need to change all nans of a matrix to a different value. I can easily get the nan positions using argwhere, but then I am not sure how to access those positions programmatically. Here is my nonworking code:
myMatrix = np.array([[3.2,2,float('NaN'),3],[3,1,2,float('NaN')],[3,3,3,3]])
nanPositions = np.argwhere(np.isnan(myMatrix))
maxVal = np.nanmax(abs(myMatrix))
for pos in nanPositions :
myMatrix[pos] = maxval
the problem is that myMatrix[pos] does not accept pos as an array.
The more-efficient way of generating your output has already been covered by sacul. However, you're incorrectly indexing your 2D matrix in the case where you want to use an array.
At least to me, it's a bit unintuitive, but you need to use:
myMatrix[[all_row_indices], [all_column_indices]]
The following will give you what you expect:
import numpy as np
myMatrix = np.array([[3.2,2,float('NaN'),3],[3,1,2,float('NaN')],[3,3,3,3]])
nanPositions = np.argwhere(np.isnan(myMatrix))
maxVal = np.nanmax(abs(myMatrix))
print(myMatrix[nanPositions[:, 0], nanPositions[:, 1]])
You can see more about advanced indexing in the documentation
In [54]: arr = np.array([[3.2,2,float('NaN'),3],[3,1,2,float('NaN')],[3,3,3,3]])
...:
In [55]: arr
Out[55]:
array([[3.2, 2. , nan, 3. ],
[3. , 1. , 2. , nan],
[3. , 3. , 3. , 3. ]])
Location of the nan:
In [56]: np.where(np.isnan(arr))
Out[56]: (array([0, 1]), array([2, 3]))
In [57]: np.argwhere(np.isnan(arr))
Out[57]:
array([[0, 2],
[1, 3]])
where produces a tuple of arrays; argwhere the same values but as a 2d array
In [58]: arr[Out[56]]
Out[58]: array([nan, nan])
In [59]: arr[Out[56]] = [100,200]
In [60]: arr
Out[60]:
array([[ 3.2, 2. , 100. , 3. ],
[ 3. , 1. , 2. , 200. ],
[ 3. , 3. , 3. , 3. ]])
The argwhere can be used to index individual items:
In [72]: for ij in Out[57]:
...: print(arr[tuple(ij)])
100.0
200.0
The tuple() is needed here because np.array([1,3]) in interpreted as 2 element indexing on the first dimension.
Another way to get that indexing tuple is to use unpacking:
In [74]: [arr[i,j] for i,j in Out[57]]
Out[74]: [100.0, 200.0]
So while argparse looks useful, it is trickier to use than plain where.
You could, as noted in the other answers, use boolean indexing (I've already modified arr so the isnan test no longer works):
In [75]: arr[arr>10]
Out[75]: array([100., 200.])
More on indexing with a list or array, and indexing with a tuple:
In [77]: arr[[0,0]] # two copies of row 0
Out[77]:
array([[ 3.2, 2. , 100. , 3. ],
[ 3.2, 2. , 100. , 3. ]])
In [78]: arr[(0,0)] # one element
Out[78]: 3.2
In [79]: arr[np.array([0,0])] # same as list
Out[79]:
array([[ 3.2, 2. , 100. , 3. ],
[ 3.2, 2. , 100. , 3. ]])
In [80]: arr[np.array([0,0]),:] # making the trailing : explicit
Out[80]:
array([[ 3.2, 2. , 100. , 3. ],
[ 3.2, 2. , 100. , 3. ]])
You can do this instead (IIUC):
myMatrix[np.isnan(myMatrix)] = np.nanmax(abs(myMatrix))
I am a beginner in Python and am stuck on a problem. I have two lists of 60 floating point numbers, lets call them start and end. The numbers in both the lists are not in an increasing or decreasing order.
start = [ ] //60 floating point numbers
end = [ ] // 60 floating numbers
I would like to find 1000 interpolated values between start[0] and end[0] and repeat the process for all 60 values of list. How do I go about it?
You can do this with a list comprehension and using numpy.linspace
import numpy as np
[np.linspace(first, last, 1000) for first, last in zip(start, end)]
As a small example (with fewer values)
>>> start = [1, 5, 10]
>>> end = [2, 10, 20]
>>> [np.linspace(first, last, 5) for first, last in zip(start, end)]
[array([ 1. , 1.25, 1.5 , 1.75, 2. ]),
array([ 5. , 6.25, 7.5 , 8.75, 10. ]),
array([ 10. , 12.5, 15. , 17.5, 20. ])]
I have a 3x3 numpy array and I want to divide each column of this with a vector 3x1. I know how to divide each row by elements of the vector, but am unable to find a solution to divide each column.
You can transpose your array to divide on each column
(arr_3x3.T/arr_3x1).T
Let's try several things:
In [347]: A=np.arange(9.).reshape(3,3)
In [348]: A
Out[348]:
array([[ 0., 1., 2.],
[ 3., 4., 5.],
[ 6., 7., 8.]])
In [349]: x=10**np.arange(3).reshape(3,1)
In [350]: A/x
Out[350]:
array([[ 0. , 1. , 2. ],
[ 0.3 , 0.4 , 0.5 ],
[ 0.06, 0.07, 0.08]])
So this has divided each row by a different value
In [351]: A/x.T
Out[351]:
array([[ 0. , 0.1 , 0.02],
[ 3. , 0.4 , 0.05],
[ 6. , 0.7 , 0.08]])
And this has divided each column by a different value
(3,3) divided by (3,1) => replicates x across columns.
With the transpose (1,3) array is replicated across rows.
It's important that x be 2d when using .T (transpose). A (3,) array transposes to a (3,) array - that is, no change.
The simplest seems to be
A = np.arange(1,10).reshape(3,3)
b=np.arange(1,4)
A/b
A will be
array([[1, 2, 3],
[4, 5, 6],
[7, 8, 9]])
and b will be
array([1, 2, 3])
and the division will produce
array([[1. , 1. , 1. ],
[4. , 2.5, 2. ],
[7. , 4. , 3. ]])
The first column is divided by 1, the second column by 2, and the third by 3.
If I've misinterpreted your columns for rows, simply transform with .T - as C_Z_ answered above.