Numpy array getting only items and indexes in an ascending order - python

Suppose I have the following array:
arr = [1,2,4,5,6,5,4,3,2,3,4,5,6,7,8]
I want to get only the items which are in ascending order, and ignore the "reverse" in he middle.
So for this array I want to get:
res = [1,2,4,5,6,7,8]
,at the indexes: [0,1,2,4,13,14]
Any idea?

I think you should approach this using the accumulated maximum value, i.e., the maximum value at each given step:
>>> arr
array([1, 2, 4, 5, 6, 5, 4, 3, 2, 3, 4, 5, 6, 7, 8])
>>> np.maximum.accumulate(arr)
array([1, 2, 4, 5, 6, 6, 6, 6, 6, 6, 6, 6, 6, 7, 8])
You could do something like:
>>> arr[arr == np.maximum.accumulate(arr)]
array([1, 2, 4, 5, 6, 6, 7, 8])
However, that doesn't deal with numbers that stay the same (you get that extra 6), to handle this, you could "roll" the accumulated maximum array and add the condition that it isn't equal to that rolled array (i.e., the value of the array isn't equal to the previous maximum value):
>>> m = np.maximum.accumulate(arr)
>>> arr[(arr == m) & (arr != np.roll(m, -1))]
array([1, 2, 4, 5, 6, 7, 8])
But really, you want the unique values of the accumulated maximum, so you could also just use it with np.unique:
>>> np.unique(np.maximum.accumulate(arr))
array([1, 2, 4, 5, 6, 7, 8])
Not sure which would be faster, but coming up with good testing data isn't straight-forward. If you have a size-able array, I'd be interested in which approach is faster with your data.

Related

Shift values in numpy array by differing amounts

I have an array a = np.array([2, 2, 2, 3, 3, 15, 7, 7, 9]) that continues like that. I would like to shift this array but I'm not sure if I can use np.roll() here.
The array I would like to produce is [0, 0, 0, 2, 2, 3, 15, 15, 7].
As you can see, the first like numbers which are in array a (in this case the three '2's) should be replaced with '0's. Everything should then be shifted such that the '3's are replaced with '2's, the '15' is replaced with the '3' etc. Ideally I would like to do this operation without any for loop as I need it to run quickly.
I realise this operation may be a bit confusing so please ask questions.
If you want to stick with NumPy, you can achieve this using np.unique by returning the counts per unique elements with the return_counts option.
Then, simply roll the values and construct a new array with np.repeat:
>>> s, i, c = np.unique(a, return_index=True, return_counts=True)
(array([ 2, 3, 7, 9, 15]), array([0, 3, 6, 8, 5]), array([3, 2, 2, 1, 1]))
The three outputs are respectively: unique sorted elements, indices of first encounter unique element, and the count per unique element.
np.unique sorts the value, so we need to unsort the values as well as the counts first. We can then shift the values with np.roll:
>>> idx = np.argsort(i)
>>> v = np.roll(s[idx], 1)
>>> v[0] = 0
array([ 0, 2, 3, 15, 7])
Alternatively with np.append, this requires a whole copy though:
>>> v = np.append([0], s[idx][:-1])
array([ 0, 2, 3, 15, 7])
Finally reassemble:
>>> np.repeat(v, c[idx])
array([ 0, 0, 0, 2, 2, 3, 15, 15, 7])
Another - more general - solution that will work when there are recurring values in a. This requires the use of np.diff.
You can get the indices of the elements with:
>>> i = np.diff(np.append(a, [0])).nonzero()[0] + 1
array([3, 5, 6, 8, 9])
>>> idx = np.append([0], i)
array([0, 3, 5, 6, 8, 9])
The values are then given using a[idx]:
>>> v = np.append([0], a)[idx]
array([ 0, 2, 3, 15, 7, 9])
And the counts per element with:
>>> c = np.append(np.diff(i, prepend=0), [0])
array([3, 2, 1, 2, 1, 0])
Finally, reassemble:
>>> np.repeat(v, c)
array([ 0, 0, 0, 2, 2, 3, 15, 15, 7])
This is not using numpy, but one approach that comes to mind is to itertools.groupby to collect contiguous runs of the same elements. Then shift all the elements (by prepending a 0) and use the counts to repeat them.
from itertools import chain, groupby
def shift(data):
values = [(k, len(list(g))) for k,g in groupby(data)]
keys = [0] + [i[0] for i in values]
reps = [i[1] for i in values]
return list(chain.from_iterable([[k]*rep for k, rep in zip(keys, reps)]))
For example
>>> a = np.array([2,2,2,3,3,15,7,7,9])
>>> shift(a)
[0, 0, 0, 2, 2, 3, 15, 15, 7]
You can try this code:
import numpy as np
a = np.array([2, 2, 2, 3, 3, 15, 7, 7, 9])
diff_a=np.diff(a)
idx=np.flatnonzero(diff_a)
val=diff_a[idx]
val=np.insert(val[:-1],0, a[0]) #update value
diff_a[idx]=val
res=np.append([0],np.cumsum(diff_a))
print(res)
You can try this:
import numpy as np
a = np.array([2, 2, 2, 3, 3, 15, 7, 7, 9])
z = a - np.pad(a, (1,0))[:-1]
z[m] = np.pad(z[(m := z!=0)], (1,0))[:-1]
print(z.cumsum())
It gives:
[ 0 0 0 2 2 3 15 15 7]

Indexing in NumPy: Access every other group of values

The [::n] indexing option in numpy provides a very useful way to index every nth item in a list. However, is it possible to use this feature to extract multiple values, e.g. every other pair of values?
For example:
a = np.array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11])
And I want to extract every other pair of values i.e. I want to return
a[0, 1, 4, 5, 8, 9,]
Of course the index could be built using loops or something, but I wonder if there's a faster way to use ::-style indexing in numpy but also specifying the width of the pattern to take every nth iteration of.
Thanks
With length of array being a multiple of the window size -
In [29]: W = 2 # window-size
In [30]: a.reshape(-1,W)[::2].ravel()
Out[30]: array([0, 1, 4, 5, 8, 9])
Explanation with breaking-down-the-steps -
# Reshape to split into W-sized groups
In [43]: a.reshape(-1,W)
Out[43]:
array([[ 0, 1],
[ 2, 3],
[ 4, 5],
[ 6, 7],
[ 8, 9],
[10, 11]])
# Use stepsize to select every other pair starting from the first one
In [44]: a.reshape(-1,W)[::2]
Out[44]:
array([[0, 1],
[4, 5],
[8, 9]])
# Flatten for desired output
In [45]: a.reshape(-1,W)[::2].ravel()
Out[45]: array([0, 1, 4, 5, 8, 9])
If you are okay with 2D output, skip the last step as that still be a view into the input and virtually free on runtime. Let's verify the view-part -
In [47]: np.shares_memory(a,a.reshape(-1,W)[::2])
Out[47]: True
For generic case of not necessarily a multiple, we can use a masking based one -
In [64]: a[(np.arange(len(a))%(2*W))<W]
Out[64]: array([0, 1, 4, 5, 8, 9])
You can do that reshaping the array into a nx3 matrix, then slice up the first two elements for each row and finally flatten up the reshaped array:
a.reshape((-1,3))[:,:2].flatten()
resulting in:
array([ 0, 1, 3, 4, 6, 7, 9, 10])

Find value indexes in a mother array with a filter array

I have two arrays: one is a mother array and the other is a "filtering array". The mother array is a 2D array (about 65 rowsx147 cols in size). The filtering array is an array that has the max value of each column of the mother array (1 row x 147 cols). I need to get the matching row values for the max values.
I tried using
for index,k in np.ndenumerate(MotherArr):
for val in FiltArr:
if k == val:
print(index)
But for some reason, I am basically getting a print of val with the very last index printed afterwards.
Any ideas on how I could get this working?
You can just take the argmax of your array along an axis:
np.random.seed(0)
A = np.random.randint(0, 10, (5, 5))
# array([[5, 0, 3, 3, 7],
# [9, 3, 5, 2, 4],
# [7, 6, 8, 8, 1],
# [6, 7, 7, 8, 1],
# [5, 9, 8, 9, 4]])
maxima = A.max(1)
# array([7, 9, 8, 8, 9])
maxima_args = A.argmax(1)
# array([4, 0, 2, 3, 1], dtype=int64)

how to roll two arrays of diffeent dimesnions into one dimensional array in python

I have two arrays (a,b) of different mXn dimensions
I need to know that how can I roll these two arrays into a single one dimensional array
I used np.flatten() for both a,b array and then rolled them into a single array but what i get is an array containg two one dimensional array(a,b)
a = np.array([[1,2,3,4],[3,4,5,6],[4,5,6,7]]) #3x4 array
b = np.array([ [1,2],[2,3],[3,4],[4,5],[5,6]]) #5x2 array
result = [a.flatten(),b.flatten()]
print(result)
[array([1, 2, 3, 4, 3, 4, 5, 6, 4, 5, 6, 7]), array([1, 2, 2, 3, ... 5, 6])]
In matlab , I would do it like this :
res = [a(:);b(:)]
Also, how can I retrieve a and b back from the result?
Use ravel + concatenate:
>>> np.concatenate((a.ravel(), b.ravel()))
array([1, 2, 3, 4, 3, 4, 5, 6, 4, 5, 6, 7, 1, 2, 2, 3, 3, 4, 4, 5, 5, 6])
ravel returns a 1D view of the arrays, and is a cheap operation. concatenate joins the views together, returning a new array.
As an aside, if you want to be able to retrieve these arrays back, you'll need to store their shapes in some variable.
i = a.shape
j = b.shape
res = np.concatenate((a.ravel(), b.ravel()))
Later, to retrieve a and b from res,
a = res[:np.prod(i)].reshape(i)
b = res[np.prod(i):].reshape(j)
a
array([[1, 2, 3, 4],
[3, 4, 5, 6],
[4, 5, 6, 7]])
b
array([[1, 2],
[2, 3],
[3, 4],
[4, 5],
[5, 6]])
How about changing the middle line to:
result = [a.flatten(),b.flatten()].flatten()
Or even more simply (if you know there's always exactly 2 arrays)
result = a.flatten() + b.flatten()

How to draw N elements of random indices from numpy array without repetition?

Say, I have a numpy array defined as:
X = numpy.array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])
Now I want to draw 3 elements from this array, but with random indices and without repetition, so I'll get, say:
X_random_draw = numpy.array([5, 0, 9]
How can I achieve something like this with the least effort and with the greatest performance speed? Thank you in advance.
With NumPy 1.7 or newer, use np.random.choice, with replace=False:
In [85]: X = np.array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])
In [86]: np.random.choice(X, 3, replace=False)
Out[86]: array([7, 5, 9])

Categories