Reversed array in numpy? - python

Numpy tentative tutorial suggests that a[ : :-1] is a reversed a. Can someone explain me how we got there?
I understand that a[:] means for each element of a (with axis=0). Next : should denote the number of elements to skip (or period) from my understanding.

It isn't numpy, it's Python.
In Python, there are slices for sequence/iterable, which come in the following syntax
seq[start:stop:step] => a slice from start to stop, stepping step each time.
All the arguments are optional, but a : has to be there for Python to recognize this as a slice.
Negative values, for step, also work to make a copy of the same sequence/iterable in reverse order:
>>> L = range(10)
>>> L[::-1]
[9, 8, 7, 6, 5, 4, 3, 2, 1, 0]
And numpy follows that "rule" like any good 3rd party library..
>>> a = numpy.array(range(10))
>>> a[::-1]
array([9, 8, 7, 6, 5, 4, 3, 2, 1, 0])
See this link

As others have noted, this is a python slicing technique, and numpy just follows suit. Hopefully this helps explain how it works:
The last bit is the stepsize. The 1 indicates to step by one element at a time, the - does that in reverse.
Blanks indicate the first and last, unless you have a negative stepsize, in which case they indicate last and first:
In [1]: import numpy as np
In [2]: a = np.arange(5)
In [3]: a
Out[3]: array([0, 1, 2, 3, 4])
In [4]: a[0:5:1]
Out[4]: array([0, 1, 2, 3, 4])
In [5]: a[0:5:-1]
Out[5]: array([], dtype=int64)
In [6]: a[5:0:-1]
Out[6]: array([4, 3, 2, 1])
In [7]: a[::-2]
Out[7]: array([4, 2, 0])
Line 5 gives an empty array since it tries to step backwards from the 0th element to the 5th.
The slice doesn't include the 'endpoint' (named last element) so line 6 misses 0 when going backwards.

This isn't specific to numpy, the slice a[::-1] is equivalent to slice(None, None, -1), where the first argument is the start index, the second argument is the end index, and the third argument is the step. None for start or stop will have the same behavior as using the beginning or end of the sequence, and -1 for step will iterate over the sequence in reverse order.

You can use the reversed Python built-in:
import numpy as np
bins = np.arange(0.0, 1.1, .1)
for i in reversed(bins):
print(i)

Related

numpy.diff with parameter n=2 produces strange result

I'm having a hard time understanding the behaviour of np.diff when n>1
The documentation gives the following example :
x = np.array([1, 2, 4, 7, 0])
np.diff(x)
array([ 1, 2, 3, -7])
np.diff(x, n=2)
array([ 1, 1, -10])
It seems from the first example that we are substracting each number by the previous one (x[i+1]-x[i]) and all results make sense.
The second time the function is called, with n=2, it seems that we're doing x[i+2]-x[i+1]-x[i] and the two first numbers (1 and 1) in the resulting array make sense but I am surprised the last number is not -11 (0 -7 -4) but -10.
Looking in the documentation I found this explaination
The first difference is given by out[i] = a[i+1] - a[i] along the given axis, higher differences are calculated by using diff recursively.
I fail to understand this 'recursively' so I'd be glad if someone had a clearer explanation !
np.diff(x, n=2) is the same as np.diff(np.diff(x)) (that's what "recursively" means in this case).
"Recursively" in this case simply means it's performing the same operation multiple times, each time on the array resulting from the previous step.
So:
x = np.array([1, 2, 4, 7, 0])
output = np.diff(x)
produces
output = [2-1, 4-2, 7-4, 0-7] = [1, 2, 3, -7]
If you use n=2, it simply does the same thing 2 times:
output = np.diff(x, n=2)
# first step, you won't see this result
output = [2-1, 4-2, 7-4, 0-7] = [1, 2, 3, -7]
# and then again (this will be your actual output)
output = [2-1, 3-2, -7-3] = [1, 1, -10]

Reversing the list in python

In [122]: a = range(10)
In [123]: a[: : -1]
Out[123]: [9, 8, 7, 6, 5, 4, 3, 2, 1, 0]
Could you explain the expression a[: : -1]?
a[:] is clearly understandable -> "start form the beginning(space before the colon) and retrieve the list upto the end (space after the colon)"
But I am not getting what the two colons are actually doing in the expression a[: : -1].
A slice takes three arguments, just like range: start, stop and step:
[0, 1, 2, 3, 4, 5][0:4:2] == list(range(0, 4, 2)) # every second element from 0 to 3
The negative step causes the slice to work backwards through the iterable. Without a start and stop (i.e. just the step [::-1]) it starts from the end, as it is working backwards.
The third argument (after two :'s) is the step size. -1 can be interpreted as stepping backwards. In other words, reversing the list.
Try with -2 step size i.e., a[::-2], You'll get:
[9, 7, 5, 3, 1]
Hope this helps.
More elaborate answers and explanations here Explain Python's slice notation

How to return all the minimum indices in numpy

I am a little bit confused reading the documentation of argmin function in numpy.
It looks like it should do the job:
Reading this
Return the indices of the minimum values along an axis.
I might assume that
np.argmin([5, 3, 2, 1, 1, 1, 6, 1])
will return an array of all indices: which will be [3, 4, 5, 7]
But instead of this it returns only 3. Where is the catch, or what should I do to get my result?
That documentation makes more sense when you think about multidimensional arrays.
>>> x = numpy.array([[0, 1],
... [3, 2]])
>>> x.argmin(axis=0)
array([0, 0])
>>> x.argmin(axis=1)
array([0, 1])
With an axis specified, argmin takes one-dimensional subarrays along the given axis and returns the first index of each subarray's minimum value. It doesn't return all indices of a single minimum value.
To get all indices of the minimum value, you could do
numpy.where(x == x.min())
See the documentation for numpy.argmax (which is referred to by the docs for numpy.argmin):
In case of multiple occurrences of the maximum values, the indices corresponding to the first occurrence are returned.
The phrasing of the documentation ("indices" instead of "index") refers to the multidimensional case when axis is provided.
So, you can't do it with np.argmin. Instead, this will work:
np.where(arr == arr.min())
I would like to quickly add that as user grofte mentioned, np.where returns a tuple and it states that it is a shorthand for nonzero which has a corresponding method flatnonzero which returns an array directly.
So, the cleanest version seems to be
my_list = np.array([5, 3, 2, 1, 1, 1, 6, 1])
np.flatnonzero(my_list == my_list.min())
=> array([3, 4, 5, 7])
Assuming that you want the indices of a list, not a numpy array, try
import numpy as np
my_list = [5, 3, 2, 1, 1, 1, 6, 1]
np.where(np.array(my_list) == min(my_list))[0]
The index [0] is because numpy returns a tuple of your answer and nothing (answer as a numpy array). Don't ask me why.
Recommended way (by numpy documents) to get all indices of the minimum value is:
x = np.array([5, 3, 2, 1, 1, 1, 6, 1])
a, = np.nonzero(x == x.min()) # a=>array([3, 4, 5, 7])

How to get indices of a sorted array in Python

I have a numerical list:
myList = [1, 2, 3, 100, 5]
Now if I sort this list to obtain [1, 2, 3, 5, 100].
What I want is the indices of the elements from the
original list in the sorted order i.e. [0, 1, 2, 4, 3]
--- ala MATLAB's sort function that returns both
values and indices.
If you are using numpy, you have the argsort() function available:
>>> import numpy
>>> numpy.argsort(myList)
array([0, 1, 2, 4, 3])
http://docs.scipy.org/doc/numpy/reference/generated/numpy.argsort.html
This returns the arguments that would sort the array or list.
Something like next:
>>> myList = [1, 2, 3, 100, 5]
>>> [i[0] for i in sorted(enumerate(myList), key=lambda x:x[1])]
[0, 1, 2, 4, 3]
enumerate(myList) gives you a list containing tuples of (index, value):
[(0, 1), (1, 2), (2, 3), (3, 100), (4, 5)]
You sort the list by passing it to sorted and specifying a function to extract the sort key (the second element of each tuple; that's what the lambda is for. Finally, the original index of each sorted element is extracted using the [i[0] for i in ...] list comprehension.
myList = [1, 2, 3, 100, 5]
sorted(range(len(myList)),key=myList.__getitem__)
[0, 1, 2, 4, 3]
I did a quick performance check on these with perfplot (a project of mine) and found that it's hard to recommend anything else but
np.argsort(x)
(note the log scale):
Code to reproduce the plot:
import perfplot
import numpy as np
def sorted_enumerate(seq):
return [i for (v, i) in sorted((v, i) for (i, v) in enumerate(seq))]
def sorted_enumerate_key(seq):
return [x for x, y in sorted(enumerate(seq), key=lambda x: x[1])]
def sorted_range(seq):
return sorted(range(len(seq)), key=seq.__getitem__)
b = perfplot.bench(
setup=np.random.rand,
kernels=[sorted_enumerate, sorted_enumerate_key, sorted_range, np.argsort],
n_range=[2 ** k for k in range(15)],
xlabel="len(x)",
)
b.save("out.png")
The answers with enumerate are nice, but I personally don't like the lambda used to sort by the value. The following just reverses the index and the value, and sorts that. So it'll first sort by value, then by index.
sorted((e,i) for i,e in enumerate(myList))
Updated answer with enumerate and itemgetter:
sorted(enumerate(a), key=lambda x: x[1])
# [(0, 1), (1, 2), (2, 3), (4, 5), (3, 100)]
Zip the lists together: The first element in the tuple will the index, the second is the value (then sort it using the second value of the tuple x[1], x is the tuple)
Or using itemgetter from the operatormodule`:
from operator import itemgetter
sorted(enumerate(a), key=itemgetter(1))
Essentially you need to do an argsort, what implementation you need depends if you want to use external libraries (e.g. NumPy) or if you want to stay pure-Python without dependencies.
The question you need to ask yourself is: Do you want the
indices that would sort the array/list
indices that the elements would have in the sorted array/list
Unfortunately the example in the question doesn't make it clear what is desired because both will give the same result:
>>> arr = np.array([1, 2, 3, 100, 5])
>>> np.argsort(np.argsort(arr))
array([0, 1, 2, 4, 3], dtype=int64)
>>> np.argsort(arr)
array([0, 1, 2, 4, 3], dtype=int64)
Choosing the argsort implementation
If you have NumPy at your disposal you can simply use the function numpy.argsort or method numpy.ndarray.argsort.
An implementation without NumPy was mentioned in some other answers already, so I'll just recap the fastest solution according to the benchmark answer here
def argsort(l):
return sorted(range(len(l)), key=l.__getitem__)
Getting the indices that would sort the array/list
To get the indices that would sort the array/list you can simply call argsort on the array or list. I'm using the NumPy versions here but the Python implementation should give the same results
>>> arr = np.array([3, 1, 2, 4])
>>> np.argsort(arr)
array([1, 2, 0, 3], dtype=int64)
The result contains the indices that are needed to get the sorted array.
Since the sorted array would be [1, 2, 3, 4] the argsorted array contains the indices of these elements in the original.
The smallest value is 1 and it is at index 1 in the original so the first element of the result is 1.
The 2 is at index 2 in the original so the second element of the result is 2.
The 3 is at index 0 in the original so the third element of the result is 0.
The largest value 4 and it is at index 3 in the original so the last element of the result is 3.
Getting the indices that the elements would have in the sorted array/list
In this case you would need to apply argsort twice:
>>> arr = np.array([3, 1, 2, 4])
>>> np.argsort(np.argsort(arr))
array([2, 0, 1, 3], dtype=int64)
In this case :
the first element of the original is 3, which is the third largest value so it would have index 2 in the sorted array/list so the first element is 2.
the second element of the original is 1, which is the smallest value so it would have index 0 in the sorted array/list so the second element is 0.
the third element of the original is 2, which is the second-smallest value so it would have index 1 in the sorted array/list so the third element is 1.
the fourth element of the original is 4 which is the largest value so it would have index 3 in the sorted array/list so the last element is 3.
If you do not want to use numpy,
sorted(range(len(seq)), key=seq.__getitem__)
is fastest, as demonstrated here.
The other answers are WRONG.
Running argsort once is not the solution.
For example, the following code:
import numpy as np
x = [3,1,2]
np.argsort(x)
yields array([1, 2, 0], dtype=int64) which is not what we want.
The answer should be to run argsort twice:
import numpy as np
x = [3,1,2]
np.argsort(np.argsort(x))
gives array([2, 0, 1], dtype=int64) as expected.
Most easiest way you can use Numpy Packages for that purpose:
import numpy
s = numpy.array([2, 3, 1, 4, 5])
sort_index = numpy.argsort(s)
print(sort_index)
But If you want that you code should use baisc python code:
s = [2, 3, 1, 4, 5]
li=[]
for i in range(len(s)):
li.append([s[i],i])
li.sort()
sort_index = []
for x in li:
sort_index.append(x[1])
print(sort_index)
We will create another array of indexes from 0 to n-1
Then zip this to the original array and then sort it on the basis of the original values
ar = [1,2,3,4,5]
new_ar = list(zip(ar,[i for i in range(len(ar))]))
new_ar.sort()
`
s = [2, 3, 1, 4, 5]
print([sorted(s, reverse=False).index(val) for val in s])
For a list with duplicate elements, it will return the rank without ties, e.g.
s = [2, 2, 1, 4, 5]
print([sorted(s, reverse=False).index(val) for val in s])
returns
[1, 1, 0, 3, 4]
Import numpy as np
FOR INDEX
S=[11,2,44,55,66,0,10,3,33]
r=np.argsort(S)
[output]=array([5, 1, 7, 6, 0, 8, 2, 3, 4])
argsort Returns the indices of S in sorted order
FOR VALUE
np.sort(S)
[output]=array([ 0, 2, 3, 10, 11, 33, 44, 55, 66])
Code:
s = [2, 3, 1, 4, 5]
li = []
for i in range(len(s)):
li.append([s[i], i])
li.sort()
sort_index = []
for x in li:
sort_index.append(x[1])
print(sort_index)
Try this, It worked for me cheers!
firstly convert your list to this:
myList = [1, 2, 3, 100, 5]
add a index to your list's item
myList = [[0, 1], [1, 2], [2, 3], [3, 100], [4, 5]]
next :
sorted(myList, key=lambda k:k[1])
result:
[[0, 1], [1, 2], [2, 3], [4, 5], [3, 100]]
A variant on RustyRob's answer (which is already the most performant pure Python solution) that may be superior when the collection you're sorting either:
Isn't a sequence (e.g. it's a set, and there's a legitimate reason to want the indices corresponding to how far an iterator must be advanced to reach the item), or
Is a sequence without O(1) indexing (among Python's included batteries, collections.deque is a notable example of this)
Case #1 is unlikely to be useful, but case #2 is more likely to be meaningful. In either case, you have two choices:
Convert to a list/tuple and use the converted version, or
Use a trick to assign keys based on iteration order
This answer provides the solution to #2. Note that it's not guaranteed to work by the language standard; the language says each key will be computed once, but not the order they will be computed in. On every version of CPython, the reference interpreter, to date, it's precomputed in order from beginning to end, so this works, but be aware it's not guaranteed. In any event, the code is:
sizediterable = ...
sorted_indices = sorted(range(len(sizediterable)), key=lambda _, it=iter(sizediterable): next(it))
All that does is provide a key function that ignores the value it's given (an index) and instead provides the next item from an iterator preconstructed from the original container (cached as a defaulted argument to allow it to function as a one-liner). As a result, for something like a large collections.deque, where using its .__getitem__ involves O(n) work (and therefore computing all the keys would involve O(n²) work), sequential iteration remains O(1), so generating the keys remains just O(n).
If you need something guaranteed to work by the language standard, using built-in types, Roman's solution will have the same algorithmic efficiency as this solution (as neither of them rely on the algorithmic efficiency of indexing the original container).
To be clear, for the suggested use case with collections.deque, the deque would have to be quite large for this to matter; deques have a fairly large constant divisor for indexing, so only truly huge ones would have an issue. Of course, by the same token, the cost of sorting is pretty minimal if the inputs are small/cheap to compare, so if your inputs are large enough that efficient sorting matters, they're large enough for efficient indexing to matter too.

Extended slice that goes to beginning of sequence with negative stride

Bear with me while I explain my question. Skip down to the bold heading if you already understand extended slice list indexing.
In python, you can index lists using slice notation. Here's an example:
>>> A = list(range(10))
>>> A[0:5]
[0, 1, 2, 3, 4]
You can also include a stride, which acts like a "step":
>>> A[0:5:2]
[0, 2, 4]
The stride is also allowed to be negative, meaning the elements are retrieved in reverse order:
>>> A[5:0:-1]
[5, 4, 3, 2, 1]
But wait! I wanted to see [4, 3, 2, 1, 0]. Oh, I see, I need to decrement the start and end indices:
>>> A[4:-1:-1]
[]
What happened? It's interpreting -1 as being at the end of the array, not the beginning. I know you can achieve this as follows:
>>> A[4::-1]
[4, 3, 2, 1, 0]
But you can't use this in all cases. For example, in a method that's been passed indices.
My question is:
Is there any good pythonic way of using extended slices with negative strides and explicit start and end indices that include the first element of a sequence?
This is what I've come up with so far, but it seems unsatisfying.
>>> A[0:5][::-1]
[4, 3, 2, 1, 0]
It is error-prone to change the semantics of start and stop. Use None or -(len(a) + 1) instead of 0 or -1. The semantics is not arbitrary. See Edsger W. Dijkstra's article "Why numbering should start at zero".
>>> a = range(10)
>>> start, stop, step = 4, None, -1
Or
>>> start, stop, step = 4, -(len(a) + 1), -1
>>> a[start:stop:step]
[4, 3, 2, 1, 0]
Or
>>> s = slice(start, stop, step)
>>> a[s]
[4, 3, 2, 1, 0]
When s is a sequence the negative indexes in s[i:j:k] are treated specially:
If i or j is negative, the index is relative to the end of the string:
len(s) + i or len(s) + j is substituted. But note that -0 is still 0.
that is why len(range(10)[4:-1:-1]) == 0 because it is equivalent to range(10)[4:9:-1].
Ok, I think this is probably as good as I will get it. Thanks to Abgan for sparking the idea. This relies on the fact that None in a slice is treated as if it were a missing parameter. Anyone got anything better?
def getReversedList(aList, end, start, step):
return aList[end:start if start!=-1 else None:step]
edit: check for start==-1, not 0
This is still not ideal, because you're clobbering the usual behavior of -1. It seems the problem here is two overlapping definitions of what's supposed to happen. Whoever wins takes away otherwise valid invocations looking for the other intention.
[ A[b] for b in range(end,start,stride) ]
Slower, however you can use negative indices, so this should work:
[ A[b] for b in range(9, -1, -1) ]
I realize this isn't using slices, but thought I'd offer the solution anyway if using slices specifically for getting the result isn't a priority.
I believe that the following doesn't satisfy you:
def getReversedList(aList, end, start, step):
if step < 0 and start == 0:
return aList[end::step]
return aList[end:start:step]
or does it? :-)
But you can't use that if you are
storing your indices in variables for
example.
Is this satisfactory?
>>> a = range(10)
>>> start = 0
>>> end = 4
>>> a[4:start-1 if start > 0 else None:-1]
[4, 3, 2, 1, 0]
As you say very few people fully understand everything that you can do with extended slicing, so unless you really need the extra performance I'd do it the "obvious" way:
rev_subset = reversed(data[start:stop])
a[4::-1]
Example:
Python 2.6 (r26:66714, Dec 4 2008, 11:34:15)
[GCC 4.0.1 (Apple Inc. build 5488)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> a = list(range(10))
>>> a
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
>>> a[4:0:-1]
[4, 3, 2, 1]
>>> a[4::-1]
[4, 3, 2, 1, 0]
>>>
The reason is that the second term is interpreted as "while not index ==". Leaving it out is "while index in range".
I know this is an old question, but in case someone like me is looking for answers:
>>> A[5-1::-1]
[4, 3, 2, 1, 0]
>>> A[4:1:-1]
[4, 3, 2]
You can use a slice(start, stop, step) object, which is such that
s=slice(start, stop, step)
print a[s]
is the same as
print a[start : stop : step]
and, moreover, you can set any of the arguments to None to indicate nothing in between the colons. So in the case you give, you can use slice(4, None, -1).

Categories