How does Numpy's partition function work? [duplicate] - python

I am trying to figure out how np.partition function works.
For example, consider
arr = np.array([5, 4, 1, 0, -1, -3, -4, 0])
If I call np.partition(arr, kth=2), I get
np.array([-4, -3, -1, 0, 1, 4, 5, 0])
I expect that, after partition, the array will split into elements less than one, one, and elements greater than one.
But the second zero is placed on the last array position, which isn't its right place after partition.

The documentation says:
Creates a copy of the array with its elements rearranged in such a way that
the value of the element in kth position is in the position it would be in
a sorted array. All elements smaller than the kth element are moved before
this element and all equal or greater are moved behind it. The ordering of
the elements in the two partitions is undefined.
In the example you give, you have selected 2th element of the sorted list (starting from zero), which is -1, and it seems to be in the right position if the array was sorted.

The docs talk of 'a sorted array'.
np.partition starts by sorting the elements in the array provided. In this case the original array is:
arr = [ 5, 4, 1, 0, -1, -3, -4, 0]
When sorted, we have:
arr_sorted = [-4 -3 -1 0 0 1 4 5]
Hence the call, np.partition(arr, kth=2), will actually have the kth as the the element in position 2 of the arr_sorted, not arr. The element is correctly picked as -1.

When I first read the official document of numpy.partition, I also interpreted its meaning in the same way as the OP did. So I was confused when I read the examples given in the documents, but could not figure out where my understanding is wrong. I google it and got here.
Considering that the confusion is frequent, so the document should be revised. I suggest using the following:
Creates a copy of the array with its elements rearranged in such a way that: 
the k-th element of the new array is in the position it would be in a sorted array. All elements smaller than the k-th element are moved before this element and all greater are moved behind it. The ordering of the elements in the two partitions is undefined. If there are other elements that are equal to the k-th element, these elements may appear before or hehind the k-th element.

Related

What is a natual way to count the number of things in python?

In python, I learned that it is natural to start an index set from 0. I.e.) if there is an array, the first element is not the 1st but the 0th element.
My question is what about when you count the number of things.
For example, if there is an array, {3, -1, 2, 5, -7}, what is the number of the elements of the array? Is it 5 elements or 4 elements?
Is the number of the elements 0 or 1, when there is {3}, an array with a single element?
I'm not a computer engineer so have less knowledge about programming.
if there is an array, {3, -1, 2, 5, -7}, what is the number of the
elements of the array? Is it 5 elements or 4 elements? Is the number
of the elements 0 or 1, when there is {3}, an array with a single
element?
The list
a = []
is an empty list and has 0 elements.
The list
a = [1]
is a nonempty list and has 1 element. The first and only element can be accessed via a[0].
The list
a = [1, 2, 3, 4, 5]
is a nonempty list and has 5 elements. The first element is accessed via a[0], the second by a[1], the third by a[2], ..., the fifth by a[4].
In python, I learned that it is natural to start an index set from 0.
I.e.) if there is an array, the first element is not the 1st but the
0th element.
The first element is the first element. In computer science (not just in Python), we often use zero-based indexing whereby the initial/first element of a sequence is assigned the index 0, the second element is assigned the index 1, and so on.
In general, with zero-based indexing, given some nonempty sequence, the
nth element is assigned the index n - 1
When we want to access the third element of a, we know that the third element is assigned the index 2 and so we write a[2] to access it.
When you have an array like this
>>> array = [ 3, -1, 2, 5, -7]
Now I print the number of elements of the array with the help of the len function
>>> print(len(array))
5
the result is 5, since array programming always starts at index 0. At first it can seem confusing.
[ 3, -1, 2, 5, -7] -> [0,1,2,3,4]

how the index works in Python

What does python do to find -2 items in the list? does iterate through the list?
a = [1, 2, 3, 4, 5]
print(a[-2])
I don't understand how it works.
Python programming language supports negative indexing of arrays, something which is not available in arrays in most other programming languages. This means that the index value of -1 gives the last element, and -2 gives the second last element of an array. The negative indexing starts from where the array ends.
(-) negative index iterates from last element to first element.
a = [1, 2, 3, 4, 5]
if you do a[-1] it returns last element (5).
if you do a[-2] it returns second last element(4) of the list and so on.

How to return the positions of the maximum value in an array

As stated I want to return the positions of the maximum value of the array. For instance if I have the array:
A = np.matrix([[1,2,3,33,99],[4,5,6,66,22],[7,8,9,99,11]])
np.argmax(A) returns only the first value which is the maximum, in this case this is 4. However how do I write code so it returns [4, 13]. Maybe there is a better function than argamax() for this as all I actually need is the position of the final maximum value.
Find the max value of the array and then use np.where.
>>> m = a.max()
>>> np.where(a.reshape(1,-1) == m)
(array([0, 0]), array([ 4, 13]))
After that, just index the second element of the tuple. Note that we have to reshape the array in order to get the indices that you are interested in.
Since you mentioned that you're interested only in the last position of the maximum value, a slightly faster solution could be:
A.size - 1 - np.argmax(A.flat[::-1])
Here:
A.flat is a flat view of A.
A.flat[::-1] is a reversed view of that flat view.
np.argmax(A.flat[::-1]) returns the first occurrence of the maximum, in that reversed view.

Negative index on nested list

Let's say I've got a nested list like this:
mylist = [[],[],[]]
and I want to insert elements at the end of the second nested list:
mylist[1].insert(-1, 1)
mylist[1].insert(-1, 2)
The output i expected was:
[[], [1, 2], []]
but instead I got:
[[], [2, 1], []]
Can somebody explain this to me? I thought the index -1 always pointed to the last position of a list.
according to documentation (https://docs.python.org/3/tutorial/datastructures.html#more-on-lists), the first argument of the insert method is the index of the element before which to insert...and -1 designates the last element of a list: so by calling insert(-1,...) the element you insert will always become the next to last element of your list.
this is easy to veryfy. if you insert yet another element
mylist[1].insert(-1, 3)
you will notice the resulting list becomes
[[], [2, 3, 1], []]
so should probably use append instead. or calculate the index dynamically like
mylist[1].insert(len(mylist[1]), 3)
If a list has for example 3 elements, they are numbered from 0 to 2 (i.e. 0, 1, 2).
If you want use negative indices, they are numbered from -3 to -1 (i.e. -3, -2, -1).
Now, if you want insert a new element at position -1, it is the same, as inserting at position 2, i.e. the inserted element will become element[2].
But element[2] will be then an element of the 4-element list, so its current position in the negative notation is not -1 but -2:
element[-4], element[-3], element[-2], element[-1]
From this page,
list.insert(index, element) means you insert element at index index. .insert(-1, value) means inserting value at the last element of the list (index=len(lst)-1). So
mylist[1].insert(1, 1)
mylist[1].insert(1, 2)
should solve the problem.
Another approach is to use append as the other one said.
mylist[1].append(1)
mylist[1].append(2)

index of a random choice from numpy array

I have a 2-d numpy array populated with integers [-1, 0, +1]. I need to choose a random element that is not zero from it and calculate the sum of its adjacent elements.
Is there a way to get the index of a numpy.random.choice?
lattice=np.zeros(9,dtype=numpy.int)
lattice[:2]=-1
lattice[2:4]=1
random.shuffle(lattice)
lattice=lattice.reshape((3,3))
random.choice(lattice[lattice!=0])
This gives the draw from the right sample, but I would need the index of the choice to be able to identify its adjacent elements. My other idea is to just sample from the index and then check if the element is non-zero, but this is obviously quite wasteful when there are a lot of zeros.
You can use lattice.nonzero() to get the locations of the nonzero elements [nonzero docs]:
>>> lnz = lattice.nonzero()
>>> lnz
(array([0, 0, 1, 1]), array([1, 2, 0, 1]))
which returns a tuple of arrays corresponding to the coordinates of the nonzero elements. Then you draw an index:
>>> np.random.randint(0, len(lnz[0]))
3
and use that to decide which coordinate you're interested in.

Categories