How to read slicing with negative step - python

I have already seen some questions about slicing, but haven't seen a helpful answer concerning some of them, which I can't manage to understand very well.
Let's say we have this list a = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
And I slice it in the following way:
a[:8:-1] #Ouput: [9]
Why? We give it an end of 8, and a step of -1. How come it behaves in this way?

If you omit the first part of the slice expression, it defaults to None. When it comes time for list.__getitem__ to interpret what slice(None, 8, -1) means, it uses the sign of the step size to determine if you are counting up from 0 or down from the end of the list. In this case, you are counting down, so :8:-1 is equivalent to slice(-1, 8, -1).

Related

Conditionally deleting element with 'del' from list. Not deleting all elements [duplicate]

This question already has answers here:
How to remove items from a list while iterating?
(25 answers)
Closed 8 months ago.
The following code:
a = list(range(10))
remove = False
for b in a:
if remove:
a.remove(b)
remove = not remove
print(a)
Outputs [0, 2, 3, 5, 6, 8, 9], instead of [0, 2, 4, 6, 8] when using Python 3.2.
Why does it output these particular values?
Why is no error given to indicate that underlying iterator is being modified?
Have the mechanics changed from earlier versions of Python with respect to this behaviour?
Note that I am not looking to work around the behaviour, but to understand it.
I debated answering this for a while, because similar questions have been asked many times here. But it's just unique enough to be given the benefit of the doubt. (Still, I won't object if others vote to close.) Here's a visual explanation of what is happening.
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9] <- b = 0; remove? no
^
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9] <- b = 1; remove? yes
^
[0, 2, 3, 4, 5, 6, 7, 8, 9] <- b = 3; remove? no
^
[0, 2, 3, 4, 5, 6, 7, 8, 9] <- b = 4; remove? yes
^
[0, 2, 3, 5, 6, 7, 8, 9] <- b = 6; remove? no
^
[0, 2, 3, 5, 6, 7, 8, 9] <- b = 7; remove? yes
^
[0, 2, 3, 5, 6, 8, 9] <- b = 9; remove? no
^
Since no one else has, I'll attempt to answer your other questions:
Why is no error given to indicate that underlying iterator is being modified?
To throw an error without prohibiting many perfectly valid loop constructions, Python would have to know a lot about what's going on, and it would probably have to get that information at runtime. All that information would take time to process. It would make Python a lot slower, in just the place where speed really counts -- a loop.
Have the mechanics changed from earlier versions of Python with respect to this behaviour?
In short, no. Or at least I highly doubt it, and certainly it has behaved this way since I learned Python (2.4). Frankly I would expect any straightforward implementation of a mutable sequence to behave in just this way. Anyone who knows better, please correct me. (Actually, a quick doc lookup confirms that the text that Mikola cited has been in the tutorial since version 1.4!)
As Mikola explained, the actual result you observe is caused by the fact that deleting an entry from the list shifts the whole list over by one spot causing you to miss elements.
But the more interesting question, to my mind, is why python doesn't elect to produce an error message when this happens. It does produce such an error message if you try to modify a dictionary. I think there are two reasons for that.
Dict are complex internally, whereas lists are not. Lists are basically just arrays. A dict has to detect when its modified while being iterated over so as to avoid crashing when the internal structure of the dict changes. A list can get away without doing that check because it just makes sure that its current index is still in range.
Historically, (I'm not sure about now), python lists were iterated over by using the [] operator. Python would evaluate list[0], list[1], list[2] until it got an IndexError. In that case, python wasn't track the size of the list before it began so it had no method of detecting that the size of list had been changed.
Of course it is not safe to modify an array as you are iterating over it. The spec says it is a bad idea and the behavior is undefined:
http://docs.python.org/tutorial/controlflow.html#for-statements
So, the next question is what exactly is happening under the hood here? If I had to guess, I would say that it is doing something like this:
for(int i=0; i<len(array); ++i)
{
do_loop_body(i);
}
If you suppose that this is indeed what is going on, then it explains the observed behavior completely. When you remove an element at or before the current pointer, you shift the whole list by 1 to the left. The first time, you remove a 1 -- like usual -- but now the list shifts backwards. The next iteration instead of hitting a 2, you hit a 3. Then you remove a 4, and the list shifts backwards. Next iteration 7, and so on.
If you add a reversed() to the for loop, you can traverse the array backwards, while removing items and get the expected output. Element position with an array depends on the preceding elements not the following elements:
Therefore the code:
a = list(range(10))
remove = True
for b in reversed(a):
if remove:
a.remove(b)
remove = not remove
print(a)
produces the expected: [0, 2, 4, 6, 8]
On your first iteration, you're not removing and everything's dandy.
Second iteration you're at position [1] of the sequence, and you remove '1'. The iterator then takes you to position [2] in the sequence, which is now '3', so '2' gets skipped over (as '2' is now at position [1] because of the removal). Of course '3' doesn't get removed, so you go on to position [3] in the sequence, which is now '4'. That gets removed, taking you to position [5] which is now '6', and so on.
The fact that you're removing things means that a position gets skipped over every time you perform a removal.

What is the difference between mylist and mylist[:]? [duplicate]

This question already has answers here:
How to remove items from a list while iterating?
(25 answers)
Closed 8 months ago.
The following code:
a = list(range(10))
remove = False
for b in a:
if remove:
a.remove(b)
remove = not remove
print(a)
Outputs [0, 2, 3, 5, 6, 8, 9], instead of [0, 2, 4, 6, 8] when using Python 3.2.
Why does it output these particular values?
Why is no error given to indicate that underlying iterator is being modified?
Have the mechanics changed from earlier versions of Python with respect to this behaviour?
Note that I am not looking to work around the behaviour, but to understand it.
I debated answering this for a while, because similar questions have been asked many times here. But it's just unique enough to be given the benefit of the doubt. (Still, I won't object if others vote to close.) Here's a visual explanation of what is happening.
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9] <- b = 0; remove? no
^
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9] <- b = 1; remove? yes
^
[0, 2, 3, 4, 5, 6, 7, 8, 9] <- b = 3; remove? no
^
[0, 2, 3, 4, 5, 6, 7, 8, 9] <- b = 4; remove? yes
^
[0, 2, 3, 5, 6, 7, 8, 9] <- b = 6; remove? no
^
[0, 2, 3, 5, 6, 7, 8, 9] <- b = 7; remove? yes
^
[0, 2, 3, 5, 6, 8, 9] <- b = 9; remove? no
^
Since no one else has, I'll attempt to answer your other questions:
Why is no error given to indicate that underlying iterator is being modified?
To throw an error without prohibiting many perfectly valid loop constructions, Python would have to know a lot about what's going on, and it would probably have to get that information at runtime. All that information would take time to process. It would make Python a lot slower, in just the place where speed really counts -- a loop.
Have the mechanics changed from earlier versions of Python with respect to this behaviour?
In short, no. Or at least I highly doubt it, and certainly it has behaved this way since I learned Python (2.4). Frankly I would expect any straightforward implementation of a mutable sequence to behave in just this way. Anyone who knows better, please correct me. (Actually, a quick doc lookup confirms that the text that Mikola cited has been in the tutorial since version 1.4!)
As Mikola explained, the actual result you observe is caused by the fact that deleting an entry from the list shifts the whole list over by one spot causing you to miss elements.
But the more interesting question, to my mind, is why python doesn't elect to produce an error message when this happens. It does produce such an error message if you try to modify a dictionary. I think there are two reasons for that.
Dict are complex internally, whereas lists are not. Lists are basically just arrays. A dict has to detect when its modified while being iterated over so as to avoid crashing when the internal structure of the dict changes. A list can get away without doing that check because it just makes sure that its current index is still in range.
Historically, (I'm not sure about now), python lists were iterated over by using the [] operator. Python would evaluate list[0], list[1], list[2] until it got an IndexError. In that case, python wasn't track the size of the list before it began so it had no method of detecting that the size of list had been changed.
Of course it is not safe to modify an array as you are iterating over it. The spec says it is a bad idea and the behavior is undefined:
http://docs.python.org/tutorial/controlflow.html#for-statements
So, the next question is what exactly is happening under the hood here? If I had to guess, I would say that it is doing something like this:
for(int i=0; i<len(array); ++i)
{
do_loop_body(i);
}
If you suppose that this is indeed what is going on, then it explains the observed behavior completely. When you remove an element at or before the current pointer, you shift the whole list by 1 to the left. The first time, you remove a 1 -- like usual -- but now the list shifts backwards. The next iteration instead of hitting a 2, you hit a 3. Then you remove a 4, and the list shifts backwards. Next iteration 7, and so on.
If you add a reversed() to the for loop, you can traverse the array backwards, while removing items and get the expected output. Element position with an array depends on the preceding elements not the following elements:
Therefore the code:
a = list(range(10))
remove = True
for b in reversed(a):
if remove:
a.remove(b)
remove = not remove
print(a)
produces the expected: [0, 2, 4, 6, 8]
On your first iteration, you're not removing and everything's dandy.
Second iteration you're at position [1] of the sequence, and you remove '1'. The iterator then takes you to position [2] in the sequence, which is now '3', so '2' gets skipped over (as '2' is now at position [1] because of the removal). Of course '3' doesn't get removed, so you go on to position [3] in the sequence, which is now '4'. That gets removed, taking you to position [5] which is now '6', and so on.
The fact that you're removing things means that a position gets skipped over every time you perform a removal.

What is the use case for negative slicing and indexing in lists?

Referring to 30 Python Language features
1.6 List slices with negative indexing:
>>> a = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
>>> a[-4:-2]
[7, 8]
Where is negative slicing and indexing most commonly used?
Is there a case where it is even indispensable so that it must exist as a language feature?
One very common scenario where a negative attribute of the slice is handy is reversing of the sequence, i.e:
a = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
a[::-1]
where the slice has a negative step value.
Another, equally common, is grabbing the last element of a sequence with a[-1]. Without negative indexing you'd resort to ugly a[len(a)-1] code; now you can simply let Python add the len to the value behind the scenes without worrying.
This convenience that Python kindly offers has been around since at least version 1.4 (oldest docs I have generally found); I am doubtful of this being "indispensable" someplace, it's just one of the many things that makes Python a bit friendlier.

Mysterious interaction between Python's slice bounds and "stride"

I understand that given an iterable such as
>>> it = [1, 2, 3, 4, 5, 6, 7, 8, 9]
I can turn it into a list and slice off the ends at arbitrary points with, for example
>>> it[1:-2]
[2, 3, 4, 5, 6, 7]
or reverse it with
>>> it[::-1]
[9, 8, 7, 6, 5, 4, 3, 2, 1]
or combine the two with
>>> it[1:-2][::-1]
[7, 6, 5, 4, 3, 2]
However, trying to accomplish this in a single operation produces in some results that puzzle me:
>>> it[1:-2:-1]
[]
>>>> it[-1:2:-1]
[9, 8, 7, 6, 5, 4]
>>>> it[-2:1:-1]
[8, 7, 6, 5, 4, 3]
Only after much trial and error, do I get what I'm looking for:
>>> it[-3:0:-1]
[7, 6, 5, 4, 3, 2]
This makes my head hurt (and can't help readers of my code):
>>> it[-3:0:-1] == it[1:-2][::-1]
True
How can I make sense of this? Should I even be pondering such things?
FWYW, my code does a lot of truncating, reversing, and listifying of iterables, and I was looking for something that was faster and clearer (yes, don't laugh) than list(reversed(it[1:-2])).
This is because in a slice like -
list[start:stop:step]
start is inclusive, resultant list starts at index start.
stop is exclusive, that is the resultant list only contains elements till stop - 1 (and not the element at stop).
So for your caseit[1:-2] - the 1 is inclusive , that means the slice result starts at index 1 , whereas the -2 is exclusive , hence the last element of the slice index would be from index -3.
Hence, if you want the reversed of that, you would have to do it[-3:0:-1] - only then -3 would be included in the sliced result, and the sliced result would go upto 1 index.
The important things to understand in your slices are
Start will be included in the slice
Stop will NOT be included in the slice
If you want to slice backwards, the step value should be a negative value.
Basically the range which you specify is a half-open (half-closed) range.
When you say it[-3:0:-1] you are actually starting from the third element from the back, till we reach 0 (not including zero), step one element at a time backwards.
>>> it[-3:0:-1]
[7, 6, 5, 4, 3, 2]
Instead, you can realize the start value like this
>>> it[len(it)-3 : 0 : -1]
[7, 6, 5, 4, 3, 2]
I think the other two answers disambiguate the usage of slicing and give a clearer image of how its parameters work.
But, since your question also involves readability -- which, let's not forget, is a big factor especially in Python -- I'd like to point out how you can improve it slightly by assigning slice() objects to variables thus removing all those hardcoded : separated numbers.
Your truncate and reverse slice object could, alternatively, be coded with a usage implying name :
rev_slice = slice(-3, 0, -1)
In some other config-like file. You could then use it in its named glory within slicing operations to make this a bit more easy on the eyes :
it[rev_slice] # [7, 6, 5, 4, 3, 2]
This might be a trivial thing to mention, but I think it's probably worth it.
Why not create a function for readability:
def listify(it, start=0, stop=None, rev=False):
if stop is None:
the_list = it[start:]
else:
the_list = it[start:stop]
if rev:
return the_list[::-1]
else:
return the_list
listify(it, start=1, stop=-2) # [2, 3, 4, 5, 6, 7]
listify(it, start=1, stop=-2, rev=True) # [7, 6, 5, 4, 3, 2]
A good way to intuitively understand the Python slicing syntax is to see how it maps to the corresponding C for loop.
A slice like
x[a:b:c]
gives you the same elements as
for (int i = a; i < b; i += c) {
...
}
The special cases are just default values:
a defaults to 0
b defaults to len(x)
c defaults to 1
Plus one more special case:
if c is negative, then a and b are swapped and the < is inverted to a >

Reversing the list in python

In [122]: a = range(10)
In [123]: a[: : -1]
Out[123]: [9, 8, 7, 6, 5, 4, 3, 2, 1, 0]
Could you explain the expression a[: : -1]?
a[:] is clearly understandable -> "start form the beginning(space before the colon) and retrieve the list upto the end (space after the colon)"
But I am not getting what the two colons are actually doing in the expression a[: : -1].
A slice takes three arguments, just like range: start, stop and step:
[0, 1, 2, 3, 4, 5][0:4:2] == list(range(0, 4, 2)) # every second element from 0 to 3
The negative step causes the slice to work backwards through the iterable. Without a start and stop (i.e. just the step [::-1]) it starts from the end, as it is working backwards.
The third argument (after two :'s) is the step size. -1 can be interpreted as stepping backwards. In other words, reversing the list.
Try with -2 step size i.e., a[::-2], You'll get:
[9, 7, 5, 3, 1]
Hope this helps.
More elaborate answers and explanations here Explain Python's slice notation

Categories