What is this behaviour of slicing in python?

What is this behaviour of slicing in python? - python

What happens when we give string[::-1] in python such that the string gets reversed we know that by default the places are acquired by the 0 index and the last index+1 of the string or the -last index and -1 then how on writing above it starts counting from last index whether neither -1 is present at the first place nor the last index so that it starts decreasing 1 from where I can get depth knowledge of working of slicing in python and what does it return like stuff

From the python documentation on slicing sequences like [i:j:k]
The slice of s from i to j with step k is defined as the sequence of items with index x = i + n*k such that 0 <= n < (j-i)/k. In other words, the indices are i, i+k, i+2*k, i+3*k and so on, stopping when j is reached (but never including j). When k is positive, i and j are reduced to len(s) if they are greater. When k is negative, i and j are reduced to len(s) - 1 if they are greater. If i or j are omitted or None, they become “end” values (which end depends on the sign of k). Note, k cannot be zero. If k is None, it is treated like 1.
https://docs.python.org/3/library/stdtypes.html#common-sequence-operations

Actually in case of python string slicing it takes three arguments:
[firstArgument:secondArgument:thirdArgument]
FirstArgument consists in the integer value from where the string get sliced.
SecondArgument tells about the up to where the string being sliced.
ThirdArgument tells about that how much step it get jumped
Example:
a="Hello"
print(a[::2])
# outputs 'hlo' without quoting.

Related

Why the final index using range in python is not equal to end parameter? [duplicate]

This question already has answers here:
Why are slice and range upper-bound exclusive?
(6 answers)
Closed last month.
>>> range(1,11)
gives you
[1,2,3,4,5,6,7,8,9,10]
Why not 1-11?
Did they just decide to do it like that at random or does it have some value I am not seeing?

Because it's more common to call range(0, 10) which returns [0,1,2,3,4,5,6,7,8,9] which contains 10 elements which equals len(range(0, 10)). Remember that programmers prefer 0-based indexing.
Also, consider the following common code snippet:
for i in range(len(li)):
pass
Could you see that if range() went up to exactly len(li) that this would be problematic? The programmer would need to explicitly subtract 1. This also follows the common trend of programmers preferring for(int i = 0; i < 10; i++) over for(int i = 0; i <= 9; i++).
If you are calling range with a start of 1 frequently, you might want to define your own function:
>>> def range1(start, end):
... return range(start, end+1)
...
>>> range1(1, 10)
[1, 2, 3, 4, 5, 6, 7, 8, 9, 10]

Although there are some useful algorithmic explanations here, I think it may help to add some simple 'real life' reasoning as to why it works this way, which I have found useful when introducing the subject to young newcomers:
With something like 'range(1,10)' confusion can arise from thinking that pair of parameters represents the "start and end".
It is actually start and "stop".
Now, if it were the "end" value then, yes, you might expect that number would be included as the final entry in the sequence. But it is not the "end".
Others mistakenly call that parameter "count" because if you only ever use 'range(n)' then it does, of course, iterate 'n' times. This logic breaks down when you add the start parameter.
So the key point is to remember its name: "stop".
That means it is the point at which, when reached, iteration will stop immediately. Not after that point.
So, while "start" does indeed represent the first value to be included, on reaching the "stop" value it 'breaks' rather than continuing to process 'that one as well' before stopping.
One analogy that I have used in explaining this to kids is that, ironically, it is better behaved than kids! It doesn't stop after it supposed to - it stops immediately without finishing what it was doing. (They get this ;) )
Another analogy - when you drive a car you don't pass a stop/yield/'give way' sign and end up with it sitting somewhere next to, or behind, your car. Technically you still haven't reached it when you do stop. It is not included in the 'things you passed on your journey'.
I hope some of that helps in explaining to Pythonitos/Pythonitas!

Exclusive ranges do have some benefits:
For one thing each item in range(0,n) is a valid index for lists of length n.
Also range(0,n) has a length of n, not n+1 which an inclusive range would.

It works well in combination with zero-based indexing and len(). For example, if you have 10 items in a list x, they are numbered 0-9. range(len(x)) gives you 0-9.
Of course, people will tell you it's more Pythonic to do for item in x or for index, item in enumerate(x) rather than for i in range(len(x)).
Slicing works that way too: foo[1:4] is items 1-3 of foo (keeping in mind that item 1 is actually the second item due to the zero-based indexing). For consistency, they should both work the same way.
I think of it as: "the first number you want, followed by the first number you don't want." If you want 1-10, the first number you don't want is 11, so it's range(1, 11).
If it becomes cumbersome in a particular application, it's easy enough to write a little helper function that adds 1 to the ending index and calls range().

It's also useful for splitting ranges; range(a,b) can be split into range(a, x) and range(x, b), whereas with inclusive range you would write either x-1 or x+1. While you rarely need to split ranges, you do tend to split lists quite often, which is one of the reasons slicing a list l[a:b] includes the a-th element but not the b-th. Then range having the same property makes it nicely consistent.

The length of the range is the top value minus the bottom value.
It's very similar to something like:
for (var i = 1; i < 11; i++) {
//i goes from 1 to 10 in here
}
in a C-style language.
Also like Ruby's range:
1...11 #this is a range from 1 to 10
However, Ruby recognises that many times you'll want to include the terminal value and offers the alternative syntax:
1..10 #this is also a range from 1 to 10

Consider the code
for i in range(10):
print "You'll see this 10 times", i
The idea is that you get a list of length y-x, which you can (as you see above) iterate over.
Read up on the python docs for range - they consider for-loop iteration the primary usecase.

Basically in python range(n) iterates n times, which is of exclusive nature that is why it does not give last value when it is being printed, we can create a function which gives
inclusive value it means it will also print last value mentioned in range.
def main():
for i in inclusive_range(25):
print(i, sep=" ")
def inclusive_range(*args):
numargs = len(args)
if numargs == 0:
raise TypeError("you need to write at least a value")
elif numargs == 1:
stop = args[0]
start = 0
step = 1
elif numargs == 2:
(start, stop) = args
step = 1
elif numargs == 3:
(start, stop, step) = args
else:
raise TypeError("Inclusive range was expected at most 3 arguments,got {}".format(numargs))
i = start
while i <= stop:
yield i
i += step
if __name__ == "__main__":
main()

The range(n) in python returns from 0 to n-1. Respectively, the range(1,n) from 1 to n-1.
So, if you want to omit the first value and get also the last value (n) you can do it very simply using the following code.
for i in range(1, n + 1):
print(i) #prints from 1 to n

It's just more convenient to reason about in many cases.
Basically, we could think of a range as an interval between start and end. If start <= end, the length of the interval between them is end - start. If len was actually defined as the length, you'd have:
len(range(start, end)) == start - end
However, we count the integers included in the range instead of measuring the length of the interval. To keep the above property true, we should include one of the endpoints and exclude the other.
Adding the step parameter is like introducing a unit of length. In that case, you'd expect
len(range(start, end, step)) == (start - end) / step
for length. To get the count, you just use integer division.

Two major uses of ranges in python. All things tend to fall in one or the other
integer. Use built-in: range(start, stop, step). To have stop included would mean that the end step would be assymetric for the general case. Consider range(0,5,3). If default behaviour would output 5 at the end, it would be broken.
floating pont. This is for numerical uses (where sometimes it happens to be integers too). Then use numpy.linspace.

What are the default values set to in a slice?

I have had a look at the top answers on why-does-list-1-not-equal-listlenlist-1 and what-are-the-default-slice-indices-in-python-really to learn about what the default values are set to in a slice. Both of the top answers refer to the following documentation:
Given s[i:j:k],
If i or j are omitted or None, they become “end” values (which end depends on the sign of k). Note, k cannot be zero. If k is None, it is treated like 1.
Suppose I have the following code:
s = "hello"
forward_s = s[::1]
print(forward_s) # => "hello"
backward_s = s[::-1]
print(backward_s) # => "olleh"
I know that if the indices are omitted then Python treats that as if the value of None was used in those places. According to the documentation the indices of i and j in [i:j:k] are set to "end" values depending on the sign of k. For a positive k, I'm assuming i is set to 0 and j is set to the length of the string. What are the values set for a negative k though?
For instance, the following also reverses the string:
reversed_str = s[-1:-6:-1]
So maybe the default value for i is set to -1 and j is set to -len(s) - 1 when k = -1?

The default values are None, None, and None.
class Sliceable(object):
def __getitem__(self, slice):
print(slice)
Sliceable()[::]
>>> slice(None, None, None)
This is regardless of the fact, that slice() does require a stop argument. The library documentation is a little bit less explicit on this, but the C API makes clear, that all three values maybe empty:
The start, stop, and step parameters are used as the values of the slice object attributes of the same names. Any of the values may be NULL, in which case the None will be used for the corresponding attribute.
It's up to the sliceable object to make sense of the default values. The convention is to use first element, past last element, minimal stepping as implemented by the builtin collections:
l = [1, 2, 3]
l[slice(None, None, None])
>>> [1, 2, 3]
s[None:None:None])
>>> [1, 2, 3]
Negative stepping will cause the default start and end values to be reversed semantically, i.e.:
s = 'abc'
s[slice(None, None, -1)]
>>> 'cba'
s[::-1]
>>> 'cba'
Note that this does not mean a simple value flip, the default value of end is typically "one past the end of the sequence, in whatever direction", since range() is not inclusive for the end value, but the default values for a slice should include the full sequence.
This is documented here:
s[i:j:k]
The slice of s from i to j with step k is defined as the sequence of items with index x = i + n*k such that 0 <= n < (j-i)/k. In other words, the indices are i, i+k, i+2*k, i+3*k and so on, stopping when j is reached (but never including j). When k is positive, i and j are reduced to len(s) if they are greater. When k is negative, i and j are reduced to len(s) - 1 if they are greater. If i or j are omitted or None, they become “end” values (which end depends on the sign of k). Note, k cannot be zero. If k is None, it is treated like 1.

Bisection index search - why compare to len(a)?

So I'm trying to understand bisection. I get that it's a useful, computation-saving algorithm, I get the general concept of what it does and how it does it.
What I don't get involves this search function that uses it, taken from https://docs.python.org/2/library/bisect.html
from bisect import bisect_left
def index(a, x):
'Locate the leftmost value exactly equal to x'
i = bisect_left(a, x)
if i != len(a) and a[i] == x:
return i
raise ValueError
Could somebody please break down for me exactly what the i != len(a) part of the if line does? I can read it - it checks whether the insertion index of x is equivalent to the length of the list a - but I can't understand it. Why is it necessary? What would happen without it?
I follow that, let's say x has an insertion index greater than the length of a, then x obviously doesn't exist in a so it spews out the error - but if that's the case then the a[i] == x check trips it up anyway...?
Thanks!

Since lists are indexed from 0 then a[i] doesn't make sense for i == len(a) (the last element has index len(a) - 1). Doing a[i] for such i will throw an IndexError.
Now if bisect_left(a, x) returns len(a) then it means that the element should be added at the end of the list in order to preserve the order.
On the other hand if there is a match for x in a then
bisect_left(a, x) != len(a)
because If x is already present in a, the insertion point will be before (to the left of) any existing entries. If it is before then obviously the insertion index has to be smaller or equal to the last index (since in worst case this matched element will have last index) and the last index is len(a)-1 which is smaller then the length.
All in all if bisect_left(a, x) == len(a) then there is no x in a. This allows us to easily get rid of "IndexError problem" that I've mentioned at the begining.
Note that this (obviously) is not true for bisect_right. However following similar logic you can reach something analogous: if bisect_right(a, x) == 0 then there is no x in a. Unfortunately implementing index function via bisect_right would be more difficult since you would still need this check for i == len(a) to avoid an IndexError.

Zero based indexing. The i != len(a) is to catch the exact case where i is equal to the length of the list, in which case a[i] will raise an index error (lists start at index 0 and go up to len(a) - 1).

Please explain Python sequence reversal

Not sure where I picked this up, but it stuck and I use it all the time.
Can someone explain how this string reversal works? I use it to test for palindromic strings without converting it to a mutable type first.
>>> word = "magic"
>>> magic = word[::-1]
>>> magic
'cigam'
I would put my best guess, but I don't want to walk in with any preconceptions about the internals behind this useful trick.

The slice notation goes like this:
my_list[start:end:step]
So, when you do [::-1], it means:
start: nothing (default)
end: nothing (default)
step: -1 (descendent order)
So, you're going from the end of the list (default) to the first element (default), decreasing the index by one (-1).
So, as many answers said, there is no sorting nor in-place swapping, just slice notation.

You can have a look here - it is an extended slice.

"What's New in Python 2.3", section 15, "Extended Slices".

This "trick" is just a particular instance of applying a slice operation to a sequence. You can use it to produce a reversed copy of a list or a tuple as well. Another "trick" from the same family: [:] is often used to produce a (shallow) copy of a list.
"What's new in Python 2.3" is an unexpected entry point into the maze. Let's start at a more obvious(?) place, the current 2.X documentation for sequence objects.
In the table of sequence operations, you'll see a row with Operation = s[i:j:k], Result = "slice of s from i to j with step k", and Notes = "(3)(5)".
Note 3 says "If i or j is negative, the index is relative to the end of the string: len(s) + i or len(s) + j is substituted. But note that -0 is still 0."
Note 5 says "The slice of s from i to j with step k is defined as the sequence of items with index x = i + n*k such that 0 <= n < (j-i)/k. In other words, the indices are i, i+k, i+2*k, i+3*k and so on, stopping when j is reached (but never including j). If i or j is greater than len(s), use len(s). If i or j are omitted or None, they become “end” values (which end depends on the sign of k). Note, k cannot be zero. If k is None, it is treated like 1."
We have k == -1, so the indices used are i, i-1, i-2, i-3 and so on, stopping when j is reached (but never including j). To obtain the observed effect, the "end" value used for i must be len(s)-1, and the "end" value used for j must be -1. Thus the indices used are last, last-1, ..., 2, 1.
Another entry point is to consider how we might produce such a result for any sequence if [::-1] didn't exist in the language:
def reverse_traversal_of_sequence(s):
for x in range(len(s) - 1, -1, -1):
do_something_with(s[x])

Why does range(start, end) not include end? [duplicate]

This question already has answers here:
Why are slice and range upper-bound exclusive?
(6 answers)
Closed last month.
>>> range(1,11)
gives you
[1,2,3,4,5,6,7,8,9,10]
Why not 1-11?
Did they just decide to do it like that at random or does it have some value I am not seeing?

Because it's more common to call range(0, 10) which returns [0,1,2,3,4,5,6,7,8,9] which contains 10 elements which equals len(range(0, 10)). Remember that programmers prefer 0-based indexing.
Also, consider the following common code snippet:
for i in range(len(li)):
pass
Could you see that if range() went up to exactly len(li) that this would be problematic? The programmer would need to explicitly subtract 1. This also follows the common trend of programmers preferring for(int i = 0; i < 10; i++) over for(int i = 0; i <= 9; i++).
If you are calling range with a start of 1 frequently, you might want to define your own function:
>>> def range1(start, end):
... return range(start, end+1)
...
>>> range1(1, 10)
[1, 2, 3, 4, 5, 6, 7, 8, 9, 10]

Although there are some useful algorithmic explanations here, I think it may help to add some simple 'real life' reasoning as to why it works this way, which I have found useful when introducing the subject to young newcomers:
With something like 'range(1,10)' confusion can arise from thinking that pair of parameters represents the "start and end".
It is actually start and "stop".
Now, if it were the "end" value then, yes, you might expect that number would be included as the final entry in the sequence. But it is not the "end".
Others mistakenly call that parameter "count" because if you only ever use 'range(n)' then it does, of course, iterate 'n' times. This logic breaks down when you add the start parameter.
So the key point is to remember its name: "stop".
That means it is the point at which, when reached, iteration will stop immediately. Not after that point.
So, while "start" does indeed represent the first value to be included, on reaching the "stop" value it 'breaks' rather than continuing to process 'that one as well' before stopping.
One analogy that I have used in explaining this to kids is that, ironically, it is better behaved than kids! It doesn't stop after it supposed to - it stops immediately without finishing what it was doing. (They get this ;) )
Another analogy - when you drive a car you don't pass a stop/yield/'give way' sign and end up with it sitting somewhere next to, or behind, your car. Technically you still haven't reached it when you do stop. It is not included in the 'things you passed on your journey'.
I hope some of that helps in explaining to Pythonitos/Pythonitas!

Exclusive ranges do have some benefits:
For one thing each item in range(0,n) is a valid index for lists of length n.
Also range(0,n) has a length of n, not n+1 which an inclusive range would.

It works well in combination with zero-based indexing and len(). For example, if you have 10 items in a list x, they are numbered 0-9. range(len(x)) gives you 0-9.
Of course, people will tell you it's more Pythonic to do for item in x or for index, item in enumerate(x) rather than for i in range(len(x)).
Slicing works that way too: foo[1:4] is items 1-3 of foo (keeping in mind that item 1 is actually the second item due to the zero-based indexing). For consistency, they should both work the same way.
I think of it as: "the first number you want, followed by the first number you don't want." If you want 1-10, the first number you don't want is 11, so it's range(1, 11).
If it becomes cumbersome in a particular application, it's easy enough to write a little helper function that adds 1 to the ending index and calls range().

It's also useful for splitting ranges; range(a,b) can be split into range(a, x) and range(x, b), whereas with inclusive range you would write either x-1 or x+1. While you rarely need to split ranges, you do tend to split lists quite often, which is one of the reasons slicing a list l[a:b] includes the a-th element but not the b-th. Then range having the same property makes it nicely consistent.

The length of the range is the top value minus the bottom value.
It's very similar to something like:
for (var i = 1; i < 11; i++) {
//i goes from 1 to 10 in here
}
in a C-style language.
Also like Ruby's range:
1...11 #this is a range from 1 to 10
However, Ruby recognises that many times you'll want to include the terminal value and offers the alternative syntax:
1..10 #this is also a range from 1 to 10

Consider the code
for i in range(10):
print "You'll see this 10 times", i
The idea is that you get a list of length y-x, which you can (as you see above) iterate over.
Read up on the python docs for range - they consider for-loop iteration the primary usecase.

Basically in python range(n) iterates n times, which is of exclusive nature that is why it does not give last value when it is being printed, we can create a function which gives
inclusive value it means it will also print last value mentioned in range.
def main():
for i in inclusive_range(25):
print(i, sep=" ")
def inclusive_range(*args):
numargs = len(args)
if numargs == 0:
raise TypeError("you need to write at least a value")
elif numargs == 1:
stop = args[0]
start = 0
step = 1
elif numargs == 2:
(start, stop) = args
step = 1
elif numargs == 3:
(start, stop, step) = args
else:
raise TypeError("Inclusive range was expected at most 3 arguments,got {}".format(numargs))
i = start
while i <= stop:
yield i
i += step
if __name__ == "__main__":
main()

The range(n) in python returns from 0 to n-1. Respectively, the range(1,n) from 1 to n-1.
So, if you want to omit the first value and get also the last value (n) you can do it very simply using the following code.
for i in range(1, n + 1):
print(i) #prints from 1 to n

It's just more convenient to reason about in many cases.
Basically, we could think of a range as an interval between start and end. If start <= end, the length of the interval between them is end - start. If len was actually defined as the length, you'd have:
len(range(start, end)) == start - end
However, we count the integers included in the range instead of measuring the length of the interval. To keep the above property true, we should include one of the endpoints and exclude the other.
Adding the step parameter is like introducing a unit of length. In that case, you'd expect
len(range(start, end, step)) == (start - end) / step
for length. To get the count, you just use integer division.

Two major uses of ranges in python. All things tend to fall in one or the other
integer. Use built-in: range(start, stop, step). To have stop included would mean that the end step would be assymetric for the general case. Consider range(0,5,3). If default behaviour would output 5 at the end, it would be broken.
floating pont. This is for numerical uses (where sometimes it happens to be integers too). Then use numpy.linspace.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

What is this behaviour of slicing in python? - python

Related

Why the final index using range in python is not equal to end parameter? [duplicate]

What are the default values set to in a slice?

Bisection index search - why compare to len(a)?

Please explain Python sequence reversal

Why does range(start, end) not include end? [duplicate]

Categories

Resources