Python tuple ... is not a tuple? What does the comma do? [duplicate] - python

This question already has answers here:
Why does adding a trailing comma after an expression create a tuple?
(6 answers)
Closed last month.
I was looking at code in my course material and had to write a function which adds the value 99 to either a list or tuple. The final code looks like this:
def f(l):
print(l)
l += 99,
print(l)
f([1,2,3])
f((1,2,3))
This was used to show something different but I'm getting somewhat hung up on the line l += 99,. What this does, is create an iterable that contains the 99 and list as well as tuple support the simple "addition" of such an object to create a new instance/add a new element.
What I don't really get is what exactly is created using the syntax element,? If I do an assignment like x = 99, the type(x) will be tuple but if I try run x = tuple(99) it will fail as the 99 is not iterable. So is there:
Some kind of intermediate iterable object created using the syntax element,?
Is there a special function defined that would allow the calling of tuple without an iterable and somehow , is mapped to that?
Edit:
In case anyone wonders why the accepted answer is the one it is: The explanation for my second question made it. I should've been more clear with my question but that += is what actuallly got me confused and this answer includes information on this.

If the left-hand argument of = is a simple name, the type of argument currently bound to that name is irrelevant. tuple(99) fails because tuple's argument is not iterable; it has nothing to do with whether or not x already refers to an instance of tuple.
99, creates a tuple with a single argument; parentheses are only necessary to separate it from other uses of commas. For example, foo((99,100)) calls foo with a single tuple argument, while foo(99,100) calls foo with two distinct int arguments.

The syntax element, simply creates an "intermediate" tuple, not some other kind of object (though a tuple is of course iterable).
However, sometimes you need to use parentheses in order to avoid ambiguity. For this reason, you'll often see this:
l += (99,)
...even though the parentheses are not syntactically necessary. I also happen to think that is easier to read. But the parentheses ARE syntactically necessary in other situations, which you have already discovered:
list((99,))
tuple((99,))
set((99,))
You can also do these, since [] makes a list:
list([99])
tuple([99])
set([99])
...but you can't do these, since 99, is not a tuple object in these situations:
list(99,)
tuple(99,)
set(99,)
To answer your second question, no, there is not a way to make the tuple() function receive a non-iterable. In fact this is the purpose of the element, or (element,) syntax - very similar to [] for list and {} for dict and set (since the list, dict, and set functions all also require iterable arguments):
[99] #list
(99,) #tuple - note the comma is required
{99} #set
As discussed in the question comments, it surprising that you can increment (+=) a list using a tuple object. Note that you cannot do this:
l = [1]
l + (2,) # error
This is inconsistent, so it is probably something that should not have been allowed. Instead, you would need to do one of these:
l += [2]
l += list((2,))
However, fixing it would create problems for people (not to mention remove a ripe opportunity for confusion exploitation by evil computer science professors), so they didn't.

The tuple constructor requires an iterable (like it says in your error message) so in order to do x = tuple(99), you need to include it in an iterable like a list:
x = tuple([99])
or
x = tuple((99,))

Related

Python - Random.shuffle returns nonetype instead of list whereas random.choice returns str or int [duplicate]

This question already has answers here:
Why does random.shuffle return None?
(5 answers)
Closed 5 months ago.
In the code shown in the snapshot - if you see line 708 (random.choice) . I have assigned this method to a variable to capture the randomized elements. This works fine because line 711 prints the type str which is as expected.
b) Whereas if you see line 726(random.shuffle) , I have used another method called a random.shuffle which shuffles the list and tried to assign it to a variable but returns nonetype
Ques 1 : Why is that line 711 returns 'str' as expected whereas 727 returns nonetype . Both are using methods. One for str whereas other for list
Ques 2 - And how to change the nonetype to List type ? Reference line 726 , 727 ?
Simple way to ask the question - Random.shuffle returns nonetype instead of "list" type whereas random.choice returns str or int type accordingly. Why there is a difference in behaviors of two different methods . What is the rule and the logic behind the rule?
enter image description here
enter image description here
How to change the nonetype to List type
You don't. Realise what it means for a method to apply an in place algorithm: it means that the result is available in the variable that you have provided as argument. There is no need to assign it, as you already have it: the variable you passed as argument has the result. See also the answers to your previous question.
Why there is a difference in behaviors of two different methods. What is the rule and the logic behind the rule?
There is indeed a logic as to why these methods were designed like that:
If the method intends to mutate an argument, then it should not return anything other than None
So in this particular case we have random.choice. This method is not about mutating a list. It is a "read-only" method, that just chooses an element from a given list. There is no mutation going on, and no new object is created. It merely uses what is already there. So this is the perfect situation for returning the result.
On the other hand we have random.shuffle. This method suggests (by its name) that it will rearrange the given list (randomly): that is a mutation. The same can be said about sort: it also mutates the list on which that method is called. So here the principle is to not return anything. If the name of the method were random.shuffled, then it would have been a different story: that name suggests you would get a variant of the given list that is rearranged; in other words: a new list would be made. In that case, the original list would not be mutated, and it would be appropriate for the new list to be the returned value. This method does not exist, but the same idea does exist for sorting: sorted returns a new list.
I hope this explains why the language designers have designed these methods to work like that.
random.shuffle() returns None since it shuffles the list in place, it does not return a random value from the list.

Why does Python return None on list.reverse()?

Was solving an algorithms problem and had to reverse a list.
When done, this is what my code looked like:
def construct_path_using_dict(previous_nodes, end_node):
constructed_path = []
current_node = end_node
while current_node:
constructed_path.append(current_node)
current_node = previous_nodes[current_node]
constructed_path = reverse(constructed_path)
return constructed_path
But, along the way, I tried return constructed_path.reverse() and I realized it wasn't returning a list...
Why was it made this way?
Shouldn't it make sense that I should be able to return a reversed list directly, without first doing list.reverse() or list = reverse(list) ?
What I'm about to write was already said here, but I'll write it anyway because I think it will perhaps add some clarity.
You're asking why the reverse method doesn't return a (reference to the) result, and instead modifies the list in-place. In the official python tutorial, it says this on the matter:
You might have noticed that methods like insert, remove or sort that only modify the list have no return value printed – they return the default None. This is a design principle for all mutable data structures in Python.
In other words (or at least, this is the way I think about it) - python tries to mutate in-place where-ever possible (that is, when dealing with an immutable data structure), and when it mutates in-place, it doesn't also return a reference to the list - because then it would appear that it is returning a new list, when it is really returning the old list.
To be clear, this is only true for object methods, not functions that take a list, for example, because the function has no way of knowing whether or not it can mutate the iterable that was passed in. Are you passing a list or a tuple? The function has no way of knowing, unlike an object method.
list.reverse reverses in place, modifying the list it was called on. Generally, Python methods that operate in place don’t return what they operated on to avoid confusion over whether the returned value is a copy.
You can reverse and return the original list:
constructed_path.reverse()
return constructed_path
Or return a reverse iterator over the original list, which isn’t a list but doesn’t involve creating a second list just as big as the first:
return reversed(constructed_path)
Or return a new list containing the reversed elements of the original list:
return constructed_path[::-1]
# equivalent: return list(reversed(constructed_path))
If you’re not concerned about performance, just pick the option you find most readable.
methods like insert, remove or sort that only modify the list have no return value printed – they return the default None. 1 This is a design principle for all mutable data structures in Python.
PyDocs 5.1
As I understand it, you can see the distinction quickly by comparing the differences returned by modifying a list (mutable) ie using list.reverse() and mutating a list that's an element within a tuple (non-mutable), while calling
id(list)
id(tuple_with_list)
before and after the mutations. Mutable data-type mutations returning none is part allowing them to be changed/expanded/pointed-to-by-multiple references without reallocating memory.

Change in Type of Brackets changes type of return [duplicate]

This question already has answers here:
Generator expressions vs. list comprehensions
(13 answers)
Closed 3 years ago.
I'm some code in my project but I came across one problem which I solved but I'm not getting how it works. When I change the type of Brackets used in code,value in year is different.
when I use square brackets in line 2 at start and end of statement after =
import datetime
years=[x for x in range(2015,datetime.datetime.now().year)]
when I print(years) it gives output [2015,2016,2017,2018]
but when I use round brackets in line 2 like this
years=(x for x in range(1940,datetime.datetime.now().year))
when I print it ,it gives output <generator object <genexpr> at 0x041DB630>
I don't understand why this happens ,can anyone please explain. Thanks
These are two different, though related, constructs.
[x for x in range(2015,datetime.datetime.now().year)]
is known as a list comprehension, whereas
(x for x in range(2015,datetime.datetime.now().year))
is known as a generator expression.
Read more at https://djangostars.com/blog/list-comprehensions-and-generator-expressions/
Generators are functions that can be paused and resumed on the fly, returning an object that can be iterated over. Unlike lists, they are lazy and thus produce items one at a time and only when asked. So they are much more memory efficient when dealing with large datasets.
Just like list comprehensions, generators can also be written in the same manner except they return a generator object rather than a list:
>>> my_list = ['a', 'b', 'c', 'd']
>>> gen_obj = (x for x in my_list)
>>> for val in gen_obj:
... print(val)
...
a
b
c
d
Here are the explanations:
With round brackets it's called a generator expression, where you would have to do list(..) to make it a list and tuple(..) to make it a tuple and so on... more on the documentation
Generator iterators are created by the yield keyword. The real difference between them and ordinary functions is that yield unlike return is both exit and entry point for the function’s body. That means, after each yield call not only the generator returns something but also remembers its state. Calling the next() method brings control back to the generator starting after the last executed yield statement. Each yield statement is executed only once, in the order it appears in the code. After all the yield statements have been executed iteration ends.
With square brackets it's called a list comprehension, where it would give a list, since square brackets are for lists, more on the documentation
A list comprehension follows the form of the mathematical set-builder notation (set comprehension) as distinct from the use of map() and filter() functions.
What you are trying is comprehension and it works by looping or iterating over items and assigning them into a container.
Below is the list comprehension using square brackets:
[thing for thing in things]
But what you have tried is using parentheses which is generator comprehension not tuple comprehension, as parentheses have been kept reserved for generator comprehension, hence:
(thing for thing in things)
will result in a generator iterator, not a tuple. To get tuple iterator use as done below:
tuple(thing for thing in things)
You are creating a generator expression in the 2nd instance.
You would need to wrap it in list() or tuple() to get an iterable output.
While in the 1st instance your generating a list.
You can readmore about the issue Getting <generator object <genexpr>

Why does isinstance require a tuple instead of any iterable? [duplicate]

I've been playing for a bit with startswith() and I've discovered something interesting:
>>> tup = ('1', '2', '3')
>>> lis = ['1', '2', '3', '4']
>>> '1'.startswith(tup)
True
>>> '1'.startswith(lis)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: startswith first arg must be str or a tuple of str, not list
Now, the error is obvious and casting the list into a tuple will work just fine as it did in the first place:
>>> '1'.startswith(tuple(lis))
True
Now, my question is: why the first argument must be str or a tuple of str prefixes, but not a list of str prefixes?
AFAIK, the Python code for startswith() might look like this:
def startswith(src, prefix):
return src[:len(prefix)] == prefix
But that just confuses me more, because even with it in mind, it still shouldn't make any difference whether is a list or tuple. What am I missing ?
There is technically no reason to accept other sequence types, no. The source code roughly does this:
if isinstance(prefix, tuple):
for substring in prefix:
if not isinstance(substring, str):
raise TypeError(...)
return tailmatch(...)
elif not isinstance(prefix, str):
raise TypeError(...)
return tailmatch(...)
(where tailmatch(...) does the actual matching work).
So yes, any iterable would do for that for loop. But, all the other string test APIs (as well as isinstance() and issubclass()) that take multiple values also only accept tuples, and this tells you as a user of the API that it is safe to assume that the value won't be mutated. You can't mutate a tuple but the method could in theory mutate the list.
Also note that you usually test for a fixed number of prefixes or suffixes or classes (in the case of isinstance() and issubclass()); the implementation is not suited for a large number of elements. A tuple implies that you have a limited number of elements, while lists can be arbitrarily large.
Next, if any iterable or sequence type would be acceptable, then that would include strings; a single string is also a sequence. Should then a single string argument be treated as separate characters, or as a single prefix?
So in other words, it's a limitation to self-document that the sequence won't be mutated, is consistent with other APIs, it carries an implication of a limited number of items to test against, and removes ambiguity as to how a single string argument should be treated.
Note that this was brought up before on the Python Ideas list; see this thread; Guido van Rossum's main argument there is that you either special case for single strings or for only accepting a tuple. He picked the latter and doesn't see a need to change this.
This has already been suggested on Python-ideas a couple of years back see: str.startswith taking any iterator instead of just tuple and GvR had this to say:
The current behavior is intentional, and the ambiguity of strings
themselves being iterables is the main reason. Since startswith() is
almost always called with a literal or tuple of literals anyway, I see
little need to extend the semantics.
In addition to that, there seemed to be no real motivation as to why to do this.
The current approach keeps things simple and fast,
unicode_startswith (and endswith) check for a tuple argument and then for a string one. They then call tailmatch in the appropriate direction. This is, arguably, very easy to understand in its current state, even for strangers to C code.
Adding other cases will only lead to more bloated and complex code for little benefit while also requiring similar changes to any other parts of the unicode object.
On a similar note, here is an excerpt from a talk by core developer, Raymond Hettinger discussing API design choices regarding certain string methods, including recent changes to the str.startswith signature. While he briefly mentions this fact that str.startswith accepts a string or tuple of strings and does not expound, the talk is informative on the decisions and pain points both core developers and contributors have dealt with leading up to the present API.

Is arr.__len__() the preferred way to get the length of an array in Python? [duplicate]

This question already has answers here:
How do I get the number of elements in a list (length of a list) in Python?
(11 answers)
Closed 13 days ago.
The community is reviewing whether to reopen this question as of 8 days ago.
In Python, is the following the only way to get the number of elements?
arr.__len__()
If so, why the strange syntax?
my_list = [1,2,3,4,5]
len(my_list)
# 5
The same works for tuples:
my_tuple = (1,2,3,4,5)
len(my_tuple)
# 5
And strings, which are really just arrays of characters:
my_string = 'hello world'
len(my_string)
# 11
It was intentionally done this way so that lists, tuples and other container types or iterables didn't all need to explicitly implement a public .length() method, instead you can just check the len() of anything that implements the 'magic' __len__() method.
Sure, this may seem redundant, but length checking implementations can vary considerably, even within the same language. It's not uncommon to see one collection type use a .length() method while another type uses a .length property, while yet another uses .count(). Having a language-level keyword unifies the entry point for all these types. So even objects you may not consider to be lists of elements could still be length-checked. This includes strings, queues, trees, etc.
The functional nature of len() also lends itself well to functional styles of programming.
lengths = map(len, list_of_containers)
The way you take a length of anything for which that makes sense (a list, dictionary, tuple, string, ...) is to call len on it.
l = [1,2,3,4]
s = 'abcde'
len(l) #returns 4
len(s) #returns 5
The reason for the "strange" syntax is that internally python translates len(object) into object.__len__(). This applies to any object. So, if you are defining some class and it makes sense for it to have a length, just define a __len__() method on it and then one can call len on those instances.
Just use len(arr):
>>> import array
>>> arr = array.array('i')
>>> arr.append('2')
>>> arr.__len__()
1
>>> len(arr)
1
Python uses duck typing: it doesn't care about what an object is, as long as it has the appropriate interface for the situation at hand. When you call the built-in function len() on an object, you are actually calling its internal __len__ method. A custom object can implement this interface and len() will return the answer, even if the object is not conceptually a sequence.
For a complete list of interfaces, have a look here: http://docs.python.org/reference/datamodel.html#basic-customization
The preferred way to get the length of any python object is to pass it as an argument to the len function. Internally, python will then try to call the special __len__ method of the object that was passed.
you can use len(arr)
as suggested in previous answers to get the length of the array. In case you want the dimensions of a 2D array you could use arr.shape returns height and width
len(list_name) function takes list as a parameter and it calls list's __len__() function.
Python suggests users use len() instead of __len__() for consistency, just like other guys said. However, There're some other benefits:
For some built-in types like list, str, bytearray and so on, the Cython implementation of len() takes a shortcut. It directly returns the ob_size in a C structure, which is faster than calling __len__().
If you are interested in such details, you could read the book called "Fluent Python" by Luciano Ramalho. There're many interesting details in it, and may help you understand Python more deeply.

Categories