[].append(x) behaviour - python

This executes as I'd expect:
>>>x=[]
>>>x.append(3)
>>>x
[3]
Why does the following return None?
>>>x = [].append(3)
>>>x
>>>

because list.append changes the list itself and returns None ;)
You can use help to see the docstring of a function or method:
In [11]: help(list.append)
Help on method_descriptor:
append(...)
L.append(object) -- append object to end
EDIT:
This is explained in docs of python3:
Some collection classes are mutable. The methods that add, subtract, or rearrange
their members in place, and don’t return a specific item, never return the
collection instance itself but None.

If the question is "why is x None?", then it's because list.append returns None as others say.
If the question is "why was append designed to return None instead of self?" then ultimately it is because of Guido van Rossum's decision, which he explains here as it applies to a related case:
https://mail.python.org/pipermail/python-dev/2003-October/038855.html
I'd like to explain once more why I'm so adamant that sort() shouldn't
return 'self'.
This comes from a coding style (popular in various other languages, I
believe especially Lisp revels in it) where a series of side effects
on a single object can be chained like this:
x.compress().chop(y).sort(z)
which would be the same as
x.compress()
x.chop(y)
x.sort(z)
I find the chaining form a threat to readability;
In short, he doesn't like methods that return self because he doesn't like method chaining. To prevent you from chaining methods like x.append(3).append(4).append(5) he returns None from append.
I speculate that this could perhaps be considered a specific case of the more general principle that distinguishes between:
pure functions, that have no side-effects and return values
procedures, that have side-effects and do not return values
Of course the Python language doesn't make any such distinction, and the Python libraries do not apply the general principle. For example list.pop() has a side-effect and returns a value, but since the value it returns isn't (necessarily) self it doesn't violate GvR's more specific rule.

The append method of lists returns None. It only modifies the list it is bounded to. Same will happen with:
x = {}.update(a=3)

The method list.append() changes the list inplace, and returns no result (so returns None if you prefer)
Many methods of list work on the list inplace, so you doesn't need to reassign a new list to override the old one.
>>> lst = []
>>> id(lst)
4294245644L
>>> lst.append(1)
>>> id(lst)
4294245644L # <-- same object, doesn't change.
With [].append(1), you're adding 1 to a list freshly created, and you have no reference on this one. So once the append is done, you have lost the list (and will be collected by the garbage collector).
By the way, fun fact, to make sense to my answer:
>>> id([].append(1))
1852276280
>>> id(None)
1852276280

Related

Why does Python return None on list.reverse()?

Was solving an algorithms problem and had to reverse a list.
When done, this is what my code looked like:
def construct_path_using_dict(previous_nodes, end_node):
constructed_path = []
current_node = end_node
while current_node:
constructed_path.append(current_node)
current_node = previous_nodes[current_node]
constructed_path = reverse(constructed_path)
return constructed_path
But, along the way, I tried return constructed_path.reverse() and I realized it wasn't returning a list...
Why was it made this way?
Shouldn't it make sense that I should be able to return a reversed list directly, without first doing list.reverse() or list = reverse(list) ?
What I'm about to write was already said here, but I'll write it anyway because I think it will perhaps add some clarity.
You're asking why the reverse method doesn't return a (reference to the) result, and instead modifies the list in-place. In the official python tutorial, it says this on the matter:
You might have noticed that methods like insert, remove or sort that only modify the list have no return value printed – they return the default None. This is a design principle for all mutable data structures in Python.
In other words (or at least, this is the way I think about it) - python tries to mutate in-place where-ever possible (that is, when dealing with an immutable data structure), and when it mutates in-place, it doesn't also return a reference to the list - because then it would appear that it is returning a new list, when it is really returning the old list.
To be clear, this is only true for object methods, not functions that take a list, for example, because the function has no way of knowing whether or not it can mutate the iterable that was passed in. Are you passing a list or a tuple? The function has no way of knowing, unlike an object method.
list.reverse reverses in place, modifying the list it was called on. Generally, Python methods that operate in place don’t return what they operated on to avoid confusion over whether the returned value is a copy.
You can reverse and return the original list:
constructed_path.reverse()
return constructed_path
Or return a reverse iterator over the original list, which isn’t a list but doesn’t involve creating a second list just as big as the first:
return reversed(constructed_path)
Or return a new list containing the reversed elements of the original list:
return constructed_path[::-1]
# equivalent: return list(reversed(constructed_path))
If you’re not concerned about performance, just pick the option you find most readable.
methods like insert, remove or sort that only modify the list have no return value printed – they return the default None. 1 This is a design principle for all mutable data structures in Python.
PyDocs 5.1
As I understand it, you can see the distinction quickly by comparing the differences returned by modifying a list (mutable) ie using list.reverse() and mutating a list that's an element within a tuple (non-mutable), while calling
id(list)
id(tuple_with_list)
before and after the mutations. Mutable data-type mutations returning none is part allowing them to be changed/expanded/pointed-to-by-multiple references without reallocating memory.

Should I ever return a list that was passed by reference and modified?

I have recently discovered that lists in python are automatically passed by reference (unless the notation array[:] is used). For example, these two functions do the same thing:
def foo(z):
z.append(3)
def bar(z):
z.append(3)
return z
x = [1, 2]
y = [1, 2]
foo(x)
bar(y)
print(x, y)
Before now, I always returned arrays that I manipulated, because I thought I had to. Now, I understand it's superfluous (and perhaps inefficient), but it seems like returning values is generally good practice for code readability. My question is, are there any issues for doing either of these methods/ what are the best practices? Is there a third option that I am missing? I'm sorry if this has been asked before but I couldn't find anything that really answers my question.
This answer works on the assumption that the decision as to whether to modify your input in-place or return a copy has already been made.
As you noted, whether or not to return a modified object is a matter of opinion, since the result is functionally equivalent. In general, it is considered good form to not return a list that is modified in-place. According to the Zen of Python (item #2):
Explicit is better than implicit.
This is borne out in the standard library. List methods are notorious for this on SO: list.append, insert, extend, list.sort, etc.
Numpy also uses this pattern frequently, since it often deals with large data sets that would be impractical to copy and return. A common example is the array method numpy.ndarray.sort, not to be confused with the top-level function numpy.sort, which returns a new copy.
The idea is something that is very much a part of the Python way of thinking. Here is an excerpt from Guido's email that explains the whys and wherefors:
I find the chaining form a threat to readability; it requires that the reader must be intimately familiar with each of the methods. The second [unchained] form makes it clear that each of these calls acts on the same object, and so even if you don't know the class and its methods very well, you can understand that the second and third call are applied to x (and that all calls are made for their side-effects), and not to something else.
Python built-ins, as a rule, will not do both, to avoid confusion over whether the function/method modifies its argument in place or returns a new value. When modifying in place, no return is performed (making it implicitly return None). The exceptions are cases where a mutating function returns something other than the object mutated (e.g. dict.pop, dict.setdefault).
It's generally a good idea to follow the same pattern, to avoid confusion.
The "best practice" is technically to not modify the thing at all:
def baz(z):
return z + [3]
x = [1, 2]
y = baz(x)
print(x, y)
but in general it's clearer if you restrict yourself to either returning a new object or modifying an object in-place, but not both at once.
There are examples in the standard library that both modify an object in-place and return something (the foremost example being list.pop()), but that's a special case because it's not returning the object that was modified.
There's not strict should of course, However, a function should either do something, or return something.. So, you'd better either modify the list in place without returning anything, or return a new one, leaving the original one unchanged.
Note: the list is not exactly passed by reference. It's the value of the reference that is actually passed. Keep that in mind if you re-assign

Python methods: modify original vs return a different object

I'm new to Python and object orient programming, and have a very basic 101 question:
I see some methods return a modified object, and preserve the original:
In: x="hello"
In: x.upper()
Out: 'HELLO'
In: x
Out: 'hello'
I see other methods modify and overwrite the original object:
In: y=[1,2,3]
In: y.pop(0)
Out: 1
In: y
Out: [2, 3]
Are either of these the norm? Is there a way to know which case I am dealing with for a given class and method?
Your examples show the difference between immutable built-in objects (e.g., strings and tuples) and mutable objects (e.g., lists, dicts, and sets).
In general, if a class (object) is described as immutable, you should expect the former behavior, and the latter for mutable objects.
Both of these are idiomatic in Python, although list.pop() is a slightly special case.
In general, methods in Python either mutate the object or return a value. list.pop() is a little unusual in that, by definition, it must do both: remove an item from the list, and return it to you.
What is not common in Python, although it is in other languages, is to mutate an object and then return that same object - which would allow for methods to be chained together like so:
shape.stretch(x=2).move(3, 5)
... but can cause programs to be harder to debug.
If an object is immutable, like a string, you can be sure that a method won't mutate it (because, by definition, it can't). Failing that, the only way to tell whether a method mutates its object is to read the documentation (normally excellent for Python's built-in and standard library objects), or, of course, the source.

Python: emulate C-style pass-by-reference for variables

I have a framework with some C-like language. Now I'm re-writing that framework and the language is being replaced with Python.
I need to find appropriate Python replacement for the following code construction:
SomeFunction(&arg1)
What this does is a C-style pass-by-reference so the variable can be changed inside the function call.
My ideas:
just return the value like v = SomeFunction(arg1)
is not so good, because my generic function can have a lot of arguments like SomeFunction(1,2,'qqq','vvv',.... and many more)
and I want to give the user ability to get the value she wants.
Return the collection of all the arguments no matter have they changed or not, like: resulting_list = SomeFunction(1,2,'qqq','vvv',.... and many more) interesting_value = resulting_list[3]
this can be improved by giving names to the values and returning dictionary interesting_value = resulting_list['magic_value1']
It's not good because we have constructions like
DoALotOfStaff( [SomeFunction1(1,2,3,&arg1,'qq',val2),
SomeFunction2(1,&arg2,v1),
AnotherFunction(),
...
], flags1, my_var,... )
And I wouldn't like to load the user with list of list of variables, with names or indexes she(the user) should know. The kind-of-references would be very useful here ...
Final Response
I compiled all the answers with my own ideas and was able to produce the solution. It works.
Usage
SomeFunction(1,12, get.interesting_value)
AnotherFunction(1, get.the_val, 'qq')
Explanation
Anything prepended by get. is kind-of reference, and its value will be filled by the function. There is no need in previous defining of the value.
Limitation - currently I support only numbers and strings, but these are sufficient form my use-case.
Implementation
wrote a Getter class which overrides getattribute and produces any variable on demand
all newly created variables has pointer to their container Getter and support method set(self,value)
when set() is called it checks if the value is int or string and creates object inheriting from int or str accordingly but with addition of the same set() method. With this new object we replace our instance in the Getter container
Thank you everybody. I will mark as "answer" the response which led me on my way, but all of you helped me somehow.
I would say that your best, cleanest, bet would be to construct an object containing the values to be passed and/or modified - this single object can be passed, (and will automatically be passed by reference), in as a single parameter and the members can be modified to return the new values.
This will simplify the code enormously and you can cope with optional parameters, defaults, etc., cleanly.
>>> class C:
... def __init__(self):
... self.a = 1
... self.b = 2
...
>>> c=C
>>> def f(o):
... o.a = 23
...
>>> f(c)
>>> c
<class __main__.C at 0x7f6952c013f8>
>>> c.a
23
>>>
Note
I am sure that you could extend this idea to have a class of parameter that carried immutable and mutable data into your function with fixed member names plus storing the names of the parameters actually passed then on return map the mutable values back into the caller parameter name. This technique could then be wrapped into a decorator.
I have to say that it sounds like a lot of work compared to re-factoring your existing code to a more object oriented design.
This is how Python works already:
def func(arg):
arg += ['bar']
arg = ['foo']
func(arg)
print arg
Here, the change to arg automatically propagates back to the caller.
For this to work, you have to be careful to modify the arguments in place instead of re-binding them to new objects. Consider the following:
def func(arg):
arg = arg + ['bar']
arg = ['foo']
func(arg)
print arg
Here, func rebinds arg to refer to a brand new list and the caller's arg remains unchanged.
Python doesn't come with this sort of thing built in. You could make your own class which provides this behavior, but it will only support a slightly more awkward syntax where the caller would construct an instance of that class (equivalent to a pointer in C) before calling your functions. It's probably not worth it. I'd return a "named tuple" (look it up) instead--I'm not sure any of the other ways are really better, and some of them are more complex.
There is a major inconsistency here. The drawbacks you're describing against the proposed solutions are related to such subtle rules of good design, that your question becomes invalid. The whole problem lies in the fact that your function violates the Single Responsibility Principle and other guidelines related to it (function shouldn't have more than 2-3 arguments, etc.). There is really no smart compromise here:
either you accept one of the proposed solutions (i.e. Steve Barnes's answer concerning your own wrappers or John Zwinck's answer concerning usage of named tuples) and refrain from focusing on good design subtleties (as your whole design is bad anyway at the moment)
or you fix the design. Then your current problem will disappear as you won't have the God Objects/Functions (the name of the function in your example - DoALotOfStuff really speaks for itself) to deal with anymore.

Are python methods chainable?

s = set([1,2,3])
I should be elegant to do the following:
a.update(s).update(s)
I doesn't work as I thought make a contains set([1,2,3,1,2,3,1,2,3])
So I'm wandering that Does Python advocate this chainable practise?
set.update() returns None so you can't chain updates like that
The usual rule in Python is that methods that mutate don't return the object
contrast with methods on immutable objects, which obviously must return a new object such as str.replace() which can be chained
It depends.
Methods that modify an object usually return None so you can't call a sequence of methods like this:
L.append(2).append(3).append(4)
And hope to have the same effect as:
L.append(2)
L.append(3)
L.append(4)
You'll probably get an AttributeError because the first call to append returns None and None does not have an append method.
Methods that creates new object returns that object, so for example:
"some string".replace("a", "b").replace("c", "d")
Is perfectly fine.
Usually immutable objects return new objects, while mutable ones return None but it depends on the method and the specific object.
Anyway it's certainly not a feature of the language itself but only a way to implement the methods. Chainable methods can be implemented in probably any language so the question "are python methods chainable" does not make much sense.
The only reasonable question would be "Are python methods always/forced to be/etc. chainable?", and the answer to this question is "No".
In your example set can only contain unique items, so the result that you show does not make any sense. You probably wanted to use a simple list.
And update method does not return you a set, rather a None value.
So, you cannot invoke another method update in chain on NoneType
So, this will anyways give you error..
a.update(s).update(s)
However, since a Set can contain only unique values. So, even if you separate your update on different lines, you won't get a Set like you want..
Yes, you can chain method calls in Python. As to whether it's good practice, there are a number of libraries out there which advocate using chained calls. For example, SQLAlchemy's tutorial makes extensive use of this style of coding. You frequently encounter code snippets like
session.query(User).filter(User.name.in_(['Edwardo', 'fakeuser'])).all()
A sensible guideline to adopt is to ask yourself whether it'll make the code easier to read and maintain. Always strive to make code readable.
I write a simple example, chainable methods should always return an object ,like self,
class Chain(object):
"""Chain example"""
def __init__(self):
self._content = ''
def update(self, new_content):
"""method of appending content"""
self._content += new_content
return self
def __str__(self):
return self._content
>>> c = Chain()
>>> c.update('str1').update('str2').update('str3')
>>> print c
str1str2str3

Categories