Related
I've noticed that many operations on lists that modify the list's contents will return None, rather than returning the list itself. Examples:
>>> mylist = ['a', 'b', 'c']
>>> empty = mylist.clear()
>>> restored = mylist.extend(range(3))
>>> backwards = mylist.reverse()
>>> with_four = mylist.append(4)
>>> in_order = mylist.sort()
>>> without_one = mylist.remove(1)
>>> mylist
[0, 2, 4]
>>> [empty, restored, backwards, with_four, in_order, without_one]
[None, None, None, None, None, None]
What is the thought process behind this decision?
To me, it seems hampering, since it prevents "chaining" of list processing (e.g. mylist.reverse().append('a string')[:someLimit]). I imagine it might be that "The Powers That Be" decided that list comprehension is a better paradigm (a valid opinion), and so didn't want to encourage other methods - but it seems perverse to prevent an intuitive method, even if better alternatives exist.
This question is specifically about Python's design decision to return None from mutating list methods like .append. Novices often write incorrect code that expects .append (in particular) to return the same list that was just modified.
For the simple question of "how do I append to a list?" (or debugging questions that boil down to that problem), see Why does "x = x.append([i])" not work in a for loop?.
To get modified versions of the list, see:
For .sort: How can I get a sorted copy of a list?
For .reverse: How can I get a reversed copy of a list (avoid a separate statement when chaining a method after .reverse)?
The same issue applies to some methods of other built-in data types, e.g. set.discard (see How to remove specific element from sets inside a list using list comprehension) and dict.update (see Why doesn't a python dict.update() return the object?).
The same reasoning applies to designing your own APIs. See Is making in-place operations return the object a bad idea?.
The general design principle in Python is for functions that mutate an object in-place to return None. I'm not sure it would have been the design choice I'd have chosen, but it's basically to emphasise that a new object is not returned.
Guido van Rossum (our Python BDFL) states the design choice on the Python-Dev mailing list:
I'd like to explain once more why I'm so adamant that sort() shouldn't
return 'self'.
This comes from a coding style (popular in various other languages, I
believe especially Lisp revels in it) where a series of side effects
on a single object can be chained like this:
x.compress().chop(y).sort(z)
which would be the same as
x.compress()
x.chop(y)
x.sort(z)
I find the chaining form a threat to readability; it requires that the
reader must be intimately familiar with each of the methods. The
second form makes it clear that each of these calls acts on the same
object, and so even if you don't know the class and its methods very
well, you can understand that the second and third call are applied to
x (and that all calls are made for their side-effects), and not to
something else.
I'd like to reserve chaining for operations that return new values,
like string processing operations:
y = x.rstrip("\n").split(":").lower()
There are a few standard library modules that encourage chaining of
side-effect calls (pstat comes to mind). There shouldn't be any new
ones; pstat slipped through my filter when it was weak.
I can't speak for the developers, but I find this behavior very intuitive.
If a method works on the original object and modifies it in-place, it doesn't return anything, because there is no new information - you obviously already have a reference to the (now mutated) object, so why return it again?
If, however, a method or function creates a new object, then of course it has to return it.
So l.reverse() returns nothing (because now the list has been reversed, but the identfier l still points to that list), but reversed(l) has to return the newly generated list because l still points to the old, unmodified list.
EDIT: I just learned from another answer that this principle is called Command-Query separation.
One could argue that the signature itself makes it clear that the function mutates the list rather than returning a new one: if the function returned a list, its behavior would have been much less obvious.
If you were sent here after asking for help fixing your code:
In the future, please try to look for problems in the code yourself, by carefully studying what happens when the code runs. Rather than giving up because there is an error message, check the result of each calculation, and see where the code starts working differently from what you expect.
If you had code calling a method like .append or .sort on a list, you will notice that the return value is None, while the list is modified in place. Study the example carefully:
>>> x = ['e', 'x', 'a', 'm', 'p', 'l', 'e']
>>> y = x.sort()
>>> print(y)
None
>>> print(x)
['a', 'e', 'e', 'l', 'm', 'p', 'x']
y got the special None value, because that is what was returned. x changed, because the sort happened in place.
It works this way on purpose, so that code like x.sort().reverse() breaks. See the other answers to understand why the Python developers wanted it that way.
To fix the problem
First, think carefully about the intent of the code. Should x change? Do we actually need a separate y?
Let's consider .sort first. If x should change, then call x.sort() by itself, without assigning the result anywhere.
If a sorted copy is needed instead, use y = x.sorted(). See How can I get a sorted copy of a list? for details.
For other methods, we can get modified copies like so:
.clear -> there is no point to this; a "cleared copy" of the list is just an empty list. Just use y = [].
.append and .extend -> probably the simplest way is to use the + operator. To add multiple elements from a list l, use y = x + l rather than .extend. To add a single element e wrap it in a list first: y = x + [e]. Another way in 3.5 and up is to use unpacking: y = [*x, *l] for .extend, y = [*x, e] for .append. See also How to allow list append() method to return the new list for .append and How do I concatenate two lists in Python? for .extend.
.reverse -> First, consider whether an actual copy is needed. The built-in reversed gives you an iterator that can be used to loop over the elements in reverse order. To make an actual copy, simply pass that iterator to list: y = list(reversed(x)). See How can I get a reversed copy of a list (avoid a separate statement when chaining a method after .reverse)? for details.
.remove -> Figure out the index of the element that will be removed (using .index), then use slicing to find the elements before and after that point and put them together. As a function:
def without(a_list, value):
index = a_list.index(value)
return a_list[:index] + a_list[index+1:]
(We can translate .pop similarly to make a modified copy, though of course .pop actually returns an element from the list.)
See also A quick way to return list without a specific element in Python.
(If you plan to remove multiple elements, strongly consider using a list comprehension (or filter) instead. It will be much simpler than any of the workarounds needed for removing items from the list while iterating over it. This way also naturally gives a modified copy.)
For any of the above, of course, we can also make a modified copy by explicitly making a copy and then using the in-place method on the copy. The most elegant approach will depend on the context and on personal taste.
As we know list in python is a mutable object and one of characteristics of mutable object is the ability to modify the state of this object without the need to assign its new state to a variable. we should demonstrate more about this topic to understand the root of this issue.
An object whose internal state can be changed is mutable. On the other hand, immutable doesn’t allow any change in the object once it has been created. Object mutability is one of the characteristics that makes Python a dynamically typed language.
Every object in python has three attributes:
Identity – This refers to the address that the object refers to in the computer’s memory.
Type – This refers to the kind of object that is created. For example integer, list, string etc.
Value – This refers to the value stored by the object. For example str = "a".
While ID and Type cannot be changed once it’s created, values can be changed for Mutable objects.
let us discuss the below code step-by-step to depict what it means in Python:
Creating a list which contains name of cities
cities = ['London', 'New York', 'Chicago']
Printing the location of the object created in the memory address in hexadecimal format
print(hex(id(cities)))
Output [1]: 0x1691d7de8c8
Adding a new city to the list cities
cities.append('Delhi')
Printing the elements from the list cities, separated by a comma
for city in cities:
print(city, end=', ')
Output [2]: London, New York, Chicago, Delhi
Printing the location of the object created in the memory address in hexadecimal format
print(hex(id(cities)))
Output [3]: 0x1691d7de8c8
The above example shows us that we were able to change the internal state of the object cities by adding one more city 'Delhi' to it, yet, the memory address of the object did not change. This confirms that we did not create a new object, rather, the same object was changed or mutated. Hence, we can say that the object which is a type of list with reference variable name cities is a MUTABLE OBJECT.
While the immutable object internal state can not be changed. For instance, consider the below code and associated error message with it, while trying to change the value of a Tuple at index 0
Creating a Tuple with variable name foo
foo = (1, 2)
Changing the index 0 value from 1 to 3
foo[0] = 3
TypeError: 'tuple' object does not support item assignment
We can conclude from the examples why mutable object shouldn't return anything when executing operations on it because it's modifying the internal state of the object directly and there is no point in returning new modified object. unlike immutable object which should return new object of the modified state after executing operations on it.
First of All, I should tell that what I am suggesting is without a doubt, a bad programming practice but if you want to use append in lambda function and you don't care about the code readability, there is way to just do that.
Imagine you have a list of lists and you want to append a element to each inner lists using map and lambda. here is how you can do that:
my_list = [[1, 2, 3, 4],
[3, 2, 1],
[1, 1, 1]]
my_new_element = 10
new_list = list(map(lambda x: [x.append(my_new_element), x][1], my_list))
print(new_list)
How it works:
when lambda wants to calculate to output, first it should calculate the [x.append(my_new_element), x] expression. To calculate this expression the append function will run and the result of expression will be [None, x] and by specifying that you want the second element of the list the result of [None,x][1] will be x
Using custom function is more readable and the better option:
def append_my_list(input_list, new_element):
input_list.append(new_element)
return input_list
my_list = [[1, 2, 3, 4],
[3, 2, 1],
[1, 1, 1]]
my_new_element = 10
new_list = list(map(lambda x: append_my_list(x, my_new_element), my_list))
print(new_list)
Why does multiple assignment make distinct references for ints, but not lists or other objects?
>>> a = b = 1
>>> a += 1
>>> a is b
>>> False
>>> a = b = [1]
>>> a.append(1)
>>> a is b
>>> True
In the int example, you first assign the same object to both a and b, but then reassign a with another object (the result of a+1). a now refers to a different object.
In the list example, you assign the same object to both a and b, but then you don't do anything to change that. append only changes the interal state of the list object, not its identity. Thus they remain the same.
If you replace a.append(1) with a = a + [1], you end up with different object, because, again, you assign a new object (the result of a+[1]) to a.
Note that a+=[1] will behave differently, but that's a whole other question.
primitive types are immutable. When a += 1 runs, a no longer refers to the memory location as b:
https://docs.python.org/2/library/functions.html#id
CPython implementation detail: This is the address of the object in memory.
In [1]: a = b = 100000000000000000000000000000
print id(a), id(b)
print a is b
Out [1]: 4400387016 4400387016
True
In [2]: a += 1
print id(a), id(b)
print a is b
Out [2]: 4395695296 4400387016
False
Python works differently when changing values of mutable object and immutable object
Immutable objects:
This are objects whose values which dose not after initialization
i.e.)int,string,tuple
Mutable Objects
This are objects whose values which can be after initialization
i.e.)All other objects are mutable like dist,list and user defined object
When changing the value of mutable object it dose not create a new memory space and transfer there it just changes the memory space where it was created
But it is exactly the opposite for immutable objects that is it creates a new space and transfer itself there
i.e.)
s="awe"
s[0]="e"
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-19-9f16ce5bbc72> in <module>()
----> 1 s[0]="e"
TypeError: 'str' object does not support item assignment
This is trying to tell u that you can change the value of the string memory
you could do this
"e"+s[1:]
Out[20]: 'ewe'
This creates a new memory space and allocates the string there .
Like wise making A=B=1 and changing A A=2 will create a new memory space and variable A will reference to that location so that's why B's value is not changed when changing value of A
But this not the case in List since it is a mutable object changing the value does not transfer it to a new memory location it just expands the used memory
i.e.)
a=b=[]
a.append(1)
print a
[1]
print b
[1]
Both gives the same value since it is referencing the same memory space so both are equal
The difference is not in the multiple assignment, but in what you subsequently do with the objects. With the int, you do +=, and with the list you do .append.
However, even if you do += for both, you won't necessarily see the same result, because what += does depends on what type you use it on.
So that is the basic answer: operations like += may work differently on different types. Whether += returns a new object or modifies the existing object is behavior that is defined by that object. To know what the behavior is, you need to know what kind of object it is and what behavior it defines (i.e., the documentation). All the more, you cannot assume that using an operation like += will have the same result as using a method like .append. What a method like .append does is defined by the object you call it on.
I am new to python and i was just reading up about lists. I have been trying to find out if a list is a variable
e.g. Hello = []
This is because I read that you assign a variable by using the '=' sign. Or, am I just assigning the empty list a name in the example above.
No. A list is an object. You assign a list to a name-reference with =.
Thus a = [1,2] produces a which is a name-reference (a pointer essentially) to the underlying list object which you see by looking at globals().
>>> a = [1,2]
>>> globals()
{'a': [1, 2], '__builtins__': <module '__builtin__' (built-in)>, '__package__': None, '__name__': '__main__', '__doc__': None}
A list is an instance of a ListType, which is a subclass of an object.
>>> import types
>>> types.ListType.mro()
[<type 'list'>, <type 'object'>]
>>> object
<type 'object'>
>>> b = types.ListType()
>>> b
[]
In Python, the concept of object is quite important (as other users might have pointed out already, I am being slow!).
You can think of list as a list (or actually, an Object) of elements. As a matter of fact, list is a Variable-sized object that represents a collection of items. Python lists are a bit special because you can have mixed types of elements in a list (e.g. strings with int)But at the same time, you can also argue,"What about set, map, tuple, etc.?". As an example,
>>> p = [1,2,3,'four']
>>> p
[1, 2, 3, 'four']
>>> isinstance(p[1], int)
True
>>> isinstance(p[3], str)
True
>>>
In a set, you can vary the size of the set - yes. In that respect, set is a variable that contains unique items - if that satisfies you....
In this way, a map is also a "Variable" sized key-value pair where every unique key has a value mapped to it. Same goes true for dictionary.
If you are curious because of the = sign - you have already used a keyword in your question; "Assignment". In all the high level languages (well most of them anyway), = is the assignment operator where you have a variable name on lhs and a valid value (either a variable of identical type/supertype, or a valid value).
The line:
Hello = []
creates a new, empty list object (instance of the list class), and assigns a reference to that object to the name (or "identifier") Hello.
You can then access that list object via the name Hello (as long as it is accessible in the current scope), e.g.:
Hello.append('World')
Lists are mutable, i.e. they can be changed in-place, in this case by appending (adding) a new string object to the list referenced by the name Hello. This may be what you meant by "variable".
For more on names in Python, see http://nedbatchelder.com/text/names.html.
For more on lists (and Python's other built-in sequence types), see the official docs.
Python lists are Data structure which enables you to hold a collection of data items. Data structures are objects or programming types which enable you to store any kind of data been integer, string etc. in memory or permanently on hard disk.
In your case, you have defined a list structure - which will store collection of data items referenced by the symbol hello. The symbol hello is termed as variable in programming.
Using a variable, you can reference your list from any part of your program to locate and access its member items.
Example:
hello = [1, 3, 100]
Calling hello with its Nth index starting at 0 will access and return the value placed within the Nth index location.
print hello[0]
Which will output 1 .
See Array data structure for more examples.
There are multiple iterator classes depending on what you're iterating over:
>>> import re
>>> re.finditer("\d+", "1 ha 2 bah").__class__
<type 'callable-iterator'>
>>> iter([1, 2]).__class__
<type 'listiterator'>
>>> iter("hurm").__class__
<type 'iterator'>
Two questions:
Is there any meaningful distinction between them?
Why is the first one called a callable-iterator? You definitely cannot call it.
BrenBarn answers #1 quite delightfully, but I believe I have unlocked the mysteries of #2. To wit, a callable-iterator is that which is returned for using iter with its second form:
>>> help(iter)
iter(...)
iter(collection) -> iterator
iter(callable, sentinel) -> iterator
Get an iterator from an object. In the first form, the argument must
supply its own iterator, or be a sequence.
In the second form, the callable is called until it returns the sentinel.
To wit:
>>> def globals_are_bad_mmkay():
global foo
foo += 1
return foo
>>> foo = 0
>>> it = iter(globals_are_bad_mmkay, 10)
>>> it
<callable-iterator object at 0x021609B0>
>>> list(it)
[1, 2, 3, 4, 5, 6, 7, 8, 9]
Being an iterator means implementing the iterator protocol, not being a member of a particular class -- an iterator is as an iterator does. You can write your own custom classes that are iterators, and they won't be any of those classes you list.
From the point of view of "being an iterator", there is no difference between them. They are all iterators, and that just means you can iterate over them. There are might of course be other differences between them -- they might have additional methods or behavior defined -- but as iterators qua iterators they are the same.
You can view an iterator as some kind of doodad that "knows how" to iterate over a particular data structure. Different kinds of data structures might have their own custom classes for iterating over them; these iterators may do different things under the hood, but all share the same public interface (the iterator protocol).
a = "Stack"
aList = list(a)
This gives me an array like this ['S','t',a','c','k']
I want to know how this list(string) function works!
A string is an iterable type. For example, if you do this:
for c in 'string':
print c
You get
s
t
r
i
n
g
So passing a string to list just iterates over the characters in the string, and places each one in the list.
String is iterable in python because you can do
>>> for ch in "abc":
... print ch
...
a
b
c
>>>
and if you take a look at list class construtor using help("list") in your python interpreter
class list(object)
| list() -> new empty list
| list(iterable) -> new list initialized from iterable's items
So, list("hello") returns a new list initialized.
>>> x = list("hello")
>>> type(x)
<type 'list'>
>>>
This works because str implements __iter__, which allows iter(obj) to be called on the object. For str, the implementation returns character by character.
Most of the work is actually being done by the string. You are actually using the constructor of the list type. In Python, we are usually not so concerned with what things are, but rather with what they can do.
What a string can do is it can give you (e.g. for the purpose of using a for loop) its characters (actually, length-1 substrings; there isn't a separate character type) one at a time. The list constructor expects any object that can do this, and constructs a list that contains the elements thus provided.