This question already has answers here:
What does "list comprehension" and similar mean? How does it work and how can I use it?
(5 answers)
Closed 7 months ago.
I saw some code like:
foo = [x for x in bar if x.occupants > 1]
What does this mean, and how does it work?
The current answers are good, but do not talk about how they are just syntactic sugar to some pattern that we are so used to.
Let's start with an example, say we have 10 numbers, and we want a subset of those that are greater than, say, 5.
>>> numbers = [12, 34, 1, 4, 4, 67, 37, 9, 0, 81]
For the above task, the below approaches below are totally identical to one another, and go from most verbose to concise, readable and pythonic:
Approach 1
result = []
for index in range(len(numbers)):
if numbers[index] > 5:
result.append(numbers[index])
print result #Prints [12, 34, 67, 37, 9, 81]
Approach 2 (Slightly cleaner, for-in loops)
result = []
for number in numbers:
if number > 5:
result.append(number)
print result #Prints [12, 34, 67, 37, 9, 81]
Approach 3 (Enter List Comprehension)
result = [number for number in numbers if number > 5]
or more generally:
[function(number) for number in numbers if condition(number)]
where:
function(x) takes an x and transforms it into something useful (like for instance: x*x)
if condition(x) returns any False-y value (False, None, empty string, empty list, etc ..) then the current iteration will be skipped (think continue). If the function return a non-False-y value then the current value makes it to the final resultant array (and goes through the transformation step above).
To understand the syntax in a slightly different manner, look at the Bonus section below.
For further information, follow the tutorial all other answers have linked: List Comprehension
Bonus
(Slightly un-pythonic, but putting it here for sake of completeness)
The example above can be written as:
result = filter(lambda x: x > 5, numbers)
The general expression above can be written as:
result = map(function, filter(condition, numbers)) #result is a list in Py2
It's a list comprehension
foo will be a filtered list of bar containing the objects with the attribute occupants > 1
bar can be a list, set, dict or any other iterable
Here is an example to clarify
>>> class Bar(object):
... def __init__(self, occupants):
... self.occupants = occupants
...
>>> bar=[Bar(0), Bar(1), Bar(2), Bar(3)]
>>> foo = [x for x in bar if x.occupants > 1]
>>> foo
[<__main__.Bar object at 0xb748516c>, <__main__.Bar object at 0xb748518c>]
So foo has 2 Bar objects, but how do we check which ones they are? Lets add a __repr__ method to Bar so it is more informative
>>> Bar.__repr__=lambda self:"Bar(occupants={0})".format(self.occupants)
>>> foo
[Bar(occupants=2), Bar(occupants=3)]
Since the programming part of question is fully answered by others it is nice to know its relation to mathematics (set theory). Actually it is the Python implementation of Set builder notation:
Defining a set by axiom of specification:
B = { x є A : S(x) }
English translation: B is a set where its members are chosen from A,
so B is a subset of A (B ⊂ A), where characteristic(s) specified by
function S holds: S(x) == True
Defining B using list comprehension:
B = [x for x in A if S(x)]
So to build B with list comprehension, member(s) of B (denoted by x) are chosen from set A where S(x) == True (inclusion condition).
Note: Function S which returns a boolean is called predicate.
This return a list which contains all the elements in bar which have occupants > 1.
The way this should work as far as I can tell is it checks to see if the list "bar" is empty (0) or consists of a singleton (1) via x.occupants where x is a defined item within the list bar and may have the characteristic of occupants. So foo gets called, moves through the list and then returns all items that pass the check condition which is x.occupant.
In a language like Java, you'd build a class called "x" where 'x' objects are then assigned to an array or similar. X would have a Field called "occupants" and each index would be checked with the x.occupants method which would return the number that is assigned to occupant. If that method returned greater than 1 (We assume an int here as a partial occupant would be odd.) the foo method (being called on the array or similar in question.) would then return an array or similar as defined in the foo method for this container array or what have you. The elements of the returned array would be the 'x' objects in the first array thingie that fit the criteria of "Greater than 1".
Python has built-in methods via list comprehension to deal with this in a much more succinct and vastly simplified way. Rather than implementing two full classes and several methods, I write that one line of code.
Related
This question already has answers here:
What does "list comprehension" and similar mean? How does it work and how can I use it?
(5 answers)
Closed 7 months ago.
I saw some code like:
foo = [x for x in bar if x.occupants > 1]
What does this mean, and how does it work?
The current answers are good, but do not talk about how they are just syntactic sugar to some pattern that we are so used to.
Let's start with an example, say we have 10 numbers, and we want a subset of those that are greater than, say, 5.
>>> numbers = [12, 34, 1, 4, 4, 67, 37, 9, 0, 81]
For the above task, the below approaches below are totally identical to one another, and go from most verbose to concise, readable and pythonic:
Approach 1
result = []
for index in range(len(numbers)):
if numbers[index] > 5:
result.append(numbers[index])
print result #Prints [12, 34, 67, 37, 9, 81]
Approach 2 (Slightly cleaner, for-in loops)
result = []
for number in numbers:
if number > 5:
result.append(number)
print result #Prints [12, 34, 67, 37, 9, 81]
Approach 3 (Enter List Comprehension)
result = [number for number in numbers if number > 5]
or more generally:
[function(number) for number in numbers if condition(number)]
where:
function(x) takes an x and transforms it into something useful (like for instance: x*x)
if condition(x) returns any False-y value (False, None, empty string, empty list, etc ..) then the current iteration will be skipped (think continue). If the function return a non-False-y value then the current value makes it to the final resultant array (and goes through the transformation step above).
To understand the syntax in a slightly different manner, look at the Bonus section below.
For further information, follow the tutorial all other answers have linked: List Comprehension
Bonus
(Slightly un-pythonic, but putting it here for sake of completeness)
The example above can be written as:
result = filter(lambda x: x > 5, numbers)
The general expression above can be written as:
result = map(function, filter(condition, numbers)) #result is a list in Py2
It's a list comprehension
foo will be a filtered list of bar containing the objects with the attribute occupants > 1
bar can be a list, set, dict or any other iterable
Here is an example to clarify
>>> class Bar(object):
... def __init__(self, occupants):
... self.occupants = occupants
...
>>> bar=[Bar(0), Bar(1), Bar(2), Bar(3)]
>>> foo = [x for x in bar if x.occupants > 1]
>>> foo
[<__main__.Bar object at 0xb748516c>, <__main__.Bar object at 0xb748518c>]
So foo has 2 Bar objects, but how do we check which ones they are? Lets add a __repr__ method to Bar so it is more informative
>>> Bar.__repr__=lambda self:"Bar(occupants={0})".format(self.occupants)
>>> foo
[Bar(occupants=2), Bar(occupants=3)]
Since the programming part of question is fully answered by others it is nice to know its relation to mathematics (set theory). Actually it is the Python implementation of Set builder notation:
Defining a set by axiom of specification:
B = { x є A : S(x) }
English translation: B is a set where its members are chosen from A,
so B is a subset of A (B ⊂ A), where characteristic(s) specified by
function S holds: S(x) == True
Defining B using list comprehension:
B = [x for x in A if S(x)]
So to build B with list comprehension, member(s) of B (denoted by x) are chosen from set A where S(x) == True (inclusion condition).
Note: Function S which returns a boolean is called predicate.
This return a list which contains all the elements in bar which have occupants > 1.
The way this should work as far as I can tell is it checks to see if the list "bar" is empty (0) or consists of a singleton (1) via x.occupants where x is a defined item within the list bar and may have the characteristic of occupants. So foo gets called, moves through the list and then returns all items that pass the check condition which is x.occupant.
In a language like Java, you'd build a class called "x" where 'x' objects are then assigned to an array or similar. X would have a Field called "occupants" and each index would be checked with the x.occupants method which would return the number that is assigned to occupant. If that method returned greater than 1 (We assume an int here as a partial occupant would be odd.) the foo method (being called on the array or similar in question.) would then return an array or similar as defined in the foo method for this container array or what have you. The elements of the returned array would be the 'x' objects in the first array thingie that fit the criteria of "Greater than 1".
Python has built-in methods via list comprehension to deal with this in a much more succinct and vastly simplified way. Rather than implementing two full classes and several methods, I write that one line of code.
This question already has answers here:
What does "list comprehension" and similar mean? How does it work and how can I use it?
(5 answers)
Closed 8 months ago.
I saw some code like:
foo = [x for x in bar if x.occupants > 1]
What does this mean, and how does it work?
The current answers are good, but do not talk about how they are just syntactic sugar to some pattern that we are so used to.
Let's start with an example, say we have 10 numbers, and we want a subset of those that are greater than, say, 5.
>>> numbers = [12, 34, 1, 4, 4, 67, 37, 9, 0, 81]
For the above task, the below approaches below are totally identical to one another, and go from most verbose to concise, readable and pythonic:
Approach 1
result = []
for index in range(len(numbers)):
if numbers[index] > 5:
result.append(numbers[index])
print result #Prints [12, 34, 67, 37, 9, 81]
Approach 2 (Slightly cleaner, for-in loops)
result = []
for number in numbers:
if number > 5:
result.append(number)
print result #Prints [12, 34, 67, 37, 9, 81]
Approach 3 (Enter List Comprehension)
result = [number for number in numbers if number > 5]
or more generally:
[function(number) for number in numbers if condition(number)]
where:
function(x) takes an x and transforms it into something useful (like for instance: x*x)
if condition(x) returns any False-y value (False, None, empty string, empty list, etc ..) then the current iteration will be skipped (think continue). If the function return a non-False-y value then the current value makes it to the final resultant array (and goes through the transformation step above).
To understand the syntax in a slightly different manner, look at the Bonus section below.
For further information, follow the tutorial all other answers have linked: List Comprehension
Bonus
(Slightly un-pythonic, but putting it here for sake of completeness)
The example above can be written as:
result = filter(lambda x: x > 5, numbers)
The general expression above can be written as:
result = map(function, filter(condition, numbers)) #result is a list in Py2
It's a list comprehension
foo will be a filtered list of bar containing the objects with the attribute occupants > 1
bar can be a list, set, dict or any other iterable
Here is an example to clarify
>>> class Bar(object):
... def __init__(self, occupants):
... self.occupants = occupants
...
>>> bar=[Bar(0), Bar(1), Bar(2), Bar(3)]
>>> foo = [x for x in bar if x.occupants > 1]
>>> foo
[<__main__.Bar object at 0xb748516c>, <__main__.Bar object at 0xb748518c>]
So foo has 2 Bar objects, but how do we check which ones they are? Lets add a __repr__ method to Bar so it is more informative
>>> Bar.__repr__=lambda self:"Bar(occupants={0})".format(self.occupants)
>>> foo
[Bar(occupants=2), Bar(occupants=3)]
Since the programming part of question is fully answered by others it is nice to know its relation to mathematics (set theory). Actually it is the Python implementation of Set builder notation:
Defining a set by axiom of specification:
B = { x є A : S(x) }
English translation: B is a set where its members are chosen from A,
so B is a subset of A (B ⊂ A), where characteristic(s) specified by
function S holds: S(x) == True
Defining B using list comprehension:
B = [x for x in A if S(x)]
So to build B with list comprehension, member(s) of B (denoted by x) are chosen from set A where S(x) == True (inclusion condition).
Note: Function S which returns a boolean is called predicate.
This return a list which contains all the elements in bar which have occupants > 1.
The way this should work as far as I can tell is it checks to see if the list "bar" is empty (0) or consists of a singleton (1) via x.occupants where x is a defined item within the list bar and may have the characteristic of occupants. So foo gets called, moves through the list and then returns all items that pass the check condition which is x.occupant.
In a language like Java, you'd build a class called "x" where 'x' objects are then assigned to an array or similar. X would have a Field called "occupants" and each index would be checked with the x.occupants method which would return the number that is assigned to occupant. If that method returned greater than 1 (We assume an int here as a partial occupant would be odd.) the foo method (being called on the array or similar in question.) would then return an array or similar as defined in the foo method for this container array or what have you. The elements of the returned array would be the 'x' objects in the first array thingie that fit the criteria of "Greater than 1".
Python has built-in methods via list comprehension to deal with this in a much more succinct and vastly simplified way. Rather than implementing two full classes and several methods, I write that one line of code.
Used a loop to add a bunch of elements to a list with
mylist = []
for x in otherlist:
mylist.append(x[0:5])
But instead of the expected result ['x1','x2',...], I got: [u'x1', u'x2',...]. Where did the u's come from and why? Also is there a better way to loop through the other list, inserting the first six characters of each element into a new list?
The u means unicode, you probably will not need to worry about it
mylist.extend(x[:5] for x in otherlist)
The u means unicode. It's Python's internal string representation (from version ... ?).
Most times you don't need to worry about it. (Until you do.)
The answers above me already answered the "u" part - that the string is encoded in Unicode. About whether there's a better way to extract the first 6 letters from the items in a list:
>>> a = ["abcdefgh", "012345678"]
>>> b = map(lambda n: n[0:5], a);
>>> for x in b:
print(x)
abcde
01234
So, map applies a function (lambda n: n[0:5]) to each element of a and returns a new list with the results of the function for every element. More precisely, in Python 3, it returns an iterator, so the function gets called only as many times as needed (i.e. if your list has 5000 items, but you only pull 10 from the result b, lambda n: n[0:5] gets called only 10 times). In Python2, you need to use itertools.imap instead.
>>> a = [1, 2, 3]
>>> def plusone(x):
print("called with {}".format(x))
return x + 1
>>> b = map(plusone, a)
>>> print("first item: {}".format(b.__next__()))
called with 1
first item: 2
Of course, you can apply the function "eagerly" to every element by calling list(b), which will give you a normal list with the function applied to each element on creation.
>>> b = map(plusone, a)
>>> list(b)
called with 1
called with 2
called with 3
[2, 3, 4]
This question already has answers here:
How do I pass a variable by reference?
(39 answers)
Closed 8 months ago.
I am not sure I understand the concept of Python's call by object style of passing function arguments (explained here http://effbot.org/zone/call-by-object.htm). There don't seem to be enough examples to clarify this concept well (or my google-fu is probably weak! :D)
I wrote this little contrived Python program to try to understand this concept
def foo( itnumber, ittuple, itlist, itdict ):
itnumber +=1
print id(itnumber) , itnumber
print id(ittuple) , ittuple
itlist.append(3.4)
print id(itlist) , itlist
itdict['mary'] = 2.3
print id(itdict), itdict
# Initialize a number, a tuple, a list and a dictionary
tnumber = 1
print id( tnumber ), tnumber
ttuple = (1, 2, 3)
print id( ttuple ) , ttuple
tlist = [1, 2, 3]
print id( tlist ) , tlist
tdict = tel = {'jack': 4098, 'sape': 4139}
print '-------'
# Invoke a function and test it
foo(tnumber, ttuple, tlist , tdict)
print '-------'
#Test behaviour after the function call is over
print id(tnumber) , tnumber
print id(ttuple) , ttuple
print id(tlist) , tlist
print id(tdict), tdict
The output of the program is
146739376 1
3075201660 (1, 2, 3)
3075103916 [1, 2, 3]
3075193004 {'sape': 4139, 'jack': 4098}
---------
146739364 2
3075201660 (1, 2, 3)
3075103916 [1, 2, 3, 3.4]
3075193004 {'sape': 4139, 'jack': 4098, 'mary': 2.3}
---------
146739376 1
3075201660 (1, 2, 3)
3075103916 [1, 2, 3, 3.4]
3075193004 {'sape': 4139, 'jack': 4098, 'mary': 2.3}
As you can see , except for the integer that was passed, the object id's (which as I understand refers to memeory location) remain unchanged.
So in the case of the integer, it was (effectively) passed by value and the other data structure were (effectively) passed by reference. I tried changing the list , the number and the dictionary to just test if the data-structures were changed in place. The number was not bu the list and the
dictionary were.
I use the word effectively above, since the 'call-by-object' style of argument passing seems to behave both ways depending on the data-structure passed in the above code
For more complicated data structures, (say numpy arrays etc), is there any quick rule of thumb to
recognize which arguments will be passed by reference and which ones passed by value?
The key difference is that in C-style language, a variable is a box in memory in which you put stuff. In Python, a variable is a name.
Python is neither call-by-reference nor call-by-value. It's something much more sensible! (In fact, I learned Python before I learned the more common languages, so call-by-value and call-by-reference seem very strange to me.)
In Python, there are things and there are names. Lists, integers, strings, and custom objects are all things. x, y, and z are names. Writing
x = []
means "construct a new thing [] and give it the name x". Writing
x = []
foo = lambda x: x.append(None)
foo(x)
means "construct a new thing [] with name x, construct a new function (which is another thing) with name foo, and call foo on the thing with name x". Now foo just appends None to whatever it received, so this reduces to "append None to the the empty list". Writing
x = 0
def foo(x):
x += 1
foo(x)
means "construct a new thing 0 with name x, construct a new function foo, and call foo on x". Inside foo, the assignment just says "rename x to 1 plus what it used to be", but that doesn't change the thing 0.
Others have already posted good answers. One more thing that I think will help:
x = expr
evaluates expr and binds x to the result. On the other hand:
x.operate()
does something to x and hence can change it (resulting in the same underlying object having a different value).
The funny cases come in with things like:
x += expr
which translate into either x = x + expr (rebinding) or x.__iadd__(expr) (modifying), sometimes in very peculiar ways:
>>> x = 1
>>> x += 2
>>> x
3
(so x was rebound, since integers are immutable)
>>> x = ([1], 2)
>>> x
([1], 2)
>>> x[0] += [3]
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: 'tuple' object does not support item assignment
>>> x
([1, 3], 2)
Here x[0], which is itself mutable, was mutated in-place; but then Python also attempted to mutate x itself (as with x.__iadd__), which errored-out because tuples are immutable. But by then x[0] was already mutated!
Numbers, strings, and tuples in Python are immutable; using augmented assignment will rebind the name.
Your other types are merely mutated, and remain the same object.
I'm confused about "x" in the python code below.
>>> # Grocery list
... grocery_list = ['apples', 'bananas', 'oranges', 'milk']
>>> for x in grocery_list:
... print(x, len(x))
I am confused about x's role in the for statement above. Is "x" a variable that is being defined within the for statement, or is it something else? It just seems different than how I am used to defining a variable, but I noticed "x" can be anything, which makes me think it is, in fact, a user-defined variable.
Please help.
Yes it's defined within the for statement. It's just a placeholder for an element in the list and can be called anything, e.g.
grocery_list = ['apples', 'bananas', 'oranges', 'milk']
for grocery in grocery_list:
print(grocery, len(grocery))
Python is a late-binding dynamic language. While Python programmers and the Python documentation frequently use the terms "variable" and "assignment" the more precise terms are "name" and "binding."
In your example x is a name of an object. At each iteration over the loop it's bound to the next object from your list. Python lists, and most other Python containers as well as many other Python object classes, feature iteration functions. That is to say that they define functions following a protocol which allows them to be used in for loops.
As you've noted a Python name (analogous to a "variable" in other languages) can be bound to any object of any type. A list can contain any mixture of object references. Thus, when you iterate over a list your "variable" (or loop name(s)) can be bound to objects of different types, potentially different types on each pass through the loop.
You can also have multiple names bound through "tuple unpacking" at each step through the iteration. For example the following snippet of code is a commonly used way to deal with dictionaries:
for key, value in some_dictionary.items():
# do something with each key and its associated value
This form isn't specific to dictionaries. The .items() method of dictionaries returns a sequence of two item tuples and this form of for loop could be used with any list or sequence which returned two-item tuples (or even two-item lists). Similarly you could see tuple unpacking used on sequence consisting of items which contain a uniform number of items:
for category, item, quantity, price in some_invoice_query():
# do something with these items, etc.
Conceptually a "variable" in most programming languages is a placeholder for data of a certain type (as well as a name by which that placeholder is referred throughout a program's source code). However, in Python (and other late-binding dynamic languages) a name is a reference to an object and conveys no constraint regarding the type or class of object to which the reference is made.
You can, rather crudely, think of Python names as if they were all void pointers in C.
x is a name in your current namespace, and the objects in grocery_list are assigned to this name one after another.
I think it is okay for you to treat x as a variable that is being defined within the for statement. Anything else is okay too. Python does not require a seperated "define" process, any variable is "defined" the first time it has been given a value.
The variable will be assigned behind the scenes each of the values of the list, in order. If the list holds references, then the reference will be assigned to the loop variable, of course.
It's almost equivalent to:
for i in xrange(len(grocery_list)):
x = grocery_list[i]
# rest of code here
But much much cleaner and (potentially) faster. The name (x) is not signicifant, it can be whatever you please.
After the loop has executed, the variable remains in scope, with the value of the last iteration that ran. So if you use a break to get out of the loop, that will show:
>>> for i in xrange(100):
... if i == 10: break
...
>>> i
10
x is a temporary variable that steps through a sequence. In lists, x will step through each item in the list.
>>> grocery_list = ['apples', 'bananas', 'oranges', 'milk']
>>> for x in grocery_list:
... print(x, len(x))
...
apples 6
bananas 7
oranges 7
milk 4
>>> print(x)
milk
EDIT: Apparently x remains defined even after the for loop exits.
A few more examples should clear things up:
>>> for x in 'some string': # x is bound to each character in the string
... print(x)
...
s
o
m
e
s
t
r
i
n
g
>>> for x in (0, 1, 2, 3): # x is bound to each number in the tuple
... print(x)
...
0
1
2
3
>>> for x in [(0,0), (1,1), (2,2)]: # x is bound to each tuple in the list
... print(x)
...
(0, 0)
(1, 1)
(2, 2)
In your example, x is the user-defined variable to which each value in grocery_list will be assigned in turn.
One must remember what a Python variable stores. It stores a location in memory where the object is pointing at is present. In other words, Python variables are basically pointers (void*s). They "know how to find their objects."
If you have
x = 5
y = 3
the assignment y = x actually hands a copy of the memory address where 5 is stored to y. Now x and y point at the copy of 3 in memory. Suppose you attempt this
x = [1,2,3]
for k in x:
k = 0
What happens? You hand k a copy of the memory address where each item is
stored. You then assign k to point a 0. The items in x are left undisturbed.
Now do this
x = [[1,2,3], [4,5,6], [7,8,9]]
for k in x:
k[0] = 0
Then x holds the list
[[0, 2, 3], [0, 5, 6], [0, 8, 9]]
You can change state of a mutable object via its memory address.
The moral: Python variables know WHERE to find their objects because they know where they are located in memory. When you assign one variable to another, you hand the recipient variable a copy of the donor variable's address. The loop variable in a for loop is just another variable.