I have no problem understanding this:
a = [1,2,3,4]
b = [x for x in a]
I thought that was all, but then I found this snippet:
a = [[1,2],[3,4],[5,6]]
b = [x for xs in a for x in xs]
Which makes b = [1,2,3,4,5,6]. The problem is I'm having trouble understanding the syntax in [x for xs in a for x in xs], Could anyone explain how it works?
Ah, the incomprehensible "nested" comprehensions. Loops unroll in the same order as in the comprehension.
[leaf for branch in tree for leaf in branch]
It helps to think of it like this.
for branch in tree:
for leaf in branch:
yield leaf
The PEP202 asserts this syntax with "the last index varying fastest" is "the Right One", notably without an explanation of why.
if a = [[1,2],[3,4],[5,6]], then if we unroll that list comp, we get:
+----------------a------------------+
| +--xs---+ , +--xs---+ , +--xs---+ | for xs in a
| | x , x | | x , x | | x , x | | for x in xs
a = [ [ 1 , 2 ] , [ 3 , 4 ] , [ 5 , 6 ] ]
b = [ x for xs in a for x in xs ] == [1,2,3,4,5,6] #a list of just the "x"s
b = [x for xs in a for x in xs] is similar to following nested loop.
b = []
for xs in a:
for x in xs:
b.append(x)
Effectively:
...for xs in a...]
is iterating over your main (outer) list and returning each of your sublists in turn.
...for x in xs]
is then iterating over each of these sub lists.
This can be re-written as:
b = []
for xs in a:
for x in xs:
b.append(x)
It can be written like this
result = []
for xs in a:
for x in xs:
result.append(x)
You can read more about it here
This is an example of a nested comprehension. Think of a = [[1,2],[3,4],[5,6]] as a 3 by 2 matrix (matrix= [[1,2],[3,4],[5,6]]).
______
row 1 |1 | 2 |
______
row 2 |3 | 4 |
______
row 3 |5 | 6 |
______
The list comprehension you see is another way to get all the elements from this matrix into a list.
I will try to explain this using different variables which will hopefully make more sense.
b = [element for row in matrix for element in row]
The first for loop iterates over the rows inside the matrix ie [1,2],[3,4],[5,6]. The second for loop iterates over each element in the list of 2 elements.
I have written a small article on List Comprehension on my website http://programmathics.com/programming/python/python-list-comprehension-tutorial/ which actually covered a very similar scenario to this question. I also give some other examples and explanations of python list comprehension.
Disclaimer: I am the creator of that website.
Here is how I best remember it:
(pseudocode, but has this type of pattern)
[(x,y,z) (loop 1) (loop 2) (loop 3)]
where the right most loop (loop 3) is the inner most loop.
[(x,y,z) for x in range(3) for y in range(3) for z in range(3)]
has the structure as:
for x in range(3):
for y in range(3):
for z in range(3):
print((x,y,z))
Edit I wanted to add another pattern:
[(result) (loop 1) (loop 2) (loop 3) (condition)]
Ex:
[(x,y,z) for x in range(3) for y in range(3) for z in range(3) if x == y == z]
Has this type of structure:
for x in range(3):
for y in range(3):
for z in range(3):
if x == y == z:
print((x,y,z))
Yes, you can nest for loops INSIDE of a list comprehension. You can even nest if statements in there.
dice_rolls = []
for roll1 in range(1,7):
for roll2 in range(1,7):
for roll3 in range(1,7):
dice_rolls.append((roll1, roll2, roll3))
# becomes
dice_rolls = [(roll1, roll2, roll3) for roll1 in range(1, 7) for roll2 in range(1, 7)
for roll3 in range(1, 7)]
I wrote a short article on medium explaining list comprehensions and some other cool things you can do with python, you should have a look if you're interested : )
You are asking for nested lists.
Let me try to answer this question in a step-by-step basis, covering these topics:
For loops
list comprehensions
both nested for loops and list comprehensions
For Loop
You have this list: lst = [0,1,2,3,4,5,6,7,8] and you want to iterate the list one item at a time and add them to a new list. You do a simple for loop:
lst = [0,1,2,3,4,5,6,7,8]
new_list = []
for lst_item in lst:
new_list.append(lst_item)
You can do exactly the same thing with a list comprehension (it's more pythonic).
List Comprehension
List comprehensions are a (*sometimes) simpler and elegant way to create lists.
new_list = [lst_item for lst_item in lst]
You read it this way: for every lst_item in lst, add lst_item to new_list
Nested Lists
What are nested lists?
A simple definition: it's a list which contains sublists. You have lists within another list.
*Depending on who you talk with, nested lists are one of those cases where list comprehensions can be more difficult to read than regular for loops.
Let's say you have this nested list: nested_list = [[0,1,2], [3,4,5], [6,7,8]], and you want to transform it to a flattened list like this one: flattened list = [0,1,2,3,4,5,6,7,8].
If you use the same for loops as before you wouldn't get it.
flattened_list = []
for list_item in nested_list:
flattened_list.append(list_item)
Why? Because each list_item is actually one of the sublists. In the first iteration you get [0,1,2], then [3,4,5] and finally [6,7,8].
You can check it like this:
nested_list[0] == [0, 1, 2]
nested_list[1] == [3, 4, 5]
nested_list[2] == [6, 7, 8]
You need a way to go into the sublists and add each sublist item to the flattened list.
How? You add an extra layer of iteration. Actually, you add one for each layer of sublists.
In the example above you have two layers.
The for loop solution.
nested_list = [[0,1,2], [3,4,5], [6,7,8]]
flattened_list = []
for sublist in nested_list:
for item in sublist:
flattened_list.append(item)
Let's read this code out loud.
for sublist in nested_list: each sublist is [0,1,2], [3,4,5], [6,7,8]. In the first iteration of the first loop we go inside [0,1,2].
for item in sublist: the first item of [0,1,2] is 0, which is appended to flattened_list. Then comes 1 and finally 2.
Up until this point flattened_list is [0,1,2].
We finish the last iteration of the second loop, so we go to the next iteration of the first loop. We go inside [3,4,5].
Then we go to each item of this sublist and append it to flattened_list. And then we go the next iteration and so on.
How can you do it with List Comprehensions?
The List Comprehension solution.
flattened_list = [item for sublist in nested_list for item in sublist]
You read it like this: add each item from each sublist from nested_list.
It's more concise, but if you have many layers it could become more difficult to read.
Let's see both together
#for loop
nested_list = [[0,1,2], [3,4,5], [6,7,8]]
flattened_list = []
for sublist in nested_list:
for item in sublist:
flattened_list.append(item)
----------------------------------------------------------------------
#list comprehension
flattened_list = [item for sublist in nested_list for item in sublist]
The more layers of iteration you will be adding more for x in y.
EDIT April 2021.
You can flatten a nested list with Numpy. Technically speaking, in Numpy the term would be 'array'.
For a small list it's an overkill, but if you're crunching millions of numbers in a list you may need Numpy.
From the Numpy's documentation. We have an attribute flat
b = np.array(
[
[ 0, 1, 2, 3],
[10, 11, 12, 13],
[20, 21, 22, 23],
[30, 31, 32, 33],
[40, 41, 42, 43]
]
)
for element in b.flat:
print(element)
0
1
2
...
41
42
43
The whole confusion in this syntax arise due to the first variable and the bad naming conventions.
[door for room in house for door in room]
Here 'door' is what sets the confusion
Imagine here if there was no 'door' variable at the start
[for room in house for door in room]
this way we can get it better.
And this become even more confusing using variables like [x, xs, y], So variable naming is also a key
You can also do something with the loop variable like:
doors = [door for room in house for door in str(room)]
which is equivalent to:
for room in house:
for door in str(room):
bolts.append(door)
english grammar:
b = "a list of 'specific items' taken from 'what loop?' "
b = [x for xs in a for x in xs]
x is the specific item
for xs in a for x in xs is the loop
Related
I'm trying to understand the following snippet of python code:
lst = [[c for c in range(r)] for r in range(3)] #line1
for x in lst: #line2
for y in x: #line3
if y < 2: #line4
print('*', end='') #line5
I know what the functions like range(3) mean by its own, but I don't get the context. It's a bit complicated to see this nested piece of code. The first line with the 'lst' is most confusing. Why is the first line producing the following output:
[[], [0], [0, 1]]
and how does line2 and line3 works together? Thanks in advance for your answer. Every idea is welcome!
Re
"The first line with the 'lst' is most confusing.":
Wherever you see [ ...for...] you have what's called a "list comprehension." This is a way to build up a list based on a one-line-loop description of the elements. For example:
list1 = [letter for letter in 'abcd']
and
list2 = []
for letter in 'abcd':
list2.append(letter)
yield identical lists list1 and list2
In your case, you have two sets of [] and two for statements, so you have a list comprehension inside a list comprehension: so the result is not just a list but a nested list.
Re
"and how does line2 and line3 works together?"
Line2 iterates through all the items in your list lst.
But each of those items is also a list, because you have a nested list. So line3 iterates through each item in that inner list.
In that nested list comprehension, r begins with a value of 0, so the inside list evaluates to [c for c in range(0)], which is []. When r is 1, it evaluates to [c for c in range(1)], which is [0]. When c is 2, you have [c for c in range(2)], which is [0,1]. When these are generated within the outer list comprehension, these are returned as list elements of the list. So you have a list of lists in lst.
The for loop then iterates over each of these lists in line 2. Line 3 then iterates over the integer elements within each list.
lst = [[c for c in range(r)] for r in range(3)]
this is a nested list comprehension.
But here it could be simplified as we don't need a list of lists, just a list of range objects so we can iterate on them. So
lst = [range(r) for r in range(3)]
is way simpler.
And while we're at it, why creating a list comprehension at all? Just remove it, and use a classic loop
for r in range(3):
for y in range(r):
if y < 2:
print('*', end='')
The snippet creates a LIST of LISTS where size (and item's value actually) depends on the index in the first level list.
The snippet is the same as
result = []
for k in range(3):
result.append([])
for v in range(k):
result[k].append(v)
print(result)
=>>>
[[], [0], [0, 1]]
The first line is equivalent to
lst = [list(range(r)) for r in range(3)]
It is a list comprehension which generates a list containing list up to r-1. list(range(r)) generates [0, 1, 2, ..., r-1], and because the r variable gets the values 0, 1 and 2 in the list comprehension, it means it will generate a list with the lists [] (r=0), [0], (r=1) and [0, 1] (r=2).
Line 2 iterates over the lists of lst, and line 3 iterates over the values of each list. Therefore, these two lines jointly iterate over all the numbers in lst.
This answer works very well for finding indices of items from a list in another list, but the problem with it is, it only gives them once. However, I would like my list of indices to have the same length as the searched for list.
Here is an example:
thelist = ['A','B','C','D','E'] # the list whose indices I want
Mylist = ['B','C','B','E'] # my list of values that I am searching in the other list
ilist = [i for i, x in enumerate(thelist) if any(thing in x for thing in Mylist)]
With this solution, ilist = [1,2,4] but what I want is ilist = [1,2,1,4] so that len(ilist) = len(Mylist). It leaves out the index that has already been found, but if my items repeat in the list, it will not give me the duplicates.
thelist = ['A','B','C','D','E']
Mylist = ['B','C','B','E']
ilist = [thelist.index(x) for x in Mylist]
print(ilist) # [1, 2, 1, 4]
Basically, "for each element of Mylist, get its position in thelist."
This assumes that every element in Mylist exists in thelist. If the element occurs in thelist more than once, it takes the first location.
UPDATE
For substrings:
thelist = ['A','boB','C','D','E']
Mylist = ['B','C','B','E']
ilist = [next(i for i, y in enumerate(thelist) if x in y) for x in Mylist]
print(ilist) # [1, 2, 1, 4]
UPDATE 2
Here's a version that does substrings in the other direction using the example in the comments below:
thelist = ['A','B','C','D','E']
Mylist = ['Boo','Cup','Bee','Eerr','Cool','Aah']
ilist = [next(i for i, y in enumerate(thelist) if y in x) for x in Mylist]
print(ilist) # [1, 2, 1, 4, 2, 0]
Below code would work
ilist = [ theList.index(i) for i in MyList ]
Make a reverse lookup from strings to indices:
string_indices = {c: i for i, c in enumerate(thelist)}
ilist = [string_indices[c] for c in Mylist]
This avoids the quadratic behaviour of repeated .index() lookups.
If you data can be implicitly converted to ndarray, as your example implies, you could use numpy_indexed (disclaimer: I am its author), to perform this kind of operation in an efficient (fully vectorized and NlogN) manner.
import numpy_indexed as npi
ilist = npi.indices(thelist, Mylist)
npi.indices is essentially the array-generalization of list.index. Also, it has a kwarg to give you control over how to deal with missing values and such.
vec = [[1,2,3], [4,5,6], [7,8,9]]
print [num for elem in vec for num in elem] <----- this
>>> [1, 2, 3, 4, 5, 6, 7, 8, 9]
This is tricking me out.
I understand elem is the lists inside of the list from for elem in vic
I don't quite understand the usage of num and for num in elem in the beginning and the end.
How does python interpret this?
What's the order it looks at?
Lets break it down.
A simple list-comprehension:
[x for x in collection]
This is easy to understand if we break it into parts: [A for B in C]
A is the item that will be in the resulting list
B is each item in the collection C
C is the collection itself.
In this way, one could write:
[x.lower() for x in words]
In order to convert all words in a list to lowercase.
It is when we complicate this with another list like so:
[x for y in collection for x in y] # [A for B in C for D in E]
Here, something special happens. We want our final list to include A items, and A items are found inside B items, so we have to tell the list-comprehension that.
A is the item that will be in the resulting list
B is each item in the collection C
C is the collection itself
D is each item in the collection E (in this case, also A)
E is another collection (in this case, B)
This logic is similar to the normal for loop:
for y in collection: # for B in C:
for x in y: # for D in E: (in this case: for A in B)
# receive x # # receive A
To expand on this, and give a great example + explanation, imagine that there is a train.
The train engine (the front) is always going to be there (the result of the list-comprehension)
Then, there are any number of train cars, each train car is in the form: for x in y
A list comprehension could look like this:
[z for b in a for c in b for d in c ... for z in y]
Which would be like having this regular for-loop:
for b in a:
for c in b:
for d in c:
...
for z in y:
# have z
In other words, instead of going down a line and indenting, in a list-comprehension you just add the next loop on to the end.
To go back to the train analogy:
Engine - Car - Car - Car ... Tail
What is the tail? The tail is a special thing in list-comprehensions. You don't need one, but if you have a tail, the tail is a condition, look at this example:
[line for line in file if not line.startswith('#')]
This would give you every line in a file as long as the line didn't start with a hashtag (#), others are just skipped.
The trick to using the "tail" of the train is that it is checked for True/False at the same time as you have your final 'Engine' or 'result' from all the loops, the above example in a regular for-loop would look like this:
for line in file:
if not line.startswith('#'):
# have line
please note: Though in my analogy of a train there is only a 'tail' at the end of the train, the condition or 'tail' can be after every 'car' or loop...
for example:
>>> z = [[1,2,3,4],[5,6,7,8],[9,10,11,12]]
>>> [x for y in z if sum(y)>10 for x in y if x < 10]
[5, 6, 7, 8, 9]
In regular for-loop:
>>> for y in z:
if sum(y)>10:
for x in y:
if x < 10:
print x
5
6
7
8
9
From the list comprehension documentation:
When a list comprehension is supplied, it consists of a single expression followed by at least one for clause and zero or more for or if clauses. In this case, the elements of the new list are those that would be produced by considering each of the for or if clauses a block, nesting from left to right, and evaluating the expression to produce a list element each time the innermost block is reached.
In other words, pretend that the for loops are nested. Reading from left to right your list comprehension can be nested as:
for elem in vec:
for num in elem:
num # the *single expression* from the spec
where the list comprehension will use that last, innermost block as the values of the resulting list.
Your code equals:
temp = []
for elem in vec:
for num in elem:
temp.append(num)
You can look at list comprehension just as sequential statements. This applies for any levels of for and if statements.
For example, consider double for loop with their own ifs:
vec = [[1,2,3], [4,5,6], [7,8,9]]
result = [i for e in vec if len(e)==3 for i in e if i%2==0]
Here the list comprehension is same as:
result = []
for e in vec:
if len(e)==3:
for i in e:
if i%2==0:
result.append(i)
As you can see list comprehension is simply for and if without indentations but in same sequence.
How do you make a list of lists within a for loop?
Here is what I have coded right now:
a = 0
xy=[[[],[]],[[],[]],[[],[]],[[],[]],[[],[]],[[],[]],[[],[]],[[],[]],[[],[]],[[],[]],[[],[]],[[],[]]]
for liness in range(len(NNCatelogue)):
a=0
for iii in range(len(NNCatelogue[liness])):
while a < len(catid):
if catid[a]==NNCatelogue[liness][iii]:
xyiii = (catid[a], a)
xy.append(xyiii)
a += 1
The output that I get is a lengthy list of pairs, as expected. It looks somewhat like the following:
[...,('C-18-1262', 30908),
('C-18-1264', 30910),
('C-18-1265', 30911),
('C-18-1267', 30913),
('C-18-1250', 30896),
('C-18-1254', 30900),...]
I would like to turn this list of pairs into a list of lists of pairs though. There are 1268 iterations, and the length of each list should be 12. (So 1268 lists with 12 elements in each of them). Any ideas for how to approach this when in a loop?
Something like this, perhaps. Note that I am using iteration over the lists directly to save a lot of unnecessary indexing.
xy = []
for line in NNCatelogue:
l = []
for c in line:
for a, ca in enumerate(catid):
if ca == c:
l.append((ca, a))
xy.append(l)
If you're using the inner loop just to search for the category index, as I suspect you are, a dictionary may be a useful addition to avoid the inner loop.
I have a few friendly suggestions right off the bat:
First of all, the a=0 at the very beginning is redundant. You do the
same thing twice with the a=0 inside of the first for loop.
Second, why are you declaring a huge framework of list elements for xy at
the top? You can always append() what you need as you go
along.
Finally, your while loop is just a simple for loop:
for n in range(len(catid)):
You can make a list of lists using list expansions like so:
list_of_lists = [[j for j in range(0, 3)] for _ in range(0, 3)]
Which outputs a 3x3 list:
[ [0, 1, 2],
[0, 1, 2],
[0, 1, 2]
]
My question is about Python List Comprehension readability. When I come across code with complex/nested list comprehensions, I find that I have to re-read them several times in order to understand the intent.
Is there an intuitive way to read aloud list comprehensions? Seems like I should start "reading" from the middle, then read the if conditions (if any), and read the expression last.
Here's how I would read the follow line of code aloud, in order to understand it:
[(x, y) for x in [1,2,3] for y in [3,1,4] if x != y]
"For each element in List x, and each element in List y, if the two elements are not the same, create a list of tuples."
Two examples that I am struggling with:
How would you read the following List Comprehensions aloud?
From another question in Stack Overflow: [x for b in a for x in b]
Python docs has this example:
[[row[i] for row in matrix] for i in range(4)]
Any suggestions or pointers for ways to read aloud list comprehensions such that the intention becomes clearer is much appreciated.
I usually unfold it in my mind into a generating loop, so for example
[(x, y) for x in [1,2,3] for y in [3,1,4] if x != y]
is the list comprehension for the generator
for x in [1,2,3]:
for y in [3,1,4]:
if x != y:
yield (x, y)
Example #1
[x for b in a for x in b] is the comprehension for
for b in a:
for x in b:
yield x
Example result for a = [[1,2,3],[4,5,6]]: [1, 2, 3, 4, 5, 6]
Example #2
[[row[i] for row in matrix] for i in range(4)] (note the inner expression is another comprehension!):
for i in range(4):
yield [row[i] for row in matrix]
which is unfolded
for i in range(4):
l = []
for row in matrix:
l.append(row[i])
yield l
"Construct a list of X's based on Y's and Z's for which Q is true."