Related
I have no problem understanding this:
a = [1,2,3,4]
b = [x for x in a]
I thought that was all, but then I found this snippet:
a = [[1,2],[3,4],[5,6]]
b = [x for xs in a for x in xs]
Which makes b = [1,2,3,4,5,6]. The problem is I'm having trouble understanding the syntax in [x for xs in a for x in xs], Could anyone explain how it works?
Ah, the incomprehensible "nested" comprehensions. Loops unroll in the same order as in the comprehension.
[leaf for branch in tree for leaf in branch]
It helps to think of it like this.
for branch in tree:
for leaf in branch:
yield leaf
The PEP202 asserts this syntax with "the last index varying fastest" is "the Right One", notably without an explanation of why.
if a = [[1,2],[3,4],[5,6]], then if we unroll that list comp, we get:
+----------------a------------------+
| +--xs---+ , +--xs---+ , +--xs---+ | for xs in a
| | x , x | | x , x | | x , x | | for x in xs
a = [ [ 1 , 2 ] , [ 3 , 4 ] , [ 5 , 6 ] ]
b = [ x for xs in a for x in xs ] == [1,2,3,4,5,6] #a list of just the "x"s
b = [x for xs in a for x in xs] is similar to following nested loop.
b = []
for xs in a:
for x in xs:
b.append(x)
Effectively:
...for xs in a...]
is iterating over your main (outer) list and returning each of your sublists in turn.
...for x in xs]
is then iterating over each of these sub lists.
This can be re-written as:
b = []
for xs in a:
for x in xs:
b.append(x)
It can be written like this
result = []
for xs in a:
for x in xs:
result.append(x)
You can read more about it here
This is an example of a nested comprehension. Think of a = [[1,2],[3,4],[5,6]] as a 3 by 2 matrix (matrix= [[1,2],[3,4],[5,6]]).
______
row 1 |1 | 2 |
______
row 2 |3 | 4 |
______
row 3 |5 | 6 |
______
The list comprehension you see is another way to get all the elements from this matrix into a list.
I will try to explain this using different variables which will hopefully make more sense.
b = [element for row in matrix for element in row]
The first for loop iterates over the rows inside the matrix ie [1,2],[3,4],[5,6]. The second for loop iterates over each element in the list of 2 elements.
I have written a small article on List Comprehension on my website http://programmathics.com/programming/python/python-list-comprehension-tutorial/ which actually covered a very similar scenario to this question. I also give some other examples and explanations of python list comprehension.
Disclaimer: I am the creator of that website.
Here is how I best remember it:
(pseudocode, but has this type of pattern)
[(x,y,z) (loop 1) (loop 2) (loop 3)]
where the right most loop (loop 3) is the inner most loop.
[(x,y,z) for x in range(3) for y in range(3) for z in range(3)]
has the structure as:
for x in range(3):
for y in range(3):
for z in range(3):
print((x,y,z))
Edit I wanted to add another pattern:
[(result) (loop 1) (loop 2) (loop 3) (condition)]
Ex:
[(x,y,z) for x in range(3) for y in range(3) for z in range(3) if x == y == z]
Has this type of structure:
for x in range(3):
for y in range(3):
for z in range(3):
if x == y == z:
print((x,y,z))
Yes, you can nest for loops INSIDE of a list comprehension. You can even nest if statements in there.
dice_rolls = []
for roll1 in range(1,7):
for roll2 in range(1,7):
for roll3 in range(1,7):
dice_rolls.append((roll1, roll2, roll3))
# becomes
dice_rolls = [(roll1, roll2, roll3) for roll1 in range(1, 7) for roll2 in range(1, 7)
for roll3 in range(1, 7)]
I wrote a short article on medium explaining list comprehensions and some other cool things you can do with python, you should have a look if you're interested : )
You are asking for nested lists.
Let me try to answer this question in a step-by-step basis, covering these topics:
For loops
list comprehensions
both nested for loops and list comprehensions
For Loop
You have this list: lst = [0,1,2,3,4,5,6,7,8] and you want to iterate the list one item at a time and add them to a new list. You do a simple for loop:
lst = [0,1,2,3,4,5,6,7,8]
new_list = []
for lst_item in lst:
new_list.append(lst_item)
You can do exactly the same thing with a list comprehension (it's more pythonic).
List Comprehension
List comprehensions are a (*sometimes) simpler and elegant way to create lists.
new_list = [lst_item for lst_item in lst]
You read it this way: for every lst_item in lst, add lst_item to new_list
Nested Lists
What are nested lists?
A simple definition: it's a list which contains sublists. You have lists within another list.
*Depending on who you talk with, nested lists are one of those cases where list comprehensions can be more difficult to read than regular for loops.
Let's say you have this nested list: nested_list = [[0,1,2], [3,4,5], [6,7,8]], and you want to transform it to a flattened list like this one: flattened list = [0,1,2,3,4,5,6,7,8].
If you use the same for loops as before you wouldn't get it.
flattened_list = []
for list_item in nested_list:
flattened_list.append(list_item)
Why? Because each list_item is actually one of the sublists. In the first iteration you get [0,1,2], then [3,4,5] and finally [6,7,8].
You can check it like this:
nested_list[0] == [0, 1, 2]
nested_list[1] == [3, 4, 5]
nested_list[2] == [6, 7, 8]
You need a way to go into the sublists and add each sublist item to the flattened list.
How? You add an extra layer of iteration. Actually, you add one for each layer of sublists.
In the example above you have two layers.
The for loop solution.
nested_list = [[0,1,2], [3,4,5], [6,7,8]]
flattened_list = []
for sublist in nested_list:
for item in sublist:
flattened_list.append(item)
Let's read this code out loud.
for sublist in nested_list: each sublist is [0,1,2], [3,4,5], [6,7,8]. In the first iteration of the first loop we go inside [0,1,2].
for item in sublist: the first item of [0,1,2] is 0, which is appended to flattened_list. Then comes 1 and finally 2.
Up until this point flattened_list is [0,1,2].
We finish the last iteration of the second loop, so we go to the next iteration of the first loop. We go inside [3,4,5].
Then we go to each item of this sublist and append it to flattened_list. And then we go the next iteration and so on.
How can you do it with List Comprehensions?
The List Comprehension solution.
flattened_list = [item for sublist in nested_list for item in sublist]
You read it like this: add each item from each sublist from nested_list.
It's more concise, but if you have many layers it could become more difficult to read.
Let's see both together
#for loop
nested_list = [[0,1,2], [3,4,5], [6,7,8]]
flattened_list = []
for sublist in nested_list:
for item in sublist:
flattened_list.append(item)
----------------------------------------------------------------------
#list comprehension
flattened_list = [item for sublist in nested_list for item in sublist]
The more layers of iteration you will be adding more for x in y.
EDIT April 2021.
You can flatten a nested list with Numpy. Technically speaking, in Numpy the term would be 'array'.
For a small list it's an overkill, but if you're crunching millions of numbers in a list you may need Numpy.
From the Numpy's documentation. We have an attribute flat
b = np.array(
[
[ 0, 1, 2, 3],
[10, 11, 12, 13],
[20, 21, 22, 23],
[30, 31, 32, 33],
[40, 41, 42, 43]
]
)
for element in b.flat:
print(element)
0
1
2
...
41
42
43
The whole confusion in this syntax arise due to the first variable and the bad naming conventions.
[door for room in house for door in room]
Here 'door' is what sets the confusion
Imagine here if there was no 'door' variable at the start
[for room in house for door in room]
this way we can get it better.
And this become even more confusing using variables like [x, xs, y], So variable naming is also a key
You can also do something with the loop variable like:
doors = [door for room in house for door in str(room)]
which is equivalent to:
for room in house:
for door in str(room):
bolts.append(door)
english grammar:
b = "a list of 'specific items' taken from 'what loop?' "
b = [x for xs in a for x in xs]
x is the specific item
for xs in a for x in xs is the loop
This is a general question that I was not to able to understand.
If I have this:
somelist = [[a for a, b in zip(X, y) if b == c] for c in np.unique(y)]
How can I write this as normal multiline for loop? I never seem to get it right.
EDIT: So far I've tried this:
somelist = []
for c in np.unique(y):
for x, t in zip(X, y):
if t == c:
separated.append(x)
But I wasn't sure if this was right because I wasn't getting an expected result in some other part of my code.
Let me know if this works:
evaluate the outer list comprehension first for the outer loop. then evaluate the inner list comprehension.
somelist=[]
for c in np.unique(y):
ans=[]
for a,b in zip(X,y):
if b==c:
ans.append(a)
somelist.append(ans)
To flat a nested comprehension out, follow these steps:
First create an empty container: somelist = []
If the comprehension has an if clause, put it right after the for
Then, flat the nested comprehensions out, starting with the innermost
The inner comprehension is:
row = []
for a, b in zip(X, y):
if b == c:
row.append(a)
Then, somelist is nothing more than [row for c in np.unique(y)], where row depends on several factors.
This one is equivalent to:
somelist = []
for c in np.unique(y):
somelist.append(row)
So the complete version is:
somelist = []
for c in np.unique(y):
row = []
for a, b in zip(X, y):
if b == c:
row.append(a)
c.append(row)
This how it looks like using "normal" for-loop (a.ka. without using list comprehension):
somelist = []
for c in np.unique(y)
l = []
for a, b in zip(X, y):
if b == c:
l.append(a)
somelist.append(l)
Your were very close. The problem with your approach is that you forgot an important point: The result of the list comprehension will be a list of lists. Thus, the values computed in the inner loop, need to be held in a temporary list that will be append to the "main" list somelist to create a list of lists:
somelist = []
for c in np.unique(y):
# create a temporary list that will holds the values computed in the
# inner loop.
sublist = []
for x, t in zip(X, y):
if t == c:
sublist.append(x)
# after the list has been computed, add the temporary list to the main
# list `somelist`. That way, a list of lists is created.
somelist.append(sublist)
The general rule of thumb when converting a list comprehension to a vanilla for loop is that for each level of nesting, you'll need another nested for loop and another temporary list to hold the values computed in the nested loop.
As a caveat, once you start getting past 2-3 leves of nesting in your comprehension, you should seriously consider coveting it to a normal for loop. Whatever efficacy you're gaining, it offset my the unreliability of the nested list comprehension. Remember, "97% of the time: premature optimization is the root of all evil. Yet we should not pass up our opportunities in that critical 3%".
After offering the obvious caveat that, for performance and Pythonic reasons, you should not expand your list comprehension into a multi-line loop, you would write it from the outside in:
somelist = []
for c in np.unique(y):
inner_list = []
for a, b in zip(X, y):
if b == c:
inner_list.append(a)
somelist.append(inner_list)
And now you see the beauty of list comprehensions.
somelist = []
for c in np.unique(y):
somelist.append([a for a, b in zip(X, y) if b == c])
Giving a list, how can I select element pairs that satisfy some criterion?
I know a linear search algorithm can achieve this:
b = []
for i in range(len(a)-1):
for j in range(i+1,len(a)):
if isTrue(a[i],a[j]):
b.append([a[i],a[j]])
Any better solution can do this more efficiency?
Update
#scytale's comment inspired me a solution. But it can not be perfect.
For example, a = [1.2,3.1,0.3,4.2,5.6,2.7,1.1]. I want to get pairs of elements that the sum of which is less than 3.
b = [(x,y) for x in a for y in a if (x+y)<3 and x!=y]
This will give duplicate pairs of:
[(1.2,0.3),(1.2,1.1),(0.3,1.2),(0.3,1.1),(1.1,1.2),(1.1,0.3)]
But what I want is:
[(1.2,0.3),(1.2,1.1),(0.3,1.1)]
What about using combinations and filter?
from itertools import combinations
c = combinations(a, 2)
f = filter(lambda x, y: isTrue(x, y), c)
Or using list comprehension:
result = [(x, y) for x, y in c if isTrue(x, y)]
I am new to Python. I have the following code which is a part of a string algorithm that i'm currently developing.
>>> newlist=[]
>>> i =0
>>> for x in range(len(list1)):
new_item = [y for y in list1[i] if y not in list2[i]]
newlist.append(new_item)
i=i+1
>>> print newlist
I like to do this using list comprehension as I've read it is performance optimized. Can somebody suggest me a method.
Thank you.
[Edit]
example:
list1= [[['pat'],['cut'],['rat']], [['sat','pat'],['cut','pat']],[['instructor','plb','error0992'],['instruction','address','00x0993'],['data','address','017x112']]
list2= [[['pat'], ['cut'], ['rat']], [['sat', 'pat']], [['instructor', 'plb', 'error0992'], ['instruction', 'address', '00x0993']]]
So the new list,
newlist= [[], [['cut', 'pat']], [['data', 'address', '017x112']]]
If you just want all elements that are in one list and not in another, I would suggest looking into python sets. They don't allow for duplicates, but the performance and readability benefits are large.
You would implement this like so:
newlist = list(set(list1).difference(set(list2)))
If you want to apply this in place of your current solutions, you should do something along the lines of what Dominic suggested (slightly edited for readability):
[list(set(a)-set(b)) for a, b in zip(list1, list2)]
If the order matters, or you have duplicates, then the single list comprehension you had above should do the trick, just wrap it as a lambda function to make it more readable:
single_item = lambda i: [y for y in list1[i] if y not in list2[i]]
newlist = [single_item(i) for i in enumerate(list1)]
This is a nested list comprehension that does the same thing as your code (albeit will not preserve the value of i).
newlist = [[y for y in list1[i] if y not in list2[i]] for i in range(len(list1))]
TL;DR: [[y for y in list1[i] if j not in list2[i]] for i in enumerate(list1)]
You should use enumerate instead of the range(len()) non-idiom. You may also want to consider making this a generator expression. Either with a concrete nested list:
([y for y in list1[i] if j not in list2[i]] for i in enumerate(list1))
or not
((y for y in list1[i] if j not in list2[i]) for i in enumerate(list1))
vec = [[1,2,3], [4,5,6], [7,8,9]]
print [num for elem in vec for num in elem] <----- this
>>> [1, 2, 3, 4, 5, 6, 7, 8, 9]
This is tricking me out.
I understand elem is the lists inside of the list from for elem in vic
I don't quite understand the usage of num and for num in elem in the beginning and the end.
How does python interpret this?
What's the order it looks at?
Lets break it down.
A simple list-comprehension:
[x for x in collection]
This is easy to understand if we break it into parts: [A for B in C]
A is the item that will be in the resulting list
B is each item in the collection C
C is the collection itself.
In this way, one could write:
[x.lower() for x in words]
In order to convert all words in a list to lowercase.
It is when we complicate this with another list like so:
[x for y in collection for x in y] # [A for B in C for D in E]
Here, something special happens. We want our final list to include A items, and A items are found inside B items, so we have to tell the list-comprehension that.
A is the item that will be in the resulting list
B is each item in the collection C
C is the collection itself
D is each item in the collection E (in this case, also A)
E is another collection (in this case, B)
This logic is similar to the normal for loop:
for y in collection: # for B in C:
for x in y: # for D in E: (in this case: for A in B)
# receive x # # receive A
To expand on this, and give a great example + explanation, imagine that there is a train.
The train engine (the front) is always going to be there (the result of the list-comprehension)
Then, there are any number of train cars, each train car is in the form: for x in y
A list comprehension could look like this:
[z for b in a for c in b for d in c ... for z in y]
Which would be like having this regular for-loop:
for b in a:
for c in b:
for d in c:
...
for z in y:
# have z
In other words, instead of going down a line and indenting, in a list-comprehension you just add the next loop on to the end.
To go back to the train analogy:
Engine - Car - Car - Car ... Tail
What is the tail? The tail is a special thing in list-comprehensions. You don't need one, but if you have a tail, the tail is a condition, look at this example:
[line for line in file if not line.startswith('#')]
This would give you every line in a file as long as the line didn't start with a hashtag (#), others are just skipped.
The trick to using the "tail" of the train is that it is checked for True/False at the same time as you have your final 'Engine' or 'result' from all the loops, the above example in a regular for-loop would look like this:
for line in file:
if not line.startswith('#'):
# have line
please note: Though in my analogy of a train there is only a 'tail' at the end of the train, the condition or 'tail' can be after every 'car' or loop...
for example:
>>> z = [[1,2,3,4],[5,6,7,8],[9,10,11,12]]
>>> [x for y in z if sum(y)>10 for x in y if x < 10]
[5, 6, 7, 8, 9]
In regular for-loop:
>>> for y in z:
if sum(y)>10:
for x in y:
if x < 10:
print x
5
6
7
8
9
From the list comprehension documentation:
When a list comprehension is supplied, it consists of a single expression followed by at least one for clause and zero or more for or if clauses. In this case, the elements of the new list are those that would be produced by considering each of the for or if clauses a block, nesting from left to right, and evaluating the expression to produce a list element each time the innermost block is reached.
In other words, pretend that the for loops are nested. Reading from left to right your list comprehension can be nested as:
for elem in vec:
for num in elem:
num # the *single expression* from the spec
where the list comprehension will use that last, innermost block as the values of the resulting list.
Your code equals:
temp = []
for elem in vec:
for num in elem:
temp.append(num)
You can look at list comprehension just as sequential statements. This applies for any levels of for and if statements.
For example, consider double for loop with their own ifs:
vec = [[1,2,3], [4,5,6], [7,8,9]]
result = [i for e in vec if len(e)==3 for i in e if i%2==0]
Here the list comprehension is same as:
result = []
for e in vec:
if len(e)==3:
for i in e:
if i%2==0:
result.append(i)
As you can see list comprehension is simply for and if without indentations but in same sequence.