python list comprehension double for - python

vec = [[1,2,3], [4,5,6], [7,8,9]]
print [num for elem in vec for num in elem] <----- this
>>> [1, 2, 3, 4, 5, 6, 7, 8, 9]
This is tricking me out.
I understand elem is the lists inside of the list from for elem in vic
I don't quite understand the usage of num and for num in elem in the beginning and the end.
How does python interpret this?
What's the order it looks at?

Lets break it down.
A simple list-comprehension:
[x for x in collection]
This is easy to understand if we break it into parts: [A for B in C]
A is the item that will be in the resulting list
B is each item in the collection C
C is the collection itself.
In this way, one could write:
[x.lower() for x in words]
In order to convert all words in a list to lowercase.
It is when we complicate this with another list like so:
[x for y in collection for x in y] # [A for B in C for D in E]
Here, something special happens. We want our final list to include A items, and A items are found inside B items, so we have to tell the list-comprehension that.
A is the item that will be in the resulting list
B is each item in the collection C
C is the collection itself
D is each item in the collection E (in this case, also A)
E is another collection (in this case, B)
This logic is similar to the normal for loop:
for y in collection: # for B in C:
for x in y: # for D in E: (in this case: for A in B)
# receive x # # receive A
To expand on this, and give a great example + explanation, imagine that there is a train.
The train engine (the front) is always going to be there (the result of the list-comprehension)
Then, there are any number of train cars, each train car is in the form: for x in y
A list comprehension could look like this:
[z for b in a for c in b for d in c ... for z in y]
Which would be like having this regular for-loop:
for b in a:
for c in b:
for d in c:
...
for z in y:
# have z
In other words, instead of going down a line and indenting, in a list-comprehension you just add the next loop on to the end.
To go back to the train analogy:
Engine - Car - Car - Car ... Tail
What is the tail? The tail is a special thing in list-comprehensions. You don't need one, but if you have a tail, the tail is a condition, look at this example:
[line for line in file if not line.startswith('#')]
This would give you every line in a file as long as the line didn't start with a hashtag (#), others are just skipped.
The trick to using the "tail" of the train is that it is checked for True/False at the same time as you have your final 'Engine' or 'result' from all the loops, the above example in a regular for-loop would look like this:
for line in file:
if not line.startswith('#'):
# have line
please note: Though in my analogy of a train there is only a 'tail' at the end of the train, the condition or 'tail' can be after every 'car' or loop...
for example:
>>> z = [[1,2,3,4],[5,6,7,8],[9,10,11,12]]
>>> [x for y in z if sum(y)>10 for x in y if x < 10]
[5, 6, 7, 8, 9]
In regular for-loop:
>>> for y in z:
if sum(y)>10:
for x in y:
if x < 10:
print x
5
6
7
8
9

From the list comprehension documentation:
When a list comprehension is supplied, it consists of a single expression followed by at least one for clause and zero or more for or if clauses. In this case, the elements of the new list are those that would be produced by considering each of the for or if clauses a block, nesting from left to right, and evaluating the expression to produce a list element each time the innermost block is reached.
In other words, pretend that the for loops are nested. Reading from left to right your list comprehension can be nested as:
for elem in vec:
for num in elem:
num # the *single expression* from the spec
where the list comprehension will use that last, innermost block as the values of the resulting list.

Your code equals:
temp = []
for elem in vec:
for num in elem:
temp.append(num)

You can look at list comprehension just as sequential statements. This applies for any levels of for and if statements.
For example, consider double for loop with their own ifs:
vec = [[1,2,3], [4,5,6], [7,8,9]]
result = [i for e in vec if len(e)==3 for i in e if i%2==0]
Here the list comprehension is same as:
result = []
for e in vec:
if len(e)==3:
for i in e:
if i%2==0:
result.append(i)
As you can see list comprehension is simply for and if without indentations but in same sequence.

Related

2 dimensional matrix as a one dimenstional matrix in python [duplicate]

I have no problem understanding this:
a = [1,2,3,4]
b = [x for x in a]
I thought that was all, but then I found this snippet:
a = [[1,2],[3,4],[5,6]]
b = [x for xs in a for x in xs]
Which makes b = [1,2,3,4,5,6]. The problem is I'm having trouble understanding the syntax in [x for xs in a for x in xs], Could anyone explain how it works?
Ah, the incomprehensible "nested" comprehensions. Loops unroll in the same order as in the comprehension.
[leaf for branch in tree for leaf in branch]
It helps to think of it like this.
for branch in tree:
for leaf in branch:
yield leaf
The PEP202 asserts this syntax with "the last index varying fastest" is "the Right One", notably without an explanation of why.
if a = [[1,2],[3,4],[5,6]], then if we unroll that list comp, we get:
+----------------a------------------+
| +--xs---+ , +--xs---+ , +--xs---+ | for xs in a
| | x , x | | x , x | | x , x | | for x in xs
a = [ [ 1 , 2 ] , [ 3 , 4 ] , [ 5 , 6 ] ]
b = [ x for xs in a for x in xs ] == [1,2,3,4,5,6] #a list of just the "x"s
b = [x for xs in a for x in xs] is similar to following nested loop.
b = []
for xs in a:
for x in xs:
b.append(x)
Effectively:
...for xs in a...]
is iterating over your main (outer) list and returning each of your sublists in turn.
...for x in xs]
is then iterating over each of these sub lists.
This can be re-written as:
b = []
for xs in a:
for x in xs:
b.append(x)
It can be written like this
result = []
for xs in a:
for x in xs:
result.append(x)
You can read more about it here
This is an example of a nested comprehension. Think of a = [[1,2],[3,4],[5,6]] as a 3 by 2 matrix (matrix= [[1,2],[3,4],[5,6]]).
______
row 1 |1 | 2 |
______
row 2 |3 | 4 |
______
row 3 |5 | 6 |
______
The list comprehension you see is another way to get all the elements from this matrix into a list.
I will try to explain this using different variables which will hopefully make more sense.
b = [element for row in matrix for element in row]
The first for loop iterates over the rows inside the matrix ie [1,2],[3,4],[5,6]. The second for loop iterates over each element in the list of 2 elements.
I have written a small article on List Comprehension on my website http://programmathics.com/programming/python/python-list-comprehension-tutorial/ which actually covered a very similar scenario to this question. I also give some other examples and explanations of python list comprehension.
Disclaimer: I am the creator of that website.
Here is how I best remember it:
(pseudocode, but has this type of pattern)
[(x,y,z) (loop 1) (loop 2) (loop 3)]
where the right most loop (loop 3) is the inner most loop.
[(x,y,z) for x in range(3) for y in range(3) for z in range(3)]
has the structure as:
for x in range(3):
for y in range(3):
for z in range(3):
print((x,y,z))
Edit I wanted to add another pattern:
[(result) (loop 1) (loop 2) (loop 3) (condition)]
Ex:
[(x,y,z) for x in range(3) for y in range(3) for z in range(3) if x == y == z]
Has this type of structure:
for x in range(3):
for y in range(3):
for z in range(3):
if x == y == z:
print((x,y,z))
Yes, you can nest for loops INSIDE of a list comprehension. You can even nest if statements in there.
dice_rolls = []
for roll1 in range(1,7):
for roll2 in range(1,7):
for roll3 in range(1,7):
dice_rolls.append((roll1, roll2, roll3))
# becomes
dice_rolls = [(roll1, roll2, roll3) for roll1 in range(1, 7) for roll2 in range(1, 7)
for roll3 in range(1, 7)]
I wrote a short article on medium explaining list comprehensions and some other cool things you can do with python, you should have a look if you're interested : )
You are asking for nested lists.
Let me try to answer this question in a step-by-step basis, covering these topics:
For loops
list comprehensions
both nested for loops and list comprehensions
For Loop
You have this list: lst = [0,1,2,3,4,5,6,7,8] and you want to iterate the list one item at a time and add them to a new list. You do a simple for loop:
lst = [0,1,2,3,4,5,6,7,8]
new_list = []
for lst_item in lst:
new_list.append(lst_item)
You can do exactly the same thing with a list comprehension (it's more pythonic).
List Comprehension
List comprehensions are a (*sometimes) simpler and elegant way to create lists.
new_list = [lst_item for lst_item in lst]
You read it this way: for every lst_item in lst, add lst_item to new_list
Nested Lists
What are nested lists?
A simple definition: it's a list which contains sublists. You have lists within another list.
*Depending on who you talk with, nested lists are one of those cases where list comprehensions can be more difficult to read than regular for loops.
Let's say you have this nested list: nested_list = [[0,1,2], [3,4,5], [6,7,8]], and you want to transform it to a flattened list like this one: flattened list = [0,1,2,3,4,5,6,7,8].
If you use the same for loops as before you wouldn't get it.
flattened_list = []
for list_item in nested_list:
flattened_list.append(list_item)
Why? Because each list_item is actually one of the sublists. In the first iteration you get [0,1,2], then [3,4,5] and finally [6,7,8].
You can check it like this:
nested_list[0] == [0, 1, 2]
nested_list[1] == [3, 4, 5]
nested_list[2] == [6, 7, 8]
You need a way to go into the sublists and add each sublist item to the flattened list.
How? You add an extra layer of iteration. Actually, you add one for each layer of sublists.
In the example above you have two layers.
The for loop solution.
nested_list = [[0,1,2], [3,4,5], [6,7,8]]
flattened_list = []
for sublist in nested_list:
for item in sublist:
flattened_list.append(item)
Let's read this code out loud.
for sublist in nested_list: each sublist is [0,1,2], [3,4,5], [6,7,8]. In the first iteration of the first loop we go inside [0,1,2].
for item in sublist: the first item of [0,1,2] is 0, which is appended to flattened_list. Then comes 1 and finally 2.
Up until this point flattened_list is [0,1,2].
We finish the last iteration of the second loop, so we go to the next iteration of the first loop. We go inside [3,4,5].
Then we go to each item of this sublist and append it to flattened_list. And then we go the next iteration and so on.
How can you do it with List Comprehensions?
The List Comprehension solution.
flattened_list = [item for sublist in nested_list for item in sublist]
You read it like this: add each item from each sublist from nested_list.
It's more concise, but if you have many layers it could become more difficult to read.
Let's see both together
#for loop
nested_list = [[0,1,2], [3,4,5], [6,7,8]]
flattened_list = []
for sublist in nested_list:
for item in sublist:
flattened_list.append(item)
----------------------------------------------------------------------
#list comprehension
flattened_list = [item for sublist in nested_list for item in sublist]
The more layers of iteration you will be adding more for x in y.
EDIT April 2021.
You can flatten a nested list with Numpy. Technically speaking, in Numpy the term would be 'array'.
For a small list it's an overkill, but if you're crunching millions of numbers in a list you may need Numpy.
From the Numpy's documentation. We have an attribute flat
b = np.array(
[
[ 0, 1, 2, 3],
[10, 11, 12, 13],
[20, 21, 22, 23],
[30, 31, 32, 33],
[40, 41, 42, 43]
]
)
for element in b.flat:
print(element)
0
1
2
...
41
42
43
The whole confusion in this syntax arise due to the first variable and the bad naming conventions.
[door for room in house for door in room]
Here 'door' is what sets the confusion
Imagine here if there was no 'door' variable at the start
[for room in house for door in room]
this way we can get it better.
And this become even more confusing using variables like [x, xs, y], So variable naming is also a key
You can also do something with the loop variable like:
doors = [door for room in house for door in str(room)]
which is equivalent to:
for room in house:
for door in str(room):
bolts.append(door)
english grammar:
b = "a list of 'specific items' taken from 'what loop?' "
b = [x for xs in a for x in xs]
x is the specific item
for xs in a for x in xs is the loop

Filtering a list of two-element sublists with a list comprehension

I have a list of lists
input = [[2,13],[5,3],[10,8],[13,4],[15,0],[17,10],[20,5],[25,9],[28,7],[31,0]]
I want to write a list comprehension where for the [a,b] pairs above I get the pairs where b > a. In the above example that would be [2,13].
My attempt
x = [[item[i],[j]] for item in inputArray if j>i]
produces a NameError
NameError: name 'j' is not defined`
The problem with your attempt is that you never tell Python what i and j are supposed to be. The check j > i cannot be computed and the list [item[i],[j]] can't be built without that information.
You can issue
>>> inp = [[2,13],[5,3],[10,8],[13,4],[15,0],[17,10],[20,5],[25,9],[28,7],[31,0]]
>>> [[a, b] for a, b in inp if b > a]
[[2, 13]]
This solution does not produce a NameError because for a, b in inp tells Python to iterate over the elements of inp (two-element sublists) and in each iteration assign the name a to the first element of a sublist and the name b to the second element.
I used the name inp instead of input because the latter is already taken by a builtin function for getting user input.
Explanation of the list comprehension
The comprehension is equivalent to
>>> result = []
>>> for a, b in inp:
... if b > a:
... result.append([a, b])
...
>>> result
[[2, 13]]
Every two-element list in inp is unpacked into the variables a and b. If the filter condition b > a is True, then a list [a, b] is built and included in the final result.
If you don't want to use unpacking, you can also index into the sublists of inp like this:
>>> [sub[:] for sub in inp if sub[1] > sub[0]]
[[2, 13]]
Taking a full slice of sub via sub[:] ensures that like in the other solutions presented so far, the filtered result stores (shallow) copies of the sublists of inp. If copying it not necessary, you can omit the [:].
This code does not produce a NameError because for sub in inp tells Python to iterate over inp and in each iteration assign the name sub to the next sublist. In addition, explicit numbers (0 and 1) are used for the indices.
Personally, I prefer the solution with unpacking. It is easier to read and will run even if the elements of inp don't support indexing, but are iterables from which two elements can be extracted.
You should unpack each pair into the i, j variables, and then you can compare them:
x = [[i, j] for i,j in inputList if j > i]
(note I have renamed inputArray, inputList)
Or without unpacking:
x = [item for item in inputList if item[1] > item[0]]

Check number not a sum of 2 ints on a list

Given a list of integers, I want to check a second list and remove from the first only those which can not be made from the sum of two numbers from the second. So given a = [3,19,20] and b = [1,2,17], I'd want [3,19].
Seems like a a cinch with two nested loops - except that I've gotten stuck with break and continue commands.
Here's what I have:
def myFunction(list_a, list_b):
for i in list_a:
for a in list_b:
for b in list_b:
if a + b == i:
break
else:
continue
break
else:
continue
list_a.remove(i)
return list_a
I know what I need to do, just the syntax seems unnecessarily confusing. Can someone show me an easier way? TIA!
You can do like this,
In [13]: from itertools import combinations
In [15]: [item for item in a if item in [sum(i) for i in combinations(b,2)]]
Out[15]: [3, 19]
combinations will give all possible combinations in b and get the list of sum. And just check the value is present in a
Edit
If you don't want to use the itertools wrote a function for it. Like this,
def comb(s):
for i, v1 in enumerate(s):
for j in range(i+1, len(s)):
yield [v1, s[j]]
result = [item for item in a if item in [sum(i) for i in comb(b)]]
Comments on code:
It's very dangerous to delete elements from a list while iterating over it. Perhaps you could append items you want to keep to a new list, and return that.
Your current algorithm is O(nm^2), where n is the size of list_a, and m is the size of list_b. This is pretty inefficient, but a good start to the problem.
Thee's also a lot of unnecessary continue and break statements, which can lead to complicated code that is hard to debug.
You also put everything into one function. If you split up each task into different functions, such as dedicating one function to finding pairs, and one for checking each item in list_a against list_b. This is a way of splitting problems into smaller problems, and using them to solve the bigger problem.
Overall I think your function is doing too much, and the logic could be condensed into much simpler code by breaking down the problem.
Another approach:
Since I found this task interesting, I decided to try it myself. My outlined approach is illustrated below.
1. You can first check if a list has a pair of a given sum in O(n) time using hashing:
def check_pairs(lst, sums):
lookup = set()
for x in lst:
current = sums - x
if current in lookup:
return True
lookup.add(x)
return False
2. Then you could use this function to check if any any pair in list_b is equal to the sum of numbers iterated in list_a:
def remove_first_sum(list_a, list_b):
new_list_a = []
for x in list_a:
check = check_pairs(list_b, x)
if check:
new_list_a.append(x)
return new_list_a
Which keeps numbers in list_a that contribute to a sum of two numbers in list_b.
3. The above can also be written with a list comprehension:
def remove_first_sum(list_a, list_b):
return [x for x in list_a if check_pairs(list_b, x)]
Both of which works as follows:
>>> remove_first_sum([3,19,20], [1,2,17])
[3, 19]
>>> remove_first_sum([3,19,20,18], [1,2,17])
[3, 19, 18]
>>> remove_first_sum([1,2,5,6],[2,3,4])
[5, 6]
Note: Overall the algorithm above is O(n) time complexity, which doesn't require anything too complicated. However, this also leads to O(n) extra auxiliary space, because a set is kept to record what items have been seen.
You can do it by first creating all possible sum combinations, then filtering out elements which don't belong to that combination list
Define the input lists
>>> a = [3,19,20]
>>> b = [1,2,17]
Next we will define all possible combinations of sum of two elements
>>> y = [i+j for k,j in enumerate(b) for i in b[k+1:]]
Next we will apply a function to every element of list a and check if it is present in above calculated list. map function can be use with an if/else clause. map will yield None in case of else clause is successful. To cater for this we can filter the list to remove None values
>>> list(filter(None, map(lambda x: x if x in y else None,a)))
The above operation will output:
>>> [3,19]
You can also write a one-line by combining all these lines into one, but I don't recommend this.
you can try something like that:
a = [3,19,20]
b= [1,2,17,5]
n_m_s=[]
data=[n_m_s.append(i+j) for i in b for j in b if i+j in a]
print(set(n_m_s))
print("after remove")
final_data=[]
for j,i in enumerate(a):
if i not in n_m_s:
final_data.append(i)
print(final_data)
output:
{19, 3}
after remove
[20]

How would I change this one line for loop to normal for loop?

This is a general question that I was not to able to understand.
If I have this:
somelist = [[a for a, b in zip(X, y) if b == c] for c in np.unique(y)]
How can I write this as normal multiline for loop? I never seem to get it right.
EDIT: So far I've tried this:
somelist = []
for c in np.unique(y):
for x, t in zip(X, y):
if t == c:
separated.append(x)
But I wasn't sure if this was right because I wasn't getting an expected result in some other part of my code.
Let me know if this works:
evaluate the outer list comprehension first for the outer loop. then evaluate the inner list comprehension.
somelist=[]
for c in np.unique(y):
ans=[]
for a,b in zip(X,y):
if b==c:
ans.append(a)
somelist.append(ans)
To flat a nested comprehension out, follow these steps:
First create an empty container: somelist = []
If the comprehension has an if clause, put it right after the for
Then, flat the nested comprehensions out, starting with the innermost
The inner comprehension is:
row = []
for a, b in zip(X, y):
if b == c:
row.append(a)
Then, somelist is nothing more than [row for c in np.unique(y)], where row depends on several factors.
This one is equivalent to:
somelist = []
for c in np.unique(y):
somelist.append(row)
So the complete version is:
somelist = []
for c in np.unique(y):
row = []
for a, b in zip(X, y):
if b == c:
row.append(a)
c.append(row)
This how it looks like using "normal" for-loop (a.ka. without using list comprehension):
somelist = []
for c in np.unique(y)
l = []
for a, b in zip(X, y):
if b == c:
l.append(a)
somelist.append(l)
Your were very close. The problem with your approach is that you forgot an important point: The result of the list comprehension will be a list of lists. Thus, the values computed in the inner loop, need to be held in a temporary list that will be append to the "main" list somelist to create a list of lists:
somelist = []
for c in np.unique(y):
# create a temporary list that will holds the values computed in the
# inner loop.
sublist = []
for x, t in zip(X, y):
if t == c:
sublist.append(x)
# after the list has been computed, add the temporary list to the main
# list `somelist`. That way, a list of lists is created.
somelist.append(sublist)
The general rule of thumb when converting a list comprehension to a vanilla for loop is that for each level of nesting, you'll need another nested for loop and another temporary list to hold the values computed in the nested loop.
As a caveat, once you start getting past 2-3 leves of nesting in your comprehension, you should seriously consider coveting it to a normal for loop. Whatever efficacy you're gaining, it offset my the unreliability of the nested list comprehension. Remember, "97% of the time: premature optimization is the root of all evil. Yet we should not pass up our opportunities in that critical 3%".
After offering the obvious caveat that, for performance and Pythonic reasons, you should not expand your list comprehension into a multi-line loop, you would write it from the outside in:
somelist = []
for c in np.unique(y):
inner_list = []
for a, b in zip(X, y):
if b == c:
inner_list.append(a)
somelist.append(inner_list)
And now you see the beauty of list comprehensions.
somelist = []
for c in np.unique(y):
somelist.append([a for a, b in zip(X, y) if b == c])

Grasping the basics of in-place operations

So...
a = [2,3,4,5]
for x in a:
x += 1
a = [2,3,4,5]
Nada.
but if I ...
a[2] += 1
a = [2,3,5,5]
Clearly my mind fails to comprehend the basics. print(x) returns only the integer within the cell so it should simply add the one automatically for each list cell. What's the solution and what am I not grasping?
In this case you are defining a new variable x, that references each element of a in turn. You cannot modify the int that x refers to, because ints are immutable in Python. When you use the += operator, a new int is created and x refers to this new int, rather than the one in a. If you created a class that wrapped up an int, then you could use your loop as-is because instances of this class would be mutable. (This isn't necessary as Python provides better ways of doing what you want to do)
for x in a:
x += 1
What you want to do is generate a new list based on a, and possibly store it back to a.
a = [x + 1 for x in a]
To understand what's happening here, consider these two pieces of code. First:
for i in range(len(a)):
x = a[i]
x += 1
Second:
for x in a:
x += 1
These two for loops do exactly the same thing to x. You can see from the first that changing the value of x doesn't change a at all; the same holds in the second.
As others have noted, a list comprehension is a good way to create a new list with new values:
new_a = [x + 1 for x in a]
If you don't want to create a new list, you can use the following patterns to alter the original list:
for i in range(len(a)): # this gets the basic idea across
a[i] += 1
for i, _ in enumerate(a): # this one uses enumerate() instead of range()
a[i] += 1
for i, x in enumerate(a): # this one is nice for more complex operations
a[i] = x + 1
If you want to +1 on elements of a list of ints:
In [775]: a = [2,3,4,5]
In [776]: b=[i+1 for i in a]
...: print b
[3, 4, 5, 6]
Why for x in a: x += 1 fails ?
Because x is an immutable object that couldn't be modified in-place. If x is a mutable object, += might work:
In [785]: for x in a:
...: x+=[1,2,3] #here x==[] and "+=" does the same thing as list.extend
In [786]: a
Out[786]: [[1, 2, 3], [1, 2, 3]]
When you say
for x in a:
x += 1
Python simply binds the name x with the items from a on each iteration. So, in the first iteration x will be referring to the item which is in the 0th index of a. But when you say
x += 1
it is equivalent to
x = x + 1
So, you are adding 1 to the value of x and making x refer to the newly created number (result of x + 1). That is why the change is not visible in the actual list.
To fix this, you can add 1 to each and every element like this
for idx in range(len(a)):
a[idx] += 1
Now the same thing happens but we are replacing the old element at index i with the new element.
Output
[3, 4, 5, 6]
Note: But we have to prefer the list comprehension way whenever possible, since it leave the original list altered but constructs a new list based on the old list. So, the same thing can be done like this
a = [item + 1 for item in a]
# [3, 4, 5, 6]
The major difference is that, earlier we were making changes to the same list now we have created a new list and make a refer to the newly created list.
In your for loop, you declare a new variable x,
for x in a
It's this variable you next adds one to
x += 1
And then you do nothing with x.
You should save the xsomewhere if you want to use it later on :)
The variable x inside the for loop is a copy of each cell in the a list. If you modify x you will not affect a.
A more "correct" way to increment each element of a list by one is using a list comprehension:
a = [elem + 1 for elem in a]
You could also use the map function:
a = map(lambda x: x + 1, a)
when you put a[2], you are reffering to the third variable in the array 'a'
because the first element which in your case is 2 is stored at a[0] similarly, 3 at a[1] ,4 at a[2] and 5 at a[3]

Categories