Itertools Chain on Nested List - python

I have two lists combined sequentially to create a nested list with python's map and zip funcionality; however, I wish to recreate this with itertools.
Furthermore, I am trying to understand why itertools.chain is returning a flattened list when I insert two lists, but when I add a nested list it simply returns the nested list.
Any help on these two issues would be greatly appreciated.
from itertools import chain
a = [0,1,2,3]
b = [4,5,6,7]
#how can I produce this with itertools?
c = list(map(list, zip(a,b)))
print(c) #[[0, 4], [1, 5], [2, 6], [3, 7]]
d = list(chain(c))
print(d) #[[0, 4], [1, 5], [2, 6], [3, 7]]
d = list(chain(a,b))
print(d) #[0, 1, 2, 3, 4, 5, 6, 7]

I'll try to answer your questions as best I can.
First off, itertools.chain doesn't work the way you think it does. chain takes x number of iterables and iterates over them in sequence. When you call chain, it essentially (internally) packs the objects into a list:
chain("ABC", "DEF") # Internally creates ["ABC", "DEF"]
Inside the method, it accesses each of these items one at a time, and iterates through them:
for iter_item in arguments:
for item in iter_item:
yield item
So when you call chain([[a,b],[c,d,e],[f,g]]), it creates a list with one iterable object: the list you passed as an argument. So now it looks like this:
[ #outer
[ #inner
[a,b],
[c,d,e],
[f,g]
]
]
chain as such iterates over the inner list, and returns three elements: [a,b], [c,d,e], and [f,g] in order. Then they get repacked by list, giving you what you had in the first place.
Incidentally, there is a way to do what you want to: chain.from_iterable. This is an alternate constructor for chain which accepts a single iterable, such as your list, and pulls the elements out to iterate over. So instead of this:
# chain(l)
[ #outer
[ #inner
[a,b],
[c,d,e],
[f,g]
]
]
You get this:
# chain.from_iterable(l)
[
[a,b],
[c,d,e],
[f,g]
]
This will iterate through the three sub-lists, and return them in one sequence, so list(chain.from_iterable(l)) will return [a,b,c,d,e,f,g].
As for your second question: While I don't know why itertools is a necessity to this process, you can do this in Python 2.x:
list(itertools.izip(x,y))
However, in 3.x, the izip function has been removed. There is still zip_longest, which will match up as many pairs as it can, and accept a filler value for extra pairs: list(zip_longest([a,b,c],[d,e,f,g,h],fillvalue="N")) returns [(a,d),(b,e),(c,f),(N,g),(N,h)] since the second list is longer than the first. Normal zip will take the shortest iterable and cut off the rest.
In other words, unless you want zip_longest instead of zip, itertools does not have a built-in method for zipping.

You can also run itertools.chain(*your_list_of_lists). For example:
for p in itertools.chain(*[[1,2],[3,4]]):
print(p)
1
2
3
4

Related

How to get a flat list while avoiding to make a nested list in the first place?

My goal
My question is about a list comprehension that does not puts elements in the resulting list as they are (which would results in a nested list), but extends the results into a flat list. So my question is not about flattening a nested list, but how to get a flat list while avoiding to make a nested list in the first place.
Example
Consider a have class instances with attributes that contains a list of integers:
class Foo:
def __init__(self, l):
self.l = l
foo_0 = Foo([1, 2, 3])
foo_1 = Foo([4, 5])
list_of_foos = [foo_0, foo_1]
Now I want to have a list of all integers in all instances of Foo. My best solution using extend is:
result = []
for f in list_of_foos:
result.extend(f.l)
As desired, result is now [1, 2, 3, 4, 5].
Is there something better? For example list comprehensions?
Since I expect list comprehension to be faster, I'm looking for pythonic way get the desired result with a list comprehension. My best approach is to get a list of lists ('nested list') and flatten this list again - which seems quirky:
result = [item for sublist in [f.l for f in list_of_foos] for item in sublist]
What functionaly I'm looking for
result = some_module.list_extends(f.l for f in list_of_foos)
Questions and Answers I read before
I was quite sure there is an answer to this problem, but during my search, I only found list.extend and list comprehension where the reason why a nested list occurs is different; and python list comprehensions; compressing a list of lists? where the answers are about avoiding the nested list, or how to flatten it.
You can use multiple fors in a single comprehension:
result = [
n
for foo in list_of_foos
for n in foo.l
]
Note that the order of fors is from the outside in -- same as if you wrote a nested for-loop:
for foo in list_of_foos:
for n in foo.l:
print(n)
If you want to combine multiple lists, as if they were all one list, I'd immediately think of itertools.chain. However, you have to access an attribute on each item, so we're also going to need operator.attrgetter. To get those together, I used map and itertools.chain.from_iterable()
https://docs.python.org/3/library/itertools.html#itertools.chain.from_iterable
from itertools import chain
from operator import attrgetter
class Foo:
def __init__(self, l):
self.l = l
foo_0 = Foo([1, 2, 3])
foo_1 = Foo([4, 5])
list_of_foos = [foo_0, foo_1]
for item in chain.from_iterable(map(attrgetter('l'), list_of_foos)):
print(item)
That demonstrates iterating through iterators with chain, as if they were one. If you don't specifically need to keep the list around, don't. But in case you do, here is the comprehension:
final = [item for item in chain.from_iterable(map(attrgetter('l'), list_of_foos))]
print(final)
[1, 2, 3, 4, 5]
In a list, you can make good use to + operator to concatenate two or more list together. It acts like an extend function to your list.
foo_0.l + foo_1.l
Out[7]: [1, 2, 3, 4, 5]
or you can use sum to perform this operation
sum([foo_0.l, foo_1.l], [])
Out[15]: [1, 2, 3, 4, 5]
In fact, it's in one of the post you have read ;)

Get one resulting list with the max or min by index in a nested list

Lets say I have this structure
[[[1,2],[3,4]],[[8,9],[7,7]]]
I want to iterate the list and have this result:
[[3,2],[8,7]]
This is would be reducing the list list of arrays in the first level [[1,2],[3,4]] to one single array where the maximum selected for the first element and the minimum is found for the second.
I have already done it manually, just iterating the groups, iterating again, storing the first value and seeing if the next is bigger or smaller, I store it in a list and create another list.
I would like to find a more elegant method with list comprehensions and so on, I'm pretty sure I can use zip here to group the values in the same group but I haven't been successful so far.
You can use zip, and by unpacking the result into individual values it is pretty easy to do what you are looking for, e.g.:
>>> x = [[[1,2],[3,4]],[[8,9],[7,7]]]
>>> [[max(a), min(b)] for k in x for a, b in [zip(*k)]]
[[3, 2], [8, 7]]
An alternative way without unpacking is to have a cycling function iterable (max, min, max, min, ...) and use nested list comprehensions, e.g.:
>>> import itertools as it
>>> maxmin = it.cycle([max, min])
>>> [[next(maxmin)(a) for a in zip(*k)] for k in x]
[[3, 2], [8, 7]]
Or index into a list of functions:
>>> import itertools as it
>>> maxmin = [max, min]
>>> [[maxmin[i](a) for i, a in enumerate(zip(*k))] for k in x]
[[3, 2], [8, 7]]
This will work without zip:
mylist = [[[1,2],[3,4]],[[8,9],[7,7]]]
[[max(y[0] for y in x), min(y[1] for y in x)] for x in mylist]
The main disadvantage of this is that it looks through each sub-list twice, once to find the maximum (of the first items) and once to find the minimum (of the second items).

Zipping together entries of a list of lists using Python

If I have a list with multiple lists (for simplicity 3, but I actually have a very large amount):
list = [[1,a,2],[3,b,4],[5,c,6]]
How do I obtain a new lists of lists that combines the original list items based on their positions using Python?
new_list = [[1,3,5],[a,b,c],[2,4,6]]
I've been trying the zip function on "list" but it's not working, what am I doing wrong?
This does what you want.
mylist = [[1,"a",2],[3,"b",4],[5,"c",6]]
mylist2 = list(map(list, zip(*mylist)))
Please don't use list, or any other built-in as variable name.
Try it online!
list(map(list, zip(*mylist)))
*mylist -- unpacks the list
zip(*mylist) -- creates an iterable of the unpacked list,
with the i-th element beeing a tuple
containing all i-th elements of each element of *mylist
list -- Is the built-in function list()
map( f , iter ) -- Applys a function f to all elements of an iterable iter
list( ) -- Creates a list from whatever is inside.
You can use zip:
a = 1
b = 2
c = 3
l = [[1,a,2],[3,b,4],[5,c,6]]
new_l = list(map(list, zip(*l)))
Output:
[[1, 3, 5], [1, 2, 3], [2, 4, 6]]
Notice that the variables are now displayed in the second element of new_l
You can use zip, also keep in minde it's bad practice to use built-in functions as variable name.
l = [[1,a,2],[3,b,4],[5,c,6]]
list(zip(*l))
output
[[1,3,5],[a,b,c],[2,4,6]]

Create List from Elements of Tuples with Exclusion

I want to make a list of items from the elements of tuples in a list such that those elements don't belong to some other list, and I know that each tuple contains one element from the list I don't want it to belong to and one element that's not in that list. For example, with
tuples = [(2,1), (1,4), (1,7), (3,10), (4,3)]
exclude = [1, 3]
I am looking to create the list
[2, 4, 7, 10]
This is easy enough to accomplish in a clumsy for loop, but it seems like there's a more pythonic way using some function or list comprehension. Any ideas?
Didn't actually understand the question. Assuming this may be you want
>>>list(set([j for i in tuples for j in i if not j in exclude]))
[2, 4, 10, 7]
Assuming your requirement is to convert list of tuples to a list and then getting unique elements in the list, exclusing the list exclude and then sorting them.
from itertools import chain
tuples_final = sorted(list(set(chain(*tuples))-set(exclude)))
You forgot a 4 in your example, the code will return :
>>>[num for tup in tuples for num in tup if num not in exclude]
[2, 4, 7, 10, 4]

How to sort like values in Python

I was wondering how to sort like values in a list, and then break like values into a sub-list.
For example: I would want a function that probably does something like
def sort_by_like_values(list):
#python magic
>>>list=[2,2,3,4,4,10]
>>>[[2,2],[3],[4,4],[10]]
OR
>>>[2,2],[3],[4,4],[10]
I read up on the sorted api and it works well for sorting things within their own list, but doesn't break lists up into sub-lists. What module would help me out here?
Use groupby from the itertools module.
from itertools import groupby
L = [2, 2, 3, 4, 4, 10]
L.sort()
for key, iterator in groupby(L):
print key, list(iterator)
Result:
2 [2, 2]
3 [3]
4 [4, 4]
10 [10]
A couple of things to be aware of: groupby needs the data it works on to be sorted by the same key you wish to group by, or it won't work. Also, the iterator needs to be consumed before continuing to the next group, so make sure you store list(iterator) to another list or something. One-liner giving you the result you want:
>>> [list(it) for key, it in groupby(sorted(L))]
[[2, 2], [3], [4, 4], [10]]
Check the itertools module, it has the useful groupby function:
import itertools as i
for k,g in i.groupby(sorted([2,2,3,4,4,10])):
print list(g)
....
[2, 2]
[3]
[4, 4]
[10]
You should be able to modify this to get the values in a list.
As everyone else has suggested itertools.groupby (which would be my first choice) - it's also possible with collections.Counter to obtain key and frequency, sort by the key, then expand back out freq times.
from itertools import repeat
from collections import Counter
grouped = [list(repeat(key, freq)) for key, freq in sorted(Counter(L).iteritems())]
itertools.groupby() with a list comprehension works fine.
In [20]: a = [1, 1, 2, 3, 3, 4, 5, 5, 5, 6]
In [21]: [ list(subgroup) for key, subgroup in itertools.groupby(sorted(a)) ]
Out[21]: [[1, 1], [2], [3, 3], [4], [5, 5, 5], [6]]
Note that groupby() returns a list of iterators, and you have to consume these iterators in order. As per the docs:
The returned group is itself an iterator that shares the underlying iterable with groupby(). Because the source is shared, when the groupby() object is advanced, the previous group is no longer visible. So, if that data is needed later, it should be stored as a list:
If you do not wish to use itertools and can wrap your head around list comprehensions, this should also do the trick :
def group(a):
a = sorted(a)
d = [0] + [x+1 for x in range(len(a)-1) if a[x]!=a[x+1]] + [len(a)]
return [a[(d[x]):(d[x+1])] for x in range(len(d)-1)]
where ais your list

Categories