Python convert CSV into Lists [duplicate] - python

I have a list of lists like [[1, 2, 3], [4, 5, 6], [7], [8, 9]]. How can I flatten it to get [1, 2, 3, 4, 5, 6, 7, 8, 9]?
If your list of lists comes from a nested list comprehension, the problem can be solved more simply/directly by fixing the comprehension; please see python list comprehensions; compressing a list of lists?.
The most popular solutions here generally only flatten one "level" of the nested list. See Flatten an irregular (arbitrarily nested) list of lists for solutions that completely flatten a deeply nested structure (recursively, in general).

Given a list of lists l,
flat_list = [item for sublist in l for item in sublist]
which means:
flat_list = []
for sublist in l:
for item in sublist:
flat_list.append(item)
is faster than the shortcuts posted so far. (l is the list to flatten.)
Here is the corresponding function:
def flatten(l):
return [item for sublist in l for item in sublist]
As evidence, you can use the timeit module in the standard library:
$ python -mtimeit -s'l=[[1,2,3],[4,5,6], [7], [8,9]]*99' '[item for sublist in l for item in sublist]'
10000 loops, best of 3: 143 usec per loop
$ python -mtimeit -s'l=[[1,2,3],[4,5,6], [7], [8,9]]*99' 'sum(l, [])'
1000 loops, best of 3: 969 usec per loop
$ python -mtimeit -s'l=[[1,2,3],[4,5,6], [7], [8,9]]*99' 'reduce(lambda x,y: x+y,l)'
1000 loops, best of 3: 1.1 msec per loop
Explanation: the shortcuts based on + (including the implied use in sum) are, of necessity, O(L**2) when there are L sublists -- as the intermediate result list keeps getting longer, at each step a new intermediate result list object gets allocated, and all the items in the previous intermediate result must be copied over (as well as a few new ones added at the end). So, for simplicity and without actual loss of generality, say you have L sublists of I items each: the first I items are copied back and forth L-1 times, the second I items L-2 times, and so on; total number of copies is I times the sum of x for x from 1 to L excluded, i.e., I * (L**2)/2.
The list comprehension just generates one list, once, and copies each item over (from its original place of residence to the result list) also exactly once.

You can use itertools.chain():
>>> import itertools
>>> list2d = [[1,2,3], [4,5,6], [7], [8,9]]
>>> merged = list(itertools.chain(*list2d))
Or you can use itertools.chain.from_iterable() which doesn't require unpacking the list with the * operator:
>>> import itertools
>>> list2d = [[1,2,3], [4,5,6], [7], [8,9]]
>>> merged = list(itertools.chain.from_iterable(list2d))
This approach is arguably more readable than [item for sublist in l for item in sublist] and appears to be faster too:
$ python3 -mtimeit -s'l=[[1,2,3],[4,5,6], [7], [8,9]]*99;import itertools' 'list(itertools.chain.from_iterable(l))'
20000 loops, best of 5: 10.8 usec per loop
$ python3 -mtimeit -s'l=[[1,2,3],[4,5,6], [7], [8,9]]*99' '[item for sublist in l for item in sublist]'
10000 loops, best of 5: 21.7 usec per loop
$ python3 -mtimeit -s'l=[[1,2,3],[4,5,6], [7], [8,9]]*99' 'sum(l, [])'
1000 loops, best of 5: 258 usec per loop
$ python3 -mtimeit -s'l=[[1,2,3],[4,5,6], [7], [8,9]]*99;from functools import reduce' 'reduce(lambda x,y: x+y,l)'
1000 loops, best of 5: 292 usec per loop
$ python3 --version
Python 3.7.5rc1

Note from the author: This is very inefficient. But fun, because monoids are awesome.
>>> xss = [[1, 2, 3], [4, 5, 6], [7], [8, 9]]
>>> sum(xss, [])
[1, 2, 3, 4, 5, 6, 7, 8, 9]
sum sums the elements of the iterable xss, and uses the second argument as the initial value [] for the sum. (The default initial value is 0, which is not a list.)
Because you are summing nested lists, you actually get [1,3]+[2,4] as a result of sum([[1,3],[2,4]],[]), which is equal to [1,3,2,4].
Note that only works on lists of lists. For lists of lists of lists, you'll need another solution.

I tested most suggested solutions with perfplot (a pet project of mine, essentially a wrapper around timeit), and found
import functools
import operator
functools.reduce(operator.iconcat, a, [])
to be the fastest solution, both when many small lists and few long lists are concatenated. (operator.iadd is equally fast.)
A simpler and also acceptable variant is
out = []
for sublist in a:
out.extend(sublist)
If the number of sublists is large, this performs a little worse than the above suggestion.
Code to reproduce the plot:
import functools
import itertools
import operator
import numpy as np
import perfplot
def forfor(a):
return [item for sublist in a for item in sublist]
def sum_brackets(a):
return sum(a, [])
def functools_reduce(a):
return functools.reduce(operator.concat, a)
def functools_reduce_iconcat(a):
return functools.reduce(operator.iconcat, a, [])
def itertools_chain(a):
return list(itertools.chain.from_iterable(a))
def numpy_flat(a):
return list(np.array(a).flat)
def numpy_concatenate(a):
return list(np.concatenate(a))
def extend(a):
out = []
for sublist in a:
out.extend(sublist)
return out
b = perfplot.bench(
setup=lambda n: [list(range(10))] * n,
# setup=lambda n: [list(range(n))] * 10,
kernels=[
forfor,
sum_brackets,
functools_reduce,
functools_reduce_iconcat,
itertools_chain,
numpy_flat,
numpy_concatenate,
extend,
],
n_range=[2 ** k for k in range(16)],
xlabel="num lists (of length 10)",
# xlabel="len lists (10 lists total)"
)
b.save("out.png")
b.show()

Using functools.reduce, which adds an accumulated list xs to the next list ys:
from functools import reduce
xss = [[1,2,3], [4,5,6], [7], [8,9]]
out = reduce(lambda xs, ys: xs + ys, xss)
Output:
[1, 2, 3, 4, 5, 6, 7, 8, 9]
A faster way using operator.concat:
from functools import reduce
import operator
xss = [[1,2,3], [4,5,6], [7], [8,9]]
out = reduce(operator.concat, xss)
Output:
[1, 2, 3, 4, 5, 6, 7, 8, 9]

Here is a general approach that applies to numbers, strings, nested lists and mixed containers. This can flatten both simple and complicated containers (see also Demo).
Code
from typing import Iterable
#from collections import Iterable # < py38
def flatten(items):
"""Yield items from any nested iterable; see Reference."""
for x in items:
if isinstance(x, Iterable) and not isinstance(x, (str, bytes)):
for sub_x in flatten(x):
yield sub_x
else:
yield x
Notes:
In Python 3, yield from flatten(x) can replace for sub_x in flatten(x): yield sub_x
In Python 3.8, abstract base classes are moved from collection.abc to the typing module.
Demo
simple = [[1, 2, 3], [4, 5, 6], [7], [8, 9]]
list(flatten(simple))
# [1, 2, 3, 4, 5, 6, 7, 8, 9]
complicated = [[1, [2]], (3, 4, {5, 6}, 7), 8, "9"] # numbers, strs, nested & mixed
list(flatten(complicated))
# [1, 2, 3, 4, 5, 6, 7, 8, '9']
Reference
This solution is modified from a recipe in Beazley, D. and B. Jones. Recipe 4.14, Python Cookbook 3rd Ed., O'Reilly Media Inc. Sebastopol, CA: 2013.
Found an earlier SO post, possibly the original demonstration.

To flatten a data-structure that is deeply nested, use iteration_utilities.deepflatten1:
>>> from iteration_utilities import deepflatten
>>> l = [[1, 2, 3], [4, 5, 6], [7], [8, 9]]
>>> list(deepflatten(l, depth=1))
[1, 2, 3, 4, 5, 6, 7, 8, 9]
>>> l = [[1, 2, 3], [4, [5, 6]], 7, [8, 9]]
>>> list(deepflatten(l))
[1, 2, 3, 4, 5, 6, 7, 8, 9]
It's a generator so you need to cast the result to a list or explicitly iterate over it.
To flatten only one level and if each of the items is itself iterable you can also use iteration_utilities.flatten which itself is just a thin wrapper around itertools.chain.from_iterable:
>>> from iteration_utilities import flatten
>>> l = [[1, 2, 3], [4, 5, 6], [7], [8, 9]]
>>> list(flatten(l))
[1, 2, 3, 4, 5, 6, 7, 8, 9]
Just to add some timings (based on Nico Schlömer's answer that didn't include the function presented in this answer):
It's a log-log plot to accommodate for the huge range of values spanned. For qualitative reasoning: Lower is better.
The results show that if the iterable contains only a few inner iterables then sum will be fastest, however for long iterables only the itertools.chain.from_iterable, iteration_utilities.deepflatten or the nested comprehension have reasonable performance with itertools.chain.from_iterable being the fastest (as already noticed by Nico Schlömer).
from itertools import chain
from functools import reduce
from collections import Iterable # or from collections.abc import Iterable
import operator
from iteration_utilities import deepflatten
def nested_list_comprehension(lsts):
return [item for sublist in lsts for item in sublist]
def itertools_chain_from_iterable(lsts):
return list(chain.from_iterable(lsts))
def pythons_sum(lsts):
return sum(lsts, [])
def reduce_add(lsts):
return reduce(lambda x, y: x + y, lsts)
def pylangs_flatten(lsts):
return list(flatten(lsts))
def flatten(items):
"""Yield items from any nested iterable; see REF."""
for x in items:
if isinstance(x, Iterable) and not isinstance(x, (str, bytes)):
yield from flatten(x)
else:
yield x
def reduce_concat(lsts):
return reduce(operator.concat, lsts)
def iteration_utilities_deepflatten(lsts):
return list(deepflatten(lsts, depth=1))
from simple_benchmark import benchmark
b = benchmark(
[nested_list_comprehension, itertools_chain_from_iterable, pythons_sum, reduce_add,
pylangs_flatten, reduce_concat, iteration_utilities_deepflatten],
arguments={2**i: [[0]*5]*(2**i) for i in range(1, 13)},
argument_name='number of inner lists'
)
b.plot()
1 Disclaimer: I'm the author of that library

The following seems simplest to me:
>>> import numpy as np
>>> l = [[1, 2, 3], [4, 5, 6], [7], [8, 9]]
>>> print(np.concatenate(l))
[1 2 3 4 5 6 7 8 9]

Consider installing the more_itertools package.
> pip install more_itertools
It ships with an implementation for flatten (source, from the itertools recipes):
import more_itertools
lst = [[1, 2, 3], [4, 5, 6], [7], [8, 9]]
list(more_itertools.flatten(lst))
# [1, 2, 3, 4, 5, 6, 7, 8, 9]
Note: as mentioned in the docs, flatten requires a list of lists. See below on flattening more irregular inputs.
As of version 2.4, you can flatten more complicated, nested iterables with more_itertools.collapse (source, contributed by abarnet).
lst = [[1, 2, 3], [4, 5, 6], [7], [8, 9]]
list(more_itertools.collapse(lst))
# [1, 2, 3, 4, 5, 6, 7, 8, 9]
lst = [[1, 2, 3], [[4, 5, 6]], [[[7]]], 8, 9] # complex nesting
list(more_itertools.collapse(lst))
# [1, 2, 3, 4, 5, 6, 7, 8, 9]

The reason your function didn't work is because the extend extends an array in-place and doesn't return it. You can still return x from lambda, using something like this:
reduce(lambda x,y: x.extend(y) or x, l)
Note: extend is more efficient than + on lists.

matplotlib.cbook.flatten() will work for nested lists even if they nest more deeply than the example.
import matplotlib
l = [[1, 2, 3], [4, 5, 6], [7], [8, 9]]
print(list(matplotlib.cbook.flatten(l)))
l2 = [[1, 2, 3], [4, 5, 6], [7], [8, [9, 10, [11, 12, [13]]]]]
print(list(matplotlib.cbook.flatten(l2)))
Result:
[1, 2, 3, 4, 5, 6, 7, 8, 9]
[1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13]
This is 18x faster than underscore._.flatten:
Average time over 1000 trials of matplotlib.cbook.flatten: 2.55e-05 sec
Average time over 1000 trials of underscore._.flatten: 4.63e-04 sec
(time for underscore._)/(time for matplotlib.cbook) = 18.1233394636

According your list [[1, 2, 3], [4, 5, 6], [7], [8, 9]] which is 1 list level, we can simply use sum(list,[]) without using any libraries
sum([[1, 2, 3], [4, 5, 6], [7], [8, 9]],[])
# [1, 2, 3, 4, 5, 6, 7, 8, 9]
To extend the advantage of this method when there is a tuple or number existing inside. Simply adding a mapping function for each element by map to the list
#For only tuple
sum(list(map(list,[[1, 2, 3], (4, 5, 6), (7,), [8, 9]])),[])
# [1, 2, 3, 4, 5, 6, 7, 8, 9]
#In general
def convert(x):
if type(x) is int or type(x) is float:
return [x]
else:
return list(x)
sum(list(map(convert,[[1, 2, 3], (4, 5, 6), 7, [8, 9]])),[])
# [1, 2, 3, 4, 5, 6, 7, 8, 9]
In here, there is a clear explanation of the drawback in terms of memory for this approach. In short, it recursively creates list objects, which should be avoided :(

One can also use NumPy's flat:
import numpy as np
list(np.array(l).flat)
It only works when sublists have identical dimensions.

You can use the list extend method. It shows to be the fastest:
flat_list = []
for sublist in l:
flat_list.extend(sublist)
Performance:
import functools
import itertools
import numpy
import operator
import perfplot
def functools_reduce_iconcat(a):
return functools.reduce(operator.iconcat, a, [])
def itertools_chain(a):
return list(itertools.chain.from_iterable(a))
def numpy_flat(a):
return list(numpy.array(a).flat)
def extend(a):
n = []
list(map(n.extend, a))
return n
perfplot.show(
setup = lambda n: [list(range(10))] * n,
kernels = [
functools_reduce_iconcat, extend, itertools_chain, numpy_flat
],
n_range = [2**k for k in range(16)],
xlabel = 'num lists',
)
Output:

There are several answers with the same recursive appending scheme as below, but none makes use of try, which makes the solution more robust and Pythonic.
def flatten(itr):
for x in itr:
try:
yield from flatten(x)
except TypeError:
yield x
Usage: this is a generator, and you typically want to enclose it in an iterable builder like list() or tuple() or use it in a for loop.
Advantages of this solution are:
works with any kind of iterable (even future ones!)
works with any combination and deepness of nesting
works also if top level contains bare items
no dependencies
fast and efficient (you can flatten the nested iterable partially, without wasting time on the remaining part you don't need)
versatile (you can use it to build an iterable of your choice or in a loop)
N.B.: Since all iterables are flattened, strings are decomposed into sequences of single characters. If you don't like/want such behavior, you can use the following version which filters out from flattening iterables like strings and bytes:
def flatten(itr):
if type(itr) in (str,bytes):
yield itr
else:
for x in itr:
try:
yield from flatten(x)
except TypeError:
yield x

Note: Below applies to Python 3.3+ because it uses yield_from. six is also a third-party package, though it is stable. Alternately, you could use sys.version.
In the case of obj = [[1, 2,], [3, 4], [5, 6]], all of the solutions here are good, including list comprehension and itertools.chain.from_iterable.
However, consider this slightly more complex case:
>>> obj = [[1, 2, 3], [4, 5], 6, 'abc', [7], [8, [9, 10]]]
There are several problems here:
One element, 6, is just a scalar; it's not iterable, so the above routes will fail here.
One element, 'abc', is technically iterable (all strs are). However, reading between the lines a bit, you don't want to treat it as such--you want to treat it as a single element.
The final element, [8, [9, 10]] is itself a nested iterable. Basic list comprehension and chain.from_iterable only extract "1 level down."
You can remedy this as follows:
>>> from collections import Iterable
>>> from six import string_types
>>> def flatten(obj):
... for i in obj:
... if isinstance(i, Iterable) and not isinstance(i, string_types):
... yield from flatten(i)
... else:
... yield i
>>> list(flatten(obj))
[1, 2, 3, 4, 5, 6, 'abc', 7, 8, 9, 10]
Here, you check that the sub-element (1) is iterable with Iterable, an ABC from itertools, but also want to ensure that (2) the element is not "string-like."

If you are willing to give up a tiny amount of speed for a cleaner look, then you could use numpy.concatenate().tolist() or numpy.concatenate().ravel().tolist():
import numpy
l = [[1, 2, 3], [4, 5, 6], [7], [8, 9]] * 99
%timeit numpy.concatenate(l).ravel().tolist()
1000 loops, best of 3: 313 µs per loop
%timeit numpy.concatenate(l).tolist()
1000 loops, best of 3: 312 µs per loop
%timeit [item for sublist in l for item in sublist]
1000 loops, best of 3: 31.5 µs per loop
You can find out more here in the documentation, numpy.concatenate and numpy.ravel.

def flatten(alist):
if alist == []:
return []
elif type(alist) is not list:
return [alist]
else:
return flatten(alist[0]) + flatten(alist[1:])

I wanted a solution which can deal with multiple nesting ([[1], [[[2]], [3]]], [1, 2, 3] for example), but would also not be recursive (I had a big level of recursion and I got a recursion error.
This is what I came up with:
def _flatten(l) -> Iterator[Any]:
stack = l.copy()
while stack:
item = stack.pop()
if isinstance(item, list):
stack.extend(item)
else:
yield item
def flatten(l) -> Iterator[Any]:
return reversed(list(_flatten(l)))
and tests:
#pytest.mark.parametrize('input_list, expected_output', [
([1, 2, 3], [1, 2, 3]),
([[1], 2, 3], [1, 2, 3]),
([[1], [2], 3], [1, 2, 3]),
([[1], [2], [3]], [1, 2, 3]),
([[1], [[2]], [3]], [1, 2, 3]),
([[1], [[[2]], [3]]], [1, 2, 3]),
])
def test_flatten(input_list, expected_output):
assert list(flatten(input_list)) == expected_output

This may not be the most efficient way, but I thought to put a one-liner (actually a two-liner). Both versions will work on arbitrary hierarchy nested lists, and exploits language features (Python 3.5) and recursion.
def make_list_flat (l):
flist = []
flist.extend ([l]) if (type (l) is not list) else [flist.extend (make_list_flat (e)) for e in l]
return flist
a = [[1, 2], [[[[3, 4, 5], 6]]], 7, [8, [9, [10, 11], 12, [13, 14, [15, [[16, 17], 18]]]]]]
flist = make_list_flat(a)
print (flist)
The output is
[1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18]
This works in a depth first manner. The recursion goes down until it finds a non-list element, then extends the local variable flist and then rolls back it to the parent. Whenever flist is returned, it is extended to the parent's flist in the list comprehension. Therefore, at the root, a flat list is returned.
The above one creates several local lists and returns them which are used to extend the parent's list. I think the way around for this may be creating a gloabl flist, like below.
a = [[1, 2], [[[[3, 4, 5], 6]]], 7, [8, [9, [10, 11], 12, [13, 14, [15, [[16, 17], 18]]]]]]
flist = []
def make_list_flat (l):
flist.extend ([l]) if (type (l) is not list) else [make_list_flat (e) for e in l]
make_list_flat(a)
print (flist)
The output is again
[1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18]
Although I am not sure at this time about the efficiency.

Not a one-liner, but seeing all the answers here, I guess this long list missed some pattern matching, so here it is :)
The two methods are probably not efficient, but anyway, it's easy to read (to me at least; perhaps I'm spoiled by functional programming):
def flat(x):
match x:
case []:
return []
case [[*sublist], *r]:
return [*sublist, *flat(r)]
The second version considers lists of lists of lists... whatever the nesting:
def flat(x):
match x:
case []:
return []
case [[*sublist], *r]:
return [*flat(sublist), *flat(r)]
case [h, *r]:
return [h, *flat(r)]

If you want to unnest everything and keep a distinct list of elements, you could use this as well.
list_of_lists = [[1,2], [2,3], [3,4]]
list(set.union(*[set(s) for s in list_of_lists]))

If you have a numpy array a:
a = np.array([[1,2], [3,4]])
a.flatten('C')
produces:
[1, 2, 3, 4]
np.flatten also accepts other parameters:
C:
F
A
K
More details about parameters are available here.

Another unusual approach that works for hetero- and homogeneous lists of integers:
from typing import List
def flatten(l: list) -> List[int]:
"""Flatten an arbitrary deep nested list of lists of integers.
Examples:
>>> flatten([1, 2, [1, [10]]])
[1, 2, 1, 10]
Args:
l: Union[l, Union[int, List[int]]
Returns:
Flatted list of integer
"""
return [int(i.strip('[ ]')) for i in str(l).split(',')]

A non-recursive function to flatten lists of lists of any depth:
def flatten_list(list1):
out = []
inside = list1
while inside:
x = inside.pop(0)
if isinstance(x, list):
inside[0:0] = x
else:
out.append(x)
return out
l = [[[1,2],3,[4,[[5,6],7],[8]]],[9,10,11]]
flatten_list(l)
# [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11]

For a list containing multiple list here a recursive solution that work for me and that i hope is correct:
# Question 4
def flatten(input_ls=[]) -> []:
res_ls = []
res_ls = flatten_recursive(input_ls, res_ls)
print("Final flatten list solution is: \n", res_ls)
return res_ls
def flatten_recursive(input_ls=[], res_ls=[]) -> []:
tmp_ls = []
for i in input_ls:
if isinstance(i, int):
res_ls.append(i)
else:
tmp_ls = i
tmp_ls.append(flatten_recursive(i, res_ls))
print(res_ls)
return res_ls
flatten([0, 1, [2, 3], 4, [5, 6]]) # test
flatten([0, [[[1]]], [[2, 3], [4, [[5, 6]]]]])
Output:
[0, 1, 2, 3]
[0, 1, 2, 3, 4, 5, 6]
[0, 1, 2, 3, 4, 5, 6]
Final flatten list solution is:
[0, 1, 2, 3, 4, 5, 6]
[0, 1]
[0, 1]
[0, 1]
[0, 1, 2, 3]
[0, 1, 2, 3, 4, 5, 6]
[0, 1, 2, 3, 4, 5, 6]
[0, 1, 2, 3, 4, 5, 6]
[0, 1, 2, 3, 4, 5, 6]
[0, 1, 2, 3, 4, 5, 6]
Final flatten list solution is:
[0, 1, 2, 3, 4, 5, 6]

I would suggest using generators with yield statement and yield from.
Here's an example:
from collections.abc import Iterable
def flatten(items, ignore_types=(bytes, str)):
"""
Flatten all of the nested lists to the one. Ignoring flatting of iterable types str and bytes by default.
"""
for x in items:
if isinstance(x, Iterable) and not isinstance(x, ignore_types):
yield from flatten(x)
else:
yield x
values = [7, [4, 3, 5, [7, 3], (3, 4), ('A', {'B', 'C'})]]
for v in flatten(values):
print(v)

If I want to add something to the great previous answers, here is my recursive flatten function which can flatten not only nested lists, but also any given container or any generally any object which can throw out items. This does also work for any depth of nesting and it is a lazy iterator which yields the items as requested:
def flatten(iterable):
# These types won't considered a sequence or generally a container
exclude = str, bytes
for i in iterable:
try:
if isinstance(i, exclude):
raise TypeError
iter(i)
except TypeError:
yield i
else:
yield from flatten(i)
This way, you can exclude types you don't want to be flattened, like str or what else.
The idea is if an object can pass the iter() it's ready to yield items. So the iterable can have even generator expressions as an item.
Someone could argue: Why did you write this that generic when the OP didn't ask for it? OK, you're right. I just felt like this might help someone (like it did for myself).
Test cases:
lst1 = [1, {3}, (1, 6), [[3, 8]], [[[5]]], 9, ((((2,),),),)]
lst2 = ['3', B'A', [[[(i ** 2 for i in range(3))]]], range(3)]
print(list(flatten(lst1)))
print(list(flatten(lst2)))
Output:
[1, 3, 1, 6, 3, 8, 5, 9, 2]
['3', b'A', 0, 1, 4, 0, 1, 2]

Considering the list has just integers:
import re
l = [[1, 2, 3], [4, 5, 6], [7], [8, 9]]
list(map(int,re.sub('(\[|\])','',str(l)).split(',')))

Simplest Way to do in python without any library
This function will work for even multidimensional list also
using recursion we can achieve any combination of list inside list, we can flatten it without using any library.
#Devil
x = [[1, 2, 3], [4, 5, 6], [7], [8, 9]]
output = []
def flatten(v):
if isinstance(v, int):
output.append(v)
if isinstance(v, list):
for i in range(0, len(v)):
flatten(v[i])
flatten(x)
print("Output:", output)
#Output: [1, 2, 3, 4, 5, 6, 7, 8, 9]
#Adding more dimensions
x = [ [1, [2, 3, [4, 5], [6]], 7 ], [8, [9, [10]]] ]
flatten(x)
print("Output:", output)
#Output: [1, 2, 3, 4, 5, 6, 7, 8, 9, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10]

Related

Nested loops in one line in python [duplicate]

I have a list of lists like [[1, 2, 3], [4, 5, 6], [7], [8, 9]]. How can I flatten it to get [1, 2, 3, 4, 5, 6, 7, 8, 9]?
If your list of lists comes from a nested list comprehension, the problem can be solved more simply/directly by fixing the comprehension; please see python list comprehensions; compressing a list of lists?.
The most popular solutions here generally only flatten one "level" of the nested list. See Flatten an irregular (arbitrarily nested) list of lists for solutions that completely flatten a deeply nested structure (recursively, in general).
Given a list of lists l,
flat_list = [item for sublist in l for item in sublist]
which means:
flat_list = []
for sublist in l:
for item in sublist:
flat_list.append(item)
is faster than the shortcuts posted so far. (l is the list to flatten.)
Here is the corresponding function:
def flatten(l):
return [item for sublist in l for item in sublist]
As evidence, you can use the timeit module in the standard library:
$ python -mtimeit -s'l=[[1,2,3],[4,5,6], [7], [8,9]]*99' '[item for sublist in l for item in sublist]'
10000 loops, best of 3: 143 usec per loop
$ python -mtimeit -s'l=[[1,2,3],[4,5,6], [7], [8,9]]*99' 'sum(l, [])'
1000 loops, best of 3: 969 usec per loop
$ python -mtimeit -s'l=[[1,2,3],[4,5,6], [7], [8,9]]*99' 'reduce(lambda x,y: x+y,l)'
1000 loops, best of 3: 1.1 msec per loop
Explanation: the shortcuts based on + (including the implied use in sum) are, of necessity, O(L**2) when there are L sublists -- as the intermediate result list keeps getting longer, at each step a new intermediate result list object gets allocated, and all the items in the previous intermediate result must be copied over (as well as a few new ones added at the end). So, for simplicity and without actual loss of generality, say you have L sublists of I items each: the first I items are copied back and forth L-1 times, the second I items L-2 times, and so on; total number of copies is I times the sum of x for x from 1 to L excluded, i.e., I * (L**2)/2.
The list comprehension just generates one list, once, and copies each item over (from its original place of residence to the result list) also exactly once.
You can use itertools.chain():
>>> import itertools
>>> list2d = [[1,2,3], [4,5,6], [7], [8,9]]
>>> merged = list(itertools.chain(*list2d))
Or you can use itertools.chain.from_iterable() which doesn't require unpacking the list with the * operator:
>>> import itertools
>>> list2d = [[1,2,3], [4,5,6], [7], [8,9]]
>>> merged = list(itertools.chain.from_iterable(list2d))
This approach is arguably more readable than [item for sublist in l for item in sublist] and appears to be faster too:
$ python3 -mtimeit -s'l=[[1,2,3],[4,5,6], [7], [8,9]]*99;import itertools' 'list(itertools.chain.from_iterable(l))'
20000 loops, best of 5: 10.8 usec per loop
$ python3 -mtimeit -s'l=[[1,2,3],[4,5,6], [7], [8,9]]*99' '[item for sublist in l for item in sublist]'
10000 loops, best of 5: 21.7 usec per loop
$ python3 -mtimeit -s'l=[[1,2,3],[4,5,6], [7], [8,9]]*99' 'sum(l, [])'
1000 loops, best of 5: 258 usec per loop
$ python3 -mtimeit -s'l=[[1,2,3],[4,5,6], [7], [8,9]]*99;from functools import reduce' 'reduce(lambda x,y: x+y,l)'
1000 loops, best of 5: 292 usec per loop
$ python3 --version
Python 3.7.5rc1
Note from the author: This is very inefficient. But fun, because monoids are awesome.
>>> xss = [[1, 2, 3], [4, 5, 6], [7], [8, 9]]
>>> sum(xss, [])
[1, 2, 3, 4, 5, 6, 7, 8, 9]
sum sums the elements of the iterable xss, and uses the second argument as the initial value [] for the sum. (The default initial value is 0, which is not a list.)
Because you are summing nested lists, you actually get [1,3]+[2,4] as a result of sum([[1,3],[2,4]],[]), which is equal to [1,3,2,4].
Note that only works on lists of lists. For lists of lists of lists, you'll need another solution.
I tested most suggested solutions with perfplot (a pet project of mine, essentially a wrapper around timeit), and found
import functools
import operator
functools.reduce(operator.iconcat, a, [])
to be the fastest solution, both when many small lists and few long lists are concatenated. (operator.iadd is equally fast.)
A simpler and also acceptable variant is
out = []
for sublist in a:
out.extend(sublist)
If the number of sublists is large, this performs a little worse than the above suggestion.
Code to reproduce the plot:
import functools
import itertools
import operator
import numpy as np
import perfplot
def forfor(a):
return [item for sublist in a for item in sublist]
def sum_brackets(a):
return sum(a, [])
def functools_reduce(a):
return functools.reduce(operator.concat, a)
def functools_reduce_iconcat(a):
return functools.reduce(operator.iconcat, a, [])
def itertools_chain(a):
return list(itertools.chain.from_iterable(a))
def numpy_flat(a):
return list(np.array(a).flat)
def numpy_concatenate(a):
return list(np.concatenate(a))
def extend(a):
out = []
for sublist in a:
out.extend(sublist)
return out
b = perfplot.bench(
setup=lambda n: [list(range(10))] * n,
# setup=lambda n: [list(range(n))] * 10,
kernels=[
forfor,
sum_brackets,
functools_reduce,
functools_reduce_iconcat,
itertools_chain,
numpy_flat,
numpy_concatenate,
extend,
],
n_range=[2 ** k for k in range(16)],
xlabel="num lists (of length 10)",
# xlabel="len lists (10 lists total)"
)
b.save("out.png")
b.show()
Using functools.reduce, which adds an accumulated list xs to the next list ys:
from functools import reduce
xss = [[1,2,3], [4,5,6], [7], [8,9]]
out = reduce(lambda xs, ys: xs + ys, xss)
Output:
[1, 2, 3, 4, 5, 6, 7, 8, 9]
A faster way using operator.concat:
from functools import reduce
import operator
xss = [[1,2,3], [4,5,6], [7], [8,9]]
out = reduce(operator.concat, xss)
Output:
[1, 2, 3, 4, 5, 6, 7, 8, 9]
Here is a general approach that applies to numbers, strings, nested lists and mixed containers. This can flatten both simple and complicated containers (see also Demo).
Code
from typing import Iterable
#from collections import Iterable # < py38
def flatten(items):
"""Yield items from any nested iterable; see Reference."""
for x in items:
if isinstance(x, Iterable) and not isinstance(x, (str, bytes)):
for sub_x in flatten(x):
yield sub_x
else:
yield x
Notes:
In Python 3, yield from flatten(x) can replace for sub_x in flatten(x): yield sub_x
In Python 3.8, abstract base classes are moved from collection.abc to the typing module.
Demo
simple = [[1, 2, 3], [4, 5, 6], [7], [8, 9]]
list(flatten(simple))
# [1, 2, 3, 4, 5, 6, 7, 8, 9]
complicated = [[1, [2]], (3, 4, {5, 6}, 7), 8, "9"] # numbers, strs, nested & mixed
list(flatten(complicated))
# [1, 2, 3, 4, 5, 6, 7, 8, '9']
Reference
This solution is modified from a recipe in Beazley, D. and B. Jones. Recipe 4.14, Python Cookbook 3rd Ed., O'Reilly Media Inc. Sebastopol, CA: 2013.
Found an earlier SO post, possibly the original demonstration.
To flatten a data-structure that is deeply nested, use iteration_utilities.deepflatten1:
>>> from iteration_utilities import deepflatten
>>> l = [[1, 2, 3], [4, 5, 6], [7], [8, 9]]
>>> list(deepflatten(l, depth=1))
[1, 2, 3, 4, 5, 6, 7, 8, 9]
>>> l = [[1, 2, 3], [4, [5, 6]], 7, [8, 9]]
>>> list(deepflatten(l))
[1, 2, 3, 4, 5, 6, 7, 8, 9]
It's a generator so you need to cast the result to a list or explicitly iterate over it.
To flatten only one level and if each of the items is itself iterable you can also use iteration_utilities.flatten which itself is just a thin wrapper around itertools.chain.from_iterable:
>>> from iteration_utilities import flatten
>>> l = [[1, 2, 3], [4, 5, 6], [7], [8, 9]]
>>> list(flatten(l))
[1, 2, 3, 4, 5, 6, 7, 8, 9]
Just to add some timings (based on Nico Schlömer's answer that didn't include the function presented in this answer):
It's a log-log plot to accommodate for the huge range of values spanned. For qualitative reasoning: Lower is better.
The results show that if the iterable contains only a few inner iterables then sum will be fastest, however for long iterables only the itertools.chain.from_iterable, iteration_utilities.deepflatten or the nested comprehension have reasonable performance with itertools.chain.from_iterable being the fastest (as already noticed by Nico Schlömer).
from itertools import chain
from functools import reduce
from collections import Iterable # or from collections.abc import Iterable
import operator
from iteration_utilities import deepflatten
def nested_list_comprehension(lsts):
return [item for sublist in lsts for item in sublist]
def itertools_chain_from_iterable(lsts):
return list(chain.from_iterable(lsts))
def pythons_sum(lsts):
return sum(lsts, [])
def reduce_add(lsts):
return reduce(lambda x, y: x + y, lsts)
def pylangs_flatten(lsts):
return list(flatten(lsts))
def flatten(items):
"""Yield items from any nested iterable; see REF."""
for x in items:
if isinstance(x, Iterable) and not isinstance(x, (str, bytes)):
yield from flatten(x)
else:
yield x
def reduce_concat(lsts):
return reduce(operator.concat, lsts)
def iteration_utilities_deepflatten(lsts):
return list(deepflatten(lsts, depth=1))
from simple_benchmark import benchmark
b = benchmark(
[nested_list_comprehension, itertools_chain_from_iterable, pythons_sum, reduce_add,
pylangs_flatten, reduce_concat, iteration_utilities_deepflatten],
arguments={2**i: [[0]*5]*(2**i) for i in range(1, 13)},
argument_name='number of inner lists'
)
b.plot()
1 Disclaimer: I'm the author of that library
The following seems simplest to me:
>>> import numpy as np
>>> l = [[1, 2, 3], [4, 5, 6], [7], [8, 9]]
>>> print(np.concatenate(l))
[1 2 3 4 5 6 7 8 9]
Consider installing the more_itertools package.
> pip install more_itertools
It ships with an implementation for flatten (source, from the itertools recipes):
import more_itertools
lst = [[1, 2, 3], [4, 5, 6], [7], [8, 9]]
list(more_itertools.flatten(lst))
# [1, 2, 3, 4, 5, 6, 7, 8, 9]
Note: as mentioned in the docs, flatten requires a list of lists. See below on flattening more irregular inputs.
As of version 2.4, you can flatten more complicated, nested iterables with more_itertools.collapse (source, contributed by abarnet).
lst = [[1, 2, 3], [4, 5, 6], [7], [8, 9]]
list(more_itertools.collapse(lst))
# [1, 2, 3, 4, 5, 6, 7, 8, 9]
lst = [[1, 2, 3], [[4, 5, 6]], [[[7]]], 8, 9] # complex nesting
list(more_itertools.collapse(lst))
# [1, 2, 3, 4, 5, 6, 7, 8, 9]
The reason your function didn't work is because the extend extends an array in-place and doesn't return it. You can still return x from lambda, using something like this:
reduce(lambda x,y: x.extend(y) or x, l)
Note: extend is more efficient than + on lists.
matplotlib.cbook.flatten() will work for nested lists even if they nest more deeply than the example.
import matplotlib
l = [[1, 2, 3], [4, 5, 6], [7], [8, 9]]
print(list(matplotlib.cbook.flatten(l)))
l2 = [[1, 2, 3], [4, 5, 6], [7], [8, [9, 10, [11, 12, [13]]]]]
print(list(matplotlib.cbook.flatten(l2)))
Result:
[1, 2, 3, 4, 5, 6, 7, 8, 9]
[1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13]
This is 18x faster than underscore._.flatten:
Average time over 1000 trials of matplotlib.cbook.flatten: 2.55e-05 sec
Average time over 1000 trials of underscore._.flatten: 4.63e-04 sec
(time for underscore._)/(time for matplotlib.cbook) = 18.1233394636
According your list [[1, 2, 3], [4, 5, 6], [7], [8, 9]] which is 1 list level, we can simply use sum(list,[]) without using any libraries
sum([[1, 2, 3], [4, 5, 6], [7], [8, 9]],[])
# [1, 2, 3, 4, 5, 6, 7, 8, 9]
To extend the advantage of this method when there is a tuple or number existing inside. Simply adding a mapping function for each element by map to the list
#For only tuple
sum(list(map(list,[[1, 2, 3], (4, 5, 6), (7,), [8, 9]])),[])
# [1, 2, 3, 4, 5, 6, 7, 8, 9]
#In general
def convert(x):
if type(x) is int or type(x) is float:
return [x]
else:
return list(x)
sum(list(map(convert,[[1, 2, 3], (4, 5, 6), 7, [8, 9]])),[])
# [1, 2, 3, 4, 5, 6, 7, 8, 9]
In here, there is a clear explanation of the drawback in terms of memory for this approach. In short, it recursively creates list objects, which should be avoided :(
One can also use NumPy's flat:
import numpy as np
list(np.array(l).flat)
It only works when sublists have identical dimensions.
You can use the list extend method. It shows to be the fastest:
flat_list = []
for sublist in l:
flat_list.extend(sublist)
Performance:
import functools
import itertools
import numpy
import operator
import perfplot
def functools_reduce_iconcat(a):
return functools.reduce(operator.iconcat, a, [])
def itertools_chain(a):
return list(itertools.chain.from_iterable(a))
def numpy_flat(a):
return list(numpy.array(a).flat)
def extend(a):
n = []
list(map(n.extend, a))
return n
perfplot.show(
setup = lambda n: [list(range(10))] * n,
kernels = [
functools_reduce_iconcat, extend, itertools_chain, numpy_flat
],
n_range = [2**k for k in range(16)],
xlabel = 'num lists',
)
Output:
There are several answers with the same recursive appending scheme as below, but none makes use of try, which makes the solution more robust and Pythonic.
def flatten(itr):
for x in itr:
try:
yield from flatten(x)
except TypeError:
yield x
Usage: this is a generator, and you typically want to enclose it in an iterable builder like list() or tuple() or use it in a for loop.
Advantages of this solution are:
works with any kind of iterable (even future ones!)
works with any combination and deepness of nesting
works also if top level contains bare items
no dependencies
fast and efficient (you can flatten the nested iterable partially, without wasting time on the remaining part you don't need)
versatile (you can use it to build an iterable of your choice or in a loop)
N.B.: Since all iterables are flattened, strings are decomposed into sequences of single characters. If you don't like/want such behavior, you can use the following version which filters out from flattening iterables like strings and bytes:
def flatten(itr):
if type(itr) in (str,bytes):
yield itr
else:
for x in itr:
try:
yield from flatten(x)
except TypeError:
yield x
Note: Below applies to Python 3.3+ because it uses yield_from. six is also a third-party package, though it is stable. Alternately, you could use sys.version.
In the case of obj = [[1, 2,], [3, 4], [5, 6]], all of the solutions here are good, including list comprehension and itertools.chain.from_iterable.
However, consider this slightly more complex case:
>>> obj = [[1, 2, 3], [4, 5], 6, 'abc', [7], [8, [9, 10]]]
There are several problems here:
One element, 6, is just a scalar; it's not iterable, so the above routes will fail here.
One element, 'abc', is technically iterable (all strs are). However, reading between the lines a bit, you don't want to treat it as such--you want to treat it as a single element.
The final element, [8, [9, 10]] is itself a nested iterable. Basic list comprehension and chain.from_iterable only extract "1 level down."
You can remedy this as follows:
>>> from collections import Iterable
>>> from six import string_types
>>> def flatten(obj):
... for i in obj:
... if isinstance(i, Iterable) and not isinstance(i, string_types):
... yield from flatten(i)
... else:
... yield i
>>> list(flatten(obj))
[1, 2, 3, 4, 5, 6, 'abc', 7, 8, 9, 10]
Here, you check that the sub-element (1) is iterable with Iterable, an ABC from itertools, but also want to ensure that (2) the element is not "string-like."
If you are willing to give up a tiny amount of speed for a cleaner look, then you could use numpy.concatenate().tolist() or numpy.concatenate().ravel().tolist():
import numpy
l = [[1, 2, 3], [4, 5, 6], [7], [8, 9]] * 99
%timeit numpy.concatenate(l).ravel().tolist()
1000 loops, best of 3: 313 µs per loop
%timeit numpy.concatenate(l).tolist()
1000 loops, best of 3: 312 µs per loop
%timeit [item for sublist in l for item in sublist]
1000 loops, best of 3: 31.5 µs per loop
You can find out more here in the documentation, numpy.concatenate and numpy.ravel.
def flatten(alist):
if alist == []:
return []
elif type(alist) is not list:
return [alist]
else:
return flatten(alist[0]) + flatten(alist[1:])
I wanted a solution which can deal with multiple nesting ([[1], [[[2]], [3]]], [1, 2, 3] for example), but would also not be recursive (I had a big level of recursion and I got a recursion error.
This is what I came up with:
def _flatten(l) -> Iterator[Any]:
stack = l.copy()
while stack:
item = stack.pop()
if isinstance(item, list):
stack.extend(item)
else:
yield item
def flatten(l) -> Iterator[Any]:
return reversed(list(_flatten(l)))
and tests:
#pytest.mark.parametrize('input_list, expected_output', [
([1, 2, 3], [1, 2, 3]),
([[1], 2, 3], [1, 2, 3]),
([[1], [2], 3], [1, 2, 3]),
([[1], [2], [3]], [1, 2, 3]),
([[1], [[2]], [3]], [1, 2, 3]),
([[1], [[[2]], [3]]], [1, 2, 3]),
])
def test_flatten(input_list, expected_output):
assert list(flatten(input_list)) == expected_output
This may not be the most efficient way, but I thought to put a one-liner (actually a two-liner). Both versions will work on arbitrary hierarchy nested lists, and exploits language features (Python 3.5) and recursion.
def make_list_flat (l):
flist = []
flist.extend ([l]) if (type (l) is not list) else [flist.extend (make_list_flat (e)) for e in l]
return flist
a = [[1, 2], [[[[3, 4, 5], 6]]], 7, [8, [9, [10, 11], 12, [13, 14, [15, [[16, 17], 18]]]]]]
flist = make_list_flat(a)
print (flist)
The output is
[1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18]
This works in a depth first manner. The recursion goes down until it finds a non-list element, then extends the local variable flist and then rolls back it to the parent. Whenever flist is returned, it is extended to the parent's flist in the list comprehension. Therefore, at the root, a flat list is returned.
The above one creates several local lists and returns them which are used to extend the parent's list. I think the way around for this may be creating a gloabl flist, like below.
a = [[1, 2], [[[[3, 4, 5], 6]]], 7, [8, [9, [10, 11], 12, [13, 14, [15, [[16, 17], 18]]]]]]
flist = []
def make_list_flat (l):
flist.extend ([l]) if (type (l) is not list) else [make_list_flat (e) for e in l]
make_list_flat(a)
print (flist)
The output is again
[1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18]
Although I am not sure at this time about the efficiency.
Not a one-liner, but seeing all the answers here, I guess this long list missed some pattern matching, so here it is :)
The two methods are probably not efficient, but anyway, it's easy to read (to me at least; perhaps I'm spoiled by functional programming):
def flat(x):
match x:
case []:
return []
case [[*sublist], *r]:
return [*sublist, *flat(r)]
The second version considers lists of lists of lists... whatever the nesting:
def flat(x):
match x:
case []:
return []
case [[*sublist], *r]:
return [*flat(sublist), *flat(r)]
case [h, *r]:
return [h, *flat(r)]
If you want to unnest everything and keep a distinct list of elements, you could use this as well.
list_of_lists = [[1,2], [2,3], [3,4]]
list(set.union(*[set(s) for s in list_of_lists]))
If you have a numpy array a:
a = np.array([[1,2], [3,4]])
a.flatten('C')
produces:
[1, 2, 3, 4]
np.flatten also accepts other parameters:
C:
F
A
K
More details about parameters are available here.
Another unusual approach that works for hetero- and homogeneous lists of integers:
from typing import List
def flatten(l: list) -> List[int]:
"""Flatten an arbitrary deep nested list of lists of integers.
Examples:
>>> flatten([1, 2, [1, [10]]])
[1, 2, 1, 10]
Args:
l: Union[l, Union[int, List[int]]
Returns:
Flatted list of integer
"""
return [int(i.strip('[ ]')) for i in str(l).split(',')]
A non-recursive function to flatten lists of lists of any depth:
def flatten_list(list1):
out = []
inside = list1
while inside:
x = inside.pop(0)
if isinstance(x, list):
inside[0:0] = x
else:
out.append(x)
return out
l = [[[1,2],3,[4,[[5,6],7],[8]]],[9,10,11]]
flatten_list(l)
# [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11]
For a list containing multiple list here a recursive solution that work for me and that i hope is correct:
# Question 4
def flatten(input_ls=[]) -> []:
res_ls = []
res_ls = flatten_recursive(input_ls, res_ls)
print("Final flatten list solution is: \n", res_ls)
return res_ls
def flatten_recursive(input_ls=[], res_ls=[]) -> []:
tmp_ls = []
for i in input_ls:
if isinstance(i, int):
res_ls.append(i)
else:
tmp_ls = i
tmp_ls.append(flatten_recursive(i, res_ls))
print(res_ls)
return res_ls
flatten([0, 1, [2, 3], 4, [5, 6]]) # test
flatten([0, [[[1]]], [[2, 3], [4, [[5, 6]]]]])
Output:
[0, 1, 2, 3]
[0, 1, 2, 3, 4, 5, 6]
[0, 1, 2, 3, 4, 5, 6]
Final flatten list solution is:
[0, 1, 2, 3, 4, 5, 6]
[0, 1]
[0, 1]
[0, 1]
[0, 1, 2, 3]
[0, 1, 2, 3, 4, 5, 6]
[0, 1, 2, 3, 4, 5, 6]
[0, 1, 2, 3, 4, 5, 6]
[0, 1, 2, 3, 4, 5, 6]
[0, 1, 2, 3, 4, 5, 6]
Final flatten list solution is:
[0, 1, 2, 3, 4, 5, 6]
I would suggest using generators with yield statement and yield from.
Here's an example:
from collections.abc import Iterable
def flatten(items, ignore_types=(bytes, str)):
"""
Flatten all of the nested lists to the one. Ignoring flatting of iterable types str and bytes by default.
"""
for x in items:
if isinstance(x, Iterable) and not isinstance(x, ignore_types):
yield from flatten(x)
else:
yield x
values = [7, [4, 3, 5, [7, 3], (3, 4), ('A', {'B', 'C'})]]
for v in flatten(values):
print(v)
If I want to add something to the great previous answers, here is my recursive flatten function which can flatten not only nested lists, but also any given container or any generally any object which can throw out items. This does also work for any depth of nesting and it is a lazy iterator which yields the items as requested:
def flatten(iterable):
# These types won't considered a sequence or generally a container
exclude = str, bytes
for i in iterable:
try:
if isinstance(i, exclude):
raise TypeError
iter(i)
except TypeError:
yield i
else:
yield from flatten(i)
This way, you can exclude types you don't want to be flattened, like str or what else.
The idea is if an object can pass the iter() it's ready to yield items. So the iterable can have even generator expressions as an item.
Someone could argue: Why did you write this that generic when the OP didn't ask for it? OK, you're right. I just felt like this might help someone (like it did for myself).
Test cases:
lst1 = [1, {3}, (1, 6), [[3, 8]], [[[5]]], 9, ((((2,),),),)]
lst2 = ['3', B'A', [[[(i ** 2 for i in range(3))]]], range(3)]
print(list(flatten(lst1)))
print(list(flatten(lst2)))
Output:
[1, 3, 1, 6, 3, 8, 5, 9, 2]
['3', b'A', 0, 1, 4, 0, 1, 2]
Considering the list has just integers:
import re
l = [[1, 2, 3], [4, 5, 6], [7], [8, 9]]
list(map(int,re.sub('(\[|\])','',str(l)).split(',')))
Simplest Way to do in python without any library
This function will work for even multidimensional list also
using recursion we can achieve any combination of list inside list, we can flatten it without using any library.
#Devil
x = [[1, 2, 3], [4, 5, 6], [7], [8, 9]]
output = []
def flatten(v):
if isinstance(v, int):
output.append(v)
if isinstance(v, list):
for i in range(0, len(v)):
flatten(v[i])
flatten(x)
print("Output:", output)
#Output: [1, 2, 3, 4, 5, 6, 7, 8, 9]
#Adding more dimensions
x = [ [1, [2, 3, [4, 5], [6]], 7 ], [8, [9, [10]]] ]
flatten(x)
print("Output:", output)
#Output: [1, 2, 3, 4, 5, 6, 7, 8, 9, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10]

Converting a nested list of scalars into a singular list [duplicate]

I have a list of lists like [[1, 2, 3], [4, 5, 6], [7], [8, 9]]. How can I flatten it to get [1, 2, 3, 4, 5, 6, 7, 8, 9]?
If your list of lists comes from a nested list comprehension, the problem can be solved more simply/directly by fixing the comprehension; please see python list comprehensions; compressing a list of lists?.
The most popular solutions here generally only flatten one "level" of the nested list. See Flatten an irregular (arbitrarily nested) list of lists for solutions that completely flatten a deeply nested structure (recursively, in general).
Given a list of lists l,
flat_list = [item for sublist in l for item in sublist]
which means:
flat_list = []
for sublist in l:
for item in sublist:
flat_list.append(item)
is faster than the shortcuts posted so far. (l is the list to flatten.)
Here is the corresponding function:
def flatten(l):
return [item for sublist in l for item in sublist]
As evidence, you can use the timeit module in the standard library:
$ python -mtimeit -s'l=[[1,2,3],[4,5,6], [7], [8,9]]*99' '[item for sublist in l for item in sublist]'
10000 loops, best of 3: 143 usec per loop
$ python -mtimeit -s'l=[[1,2,3],[4,5,6], [7], [8,9]]*99' 'sum(l, [])'
1000 loops, best of 3: 969 usec per loop
$ python -mtimeit -s'l=[[1,2,3],[4,5,6], [7], [8,9]]*99' 'reduce(lambda x,y: x+y,l)'
1000 loops, best of 3: 1.1 msec per loop
Explanation: the shortcuts based on + (including the implied use in sum) are, of necessity, O(L**2) when there are L sublists -- as the intermediate result list keeps getting longer, at each step a new intermediate result list object gets allocated, and all the items in the previous intermediate result must be copied over (as well as a few new ones added at the end). So, for simplicity and without actual loss of generality, say you have L sublists of I items each: the first I items are copied back and forth L-1 times, the second I items L-2 times, and so on; total number of copies is I times the sum of x for x from 1 to L excluded, i.e., I * (L**2)/2.
The list comprehension just generates one list, once, and copies each item over (from its original place of residence to the result list) also exactly once.
You can use itertools.chain():
>>> import itertools
>>> list2d = [[1,2,3], [4,5,6], [7], [8,9]]
>>> merged = list(itertools.chain(*list2d))
Or you can use itertools.chain.from_iterable() which doesn't require unpacking the list with the * operator:
>>> import itertools
>>> list2d = [[1,2,3], [4,5,6], [7], [8,9]]
>>> merged = list(itertools.chain.from_iterable(list2d))
This approach is arguably more readable than [item for sublist in l for item in sublist] and appears to be faster too:
$ python3 -mtimeit -s'l=[[1,2,3],[4,5,6], [7], [8,9]]*99;import itertools' 'list(itertools.chain.from_iterable(l))'
20000 loops, best of 5: 10.8 usec per loop
$ python3 -mtimeit -s'l=[[1,2,3],[4,5,6], [7], [8,9]]*99' '[item for sublist in l for item in sublist]'
10000 loops, best of 5: 21.7 usec per loop
$ python3 -mtimeit -s'l=[[1,2,3],[4,5,6], [7], [8,9]]*99' 'sum(l, [])'
1000 loops, best of 5: 258 usec per loop
$ python3 -mtimeit -s'l=[[1,2,3],[4,5,6], [7], [8,9]]*99;from functools import reduce' 'reduce(lambda x,y: x+y,l)'
1000 loops, best of 5: 292 usec per loop
$ python3 --version
Python 3.7.5rc1
Note from the author: This is very inefficient. But fun, because monoids are awesome.
>>> xss = [[1, 2, 3], [4, 5, 6], [7], [8, 9]]
>>> sum(xss, [])
[1, 2, 3, 4, 5, 6, 7, 8, 9]
sum sums the elements of the iterable xss, and uses the second argument as the initial value [] for the sum. (The default initial value is 0, which is not a list.)
Because you are summing nested lists, you actually get [1,3]+[2,4] as a result of sum([[1,3],[2,4]],[]), which is equal to [1,3,2,4].
Note that only works on lists of lists. For lists of lists of lists, you'll need another solution.
I tested most suggested solutions with perfplot (a pet project of mine, essentially a wrapper around timeit), and found
import functools
import operator
functools.reduce(operator.iconcat, a, [])
to be the fastest solution, both when many small lists and few long lists are concatenated. (operator.iadd is equally fast.)
A simpler and also acceptable variant is
out = []
for sublist in a:
out.extend(sublist)
If the number of sublists is large, this performs a little worse than the above suggestion.
Code to reproduce the plot:
import functools
import itertools
import operator
import numpy as np
import perfplot
def forfor(a):
return [item for sublist in a for item in sublist]
def sum_brackets(a):
return sum(a, [])
def functools_reduce(a):
return functools.reduce(operator.concat, a)
def functools_reduce_iconcat(a):
return functools.reduce(operator.iconcat, a, [])
def itertools_chain(a):
return list(itertools.chain.from_iterable(a))
def numpy_flat(a):
return list(np.array(a).flat)
def numpy_concatenate(a):
return list(np.concatenate(a))
def extend(a):
out = []
for sublist in a:
out.extend(sublist)
return out
b = perfplot.bench(
setup=lambda n: [list(range(10))] * n,
# setup=lambda n: [list(range(n))] * 10,
kernels=[
forfor,
sum_brackets,
functools_reduce,
functools_reduce_iconcat,
itertools_chain,
numpy_flat,
numpy_concatenate,
extend,
],
n_range=[2 ** k for k in range(16)],
xlabel="num lists (of length 10)",
# xlabel="len lists (10 lists total)"
)
b.save("out.png")
b.show()
Using functools.reduce, which adds an accumulated list xs to the next list ys:
from functools import reduce
xss = [[1,2,3], [4,5,6], [7], [8,9]]
out = reduce(lambda xs, ys: xs + ys, xss)
Output:
[1, 2, 3, 4, 5, 6, 7, 8, 9]
A faster way using operator.concat:
from functools import reduce
import operator
xss = [[1,2,3], [4,5,6], [7], [8,9]]
out = reduce(operator.concat, xss)
Output:
[1, 2, 3, 4, 5, 6, 7, 8, 9]
Here is a general approach that applies to numbers, strings, nested lists and mixed containers. This can flatten both simple and complicated containers (see also Demo).
Code
from typing import Iterable
#from collections import Iterable # < py38
def flatten(items):
"""Yield items from any nested iterable; see Reference."""
for x in items:
if isinstance(x, Iterable) and not isinstance(x, (str, bytes)):
for sub_x in flatten(x):
yield sub_x
else:
yield x
Notes:
In Python 3, yield from flatten(x) can replace for sub_x in flatten(x): yield sub_x
In Python 3.8, abstract base classes are moved from collection.abc to the typing module.
Demo
simple = [[1, 2, 3], [4, 5, 6], [7], [8, 9]]
list(flatten(simple))
# [1, 2, 3, 4, 5, 6, 7, 8, 9]
complicated = [[1, [2]], (3, 4, {5, 6}, 7), 8, "9"] # numbers, strs, nested & mixed
list(flatten(complicated))
# [1, 2, 3, 4, 5, 6, 7, 8, '9']
Reference
This solution is modified from a recipe in Beazley, D. and B. Jones. Recipe 4.14, Python Cookbook 3rd Ed., O'Reilly Media Inc. Sebastopol, CA: 2013.
Found an earlier SO post, possibly the original demonstration.
To flatten a data-structure that is deeply nested, use iteration_utilities.deepflatten1:
>>> from iteration_utilities import deepflatten
>>> l = [[1, 2, 3], [4, 5, 6], [7], [8, 9]]
>>> list(deepflatten(l, depth=1))
[1, 2, 3, 4, 5, 6, 7, 8, 9]
>>> l = [[1, 2, 3], [4, [5, 6]], 7, [8, 9]]
>>> list(deepflatten(l))
[1, 2, 3, 4, 5, 6, 7, 8, 9]
It's a generator so you need to cast the result to a list or explicitly iterate over it.
To flatten only one level and if each of the items is itself iterable you can also use iteration_utilities.flatten which itself is just a thin wrapper around itertools.chain.from_iterable:
>>> from iteration_utilities import flatten
>>> l = [[1, 2, 3], [4, 5, 6], [7], [8, 9]]
>>> list(flatten(l))
[1, 2, 3, 4, 5, 6, 7, 8, 9]
Just to add some timings (based on Nico Schlömer's answer that didn't include the function presented in this answer):
It's a log-log plot to accommodate for the huge range of values spanned. For qualitative reasoning: Lower is better.
The results show that if the iterable contains only a few inner iterables then sum will be fastest, however for long iterables only the itertools.chain.from_iterable, iteration_utilities.deepflatten or the nested comprehension have reasonable performance with itertools.chain.from_iterable being the fastest (as already noticed by Nico Schlömer).
from itertools import chain
from functools import reduce
from collections import Iterable # or from collections.abc import Iterable
import operator
from iteration_utilities import deepflatten
def nested_list_comprehension(lsts):
return [item for sublist in lsts for item in sublist]
def itertools_chain_from_iterable(lsts):
return list(chain.from_iterable(lsts))
def pythons_sum(lsts):
return sum(lsts, [])
def reduce_add(lsts):
return reduce(lambda x, y: x + y, lsts)
def pylangs_flatten(lsts):
return list(flatten(lsts))
def flatten(items):
"""Yield items from any nested iterable; see REF."""
for x in items:
if isinstance(x, Iterable) and not isinstance(x, (str, bytes)):
yield from flatten(x)
else:
yield x
def reduce_concat(lsts):
return reduce(operator.concat, lsts)
def iteration_utilities_deepflatten(lsts):
return list(deepflatten(lsts, depth=1))
from simple_benchmark import benchmark
b = benchmark(
[nested_list_comprehension, itertools_chain_from_iterable, pythons_sum, reduce_add,
pylangs_flatten, reduce_concat, iteration_utilities_deepflatten],
arguments={2**i: [[0]*5]*(2**i) for i in range(1, 13)},
argument_name='number of inner lists'
)
b.plot()
1 Disclaimer: I'm the author of that library
The following seems simplest to me:
>>> import numpy as np
>>> l = [[1, 2, 3], [4, 5, 6], [7], [8, 9]]
>>> print(np.concatenate(l))
[1 2 3 4 5 6 7 8 9]
Consider installing the more_itertools package.
> pip install more_itertools
It ships with an implementation for flatten (source, from the itertools recipes):
import more_itertools
lst = [[1, 2, 3], [4, 5, 6], [7], [8, 9]]
list(more_itertools.flatten(lst))
# [1, 2, 3, 4, 5, 6, 7, 8, 9]
Note: as mentioned in the docs, flatten requires a list of lists. See below on flattening more irregular inputs.
As of version 2.4, you can flatten more complicated, nested iterables with more_itertools.collapse (source, contributed by abarnet).
lst = [[1, 2, 3], [4, 5, 6], [7], [8, 9]]
list(more_itertools.collapse(lst))
# [1, 2, 3, 4, 5, 6, 7, 8, 9]
lst = [[1, 2, 3], [[4, 5, 6]], [[[7]]], 8, 9] # complex nesting
list(more_itertools.collapse(lst))
# [1, 2, 3, 4, 5, 6, 7, 8, 9]
The reason your function didn't work is because the extend extends an array in-place and doesn't return it. You can still return x from lambda, using something like this:
reduce(lambda x,y: x.extend(y) or x, l)
Note: extend is more efficient than + on lists.
matplotlib.cbook.flatten() will work for nested lists even if they nest more deeply than the example.
import matplotlib
l = [[1, 2, 3], [4, 5, 6], [7], [8, 9]]
print(list(matplotlib.cbook.flatten(l)))
l2 = [[1, 2, 3], [4, 5, 6], [7], [8, [9, 10, [11, 12, [13]]]]]
print(list(matplotlib.cbook.flatten(l2)))
Result:
[1, 2, 3, 4, 5, 6, 7, 8, 9]
[1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13]
This is 18x faster than underscore._.flatten:
Average time over 1000 trials of matplotlib.cbook.flatten: 2.55e-05 sec
Average time over 1000 trials of underscore._.flatten: 4.63e-04 sec
(time for underscore._)/(time for matplotlib.cbook) = 18.1233394636
According your list [[1, 2, 3], [4, 5, 6], [7], [8, 9]] which is 1 list level, we can simply use sum(list,[]) without using any libraries
sum([[1, 2, 3], [4, 5, 6], [7], [8, 9]],[])
# [1, 2, 3, 4, 5, 6, 7, 8, 9]
To extend the advantage of this method when there is a tuple or number existing inside. Simply adding a mapping function for each element by map to the list
#For only tuple
sum(list(map(list,[[1, 2, 3], (4, 5, 6), (7,), [8, 9]])),[])
# [1, 2, 3, 4, 5, 6, 7, 8, 9]
#In general
def convert(x):
if type(x) is int or type(x) is float:
return [x]
else:
return list(x)
sum(list(map(convert,[[1, 2, 3], (4, 5, 6), 7, [8, 9]])),[])
# [1, 2, 3, 4, 5, 6, 7, 8, 9]
In here, there is a clear explanation of the drawback in terms of memory for this approach. In short, it recursively creates list objects, which should be avoided :(
One can also use NumPy's flat:
import numpy as np
list(np.array(l).flat)
It only works when sublists have identical dimensions.
You can use the list extend method. It shows to be the fastest:
flat_list = []
for sublist in l:
flat_list.extend(sublist)
Performance:
import functools
import itertools
import numpy
import operator
import perfplot
def functools_reduce_iconcat(a):
return functools.reduce(operator.iconcat, a, [])
def itertools_chain(a):
return list(itertools.chain.from_iterable(a))
def numpy_flat(a):
return list(numpy.array(a).flat)
def extend(a):
n = []
list(map(n.extend, a))
return n
perfplot.show(
setup = lambda n: [list(range(10))] * n,
kernels = [
functools_reduce_iconcat, extend, itertools_chain, numpy_flat
],
n_range = [2**k for k in range(16)],
xlabel = 'num lists',
)
Output:
There are several answers with the same recursive appending scheme as below, but none makes use of try, which makes the solution more robust and Pythonic.
def flatten(itr):
for x in itr:
try:
yield from flatten(x)
except TypeError:
yield x
Usage: this is a generator, and you typically want to enclose it in an iterable builder like list() or tuple() or use it in a for loop.
Advantages of this solution are:
works with any kind of iterable (even future ones!)
works with any combination and deepness of nesting
works also if top level contains bare items
no dependencies
fast and efficient (you can flatten the nested iterable partially, without wasting time on the remaining part you don't need)
versatile (you can use it to build an iterable of your choice or in a loop)
N.B.: Since all iterables are flattened, strings are decomposed into sequences of single characters. If you don't like/want such behavior, you can use the following version which filters out from flattening iterables like strings and bytes:
def flatten(itr):
if type(itr) in (str,bytes):
yield itr
else:
for x in itr:
try:
yield from flatten(x)
except TypeError:
yield x
Note: Below applies to Python 3.3+ because it uses yield_from. six is also a third-party package, though it is stable. Alternately, you could use sys.version.
In the case of obj = [[1, 2,], [3, 4], [5, 6]], all of the solutions here are good, including list comprehension and itertools.chain.from_iterable.
However, consider this slightly more complex case:
>>> obj = [[1, 2, 3], [4, 5], 6, 'abc', [7], [8, [9, 10]]]
There are several problems here:
One element, 6, is just a scalar; it's not iterable, so the above routes will fail here.
One element, 'abc', is technically iterable (all strs are). However, reading between the lines a bit, you don't want to treat it as such--you want to treat it as a single element.
The final element, [8, [9, 10]] is itself a nested iterable. Basic list comprehension and chain.from_iterable only extract "1 level down."
You can remedy this as follows:
>>> from collections import Iterable
>>> from six import string_types
>>> def flatten(obj):
... for i in obj:
... if isinstance(i, Iterable) and not isinstance(i, string_types):
... yield from flatten(i)
... else:
... yield i
>>> list(flatten(obj))
[1, 2, 3, 4, 5, 6, 'abc', 7, 8, 9, 10]
Here, you check that the sub-element (1) is iterable with Iterable, an ABC from itertools, but also want to ensure that (2) the element is not "string-like."
If you are willing to give up a tiny amount of speed for a cleaner look, then you could use numpy.concatenate().tolist() or numpy.concatenate().ravel().tolist():
import numpy
l = [[1, 2, 3], [4, 5, 6], [7], [8, 9]] * 99
%timeit numpy.concatenate(l).ravel().tolist()
1000 loops, best of 3: 313 µs per loop
%timeit numpy.concatenate(l).tolist()
1000 loops, best of 3: 312 µs per loop
%timeit [item for sublist in l for item in sublist]
1000 loops, best of 3: 31.5 µs per loop
You can find out more here in the documentation, numpy.concatenate and numpy.ravel.
def flatten(alist):
if alist == []:
return []
elif type(alist) is not list:
return [alist]
else:
return flatten(alist[0]) + flatten(alist[1:])
I wanted a solution which can deal with multiple nesting ([[1], [[[2]], [3]]], [1, 2, 3] for example), but would also not be recursive (I had a big level of recursion and I got a recursion error.
This is what I came up with:
def _flatten(l) -> Iterator[Any]:
stack = l.copy()
while stack:
item = stack.pop()
if isinstance(item, list):
stack.extend(item)
else:
yield item
def flatten(l) -> Iterator[Any]:
return reversed(list(_flatten(l)))
and tests:
#pytest.mark.parametrize('input_list, expected_output', [
([1, 2, 3], [1, 2, 3]),
([[1], 2, 3], [1, 2, 3]),
([[1], [2], 3], [1, 2, 3]),
([[1], [2], [3]], [1, 2, 3]),
([[1], [[2]], [3]], [1, 2, 3]),
([[1], [[[2]], [3]]], [1, 2, 3]),
])
def test_flatten(input_list, expected_output):
assert list(flatten(input_list)) == expected_output
This may not be the most efficient way, but I thought to put a one-liner (actually a two-liner). Both versions will work on arbitrary hierarchy nested lists, and exploits language features (Python 3.5) and recursion.
def make_list_flat (l):
flist = []
flist.extend ([l]) if (type (l) is not list) else [flist.extend (make_list_flat (e)) for e in l]
return flist
a = [[1, 2], [[[[3, 4, 5], 6]]], 7, [8, [9, [10, 11], 12, [13, 14, [15, [[16, 17], 18]]]]]]
flist = make_list_flat(a)
print (flist)
The output is
[1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18]
This works in a depth first manner. The recursion goes down until it finds a non-list element, then extends the local variable flist and then rolls back it to the parent. Whenever flist is returned, it is extended to the parent's flist in the list comprehension. Therefore, at the root, a flat list is returned.
The above one creates several local lists and returns them which are used to extend the parent's list. I think the way around for this may be creating a gloabl flist, like below.
a = [[1, 2], [[[[3, 4, 5], 6]]], 7, [8, [9, [10, 11], 12, [13, 14, [15, [[16, 17], 18]]]]]]
flist = []
def make_list_flat (l):
flist.extend ([l]) if (type (l) is not list) else [make_list_flat (e) for e in l]
make_list_flat(a)
print (flist)
The output is again
[1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18]
Although I am not sure at this time about the efficiency.
Not a one-liner, but seeing all the answers here, I guess this long list missed some pattern matching, so here it is :)
The two methods are probably not efficient, but anyway, it's easy to read (to me at least; perhaps I'm spoiled by functional programming):
def flat(x):
match x:
case []:
return []
case [[*sublist], *r]:
return [*sublist, *flat(r)]
The second version considers lists of lists of lists... whatever the nesting:
def flat(x):
match x:
case []:
return []
case [[*sublist], *r]:
return [*flat(sublist), *flat(r)]
case [h, *r]:
return [h, *flat(r)]
If you want to unnest everything and keep a distinct list of elements, you could use this as well.
list_of_lists = [[1,2], [2,3], [3,4]]
list(set.union(*[set(s) for s in list_of_lists]))
If you have a numpy array a:
a = np.array([[1,2], [3,4]])
a.flatten('C')
produces:
[1, 2, 3, 4]
np.flatten also accepts other parameters:
C:
F
A
K
More details about parameters are available here.
Another unusual approach that works for hetero- and homogeneous lists of integers:
from typing import List
def flatten(l: list) -> List[int]:
"""Flatten an arbitrary deep nested list of lists of integers.
Examples:
>>> flatten([1, 2, [1, [10]]])
[1, 2, 1, 10]
Args:
l: Union[l, Union[int, List[int]]
Returns:
Flatted list of integer
"""
return [int(i.strip('[ ]')) for i in str(l).split(',')]
A non-recursive function to flatten lists of lists of any depth:
def flatten_list(list1):
out = []
inside = list1
while inside:
x = inside.pop(0)
if isinstance(x, list):
inside[0:0] = x
else:
out.append(x)
return out
l = [[[1,2],3,[4,[[5,6],7],[8]]],[9,10,11]]
flatten_list(l)
# [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11]
For a list containing multiple list here a recursive solution that work for me and that i hope is correct:
# Question 4
def flatten(input_ls=[]) -> []:
res_ls = []
res_ls = flatten_recursive(input_ls, res_ls)
print("Final flatten list solution is: \n", res_ls)
return res_ls
def flatten_recursive(input_ls=[], res_ls=[]) -> []:
tmp_ls = []
for i in input_ls:
if isinstance(i, int):
res_ls.append(i)
else:
tmp_ls = i
tmp_ls.append(flatten_recursive(i, res_ls))
print(res_ls)
return res_ls
flatten([0, 1, [2, 3], 4, [5, 6]]) # test
flatten([0, [[[1]]], [[2, 3], [4, [[5, 6]]]]])
Output:
[0, 1, 2, 3]
[0, 1, 2, 3, 4, 5, 6]
[0, 1, 2, 3, 4, 5, 6]
Final flatten list solution is:
[0, 1, 2, 3, 4, 5, 6]
[0, 1]
[0, 1]
[0, 1]
[0, 1, 2, 3]
[0, 1, 2, 3, 4, 5, 6]
[0, 1, 2, 3, 4, 5, 6]
[0, 1, 2, 3, 4, 5, 6]
[0, 1, 2, 3, 4, 5, 6]
[0, 1, 2, 3, 4, 5, 6]
Final flatten list solution is:
[0, 1, 2, 3, 4, 5, 6]
I would suggest using generators with yield statement and yield from.
Here's an example:
from collections.abc import Iterable
def flatten(items, ignore_types=(bytes, str)):
"""
Flatten all of the nested lists to the one. Ignoring flatting of iterable types str and bytes by default.
"""
for x in items:
if isinstance(x, Iterable) and not isinstance(x, ignore_types):
yield from flatten(x)
else:
yield x
values = [7, [4, 3, 5, [7, 3], (3, 4), ('A', {'B', 'C'})]]
for v in flatten(values):
print(v)
If I want to add something to the great previous answers, here is my recursive flatten function which can flatten not only nested lists, but also any given container or any generally any object which can throw out items. This does also work for any depth of nesting and it is a lazy iterator which yields the items as requested:
def flatten(iterable):
# These types won't considered a sequence or generally a container
exclude = str, bytes
for i in iterable:
try:
if isinstance(i, exclude):
raise TypeError
iter(i)
except TypeError:
yield i
else:
yield from flatten(i)
This way, you can exclude types you don't want to be flattened, like str or what else.
The idea is if an object can pass the iter() it's ready to yield items. So the iterable can have even generator expressions as an item.
Someone could argue: Why did you write this that generic when the OP didn't ask for it? OK, you're right. I just felt like this might help someone (like it did for myself).
Test cases:
lst1 = [1, {3}, (1, 6), [[3, 8]], [[[5]]], 9, ((((2,),),),)]
lst2 = ['3', B'A', [[[(i ** 2 for i in range(3))]]], range(3)]
print(list(flatten(lst1)))
print(list(flatten(lst2)))
Output:
[1, 3, 1, 6, 3, 8, 5, 9, 2]
['3', b'A', 0, 1, 4, 0, 1, 2]
Considering the list has just integers:
import re
l = [[1, 2, 3], [4, 5, 6], [7], [8, 9]]
list(map(int,re.sub('(\[|\])','',str(l)).split(',')))
Simplest Way to do in python without any library
This function will work for even multidimensional list also
using recursion we can achieve any combination of list inside list, we can flatten it without using any library.
#Devil
x = [[1, 2, 3], [4, 5, 6], [7], [8, 9]]
output = []
def flatten(v):
if isinstance(v, int):
output.append(v)
if isinstance(v, list):
for i in range(0, len(v)):
flatten(v[i])
flatten(x)
print("Output:", output)
#Output: [1, 2, 3, 4, 5, 6, 7, 8, 9]
#Adding more dimensions
x = [ [1, [2, 3, [4, 5], [6]], 7 ], [8, [9, [10]]] ]
flatten(x)
print("Output:", output)
#Output: [1, 2, 3, 4, 5, 6, 7, 8, 9, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10]

Getting flattened list of all values in dictionary [duplicate]

I have a list of lists like [[1, 2, 3], [4, 5, 6], [7], [8, 9]]. How can I flatten it to get [1, 2, 3, 4, 5, 6, 7, 8, 9]?
If your list of lists comes from a nested list comprehension, the problem can be solved more simply/directly by fixing the comprehension; please see python list comprehensions; compressing a list of lists?.
The most popular solutions here generally only flatten one "level" of the nested list. See Flatten an irregular (arbitrarily nested) list of lists for solutions that completely flatten a deeply nested structure (recursively, in general).
Given a list of lists l,
flat_list = [item for sublist in l for item in sublist]
which means:
flat_list = []
for sublist in l:
for item in sublist:
flat_list.append(item)
is faster than the shortcuts posted so far. (l is the list to flatten.)
Here is the corresponding function:
def flatten(l):
return [item for sublist in l for item in sublist]
As evidence, you can use the timeit module in the standard library:
$ python -mtimeit -s'l=[[1,2,3],[4,5,6], [7], [8,9]]*99' '[item for sublist in l for item in sublist]'
10000 loops, best of 3: 143 usec per loop
$ python -mtimeit -s'l=[[1,2,3],[4,5,6], [7], [8,9]]*99' 'sum(l, [])'
1000 loops, best of 3: 969 usec per loop
$ python -mtimeit -s'l=[[1,2,3],[4,5,6], [7], [8,9]]*99' 'reduce(lambda x,y: x+y,l)'
1000 loops, best of 3: 1.1 msec per loop
Explanation: the shortcuts based on + (including the implied use in sum) are, of necessity, O(L**2) when there are L sublists -- as the intermediate result list keeps getting longer, at each step a new intermediate result list object gets allocated, and all the items in the previous intermediate result must be copied over (as well as a few new ones added at the end). So, for simplicity and without actual loss of generality, say you have L sublists of I items each: the first I items are copied back and forth L-1 times, the second I items L-2 times, and so on; total number of copies is I times the sum of x for x from 1 to L excluded, i.e., I * (L**2)/2.
The list comprehension just generates one list, once, and copies each item over (from its original place of residence to the result list) also exactly once.
You can use itertools.chain():
>>> import itertools
>>> list2d = [[1,2,3], [4,5,6], [7], [8,9]]
>>> merged = list(itertools.chain(*list2d))
Or you can use itertools.chain.from_iterable() which doesn't require unpacking the list with the * operator:
>>> import itertools
>>> list2d = [[1,2,3], [4,5,6], [7], [8,9]]
>>> merged = list(itertools.chain.from_iterable(list2d))
This approach is arguably more readable than [item for sublist in l for item in sublist] and appears to be faster too:
$ python3 -mtimeit -s'l=[[1,2,3],[4,5,6], [7], [8,9]]*99;import itertools' 'list(itertools.chain.from_iterable(l))'
20000 loops, best of 5: 10.8 usec per loop
$ python3 -mtimeit -s'l=[[1,2,3],[4,5,6], [7], [8,9]]*99' '[item for sublist in l for item in sublist]'
10000 loops, best of 5: 21.7 usec per loop
$ python3 -mtimeit -s'l=[[1,2,3],[4,5,6], [7], [8,9]]*99' 'sum(l, [])'
1000 loops, best of 5: 258 usec per loop
$ python3 -mtimeit -s'l=[[1,2,3],[4,5,6], [7], [8,9]]*99;from functools import reduce' 'reduce(lambda x,y: x+y,l)'
1000 loops, best of 5: 292 usec per loop
$ python3 --version
Python 3.7.5rc1
Note from the author: This is very inefficient. But fun, because monoids are awesome.
>>> xss = [[1, 2, 3], [4, 5, 6], [7], [8, 9]]
>>> sum(xss, [])
[1, 2, 3, 4, 5, 6, 7, 8, 9]
sum sums the elements of the iterable xss, and uses the second argument as the initial value [] for the sum. (The default initial value is 0, which is not a list.)
Because you are summing nested lists, you actually get [1,3]+[2,4] as a result of sum([[1,3],[2,4]],[]), which is equal to [1,3,2,4].
Note that only works on lists of lists. For lists of lists of lists, you'll need another solution.
I tested most suggested solutions with perfplot (a pet project of mine, essentially a wrapper around timeit), and found
import functools
import operator
functools.reduce(operator.iconcat, a, [])
to be the fastest solution, both when many small lists and few long lists are concatenated. (operator.iadd is equally fast.)
A simpler and also acceptable variant is
out = []
for sublist in a:
out.extend(sublist)
If the number of sublists is large, this performs a little worse than the above suggestion.
Code to reproduce the plot:
import functools
import itertools
import operator
import numpy as np
import perfplot
def forfor(a):
return [item for sublist in a for item in sublist]
def sum_brackets(a):
return sum(a, [])
def functools_reduce(a):
return functools.reduce(operator.concat, a)
def functools_reduce_iconcat(a):
return functools.reduce(operator.iconcat, a, [])
def itertools_chain(a):
return list(itertools.chain.from_iterable(a))
def numpy_flat(a):
return list(np.array(a).flat)
def numpy_concatenate(a):
return list(np.concatenate(a))
def extend(a):
out = []
for sublist in a:
out.extend(sublist)
return out
b = perfplot.bench(
setup=lambda n: [list(range(10))] * n,
# setup=lambda n: [list(range(n))] * 10,
kernels=[
forfor,
sum_brackets,
functools_reduce,
functools_reduce_iconcat,
itertools_chain,
numpy_flat,
numpy_concatenate,
extend,
],
n_range=[2 ** k for k in range(16)],
xlabel="num lists (of length 10)",
# xlabel="len lists (10 lists total)"
)
b.save("out.png")
b.show()
Using functools.reduce, which adds an accumulated list xs to the next list ys:
from functools import reduce
xss = [[1,2,3], [4,5,6], [7], [8,9]]
out = reduce(lambda xs, ys: xs + ys, xss)
Output:
[1, 2, 3, 4, 5, 6, 7, 8, 9]
A faster way using operator.concat:
from functools import reduce
import operator
xss = [[1,2,3], [4,5,6], [7], [8,9]]
out = reduce(operator.concat, xss)
Output:
[1, 2, 3, 4, 5, 6, 7, 8, 9]
Here is a general approach that applies to numbers, strings, nested lists and mixed containers. This can flatten both simple and complicated containers (see also Demo).
Code
from typing import Iterable
#from collections import Iterable # < py38
def flatten(items):
"""Yield items from any nested iterable; see Reference."""
for x in items:
if isinstance(x, Iterable) and not isinstance(x, (str, bytes)):
for sub_x in flatten(x):
yield sub_x
else:
yield x
Notes:
In Python 3, yield from flatten(x) can replace for sub_x in flatten(x): yield sub_x
In Python 3.8, abstract base classes are moved from collection.abc to the typing module.
Demo
simple = [[1, 2, 3], [4, 5, 6], [7], [8, 9]]
list(flatten(simple))
# [1, 2, 3, 4, 5, 6, 7, 8, 9]
complicated = [[1, [2]], (3, 4, {5, 6}, 7), 8, "9"] # numbers, strs, nested & mixed
list(flatten(complicated))
# [1, 2, 3, 4, 5, 6, 7, 8, '9']
Reference
This solution is modified from a recipe in Beazley, D. and B. Jones. Recipe 4.14, Python Cookbook 3rd Ed., O'Reilly Media Inc. Sebastopol, CA: 2013.
Found an earlier SO post, possibly the original demonstration.
To flatten a data-structure that is deeply nested, use iteration_utilities.deepflatten1:
>>> from iteration_utilities import deepflatten
>>> l = [[1, 2, 3], [4, 5, 6], [7], [8, 9]]
>>> list(deepflatten(l, depth=1))
[1, 2, 3, 4, 5, 6, 7, 8, 9]
>>> l = [[1, 2, 3], [4, [5, 6]], 7, [8, 9]]
>>> list(deepflatten(l))
[1, 2, 3, 4, 5, 6, 7, 8, 9]
It's a generator so you need to cast the result to a list or explicitly iterate over it.
To flatten only one level and if each of the items is itself iterable you can also use iteration_utilities.flatten which itself is just a thin wrapper around itertools.chain.from_iterable:
>>> from iteration_utilities import flatten
>>> l = [[1, 2, 3], [4, 5, 6], [7], [8, 9]]
>>> list(flatten(l))
[1, 2, 3, 4, 5, 6, 7, 8, 9]
Just to add some timings (based on Nico Schlömer's answer that didn't include the function presented in this answer):
It's a log-log plot to accommodate for the huge range of values spanned. For qualitative reasoning: Lower is better.
The results show that if the iterable contains only a few inner iterables then sum will be fastest, however for long iterables only the itertools.chain.from_iterable, iteration_utilities.deepflatten or the nested comprehension have reasonable performance with itertools.chain.from_iterable being the fastest (as already noticed by Nico Schlömer).
from itertools import chain
from functools import reduce
from collections import Iterable # or from collections.abc import Iterable
import operator
from iteration_utilities import deepflatten
def nested_list_comprehension(lsts):
return [item for sublist in lsts for item in sublist]
def itertools_chain_from_iterable(lsts):
return list(chain.from_iterable(lsts))
def pythons_sum(lsts):
return sum(lsts, [])
def reduce_add(lsts):
return reduce(lambda x, y: x + y, lsts)
def pylangs_flatten(lsts):
return list(flatten(lsts))
def flatten(items):
"""Yield items from any nested iterable; see REF."""
for x in items:
if isinstance(x, Iterable) and not isinstance(x, (str, bytes)):
yield from flatten(x)
else:
yield x
def reduce_concat(lsts):
return reduce(operator.concat, lsts)
def iteration_utilities_deepflatten(lsts):
return list(deepflatten(lsts, depth=1))
from simple_benchmark import benchmark
b = benchmark(
[nested_list_comprehension, itertools_chain_from_iterable, pythons_sum, reduce_add,
pylangs_flatten, reduce_concat, iteration_utilities_deepflatten],
arguments={2**i: [[0]*5]*(2**i) for i in range(1, 13)},
argument_name='number of inner lists'
)
b.plot()
1 Disclaimer: I'm the author of that library
The following seems simplest to me:
>>> import numpy as np
>>> l = [[1, 2, 3], [4, 5, 6], [7], [8, 9]]
>>> print(np.concatenate(l))
[1 2 3 4 5 6 7 8 9]
Consider installing the more_itertools package.
> pip install more_itertools
It ships with an implementation for flatten (source, from the itertools recipes):
import more_itertools
lst = [[1, 2, 3], [4, 5, 6], [7], [8, 9]]
list(more_itertools.flatten(lst))
# [1, 2, 3, 4, 5, 6, 7, 8, 9]
Note: as mentioned in the docs, flatten requires a list of lists. See below on flattening more irregular inputs.
As of version 2.4, you can flatten more complicated, nested iterables with more_itertools.collapse (source, contributed by abarnet).
lst = [[1, 2, 3], [4, 5, 6], [7], [8, 9]]
list(more_itertools.collapse(lst))
# [1, 2, 3, 4, 5, 6, 7, 8, 9]
lst = [[1, 2, 3], [[4, 5, 6]], [[[7]]], 8, 9] # complex nesting
list(more_itertools.collapse(lst))
# [1, 2, 3, 4, 5, 6, 7, 8, 9]
The reason your function didn't work is because the extend extends an array in-place and doesn't return it. You can still return x from lambda, using something like this:
reduce(lambda x,y: x.extend(y) or x, l)
Note: extend is more efficient than + on lists.
matplotlib.cbook.flatten() will work for nested lists even if they nest more deeply than the example.
import matplotlib
l = [[1, 2, 3], [4, 5, 6], [7], [8, 9]]
print(list(matplotlib.cbook.flatten(l)))
l2 = [[1, 2, 3], [4, 5, 6], [7], [8, [9, 10, [11, 12, [13]]]]]
print(list(matplotlib.cbook.flatten(l2)))
Result:
[1, 2, 3, 4, 5, 6, 7, 8, 9]
[1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13]
This is 18x faster than underscore._.flatten:
Average time over 1000 trials of matplotlib.cbook.flatten: 2.55e-05 sec
Average time over 1000 trials of underscore._.flatten: 4.63e-04 sec
(time for underscore._)/(time for matplotlib.cbook) = 18.1233394636
According your list [[1, 2, 3], [4, 5, 6], [7], [8, 9]] which is 1 list level, we can simply use sum(list,[]) without using any libraries
sum([[1, 2, 3], [4, 5, 6], [7], [8, 9]],[])
# [1, 2, 3, 4, 5, 6, 7, 8, 9]
To extend the advantage of this method when there is a tuple or number existing inside. Simply adding a mapping function for each element by map to the list
#For only tuple
sum(list(map(list,[[1, 2, 3], (4, 5, 6), (7,), [8, 9]])),[])
# [1, 2, 3, 4, 5, 6, 7, 8, 9]
#In general
def convert(x):
if type(x) is int or type(x) is float:
return [x]
else:
return list(x)
sum(list(map(convert,[[1, 2, 3], (4, 5, 6), 7, [8, 9]])),[])
# [1, 2, 3, 4, 5, 6, 7, 8, 9]
In here, there is a clear explanation of the drawback in terms of memory for this approach. In short, it recursively creates list objects, which should be avoided :(
One can also use NumPy's flat:
import numpy as np
list(np.array(l).flat)
It only works when sublists have identical dimensions.
You can use the list extend method. It shows to be the fastest:
flat_list = []
for sublist in l:
flat_list.extend(sublist)
Performance:
import functools
import itertools
import numpy
import operator
import perfplot
def functools_reduce_iconcat(a):
return functools.reduce(operator.iconcat, a, [])
def itertools_chain(a):
return list(itertools.chain.from_iterable(a))
def numpy_flat(a):
return list(numpy.array(a).flat)
def extend(a):
n = []
list(map(n.extend, a))
return n
perfplot.show(
setup = lambda n: [list(range(10))] * n,
kernels = [
functools_reduce_iconcat, extend, itertools_chain, numpy_flat
],
n_range = [2**k for k in range(16)],
xlabel = 'num lists',
)
Output:
There are several answers with the same recursive appending scheme as below, but none makes use of try, which makes the solution more robust and Pythonic.
def flatten(itr):
for x in itr:
try:
yield from flatten(x)
except TypeError:
yield x
Usage: this is a generator, and you typically want to enclose it in an iterable builder like list() or tuple() or use it in a for loop.
Advantages of this solution are:
works with any kind of iterable (even future ones!)
works with any combination and deepness of nesting
works also if top level contains bare items
no dependencies
fast and efficient (you can flatten the nested iterable partially, without wasting time on the remaining part you don't need)
versatile (you can use it to build an iterable of your choice or in a loop)
N.B.: Since all iterables are flattened, strings are decomposed into sequences of single characters. If you don't like/want such behavior, you can use the following version which filters out from flattening iterables like strings and bytes:
def flatten(itr):
if type(itr) in (str,bytes):
yield itr
else:
for x in itr:
try:
yield from flatten(x)
except TypeError:
yield x
Note: Below applies to Python 3.3+ because it uses yield_from. six is also a third-party package, though it is stable. Alternately, you could use sys.version.
In the case of obj = [[1, 2,], [3, 4], [5, 6]], all of the solutions here are good, including list comprehension and itertools.chain.from_iterable.
However, consider this slightly more complex case:
>>> obj = [[1, 2, 3], [4, 5], 6, 'abc', [7], [8, [9, 10]]]
There are several problems here:
One element, 6, is just a scalar; it's not iterable, so the above routes will fail here.
One element, 'abc', is technically iterable (all strs are). However, reading between the lines a bit, you don't want to treat it as such--you want to treat it as a single element.
The final element, [8, [9, 10]] is itself a nested iterable. Basic list comprehension and chain.from_iterable only extract "1 level down."
You can remedy this as follows:
>>> from collections import Iterable
>>> from six import string_types
>>> def flatten(obj):
... for i in obj:
... if isinstance(i, Iterable) and not isinstance(i, string_types):
... yield from flatten(i)
... else:
... yield i
>>> list(flatten(obj))
[1, 2, 3, 4, 5, 6, 'abc', 7, 8, 9, 10]
Here, you check that the sub-element (1) is iterable with Iterable, an ABC from itertools, but also want to ensure that (2) the element is not "string-like."
If you are willing to give up a tiny amount of speed for a cleaner look, then you could use numpy.concatenate().tolist() or numpy.concatenate().ravel().tolist():
import numpy
l = [[1, 2, 3], [4, 5, 6], [7], [8, 9]] * 99
%timeit numpy.concatenate(l).ravel().tolist()
1000 loops, best of 3: 313 µs per loop
%timeit numpy.concatenate(l).tolist()
1000 loops, best of 3: 312 µs per loop
%timeit [item for sublist in l for item in sublist]
1000 loops, best of 3: 31.5 µs per loop
You can find out more here in the documentation, numpy.concatenate and numpy.ravel.
def flatten(alist):
if alist == []:
return []
elif type(alist) is not list:
return [alist]
else:
return flatten(alist[0]) + flatten(alist[1:])
I wanted a solution which can deal with multiple nesting ([[1], [[[2]], [3]]], [1, 2, 3] for example), but would also not be recursive (I had a big level of recursion and I got a recursion error.
This is what I came up with:
def _flatten(l) -> Iterator[Any]:
stack = l.copy()
while stack:
item = stack.pop()
if isinstance(item, list):
stack.extend(item)
else:
yield item
def flatten(l) -> Iterator[Any]:
return reversed(list(_flatten(l)))
and tests:
#pytest.mark.parametrize('input_list, expected_output', [
([1, 2, 3], [1, 2, 3]),
([[1], 2, 3], [1, 2, 3]),
([[1], [2], 3], [1, 2, 3]),
([[1], [2], [3]], [1, 2, 3]),
([[1], [[2]], [3]], [1, 2, 3]),
([[1], [[[2]], [3]]], [1, 2, 3]),
])
def test_flatten(input_list, expected_output):
assert list(flatten(input_list)) == expected_output
This may not be the most efficient way, but I thought to put a one-liner (actually a two-liner). Both versions will work on arbitrary hierarchy nested lists, and exploits language features (Python 3.5) and recursion.
def make_list_flat (l):
flist = []
flist.extend ([l]) if (type (l) is not list) else [flist.extend (make_list_flat (e)) for e in l]
return flist
a = [[1, 2], [[[[3, 4, 5], 6]]], 7, [8, [9, [10, 11], 12, [13, 14, [15, [[16, 17], 18]]]]]]
flist = make_list_flat(a)
print (flist)
The output is
[1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18]
This works in a depth first manner. The recursion goes down until it finds a non-list element, then extends the local variable flist and then rolls back it to the parent. Whenever flist is returned, it is extended to the parent's flist in the list comprehension. Therefore, at the root, a flat list is returned.
The above one creates several local lists and returns them which are used to extend the parent's list. I think the way around for this may be creating a gloabl flist, like below.
a = [[1, 2], [[[[3, 4, 5], 6]]], 7, [8, [9, [10, 11], 12, [13, 14, [15, [[16, 17], 18]]]]]]
flist = []
def make_list_flat (l):
flist.extend ([l]) if (type (l) is not list) else [make_list_flat (e) for e in l]
make_list_flat(a)
print (flist)
The output is again
[1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18]
Although I am not sure at this time about the efficiency.
Not a one-liner, but seeing all the answers here, I guess this long list missed some pattern matching, so here it is :)
The two methods are probably not efficient, but anyway, it's easy to read (to me at least; perhaps I'm spoiled by functional programming):
def flat(x):
match x:
case []:
return []
case [[*sublist], *r]:
return [*sublist, *flat(r)]
The second version considers lists of lists of lists... whatever the nesting:
def flat(x):
match x:
case []:
return []
case [[*sublist], *r]:
return [*flat(sublist), *flat(r)]
case [h, *r]:
return [h, *flat(r)]
If you want to unnest everything and keep a distinct list of elements, you could use this as well.
list_of_lists = [[1,2], [2,3], [3,4]]
list(set.union(*[set(s) for s in list_of_lists]))
If you have a numpy array a:
a = np.array([[1,2], [3,4]])
a.flatten('C')
produces:
[1, 2, 3, 4]
np.flatten also accepts other parameters:
C:
F
A
K
More details about parameters are available here.
Another unusual approach that works for hetero- and homogeneous lists of integers:
from typing import List
def flatten(l: list) -> List[int]:
"""Flatten an arbitrary deep nested list of lists of integers.
Examples:
>>> flatten([1, 2, [1, [10]]])
[1, 2, 1, 10]
Args:
l: Union[l, Union[int, List[int]]
Returns:
Flatted list of integer
"""
return [int(i.strip('[ ]')) for i in str(l).split(',')]
A non-recursive function to flatten lists of lists of any depth:
def flatten_list(list1):
out = []
inside = list1
while inside:
x = inside.pop(0)
if isinstance(x, list):
inside[0:0] = x
else:
out.append(x)
return out
l = [[[1,2],3,[4,[[5,6],7],[8]]],[9,10,11]]
flatten_list(l)
# [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11]
For a list containing multiple list here a recursive solution that work for me and that i hope is correct:
# Question 4
def flatten(input_ls=[]) -> []:
res_ls = []
res_ls = flatten_recursive(input_ls, res_ls)
print("Final flatten list solution is: \n", res_ls)
return res_ls
def flatten_recursive(input_ls=[], res_ls=[]) -> []:
tmp_ls = []
for i in input_ls:
if isinstance(i, int):
res_ls.append(i)
else:
tmp_ls = i
tmp_ls.append(flatten_recursive(i, res_ls))
print(res_ls)
return res_ls
flatten([0, 1, [2, 3], 4, [5, 6]]) # test
flatten([0, [[[1]]], [[2, 3], [4, [[5, 6]]]]])
Output:
[0, 1, 2, 3]
[0, 1, 2, 3, 4, 5, 6]
[0, 1, 2, 3, 4, 5, 6]
Final flatten list solution is:
[0, 1, 2, 3, 4, 5, 6]
[0, 1]
[0, 1]
[0, 1]
[0, 1, 2, 3]
[0, 1, 2, 3, 4, 5, 6]
[0, 1, 2, 3, 4, 5, 6]
[0, 1, 2, 3, 4, 5, 6]
[0, 1, 2, 3, 4, 5, 6]
[0, 1, 2, 3, 4, 5, 6]
Final flatten list solution is:
[0, 1, 2, 3, 4, 5, 6]
I would suggest using generators with yield statement and yield from.
Here's an example:
from collections.abc import Iterable
def flatten(items, ignore_types=(bytes, str)):
"""
Flatten all of the nested lists to the one. Ignoring flatting of iterable types str and bytes by default.
"""
for x in items:
if isinstance(x, Iterable) and not isinstance(x, ignore_types):
yield from flatten(x)
else:
yield x
values = [7, [4, 3, 5, [7, 3], (3, 4), ('A', {'B', 'C'})]]
for v in flatten(values):
print(v)
If I want to add something to the great previous answers, here is my recursive flatten function which can flatten not only nested lists, but also any given container or any generally any object which can throw out items. This does also work for any depth of nesting and it is a lazy iterator which yields the items as requested:
def flatten(iterable):
# These types won't considered a sequence or generally a container
exclude = str, bytes
for i in iterable:
try:
if isinstance(i, exclude):
raise TypeError
iter(i)
except TypeError:
yield i
else:
yield from flatten(i)
This way, you can exclude types you don't want to be flattened, like str or what else.
The idea is if an object can pass the iter() it's ready to yield items. So the iterable can have even generator expressions as an item.
Someone could argue: Why did you write this that generic when the OP didn't ask for it? OK, you're right. I just felt like this might help someone (like it did for myself).
Test cases:
lst1 = [1, {3}, (1, 6), [[3, 8]], [[[5]]], 9, ((((2,),),),)]
lst2 = ['3', B'A', [[[(i ** 2 for i in range(3))]]], range(3)]
print(list(flatten(lst1)))
print(list(flatten(lst2)))
Output:
[1, 3, 1, 6, 3, 8, 5, 9, 2]
['3', b'A', 0, 1, 4, 0, 1, 2]
Considering the list has just integers:
import re
l = [[1, 2, 3], [4, 5, 6], [7], [8, 9]]
list(map(int,re.sub('(\[|\])','',str(l)).split(',')))
Simplest Way to do in python without any library
This function will work for even multidimensional list also
using recursion we can achieve any combination of list inside list, we can flatten it without using any library.
#Devil
x = [[1, 2, 3], [4, 5, 6], [7], [8, 9]]
output = []
def flatten(v):
if isinstance(v, int):
output.append(v)
if isinstance(v, list):
for i in range(0, len(v)):
flatten(v[i])
flatten(x)
print("Output:", output)
#Output: [1, 2, 3, 4, 5, 6, 7, 8, 9]
#Adding more dimensions
x = [ [1, [2, 3, [4, 5], [6]], 7 ], [8, [9, [10]]] ]
flatten(x)
print("Output:", output)
#Output: [1, 2, 3, 4, 5, 6, 7, 8, 9, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10]

How to flatten a list in Python [duplicate]

I have a list of lists like [[1, 2, 3], [4, 5, 6], [7], [8, 9]]. How can I flatten it to get [1, 2, 3, 4, 5, 6, 7, 8, 9]?
If your list of lists comes from a nested list comprehension, the problem can be solved more simply/directly by fixing the comprehension; please see python list comprehensions; compressing a list of lists?.
The most popular solutions here generally only flatten one "level" of the nested list. See Flatten an irregular (arbitrarily nested) list of lists for solutions that completely flatten a deeply nested structure (recursively, in general).
Given a list of lists l,
flat_list = [item for sublist in l for item in sublist]
which means:
flat_list = []
for sublist in l:
for item in sublist:
flat_list.append(item)
is faster than the shortcuts posted so far. (l is the list to flatten.)
Here is the corresponding function:
def flatten(l):
return [item for sublist in l for item in sublist]
As evidence, you can use the timeit module in the standard library:
$ python -mtimeit -s'l=[[1,2,3],[4,5,6], [7], [8,9]]*99' '[item for sublist in l for item in sublist]'
10000 loops, best of 3: 143 usec per loop
$ python -mtimeit -s'l=[[1,2,3],[4,5,6], [7], [8,9]]*99' 'sum(l, [])'
1000 loops, best of 3: 969 usec per loop
$ python -mtimeit -s'l=[[1,2,3],[4,5,6], [7], [8,9]]*99' 'reduce(lambda x,y: x+y,l)'
1000 loops, best of 3: 1.1 msec per loop
Explanation: the shortcuts based on + (including the implied use in sum) are, of necessity, O(L**2) when there are L sublists -- as the intermediate result list keeps getting longer, at each step a new intermediate result list object gets allocated, and all the items in the previous intermediate result must be copied over (as well as a few new ones added at the end). So, for simplicity and without actual loss of generality, say you have L sublists of I items each: the first I items are copied back and forth L-1 times, the second I items L-2 times, and so on; total number of copies is I times the sum of x for x from 1 to L excluded, i.e., I * (L**2)/2.
The list comprehension just generates one list, once, and copies each item over (from its original place of residence to the result list) also exactly once.
You can use itertools.chain():
>>> import itertools
>>> list2d = [[1,2,3], [4,5,6], [7], [8,9]]
>>> merged = list(itertools.chain(*list2d))
Or you can use itertools.chain.from_iterable() which doesn't require unpacking the list with the * operator:
>>> import itertools
>>> list2d = [[1,2,3], [4,5,6], [7], [8,9]]
>>> merged = list(itertools.chain.from_iterable(list2d))
This approach is arguably more readable than [item for sublist in l for item in sublist] and appears to be faster too:
$ python3 -mtimeit -s'l=[[1,2,3],[4,5,6], [7], [8,9]]*99;import itertools' 'list(itertools.chain.from_iterable(l))'
20000 loops, best of 5: 10.8 usec per loop
$ python3 -mtimeit -s'l=[[1,2,3],[4,5,6], [7], [8,9]]*99' '[item for sublist in l for item in sublist]'
10000 loops, best of 5: 21.7 usec per loop
$ python3 -mtimeit -s'l=[[1,2,3],[4,5,6], [7], [8,9]]*99' 'sum(l, [])'
1000 loops, best of 5: 258 usec per loop
$ python3 -mtimeit -s'l=[[1,2,3],[4,5,6], [7], [8,9]]*99;from functools import reduce' 'reduce(lambda x,y: x+y,l)'
1000 loops, best of 5: 292 usec per loop
$ python3 --version
Python 3.7.5rc1
Note from the author: This is very inefficient. But fun, because monoids are awesome.
>>> xss = [[1, 2, 3], [4, 5, 6], [7], [8, 9]]
>>> sum(xss, [])
[1, 2, 3, 4, 5, 6, 7, 8, 9]
sum sums the elements of the iterable xss, and uses the second argument as the initial value [] for the sum. (The default initial value is 0, which is not a list.)
Because you are summing nested lists, you actually get [1,3]+[2,4] as a result of sum([[1,3],[2,4]],[]), which is equal to [1,3,2,4].
Note that only works on lists of lists. For lists of lists of lists, you'll need another solution.
I tested most suggested solutions with perfplot (a pet project of mine, essentially a wrapper around timeit), and found
import functools
import operator
functools.reduce(operator.iconcat, a, [])
to be the fastest solution, both when many small lists and few long lists are concatenated. (operator.iadd is equally fast.)
A simpler and also acceptable variant is
out = []
for sublist in a:
out.extend(sublist)
If the number of sublists is large, this performs a little worse than the above suggestion.
Code to reproduce the plot:
import functools
import itertools
import operator
import numpy as np
import perfplot
def forfor(a):
return [item for sublist in a for item in sublist]
def sum_brackets(a):
return sum(a, [])
def functools_reduce(a):
return functools.reduce(operator.concat, a)
def functools_reduce_iconcat(a):
return functools.reduce(operator.iconcat, a, [])
def itertools_chain(a):
return list(itertools.chain.from_iterable(a))
def numpy_flat(a):
return list(np.array(a).flat)
def numpy_concatenate(a):
return list(np.concatenate(a))
def extend(a):
out = []
for sublist in a:
out.extend(sublist)
return out
b = perfplot.bench(
setup=lambda n: [list(range(10))] * n,
# setup=lambda n: [list(range(n))] * 10,
kernels=[
forfor,
sum_brackets,
functools_reduce,
functools_reduce_iconcat,
itertools_chain,
numpy_flat,
numpy_concatenate,
extend,
],
n_range=[2 ** k for k in range(16)],
xlabel="num lists (of length 10)",
# xlabel="len lists (10 lists total)"
)
b.save("out.png")
b.show()
Using functools.reduce, which adds an accumulated list xs to the next list ys:
from functools import reduce
xss = [[1,2,3], [4,5,6], [7], [8,9]]
out = reduce(lambda xs, ys: xs + ys, xss)
Output:
[1, 2, 3, 4, 5, 6, 7, 8, 9]
A faster way using operator.concat:
from functools import reduce
import operator
xss = [[1,2,3], [4,5,6], [7], [8,9]]
out = reduce(operator.concat, xss)
Output:
[1, 2, 3, 4, 5, 6, 7, 8, 9]
Here is a general approach that applies to numbers, strings, nested lists and mixed containers. This can flatten both simple and complicated containers (see also Demo).
Code
from typing import Iterable
#from collections import Iterable # < py38
def flatten(items):
"""Yield items from any nested iterable; see Reference."""
for x in items:
if isinstance(x, Iterable) and not isinstance(x, (str, bytes)):
for sub_x in flatten(x):
yield sub_x
else:
yield x
Notes:
In Python 3, yield from flatten(x) can replace for sub_x in flatten(x): yield sub_x
In Python 3.8, abstract base classes are moved from collection.abc to the typing module.
Demo
simple = [[1, 2, 3], [4, 5, 6], [7], [8, 9]]
list(flatten(simple))
# [1, 2, 3, 4, 5, 6, 7, 8, 9]
complicated = [[1, [2]], (3, 4, {5, 6}, 7), 8, "9"] # numbers, strs, nested & mixed
list(flatten(complicated))
# [1, 2, 3, 4, 5, 6, 7, 8, '9']
Reference
This solution is modified from a recipe in Beazley, D. and B. Jones. Recipe 4.14, Python Cookbook 3rd Ed., O'Reilly Media Inc. Sebastopol, CA: 2013.
Found an earlier SO post, possibly the original demonstration.
To flatten a data-structure that is deeply nested, use iteration_utilities.deepflatten1:
>>> from iteration_utilities import deepflatten
>>> l = [[1, 2, 3], [4, 5, 6], [7], [8, 9]]
>>> list(deepflatten(l, depth=1))
[1, 2, 3, 4, 5, 6, 7, 8, 9]
>>> l = [[1, 2, 3], [4, [5, 6]], 7, [8, 9]]
>>> list(deepflatten(l))
[1, 2, 3, 4, 5, 6, 7, 8, 9]
It's a generator so you need to cast the result to a list or explicitly iterate over it.
To flatten only one level and if each of the items is itself iterable you can also use iteration_utilities.flatten which itself is just a thin wrapper around itertools.chain.from_iterable:
>>> from iteration_utilities import flatten
>>> l = [[1, 2, 3], [4, 5, 6], [7], [8, 9]]
>>> list(flatten(l))
[1, 2, 3, 4, 5, 6, 7, 8, 9]
Just to add some timings (based on Nico Schlömer's answer that didn't include the function presented in this answer):
It's a log-log plot to accommodate for the huge range of values spanned. For qualitative reasoning: Lower is better.
The results show that if the iterable contains only a few inner iterables then sum will be fastest, however for long iterables only the itertools.chain.from_iterable, iteration_utilities.deepflatten or the nested comprehension have reasonable performance with itertools.chain.from_iterable being the fastest (as already noticed by Nico Schlömer).
from itertools import chain
from functools import reduce
from collections import Iterable # or from collections.abc import Iterable
import operator
from iteration_utilities import deepflatten
def nested_list_comprehension(lsts):
return [item for sublist in lsts for item in sublist]
def itertools_chain_from_iterable(lsts):
return list(chain.from_iterable(lsts))
def pythons_sum(lsts):
return sum(lsts, [])
def reduce_add(lsts):
return reduce(lambda x, y: x + y, lsts)
def pylangs_flatten(lsts):
return list(flatten(lsts))
def flatten(items):
"""Yield items from any nested iterable; see REF."""
for x in items:
if isinstance(x, Iterable) and not isinstance(x, (str, bytes)):
yield from flatten(x)
else:
yield x
def reduce_concat(lsts):
return reduce(operator.concat, lsts)
def iteration_utilities_deepflatten(lsts):
return list(deepflatten(lsts, depth=1))
from simple_benchmark import benchmark
b = benchmark(
[nested_list_comprehension, itertools_chain_from_iterable, pythons_sum, reduce_add,
pylangs_flatten, reduce_concat, iteration_utilities_deepflatten],
arguments={2**i: [[0]*5]*(2**i) for i in range(1, 13)},
argument_name='number of inner lists'
)
b.plot()
1 Disclaimer: I'm the author of that library
The following seems simplest to me:
>>> import numpy as np
>>> l = [[1, 2, 3], [4, 5, 6], [7], [8, 9]]
>>> print(np.concatenate(l))
[1 2 3 4 5 6 7 8 9]
Consider installing the more_itertools package.
> pip install more_itertools
It ships with an implementation for flatten (source, from the itertools recipes):
import more_itertools
lst = [[1, 2, 3], [4, 5, 6], [7], [8, 9]]
list(more_itertools.flatten(lst))
# [1, 2, 3, 4, 5, 6, 7, 8, 9]
Note: as mentioned in the docs, flatten requires a list of lists. See below on flattening more irregular inputs.
As of version 2.4, you can flatten more complicated, nested iterables with more_itertools.collapse (source, contributed by abarnet).
lst = [[1, 2, 3], [4, 5, 6], [7], [8, 9]]
list(more_itertools.collapse(lst))
# [1, 2, 3, 4, 5, 6, 7, 8, 9]
lst = [[1, 2, 3], [[4, 5, 6]], [[[7]]], 8, 9] # complex nesting
list(more_itertools.collapse(lst))
# [1, 2, 3, 4, 5, 6, 7, 8, 9]
The reason your function didn't work is because the extend extends an array in-place and doesn't return it. You can still return x from lambda, using something like this:
reduce(lambda x,y: x.extend(y) or x, l)
Note: extend is more efficient than + on lists.
matplotlib.cbook.flatten() will work for nested lists even if they nest more deeply than the example.
import matplotlib
l = [[1, 2, 3], [4, 5, 6], [7], [8, 9]]
print(list(matplotlib.cbook.flatten(l)))
l2 = [[1, 2, 3], [4, 5, 6], [7], [8, [9, 10, [11, 12, [13]]]]]
print(list(matplotlib.cbook.flatten(l2)))
Result:
[1, 2, 3, 4, 5, 6, 7, 8, 9]
[1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13]
This is 18x faster than underscore._.flatten:
Average time over 1000 trials of matplotlib.cbook.flatten: 2.55e-05 sec
Average time over 1000 trials of underscore._.flatten: 4.63e-04 sec
(time for underscore._)/(time for matplotlib.cbook) = 18.1233394636
According your list [[1, 2, 3], [4, 5, 6], [7], [8, 9]] which is 1 list level, we can simply use sum(list,[]) without using any libraries
sum([[1, 2, 3], [4, 5, 6], [7], [8, 9]],[])
# [1, 2, 3, 4, 5, 6, 7, 8, 9]
To extend the advantage of this method when there is a tuple or number existing inside. Simply adding a mapping function for each element by map to the list
#For only tuple
sum(list(map(list,[[1, 2, 3], (4, 5, 6), (7,), [8, 9]])),[])
# [1, 2, 3, 4, 5, 6, 7, 8, 9]
#In general
def convert(x):
if type(x) is int or type(x) is float:
return [x]
else:
return list(x)
sum(list(map(convert,[[1, 2, 3], (4, 5, 6), 7, [8, 9]])),[])
# [1, 2, 3, 4, 5, 6, 7, 8, 9]
In here, there is a clear explanation of the drawback in terms of memory for this approach. In short, it recursively creates list objects, which should be avoided :(
One can also use NumPy's flat:
import numpy as np
list(np.array(l).flat)
It only works when sublists have identical dimensions.
You can use the list extend method. It shows to be the fastest:
flat_list = []
for sublist in l:
flat_list.extend(sublist)
Performance:
import functools
import itertools
import numpy
import operator
import perfplot
def functools_reduce_iconcat(a):
return functools.reduce(operator.iconcat, a, [])
def itertools_chain(a):
return list(itertools.chain.from_iterable(a))
def numpy_flat(a):
return list(numpy.array(a).flat)
def extend(a):
n = []
list(map(n.extend, a))
return n
perfplot.show(
setup = lambda n: [list(range(10))] * n,
kernels = [
functools_reduce_iconcat, extend, itertools_chain, numpy_flat
],
n_range = [2**k for k in range(16)],
xlabel = 'num lists',
)
Output:
There are several answers with the same recursive appending scheme as below, but none makes use of try, which makes the solution more robust and Pythonic.
def flatten(itr):
for x in itr:
try:
yield from flatten(x)
except TypeError:
yield x
Usage: this is a generator, and you typically want to enclose it in an iterable builder like list() or tuple() or use it in a for loop.
Advantages of this solution are:
works with any kind of iterable (even future ones!)
works with any combination and deepness of nesting
works also if top level contains bare items
no dependencies
fast and efficient (you can flatten the nested iterable partially, without wasting time on the remaining part you don't need)
versatile (you can use it to build an iterable of your choice or in a loop)
N.B.: Since all iterables are flattened, strings are decomposed into sequences of single characters. If you don't like/want such behavior, you can use the following version which filters out from flattening iterables like strings and bytes:
def flatten(itr):
if type(itr) in (str,bytes):
yield itr
else:
for x in itr:
try:
yield from flatten(x)
except TypeError:
yield x
Note: Below applies to Python 3.3+ because it uses yield_from. six is also a third-party package, though it is stable. Alternately, you could use sys.version.
In the case of obj = [[1, 2,], [3, 4], [5, 6]], all of the solutions here are good, including list comprehension and itertools.chain.from_iterable.
However, consider this slightly more complex case:
>>> obj = [[1, 2, 3], [4, 5], 6, 'abc', [7], [8, [9, 10]]]
There are several problems here:
One element, 6, is just a scalar; it's not iterable, so the above routes will fail here.
One element, 'abc', is technically iterable (all strs are). However, reading between the lines a bit, you don't want to treat it as such--you want to treat it as a single element.
The final element, [8, [9, 10]] is itself a nested iterable. Basic list comprehension and chain.from_iterable only extract "1 level down."
You can remedy this as follows:
>>> from collections import Iterable
>>> from six import string_types
>>> def flatten(obj):
... for i in obj:
... if isinstance(i, Iterable) and not isinstance(i, string_types):
... yield from flatten(i)
... else:
... yield i
>>> list(flatten(obj))
[1, 2, 3, 4, 5, 6, 'abc', 7, 8, 9, 10]
Here, you check that the sub-element (1) is iterable with Iterable, an ABC from itertools, but also want to ensure that (2) the element is not "string-like."
If you are willing to give up a tiny amount of speed for a cleaner look, then you could use numpy.concatenate().tolist() or numpy.concatenate().ravel().tolist():
import numpy
l = [[1, 2, 3], [4, 5, 6], [7], [8, 9]] * 99
%timeit numpy.concatenate(l).ravel().tolist()
1000 loops, best of 3: 313 µs per loop
%timeit numpy.concatenate(l).tolist()
1000 loops, best of 3: 312 µs per loop
%timeit [item for sublist in l for item in sublist]
1000 loops, best of 3: 31.5 µs per loop
You can find out more here in the documentation, numpy.concatenate and numpy.ravel.
def flatten(alist):
if alist == []:
return []
elif type(alist) is not list:
return [alist]
else:
return flatten(alist[0]) + flatten(alist[1:])
I wanted a solution which can deal with multiple nesting ([[1], [[[2]], [3]]], [1, 2, 3] for example), but would also not be recursive (I had a big level of recursion and I got a recursion error.
This is what I came up with:
def _flatten(l) -> Iterator[Any]:
stack = l.copy()
while stack:
item = stack.pop()
if isinstance(item, list):
stack.extend(item)
else:
yield item
def flatten(l) -> Iterator[Any]:
return reversed(list(_flatten(l)))
and tests:
#pytest.mark.parametrize('input_list, expected_output', [
([1, 2, 3], [1, 2, 3]),
([[1], 2, 3], [1, 2, 3]),
([[1], [2], 3], [1, 2, 3]),
([[1], [2], [3]], [1, 2, 3]),
([[1], [[2]], [3]], [1, 2, 3]),
([[1], [[[2]], [3]]], [1, 2, 3]),
])
def test_flatten(input_list, expected_output):
assert list(flatten(input_list)) == expected_output
This may not be the most efficient way, but I thought to put a one-liner (actually a two-liner). Both versions will work on arbitrary hierarchy nested lists, and exploits language features (Python 3.5) and recursion.
def make_list_flat (l):
flist = []
flist.extend ([l]) if (type (l) is not list) else [flist.extend (make_list_flat (e)) for e in l]
return flist
a = [[1, 2], [[[[3, 4, 5], 6]]], 7, [8, [9, [10, 11], 12, [13, 14, [15, [[16, 17], 18]]]]]]
flist = make_list_flat(a)
print (flist)
The output is
[1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18]
This works in a depth first manner. The recursion goes down until it finds a non-list element, then extends the local variable flist and then rolls back it to the parent. Whenever flist is returned, it is extended to the parent's flist in the list comprehension. Therefore, at the root, a flat list is returned.
The above one creates several local lists and returns them which are used to extend the parent's list. I think the way around for this may be creating a gloabl flist, like below.
a = [[1, 2], [[[[3, 4, 5], 6]]], 7, [8, [9, [10, 11], 12, [13, 14, [15, [[16, 17], 18]]]]]]
flist = []
def make_list_flat (l):
flist.extend ([l]) if (type (l) is not list) else [make_list_flat (e) for e in l]
make_list_flat(a)
print (flist)
The output is again
[1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18]
Although I am not sure at this time about the efficiency.
Not a one-liner, but seeing all the answers here, I guess this long list missed some pattern matching, so here it is :)
The two methods are probably not efficient, but anyway, it's easy to read (to me at least; perhaps I'm spoiled by functional programming):
def flat(x):
match x:
case []:
return []
case [[*sublist], *r]:
return [*sublist, *flat(r)]
The second version considers lists of lists of lists... whatever the nesting:
def flat(x):
match x:
case []:
return []
case [[*sublist], *r]:
return [*flat(sublist), *flat(r)]
case [h, *r]:
return [h, *flat(r)]
If you want to unnest everything and keep a distinct list of elements, you could use this as well.
list_of_lists = [[1,2], [2,3], [3,4]]
list(set.union(*[set(s) for s in list_of_lists]))
If you have a numpy array a:
a = np.array([[1,2], [3,4]])
a.flatten('C')
produces:
[1, 2, 3, 4]
np.flatten also accepts other parameters:
C:
F
A
K
More details about parameters are available here.
Another unusual approach that works for hetero- and homogeneous lists of integers:
from typing import List
def flatten(l: list) -> List[int]:
"""Flatten an arbitrary deep nested list of lists of integers.
Examples:
>>> flatten([1, 2, [1, [10]]])
[1, 2, 1, 10]
Args:
l: Union[l, Union[int, List[int]]
Returns:
Flatted list of integer
"""
return [int(i.strip('[ ]')) for i in str(l).split(',')]
A non-recursive function to flatten lists of lists of any depth:
def flatten_list(list1):
out = []
inside = list1
while inside:
x = inside.pop(0)
if isinstance(x, list):
inside[0:0] = x
else:
out.append(x)
return out
l = [[[1,2],3,[4,[[5,6],7],[8]]],[9,10,11]]
flatten_list(l)
# [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11]
For a list containing multiple list here a recursive solution that work for me and that i hope is correct:
# Question 4
def flatten(input_ls=[]) -> []:
res_ls = []
res_ls = flatten_recursive(input_ls, res_ls)
print("Final flatten list solution is: \n", res_ls)
return res_ls
def flatten_recursive(input_ls=[], res_ls=[]) -> []:
tmp_ls = []
for i in input_ls:
if isinstance(i, int):
res_ls.append(i)
else:
tmp_ls = i
tmp_ls.append(flatten_recursive(i, res_ls))
print(res_ls)
return res_ls
flatten([0, 1, [2, 3], 4, [5, 6]]) # test
flatten([0, [[[1]]], [[2, 3], [4, [[5, 6]]]]])
Output:
[0, 1, 2, 3]
[0, 1, 2, 3, 4, 5, 6]
[0, 1, 2, 3, 4, 5, 6]
Final flatten list solution is:
[0, 1, 2, 3, 4, 5, 6]
[0, 1]
[0, 1]
[0, 1]
[0, 1, 2, 3]
[0, 1, 2, 3, 4, 5, 6]
[0, 1, 2, 3, 4, 5, 6]
[0, 1, 2, 3, 4, 5, 6]
[0, 1, 2, 3, 4, 5, 6]
[0, 1, 2, 3, 4, 5, 6]
Final flatten list solution is:
[0, 1, 2, 3, 4, 5, 6]
I would suggest using generators with yield statement and yield from.
Here's an example:
from collections.abc import Iterable
def flatten(items, ignore_types=(bytes, str)):
"""
Flatten all of the nested lists to the one. Ignoring flatting of iterable types str and bytes by default.
"""
for x in items:
if isinstance(x, Iterable) and not isinstance(x, ignore_types):
yield from flatten(x)
else:
yield x
values = [7, [4, 3, 5, [7, 3], (3, 4), ('A', {'B', 'C'})]]
for v in flatten(values):
print(v)
If I want to add something to the great previous answers, here is my recursive flatten function which can flatten not only nested lists, but also any given container or any generally any object which can throw out items. This does also work for any depth of nesting and it is a lazy iterator which yields the items as requested:
def flatten(iterable):
# These types won't considered a sequence or generally a container
exclude = str, bytes
for i in iterable:
try:
if isinstance(i, exclude):
raise TypeError
iter(i)
except TypeError:
yield i
else:
yield from flatten(i)
This way, you can exclude types you don't want to be flattened, like str or what else.
The idea is if an object can pass the iter() it's ready to yield items. So the iterable can have even generator expressions as an item.
Someone could argue: Why did you write this that generic when the OP didn't ask for it? OK, you're right. I just felt like this might help someone (like it did for myself).
Test cases:
lst1 = [1, {3}, (1, 6), [[3, 8]], [[[5]]], 9, ((((2,),),),)]
lst2 = ['3', B'A', [[[(i ** 2 for i in range(3))]]], range(3)]
print(list(flatten(lst1)))
print(list(flatten(lst2)))
Output:
[1, 3, 1, 6, 3, 8, 5, 9, 2]
['3', b'A', 0, 1, 4, 0, 1, 2]
Considering the list has just integers:
import re
l = [[1, 2, 3], [4, 5, 6], [7], [8, 9]]
list(map(int,re.sub('(\[|\])','',str(l)).split(',')))
Simplest Way to do in python without any library
This function will work for even multidimensional list also
using recursion we can achieve any combination of list inside list, we can flatten it without using any library.
#Devil
x = [[1, 2, 3], [4, 5, 6], [7], [8, 9]]
output = []
def flatten(v):
if isinstance(v, int):
output.append(v)
if isinstance(v, list):
for i in range(0, len(v)):
flatten(v[i])
flatten(x)
print("Output:", output)
#Output: [1, 2, 3, 4, 5, 6, 7, 8, 9]
#Adding more dimensions
x = [ [1, [2, 3, [4, 5], [6]], 7 ], [8, [9, [10]]] ]
flatten(x)
print("Output:", output)
#Output: [1, 2, 3, 4, 5, 6, 7, 8, 9, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10]

How to merge a list of list in python? [duplicate]

I have a list of lists like [[1, 2, 3], [4, 5, 6], [7], [8, 9]]. How can I flatten it to get [1, 2, 3, 4, 5, 6, 7, 8, 9]?
If your list of lists comes from a nested list comprehension, the problem can be solved more simply/directly by fixing the comprehension; please see python list comprehensions; compressing a list of lists?.
The most popular solutions here generally only flatten one "level" of the nested list. See Flatten an irregular (arbitrarily nested) list of lists for solutions that completely flatten a deeply nested structure (recursively, in general).
Given a list of lists l,
flat_list = [item for sublist in l for item in sublist]
which means:
flat_list = []
for sublist in l:
for item in sublist:
flat_list.append(item)
is faster than the shortcuts posted so far. (l is the list to flatten.)
Here is the corresponding function:
def flatten(l):
return [item for sublist in l for item in sublist]
As evidence, you can use the timeit module in the standard library:
$ python -mtimeit -s'l=[[1,2,3],[4,5,6], [7], [8,9]]*99' '[item for sublist in l for item in sublist]'
10000 loops, best of 3: 143 usec per loop
$ python -mtimeit -s'l=[[1,2,3],[4,5,6], [7], [8,9]]*99' 'sum(l, [])'
1000 loops, best of 3: 969 usec per loop
$ python -mtimeit -s'l=[[1,2,3],[4,5,6], [7], [8,9]]*99' 'reduce(lambda x,y: x+y,l)'
1000 loops, best of 3: 1.1 msec per loop
Explanation: the shortcuts based on + (including the implied use in sum) are, of necessity, O(L**2) when there are L sublists -- as the intermediate result list keeps getting longer, at each step a new intermediate result list object gets allocated, and all the items in the previous intermediate result must be copied over (as well as a few new ones added at the end). So, for simplicity and without actual loss of generality, say you have L sublists of I items each: the first I items are copied back and forth L-1 times, the second I items L-2 times, and so on; total number of copies is I times the sum of x for x from 1 to L excluded, i.e., I * (L**2)/2.
The list comprehension just generates one list, once, and copies each item over (from its original place of residence to the result list) also exactly once.
You can use itertools.chain():
>>> import itertools
>>> list2d = [[1,2,3], [4,5,6], [7], [8,9]]
>>> merged = list(itertools.chain(*list2d))
Or you can use itertools.chain.from_iterable() which doesn't require unpacking the list with the * operator:
>>> import itertools
>>> list2d = [[1,2,3], [4,5,6], [7], [8,9]]
>>> merged = list(itertools.chain.from_iterable(list2d))
This approach is arguably more readable than [item for sublist in l for item in sublist] and appears to be faster too:
$ python3 -mtimeit -s'l=[[1,2,3],[4,5,6], [7], [8,9]]*99;import itertools' 'list(itertools.chain.from_iterable(l))'
20000 loops, best of 5: 10.8 usec per loop
$ python3 -mtimeit -s'l=[[1,2,3],[4,5,6], [7], [8,9]]*99' '[item for sublist in l for item in sublist]'
10000 loops, best of 5: 21.7 usec per loop
$ python3 -mtimeit -s'l=[[1,2,3],[4,5,6], [7], [8,9]]*99' 'sum(l, [])'
1000 loops, best of 5: 258 usec per loop
$ python3 -mtimeit -s'l=[[1,2,3],[4,5,6], [7], [8,9]]*99;from functools import reduce' 'reduce(lambda x,y: x+y,l)'
1000 loops, best of 5: 292 usec per loop
$ python3 --version
Python 3.7.5rc1
Note from the author: This is very inefficient. But fun, because monoids are awesome.
>>> xss = [[1, 2, 3], [4, 5, 6], [7], [8, 9]]
>>> sum(xss, [])
[1, 2, 3, 4, 5, 6, 7, 8, 9]
sum sums the elements of the iterable xss, and uses the second argument as the initial value [] for the sum. (The default initial value is 0, which is not a list.)
Because you are summing nested lists, you actually get [1,3]+[2,4] as a result of sum([[1,3],[2,4]],[]), which is equal to [1,3,2,4].
Note that only works on lists of lists. For lists of lists of lists, you'll need another solution.
I tested most suggested solutions with perfplot (a pet project of mine, essentially a wrapper around timeit), and found
import functools
import operator
functools.reduce(operator.iconcat, a, [])
to be the fastest solution, both when many small lists and few long lists are concatenated. (operator.iadd is equally fast.)
A simpler and also acceptable variant is
out = []
for sublist in a:
out.extend(sublist)
If the number of sublists is large, this performs a little worse than the above suggestion.
Code to reproduce the plot:
import functools
import itertools
import operator
import numpy as np
import perfplot
def forfor(a):
return [item for sublist in a for item in sublist]
def sum_brackets(a):
return sum(a, [])
def functools_reduce(a):
return functools.reduce(operator.concat, a)
def functools_reduce_iconcat(a):
return functools.reduce(operator.iconcat, a, [])
def itertools_chain(a):
return list(itertools.chain.from_iterable(a))
def numpy_flat(a):
return list(np.array(a).flat)
def numpy_concatenate(a):
return list(np.concatenate(a))
def extend(a):
out = []
for sublist in a:
out.extend(sublist)
return out
b = perfplot.bench(
setup=lambda n: [list(range(10))] * n,
# setup=lambda n: [list(range(n))] * 10,
kernels=[
forfor,
sum_brackets,
functools_reduce,
functools_reduce_iconcat,
itertools_chain,
numpy_flat,
numpy_concatenate,
extend,
],
n_range=[2 ** k for k in range(16)],
xlabel="num lists (of length 10)",
# xlabel="len lists (10 lists total)"
)
b.save("out.png")
b.show()
Using functools.reduce, which adds an accumulated list xs to the next list ys:
from functools import reduce
xss = [[1,2,3], [4,5,6], [7], [8,9]]
out = reduce(lambda xs, ys: xs + ys, xss)
Output:
[1, 2, 3, 4, 5, 6, 7, 8, 9]
A faster way using operator.concat:
from functools import reduce
import operator
xss = [[1,2,3], [4,5,6], [7], [8,9]]
out = reduce(operator.concat, xss)
Output:
[1, 2, 3, 4, 5, 6, 7, 8, 9]
Here is a general approach that applies to numbers, strings, nested lists and mixed containers. This can flatten both simple and complicated containers (see also Demo).
Code
from typing import Iterable
#from collections import Iterable # < py38
def flatten(items):
"""Yield items from any nested iterable; see Reference."""
for x in items:
if isinstance(x, Iterable) and not isinstance(x, (str, bytes)):
for sub_x in flatten(x):
yield sub_x
else:
yield x
Notes:
In Python 3, yield from flatten(x) can replace for sub_x in flatten(x): yield sub_x
In Python 3.8, abstract base classes are moved from collection.abc to the typing module.
Demo
simple = [[1, 2, 3], [4, 5, 6], [7], [8, 9]]
list(flatten(simple))
# [1, 2, 3, 4, 5, 6, 7, 8, 9]
complicated = [[1, [2]], (3, 4, {5, 6}, 7), 8, "9"] # numbers, strs, nested & mixed
list(flatten(complicated))
# [1, 2, 3, 4, 5, 6, 7, 8, '9']
Reference
This solution is modified from a recipe in Beazley, D. and B. Jones. Recipe 4.14, Python Cookbook 3rd Ed., O'Reilly Media Inc. Sebastopol, CA: 2013.
Found an earlier SO post, possibly the original demonstration.
To flatten a data-structure that is deeply nested, use iteration_utilities.deepflatten1:
>>> from iteration_utilities import deepflatten
>>> l = [[1, 2, 3], [4, 5, 6], [7], [8, 9]]
>>> list(deepflatten(l, depth=1))
[1, 2, 3, 4, 5, 6, 7, 8, 9]
>>> l = [[1, 2, 3], [4, [5, 6]], 7, [8, 9]]
>>> list(deepflatten(l))
[1, 2, 3, 4, 5, 6, 7, 8, 9]
It's a generator so you need to cast the result to a list or explicitly iterate over it.
To flatten only one level and if each of the items is itself iterable you can also use iteration_utilities.flatten which itself is just a thin wrapper around itertools.chain.from_iterable:
>>> from iteration_utilities import flatten
>>> l = [[1, 2, 3], [4, 5, 6], [7], [8, 9]]
>>> list(flatten(l))
[1, 2, 3, 4, 5, 6, 7, 8, 9]
Just to add some timings (based on Nico Schlömer's answer that didn't include the function presented in this answer):
It's a log-log plot to accommodate for the huge range of values spanned. For qualitative reasoning: Lower is better.
The results show that if the iterable contains only a few inner iterables then sum will be fastest, however for long iterables only the itertools.chain.from_iterable, iteration_utilities.deepflatten or the nested comprehension have reasonable performance with itertools.chain.from_iterable being the fastest (as already noticed by Nico Schlömer).
from itertools import chain
from functools import reduce
from collections import Iterable # or from collections.abc import Iterable
import operator
from iteration_utilities import deepflatten
def nested_list_comprehension(lsts):
return [item for sublist in lsts for item in sublist]
def itertools_chain_from_iterable(lsts):
return list(chain.from_iterable(lsts))
def pythons_sum(lsts):
return sum(lsts, [])
def reduce_add(lsts):
return reduce(lambda x, y: x + y, lsts)
def pylangs_flatten(lsts):
return list(flatten(lsts))
def flatten(items):
"""Yield items from any nested iterable; see REF."""
for x in items:
if isinstance(x, Iterable) and not isinstance(x, (str, bytes)):
yield from flatten(x)
else:
yield x
def reduce_concat(lsts):
return reduce(operator.concat, lsts)
def iteration_utilities_deepflatten(lsts):
return list(deepflatten(lsts, depth=1))
from simple_benchmark import benchmark
b = benchmark(
[nested_list_comprehension, itertools_chain_from_iterable, pythons_sum, reduce_add,
pylangs_flatten, reduce_concat, iteration_utilities_deepflatten],
arguments={2**i: [[0]*5]*(2**i) for i in range(1, 13)},
argument_name='number of inner lists'
)
b.plot()
1 Disclaimer: I'm the author of that library
The following seems simplest to me:
>>> import numpy as np
>>> l = [[1, 2, 3], [4, 5, 6], [7], [8, 9]]
>>> print(np.concatenate(l))
[1 2 3 4 5 6 7 8 9]
Consider installing the more_itertools package.
> pip install more_itertools
It ships with an implementation for flatten (source, from the itertools recipes):
import more_itertools
lst = [[1, 2, 3], [4, 5, 6], [7], [8, 9]]
list(more_itertools.flatten(lst))
# [1, 2, 3, 4, 5, 6, 7, 8, 9]
Note: as mentioned in the docs, flatten requires a list of lists. See below on flattening more irregular inputs.
As of version 2.4, you can flatten more complicated, nested iterables with more_itertools.collapse (source, contributed by abarnet).
lst = [[1, 2, 3], [4, 5, 6], [7], [8, 9]]
list(more_itertools.collapse(lst))
# [1, 2, 3, 4, 5, 6, 7, 8, 9]
lst = [[1, 2, 3], [[4, 5, 6]], [[[7]]], 8, 9] # complex nesting
list(more_itertools.collapse(lst))
# [1, 2, 3, 4, 5, 6, 7, 8, 9]
The reason your function didn't work is because the extend extends an array in-place and doesn't return it. You can still return x from lambda, using something like this:
reduce(lambda x,y: x.extend(y) or x, l)
Note: extend is more efficient than + on lists.
matplotlib.cbook.flatten() will work for nested lists even if they nest more deeply than the example.
import matplotlib
l = [[1, 2, 3], [4, 5, 6], [7], [8, 9]]
print(list(matplotlib.cbook.flatten(l)))
l2 = [[1, 2, 3], [4, 5, 6], [7], [8, [9, 10, [11, 12, [13]]]]]
print(list(matplotlib.cbook.flatten(l2)))
Result:
[1, 2, 3, 4, 5, 6, 7, 8, 9]
[1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13]
This is 18x faster than underscore._.flatten:
Average time over 1000 trials of matplotlib.cbook.flatten: 2.55e-05 sec
Average time over 1000 trials of underscore._.flatten: 4.63e-04 sec
(time for underscore._)/(time for matplotlib.cbook) = 18.1233394636
According your list [[1, 2, 3], [4, 5, 6], [7], [8, 9]] which is 1 list level, we can simply use sum(list,[]) without using any libraries
sum([[1, 2, 3], [4, 5, 6], [7], [8, 9]],[])
# [1, 2, 3, 4, 5, 6, 7, 8, 9]
To extend the advantage of this method when there is a tuple or number existing inside. Simply adding a mapping function for each element by map to the list
#For only tuple
sum(list(map(list,[[1, 2, 3], (4, 5, 6), (7,), [8, 9]])),[])
# [1, 2, 3, 4, 5, 6, 7, 8, 9]
#In general
def convert(x):
if type(x) is int or type(x) is float:
return [x]
else:
return list(x)
sum(list(map(convert,[[1, 2, 3], (4, 5, 6), 7, [8, 9]])),[])
# [1, 2, 3, 4, 5, 6, 7, 8, 9]
In here, there is a clear explanation of the drawback in terms of memory for this approach. In short, it recursively creates list objects, which should be avoided :(
One can also use NumPy's flat:
import numpy as np
list(np.array(l).flat)
It only works when sublists have identical dimensions.
You can use the list extend method. It shows to be the fastest:
flat_list = []
for sublist in l:
flat_list.extend(sublist)
Performance:
import functools
import itertools
import numpy
import operator
import perfplot
def functools_reduce_iconcat(a):
return functools.reduce(operator.iconcat, a, [])
def itertools_chain(a):
return list(itertools.chain.from_iterable(a))
def numpy_flat(a):
return list(numpy.array(a).flat)
def extend(a):
n = []
list(map(n.extend, a))
return n
perfplot.show(
setup = lambda n: [list(range(10))] * n,
kernels = [
functools_reduce_iconcat, extend, itertools_chain, numpy_flat
],
n_range = [2**k for k in range(16)],
xlabel = 'num lists',
)
Output:
There are several answers with the same recursive appending scheme as below, but none makes use of try, which makes the solution more robust and Pythonic.
def flatten(itr):
for x in itr:
try:
yield from flatten(x)
except TypeError:
yield x
Usage: this is a generator, and you typically want to enclose it in an iterable builder like list() or tuple() or use it in a for loop.
Advantages of this solution are:
works with any kind of iterable (even future ones!)
works with any combination and deepness of nesting
works also if top level contains bare items
no dependencies
fast and efficient (you can flatten the nested iterable partially, without wasting time on the remaining part you don't need)
versatile (you can use it to build an iterable of your choice or in a loop)
N.B.: Since all iterables are flattened, strings are decomposed into sequences of single characters. If you don't like/want such behavior, you can use the following version which filters out from flattening iterables like strings and bytes:
def flatten(itr):
if type(itr) in (str,bytes):
yield itr
else:
for x in itr:
try:
yield from flatten(x)
except TypeError:
yield x
Note: Below applies to Python 3.3+ because it uses yield_from. six is also a third-party package, though it is stable. Alternately, you could use sys.version.
In the case of obj = [[1, 2,], [3, 4], [5, 6]], all of the solutions here are good, including list comprehension and itertools.chain.from_iterable.
However, consider this slightly more complex case:
>>> obj = [[1, 2, 3], [4, 5], 6, 'abc', [7], [8, [9, 10]]]
There are several problems here:
One element, 6, is just a scalar; it's not iterable, so the above routes will fail here.
One element, 'abc', is technically iterable (all strs are). However, reading between the lines a bit, you don't want to treat it as such--you want to treat it as a single element.
The final element, [8, [9, 10]] is itself a nested iterable. Basic list comprehension and chain.from_iterable only extract "1 level down."
You can remedy this as follows:
>>> from collections import Iterable
>>> from six import string_types
>>> def flatten(obj):
... for i in obj:
... if isinstance(i, Iterable) and not isinstance(i, string_types):
... yield from flatten(i)
... else:
... yield i
>>> list(flatten(obj))
[1, 2, 3, 4, 5, 6, 'abc', 7, 8, 9, 10]
Here, you check that the sub-element (1) is iterable with Iterable, an ABC from itertools, but also want to ensure that (2) the element is not "string-like."
If you are willing to give up a tiny amount of speed for a cleaner look, then you could use numpy.concatenate().tolist() or numpy.concatenate().ravel().tolist():
import numpy
l = [[1, 2, 3], [4, 5, 6], [7], [8, 9]] * 99
%timeit numpy.concatenate(l).ravel().tolist()
1000 loops, best of 3: 313 µs per loop
%timeit numpy.concatenate(l).tolist()
1000 loops, best of 3: 312 µs per loop
%timeit [item for sublist in l for item in sublist]
1000 loops, best of 3: 31.5 µs per loop
You can find out more here in the documentation, numpy.concatenate and numpy.ravel.
def flatten(alist):
if alist == []:
return []
elif type(alist) is not list:
return [alist]
else:
return flatten(alist[0]) + flatten(alist[1:])
I wanted a solution which can deal with multiple nesting ([[1], [[[2]], [3]]], [1, 2, 3] for example), but would also not be recursive (I had a big level of recursion and I got a recursion error.
This is what I came up with:
def _flatten(l) -> Iterator[Any]:
stack = l.copy()
while stack:
item = stack.pop()
if isinstance(item, list):
stack.extend(item)
else:
yield item
def flatten(l) -> Iterator[Any]:
return reversed(list(_flatten(l)))
and tests:
#pytest.mark.parametrize('input_list, expected_output', [
([1, 2, 3], [1, 2, 3]),
([[1], 2, 3], [1, 2, 3]),
([[1], [2], 3], [1, 2, 3]),
([[1], [2], [3]], [1, 2, 3]),
([[1], [[2]], [3]], [1, 2, 3]),
([[1], [[[2]], [3]]], [1, 2, 3]),
])
def test_flatten(input_list, expected_output):
assert list(flatten(input_list)) == expected_output
This may not be the most efficient way, but I thought to put a one-liner (actually a two-liner). Both versions will work on arbitrary hierarchy nested lists, and exploits language features (Python 3.5) and recursion.
def make_list_flat (l):
flist = []
flist.extend ([l]) if (type (l) is not list) else [flist.extend (make_list_flat (e)) for e in l]
return flist
a = [[1, 2], [[[[3, 4, 5], 6]]], 7, [8, [9, [10, 11], 12, [13, 14, [15, [[16, 17], 18]]]]]]
flist = make_list_flat(a)
print (flist)
The output is
[1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18]
This works in a depth first manner. The recursion goes down until it finds a non-list element, then extends the local variable flist and then rolls back it to the parent. Whenever flist is returned, it is extended to the parent's flist in the list comprehension. Therefore, at the root, a flat list is returned.
The above one creates several local lists and returns them which are used to extend the parent's list. I think the way around for this may be creating a gloabl flist, like below.
a = [[1, 2], [[[[3, 4, 5], 6]]], 7, [8, [9, [10, 11], 12, [13, 14, [15, [[16, 17], 18]]]]]]
flist = []
def make_list_flat (l):
flist.extend ([l]) if (type (l) is not list) else [make_list_flat (e) for e in l]
make_list_flat(a)
print (flist)
The output is again
[1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18]
Although I am not sure at this time about the efficiency.
Not a one-liner, but seeing all the answers here, I guess this long list missed some pattern matching, so here it is :)
The two methods are probably not efficient, but anyway, it's easy to read (to me at least; perhaps I'm spoiled by functional programming):
def flat(x):
match x:
case []:
return []
case [[*sublist], *r]:
return [*sublist, *flat(r)]
The second version considers lists of lists of lists... whatever the nesting:
def flat(x):
match x:
case []:
return []
case [[*sublist], *r]:
return [*flat(sublist), *flat(r)]
case [h, *r]:
return [h, *flat(r)]
If you want to unnest everything and keep a distinct list of elements, you could use this as well.
list_of_lists = [[1,2], [2,3], [3,4]]
list(set.union(*[set(s) for s in list_of_lists]))
If you have a numpy array a:
a = np.array([[1,2], [3,4]])
a.flatten('C')
produces:
[1, 2, 3, 4]
np.flatten also accepts other parameters:
C:
F
A
K
More details about parameters are available here.
Another unusual approach that works for hetero- and homogeneous lists of integers:
from typing import List
def flatten(l: list) -> List[int]:
"""Flatten an arbitrary deep nested list of lists of integers.
Examples:
>>> flatten([1, 2, [1, [10]]])
[1, 2, 1, 10]
Args:
l: Union[l, Union[int, List[int]]
Returns:
Flatted list of integer
"""
return [int(i.strip('[ ]')) for i in str(l).split(',')]
A non-recursive function to flatten lists of lists of any depth:
def flatten_list(list1):
out = []
inside = list1
while inside:
x = inside.pop(0)
if isinstance(x, list):
inside[0:0] = x
else:
out.append(x)
return out
l = [[[1,2],3,[4,[[5,6],7],[8]]],[9,10,11]]
flatten_list(l)
# [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11]
For a list containing multiple list here a recursive solution that work for me and that i hope is correct:
# Question 4
def flatten(input_ls=[]) -> []:
res_ls = []
res_ls = flatten_recursive(input_ls, res_ls)
print("Final flatten list solution is: \n", res_ls)
return res_ls
def flatten_recursive(input_ls=[], res_ls=[]) -> []:
tmp_ls = []
for i in input_ls:
if isinstance(i, int):
res_ls.append(i)
else:
tmp_ls = i
tmp_ls.append(flatten_recursive(i, res_ls))
print(res_ls)
return res_ls
flatten([0, 1, [2, 3], 4, [5, 6]]) # test
flatten([0, [[[1]]], [[2, 3], [4, [[5, 6]]]]])
Output:
[0, 1, 2, 3]
[0, 1, 2, 3, 4, 5, 6]
[0, 1, 2, 3, 4, 5, 6]
Final flatten list solution is:
[0, 1, 2, 3, 4, 5, 6]
[0, 1]
[0, 1]
[0, 1]
[0, 1, 2, 3]
[0, 1, 2, 3, 4, 5, 6]
[0, 1, 2, 3, 4, 5, 6]
[0, 1, 2, 3, 4, 5, 6]
[0, 1, 2, 3, 4, 5, 6]
[0, 1, 2, 3, 4, 5, 6]
Final flatten list solution is:
[0, 1, 2, 3, 4, 5, 6]
I would suggest using generators with yield statement and yield from.
Here's an example:
from collections.abc import Iterable
def flatten(items, ignore_types=(bytes, str)):
"""
Flatten all of the nested lists to the one. Ignoring flatting of iterable types str and bytes by default.
"""
for x in items:
if isinstance(x, Iterable) and not isinstance(x, ignore_types):
yield from flatten(x)
else:
yield x
values = [7, [4, 3, 5, [7, 3], (3, 4), ('A', {'B', 'C'})]]
for v in flatten(values):
print(v)
If I want to add something to the great previous answers, here is my recursive flatten function which can flatten not only nested lists, but also any given container or any generally any object which can throw out items. This does also work for any depth of nesting and it is a lazy iterator which yields the items as requested:
def flatten(iterable):
# These types won't considered a sequence or generally a container
exclude = str, bytes
for i in iterable:
try:
if isinstance(i, exclude):
raise TypeError
iter(i)
except TypeError:
yield i
else:
yield from flatten(i)
This way, you can exclude types you don't want to be flattened, like str or what else.
The idea is if an object can pass the iter() it's ready to yield items. So the iterable can have even generator expressions as an item.
Someone could argue: Why did you write this that generic when the OP didn't ask for it? OK, you're right. I just felt like this might help someone (like it did for myself).
Test cases:
lst1 = [1, {3}, (1, 6), [[3, 8]], [[[5]]], 9, ((((2,),),),)]
lst2 = ['3', B'A', [[[(i ** 2 for i in range(3))]]], range(3)]
print(list(flatten(lst1)))
print(list(flatten(lst2)))
Output:
[1, 3, 1, 6, 3, 8, 5, 9, 2]
['3', b'A', 0, 1, 4, 0, 1, 2]
Considering the list has just integers:
import re
l = [[1, 2, 3], [4, 5, 6], [7], [8, 9]]
list(map(int,re.sub('(\[|\])','',str(l)).split(',')))
Simplest Way to do in python without any library
This function will work for even multidimensional list also
using recursion we can achieve any combination of list inside list, we can flatten it without using any library.
#Devil
x = [[1, 2, 3], [4, 5, 6], [7], [8, 9]]
output = []
def flatten(v):
if isinstance(v, int):
output.append(v)
if isinstance(v, list):
for i in range(0, len(v)):
flatten(v[i])
flatten(x)
print("Output:", output)
#Output: [1, 2, 3, 4, 5, 6, 7, 8, 9]
#Adding more dimensions
x = [ [1, [2, 3, [4, 5], [6]], 7 ], [8, [9, [10]]] ]
flatten(x)
print("Output:", output)
#Output: [1, 2, 3, 4, 5, 6, 7, 8, 9, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10]

Categories