Interleave two lists of different length in python v. 2? - python

I am trying to write a Python function that takes two lists as arguments and interleaves them. The order of the component lists should be preserved. If the lists do not have the same length, the elements of the longer list should end up at the
end of the resulting list.
For example, I'd like to put this in Shell:
interleave(["a", "b"], [1, 2, 3, 4])
And get this back:
["a", 1, "b", 2, 3, 4]
If you can help me I'd appreciate it.

Here's how I'd do it, using various bits of the itertools module. It works for any number of iterables, not just two:
from itertools import chain, izip_longest # or zip_longest in Python 3
def interleave(*iterables):
sentinel = object()
z = izip_longest(*iterables, fillvalue = sentinel)
c = chain.from_iterable(z)
f = filter(lambda x: x is not sentinel, c)
return list(f)

You could try this:
In [30]: from itertools import izip_longest
In [31]: l = ['a', 'b']
In [32]: l2 = [1, 2, 3, 4]
In [33]: [item for slist in izip_longest(l, l2) for item in slist if item is not None]
Out[33]: ['a', 1, 'b', 2, 3, 4]
izip_longest 'zips' the two lists together, but instead of stopping at the length of the shortest list, it continues until the longest one is exhausted:
In [36]: list(izip_longest(l, l2))
Out[36]: [('a', 1), ('b', 2), (None, 3), (None, 4)]
You then add items by iterating through each item in each pair in the zipped list, omitting those that have a value of None. As pointed out by #Blckknight, this will not function properly if your original lists have None values already. If that is possible in your situation, you can use the fillvalue property of izip_longest to fill with something other than None (as #Blckknight does in his answer).
Here is the above example as a function:
In [37]: def interleave(*iterables):
....: return [item for slist in izip_longest(*iterables) for item in slist if item is not None]
....:
In [38]: interleave(l, l2)
Out[38]: ['a', 1, 'b', 2, 3, 4]
In [39]: interleave(l, l2, [44, 56, 77])
Out[39]: ['a', 1, 44, 'b', 2, 56, 3, 77, 4]

A not very elegant solution, but still may be helpful
def interleave(lista, listb):
(tempa, tempb) = ([i for i in reversed(lista)], [i for i in reversed(listb)])
result = []
while tempa or tempb:
if tempa:
result.append(tempa.pop())
if tempb:
result.append(tempb.pop())
return result
or in a single line
def interleave2(lista, listb):
return reduce(lambda x,y : x + y,
map(lambda x: x[0] + x[1],
[(lista[i:i+1], listb[i:i+1])
for i in xrange(max(len(lista),len(listb)))]))

Another solution is based on: How would I do it by hand? Well, almost by hand, using the built-in zip(), and extending the result of zipping the in the length of the shorter list by the tail of the longer one:
#!python2
def interleave(lst1, lst2):
minlen = min(len(lst1), len(lst2)) # find the length of the shorter
tail = lst1[minlen:] + lst2[minlen:] # get the tail
result = []
for t in zip(lst1, lst2): # use a standard zip
result.extend(t) # expand tuple to two items
return result + tail # result of zip() plus the tail
print interleave(["a", "b"], [1, 2, 3, 4])
print interleave([1, 2, 3, 4], ["a", "b"])
print interleave(["a", None, "b"], [1, 2, 3, None, 4])
It prints the results:
['a', 1, 'b', 2, 3, 4]
[1, 'a', 2, 'b', 3, 4]
['a', 1, None, 2, 'b', 3, None, 4]

Related

creating a dataframe out of multiple variables containing lists in python [duplicate]

This question's answers are a community effort. Edit existing answers to improve this post. It is not currently accepting new answers or interactions.
How do I concatenate two lists in Python?
Example:
listone = [1, 2, 3]
listtwo = [4, 5, 6]
Expected outcome:
>>> joinedlist
[1, 2, 3, 4, 5, 6]
Use the + operator to combine the lists:
listone = [1, 2, 3]
listtwo = [4, 5, 6]
joinedlist = listone + listtwo
Output:
>>> joinedlist
[1, 2, 3, 4, 5, 6]
Python >= 3.5 alternative: [*l1, *l2]
Another alternative has been introduced via the acceptance of PEP 448 which deserves mentioning.
The PEP, titled Additional Unpacking Generalizations, generally reduced some syntactic restrictions when using the starred * expression in Python; with it, joining two lists (applies to any iterable) can now also be done with:
>>> l1 = [1, 2, 3]
>>> l2 = [4, 5, 6]
>>> joined_list = [*l1, *l2] # unpack both iterables in a list literal
>>> print(joined_list)
[1, 2, 3, 4, 5, 6]
This functionality was defined for Python 3.5, but it hasn't been backported to previous versions in the 3.x family. In unsupported versions a SyntaxError is going to be raised.
As with the other approaches, this too creates as shallow copy of the elements in the corresponding lists.
The upside to this approach is that you really don't need lists in order to perform it; anything that is iterable will do. As stated in the PEP:
This is also useful as a more readable way of summing iterables into a
list, such as my_list + list(my_tuple) + list(my_range) which is now
equivalent to just [*my_list, *my_tuple, *my_range].
So while addition with + would raise a TypeError due to type mismatch:
l = [1, 2, 3]
r = range(4, 7)
res = l + r
The following won't:
res = [*l, *r]
because it will first unpack the contents of the iterables and then simply create a list from the contents.
It's also possible to create a generator that simply iterates over the items in both lists using itertools.chain(). This allows you to chain lists (or any iterable) together for processing without copying the items to a new list:
import itertools
for item in itertools.chain(listone, listtwo):
# Do something with each list item
You could also use the list.extend() method in order to add a list to the end of another one:
listone = [1,2,3]
listtwo = [4,5,6]
listone.extend(listtwo)
If you want to keep the original list intact, you can create a new list object, and extend both lists to it:
mergedlist = []
mergedlist.extend(listone)
mergedlist.extend(listtwo)
How do I concatenate two lists in Python?
As of 3.9, these are the most popular stdlib methods for concatenating two (or more) lists in Python.
Version Restrictions
In-Place?
Generalize to N lists?
a+b
-
No
sum([a, b, c], [])1
list(chain(a,b))2
>=2.3
No
list(chain(a, b, c))
[*a, *b]3
>=3.5
No
[*a, *b, *c]
a += b
-
Yes
No
a.extend(b)
-
Yes
No
Footnotes
This is a slick solution because of its succinctness. But sum performs concatenation in a pairwise fashion, which means this is a
quadratic operation as memory has to be allocated for each step. DO
NOT USE if your lists are large.
See chain
and
chain.from_iterable
from the docs. You will need to from itertools import chain first.
Concatenation is linear in memory, so this is the best in terms of
performance and version compatibility. chain.from_iterable was introduced in 2.6.
This method uses Additional Unpacking Generalizations (PEP 448), but cannot
generalize to N lists unless you manually unpack each one yourself.
a += b and a.extend(b) are more or less equivalent for all practical purposes. += when called on a list will internally call
list.__iadd__, which extends the first list by the second.
Performance
2-List Concatenation1
There's not much difference between these methods but that makes sense given they all have the same order of complexity (linear). There's no particular reason to prefer one over the other except as a matter of style.
N-List Concatenation
Plots have been generated using the perfplot module. Code, for your reference.
1. The iadd (+=) and extend methods operate in-place, so a copy has to be generated each time before testing. To keep things fair, all methods have a pre-copy step for the left-hand list which can be ignored.
Comments on Other Solutions
DO NOT USE THE DUNDER METHOD list.__add__ directly in any way, shape or form. In fact, stay clear of dunder methods, and use the operators and operator functions like they were designed for. Python has careful semantics baked into these which are more complicated than just calling the dunder directly. Here is an example. So, to summarise, a.__add__(b) => BAD; a + b => GOOD.
Some answers here offer reduce(operator.add, [a, b]) for pairwise concatenation -- this is the same as sum([a, b], []) only more wordy.
Any method that uses set will drop duplicates and lose ordering. Use with caution.
for i in b: a.append(i) is more wordy, and slower than a.extend(b), which is single function call and more idiomatic. append is slower because of the semantics with which memory is allocated and grown for lists. See here for a similar discussion.
heapq.merge will work, but its use case is for merging sorted lists in linear time. Using it in any other situation is an anti-pattern.
yielding list elements from a function is an acceptable method, but chain does this faster and better (it has a code path in C, so it is fast).
operator.add(a, b) is an acceptable functional equivalent to a + b. It's use cases are mainly for dynamic method dispatch. Otherwise, prefer a + b which is shorter and more readable, in my opinion. YMMV.
You can use sets to obtain merged list of unique values
mergedlist = list(set(listone + listtwo))
This is quite simple, and I think it was even shown in the tutorial:
>>> listone = [1,2,3]
>>> listtwo = [4,5,6]
>>>
>>> listone + listtwo
[1, 2, 3, 4, 5, 6]
This question directly asks about joining two lists. However it's pretty high in search even when you are looking for a way of joining many lists (including the case when you joining zero lists).
I think the best option is to use list comprehensions:
>>> a = [[1,2,3], [4,5,6], [7,8,9]]
>>> [x for xs in a for x in xs]
[1, 2, 3, 4, 5, 6, 7, 8, 9]
You can create generators as well:
>>> map(str, (x for xs in a for x in xs))
['1', '2', '3', '4', '5', '6', '7', '8', '9']
Old Answer
Consider this more generic approach:
a = [[1,2,3], [4,5,6], [7,8,9]]
reduce(lambda c, x: c + x, a, [])
Will output:
[1, 2, 3, 4, 5, 6, 7, 8, 9]
Note, this also works correctly when a is [] or [[1,2,3]].
However, this can be done more efficiently with itertools:
a = [[1,2,3], [4,5,6], [7,8,9]]
list(itertools.chain(*a))
If you don't need a list, but just an iterable, omit list().
Update
Alternative suggested by Patrick Collins in the comments could also work for you:
sum(a, [])
You could simply use the + or += operator as follows:
a = [1, 2, 3]
b = [4, 5, 6]
c = a + b
Or:
c = []
a = [1, 2, 3]
b = [4, 5, 6]
c += (a + b)
Also, if you want the values in the merged list to be unique you can do:
c = list(set(a + b))
It's worth noting that the itertools.chain function accepts variable number of arguments:
>>> l1 = ['a']; l2 = ['b', 'c']; l3 = ['d', 'e', 'f']
>>> [i for i in itertools.chain(l1, l2)]
['a', 'b', 'c']
>>> [i for i in itertools.chain(l1, l2, l3)]
['a', 'b', 'c', 'd', 'e', 'f']
If an iterable (tuple, list, generator, etc.) is the input, the from_iterable class method may be used:
>>> il = [['a'], ['b', 'c'], ['d', 'e', 'f']]
>>> [i for i in itertools.chain.from_iterable(il)]
['a', 'b', 'c', 'd', 'e', 'f']
For cases with a low number of lists you can simply add the lists together or use in-place unpacking (available in Python-3.5+):
In [1]: listone = [1, 2, 3]
...: listtwo = [4, 5, 6]
In [2]: listone + listtwo
Out[2]: [1, 2, 3, 4, 5, 6]
In [3]: [*listone, *listtwo]
Out[3]: [1, 2, 3, 4, 5, 6]
As a more general way for cases with more number of lists you can use chain.from_iterable()1 function from itertools module. Also, based on this answer this function is the best; or at least a very good way for flatting a nested list as well.
>>> l=[[1, 2, 3], [4, 5, 6], [7, 8, 9]]
>>> import itertools
>>> list(itertools.chain.from_iterable(l))
[1, 2, 3, 4, 5, 6, 7, 8, 9]
1. Note that `chain.from_iterable()` is available in Python 2.6 and later. In other versions, use `chain(*l)`.
With Python 3.3+ you can use yield from:
listone = [1,2,3]
listtwo = [4,5,6]
def merge(l1, l2):
yield from l1
yield from l2
>>> list(merge(listone, listtwo))
[1, 2, 3, 4, 5, 6]
Or, if you want to support an arbitrary number of iterators:
def merge(*iters):
for it in iters:
yield from it
>>> list(merge(listone, listtwo, 'abcd', [20, 21, 22]))
[1, 2, 3, 4, 5, 6, 'a', 'b', 'c', 'd', 20, 21, 22]
If you want to merge the two lists in sorted form, you can use the merge function from the heapq library.
from heapq import merge
a = [1, 2, 4]
b = [2, 4, 6, 7]
print list(merge(a, b))
If you can't use the plus operator (+), you can use the operator import:
import operator
listone = [1,2,3]
listtwo = [4,5,6]
result = operator.add(listone, listtwo)
print(result)
>>> [1, 2, 3, 4, 5, 6]
Alternatively, you could also use the __add__ dunder function:
listone = [1,2,3]
listtwo = [4,5,6]
result = list.__add__(listone, listtwo)
print(result)
>>> [1, 2, 3, 4, 5, 6]
If you need to merge two ordered lists with complicated sorting rules, you might have to roll it yourself like in the following code (using a simple sorting rule for readability :-) ).
list1 = [1,2,5]
list2 = [2,3,4]
newlist = []
while list1 and list2:
if list1[0] == list2[0]:
newlist.append(list1.pop(0))
list2.pop(0)
elif list1[0] < list2[0]:
newlist.append(list1.pop(0))
else:
newlist.append(list2.pop(0))
if list1:
newlist.extend(list1)
if list2:
newlist.extend(list2)
assert(newlist == [1, 2, 3, 4, 5])
If you are using NumPy, you can concatenate two arrays of compatible dimensions with this command:
numpy.concatenate([a,b])
Use a simple list comprehension:
joined_list = [item for list_ in [list_one, list_two] for item in list_]
It has all the advantages of the newest approach of using Additional Unpacking Generalizations - i.e. you can concatenate an arbitrary number of different iterables (for example, lists, tuples, ranges, and generators) that way - and it's not limited to Python 3.5 or later.
Another way:
>>> listone = [1, 2, 3]
>>> listtwo = [4, 5, 6]
>>> joinedlist = [*listone, *listtwo]
>>> joinedlist
[1, 2, 3, 4, 5, 6]
>>>
list(set(listone) | set(listtwo))
The above code does not preserve order and removes duplicates from each list (but not from the concatenated list).
As already pointed out by many, itertools.chain() is the way to go if one needs to apply exactly the same treatment to both lists. In my case, I had a label and a flag which were different from one list to the other, so I needed something slightly more complex. As it turns out, behind the scenes itertools.chain() simply does the following:
for it in iterables:
for element in it:
yield element
(see https://docs.python.org/2/library/itertools.html), so I took inspiration from here and wrote something along these lines:
for iterable, header, flag in ( (newList, 'New', ''), (modList, 'Modified', '-f')):
print header + ':'
for path in iterable:
[...]
command = 'cp -r' if os.path.isdir(srcPath) else 'cp'
print >> SCRIPT , command, flag, srcPath, mergedDirPath
[...]
The main points to understand here are that lists are just a special case of iterable, which are objects like any other; and that for ... in loops in python can work with tuple variables, so it is simple to loop on multiple variables at the same time.
You could use the append() method defined on list objects:
mergedlist =[]
for elem in listone:
mergedlist.append(elem)
for elem in listtwo:
mergedlist.append(elem)
a = [1, 2, 3]
b = [4, 5, 6]
c = a + b
print(c)
Output
>>> [1, 2, 3, 4, 5, 6]
In the above code, the "+" operator is used to concatenate the two lists into a single list.
Another solution
a = [1, 2, 3]
b = [4, 5, 6]
c = [] # Empty list in which we are going to append the values of list (a) and (b)
for i in a:
c.append(i)
for j in b:
c.append(j)
print(c)
Output
>>> [1, 2, 3, 4, 5, 6]
All the possible ways to join lists that I could find
import itertools
A = [1,3,5,7,9] + [2,4,6,8,10]
B = [1,3,5,7,9]
B.append([2,4,6,8,10])
C = [1,3,5,7,9]
C.extend([2,4,6,8,10])
D = list(zip([1,3,5,7,9],[2,4,6,8,10]))
E = [1,3,5,7,9]+[2,4,6,8,10]
F = list(set([1,3,5,7,9] + [2,4,6,8,10]))
G = []
for a in itertools.chain([1,3,5,7,9], [2,4,6,8,10]):
G.append(a)
print("A: " + str(A))
print("B: " + str(B))
print("C: " + str(C))
print("D: " + str(D))
print("E: " + str(E))
print("F: " + str(F))
print("G: " + str(G))
Output
A: [1, 3, 5, 7, 9, 2, 4, 6, 8, 10]
B: [1, 3, 5, 7, 9, [2, 4, 6, 8, 10]]
C: [1, 3, 5, 7, 9, 2, 4, 6, 8, 10]
D: [(1, 2), (3, 4), (5, 6), (7, 8), (9, 10)]
E: [1, 3, 5, 7, 9, 2, 4, 6, 8, 10]
F: [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
G: [1, 3, 5, 7, 9, 2, 4, 6, 8, 10]
I recommend three methods to concatenate the list, but the first method is most recommended,
# Easiest and least complexity method <= recommended
listone = [1, 2, 3]
listtwo = [4, 5, 6]
newlist = listone + listtwo
print(newlist)
# Second-easiest method
newlist = listone.copy()
newlist.extend(listtwo)
print(newlist)
In the second method, I assign newlist to a copy of the listone, because I don't want to change listone.
# Third method
newlist = listone.copy()
for j in listtwo:
newlist.append(j)
print(newlist)
This is not a good way to concatenate lists because we are using a for loop to concatenate the lists. So time complexity is much higher than with the other two methods.
The most common method used to concatenate lists are the plus operator and the built-in method append, for example:
list = [1,2]
list = list + [3]
# list = [1,2,3]
list.append(3)
# list = [1,2,3]
list.append([3,4])
# list = [1,2,[3,4]]
For most of the cases, this will work, but the append function will not extend a list if one was added. Because that is not expected, you can use another method called extend. It should work with structures:
list = [1,2]
list.extend([3,4])
# list = [1,2,3,4]
A really concise way to combine a list of lists is
list_of_lists = [[1,2,3], [4,5,6], [7,8,9]]
reduce(list.__add__, list_of_lists)
which gives us
[1, 2, 3, 4, 5, 6, 7, 8, 9]
So there are two easy ways.
Using +: It creates a new list from provided lists
Example:
In [1]: a = [1, 2, 3]
In [2]: b = [4, 5, 6]
In [3]: a + b
Out[3]: [1, 2, 3, 4, 5, 6]
In [4]: %timeit a + b
10000000 loops, best of 3: 126 ns per loop
Using extend: It appends new list to existing list. That means it does not create a separate list.
Example:
In [1]: a = [1, 2, 3]
In [2]: b = [4, 5, 6]
In [3]: %timeit a.extend(b)
10000000 loops, best of 3: 91.1 ns per loop
Thus we see that out of two of most popular methods, extend is efficient.
You could also just use sum.
>>> a = [1, 2, 3]
>>> b = [4, 5, 6]
>>> sum([a, b], [])
[1, 2, 3, 4, 5, 6]
>>>
This works for any length and any element type of list:
>>> a = ['a', 'b', 'c', 'd']
>>> b = [1, 2, 3, 4]
>>> c = [1, 2]
>>> sum([a, b, c], [])
['a', 'b', 'c', 'd', 1, 2, 3, 4, 1, 2]
>>>
The reason I add [], is because the start argument is set to 0 by default, so it loops through the list and adds to start, but 0 + [1, 2, 3] would give an error, so if we set the start to []. It would add to [], and [] + [1, 2, 3] would work as expected.
I assume you want one of the two methods:
Keep duplicate elements
It is very easy. Just concatenate like a string:
def concat_list(l1,l2):
l3 = l1+l2
return l3
Next, if you want to eliminate duplicate elements
def concat_list(l1,l2):
l3 = []
for i in [l1,l2]:
for j in i:
if j not in l3:
# Check if element exists in final list, if no then add element to list
l3.append(j)
return l3
The solutions provided are for a single list. In case there are lists within a list and the merging of corresponding lists is required, the "+" operation through a for loop does the work.
a = [[1,2,3], [4,5,6]]
b = [[0,1,2], [7,8,9]]
for i in range(len(a)):
cc.append(a[i] + b[i])
Output: [[1, 2, 3, 0, 1, 2], [4, 5, 6, 7, 8, 9]]

Pass several lists of columns to index a dataframe (without using loc/iloc) [duplicate]

This question's answers are a community effort. Edit existing answers to improve this post. It is not currently accepting new answers or interactions.
How do I concatenate two lists in Python?
Example:
listone = [1, 2, 3]
listtwo = [4, 5, 6]
Expected outcome:
>>> joinedlist
[1, 2, 3, 4, 5, 6]
Use the + operator to combine the lists:
listone = [1, 2, 3]
listtwo = [4, 5, 6]
joinedlist = listone + listtwo
Output:
>>> joinedlist
[1, 2, 3, 4, 5, 6]
Python >= 3.5 alternative: [*l1, *l2]
Another alternative has been introduced via the acceptance of PEP 448 which deserves mentioning.
The PEP, titled Additional Unpacking Generalizations, generally reduced some syntactic restrictions when using the starred * expression in Python; with it, joining two lists (applies to any iterable) can now also be done with:
>>> l1 = [1, 2, 3]
>>> l2 = [4, 5, 6]
>>> joined_list = [*l1, *l2] # unpack both iterables in a list literal
>>> print(joined_list)
[1, 2, 3, 4, 5, 6]
This functionality was defined for Python 3.5, but it hasn't been backported to previous versions in the 3.x family. In unsupported versions a SyntaxError is going to be raised.
As with the other approaches, this too creates as shallow copy of the elements in the corresponding lists.
The upside to this approach is that you really don't need lists in order to perform it; anything that is iterable will do. As stated in the PEP:
This is also useful as a more readable way of summing iterables into a
list, such as my_list + list(my_tuple) + list(my_range) which is now
equivalent to just [*my_list, *my_tuple, *my_range].
So while addition with + would raise a TypeError due to type mismatch:
l = [1, 2, 3]
r = range(4, 7)
res = l + r
The following won't:
res = [*l, *r]
because it will first unpack the contents of the iterables and then simply create a list from the contents.
It's also possible to create a generator that simply iterates over the items in both lists using itertools.chain(). This allows you to chain lists (or any iterable) together for processing without copying the items to a new list:
import itertools
for item in itertools.chain(listone, listtwo):
# Do something with each list item
You could also use the list.extend() method in order to add a list to the end of another one:
listone = [1,2,3]
listtwo = [4,5,6]
listone.extend(listtwo)
If you want to keep the original list intact, you can create a new list object, and extend both lists to it:
mergedlist = []
mergedlist.extend(listone)
mergedlist.extend(listtwo)
How do I concatenate two lists in Python?
As of 3.9, these are the most popular stdlib methods for concatenating two (or more) lists in Python.
Version Restrictions
In-Place?
Generalize to N lists?
a+b
-
No
sum([a, b, c], [])1
list(chain(a,b))2
>=2.3
No
list(chain(a, b, c))
[*a, *b]3
>=3.5
No
[*a, *b, *c]
a += b
-
Yes
No
a.extend(b)
-
Yes
No
Footnotes
This is a slick solution because of its succinctness. But sum performs concatenation in a pairwise fashion, which means this is a
quadratic operation as memory has to be allocated for each step. DO
NOT USE if your lists are large.
See chain
and
chain.from_iterable
from the docs. You will need to from itertools import chain first.
Concatenation is linear in memory, so this is the best in terms of
performance and version compatibility. chain.from_iterable was introduced in 2.6.
This method uses Additional Unpacking Generalizations (PEP 448), but cannot
generalize to N lists unless you manually unpack each one yourself.
a += b and a.extend(b) are more or less equivalent for all practical purposes. += when called on a list will internally call
list.__iadd__, which extends the first list by the second.
Performance
2-List Concatenation1
There's not much difference between these methods but that makes sense given they all have the same order of complexity (linear). There's no particular reason to prefer one over the other except as a matter of style.
N-List Concatenation
Plots have been generated using the perfplot module. Code, for your reference.
1. The iadd (+=) and extend methods operate in-place, so a copy has to be generated each time before testing. To keep things fair, all methods have a pre-copy step for the left-hand list which can be ignored.
Comments on Other Solutions
DO NOT USE THE DUNDER METHOD list.__add__ directly in any way, shape or form. In fact, stay clear of dunder methods, and use the operators and operator functions like they were designed for. Python has careful semantics baked into these which are more complicated than just calling the dunder directly. Here is an example. So, to summarise, a.__add__(b) => BAD; a + b => GOOD.
Some answers here offer reduce(operator.add, [a, b]) for pairwise concatenation -- this is the same as sum([a, b], []) only more wordy.
Any method that uses set will drop duplicates and lose ordering. Use with caution.
for i in b: a.append(i) is more wordy, and slower than a.extend(b), which is single function call and more idiomatic. append is slower because of the semantics with which memory is allocated and grown for lists. See here for a similar discussion.
heapq.merge will work, but its use case is for merging sorted lists in linear time. Using it in any other situation is an anti-pattern.
yielding list elements from a function is an acceptable method, but chain does this faster and better (it has a code path in C, so it is fast).
operator.add(a, b) is an acceptable functional equivalent to a + b. It's use cases are mainly for dynamic method dispatch. Otherwise, prefer a + b which is shorter and more readable, in my opinion. YMMV.
You can use sets to obtain merged list of unique values
mergedlist = list(set(listone + listtwo))
This is quite simple, and I think it was even shown in the tutorial:
>>> listone = [1,2,3]
>>> listtwo = [4,5,6]
>>>
>>> listone + listtwo
[1, 2, 3, 4, 5, 6]
This question directly asks about joining two lists. However it's pretty high in search even when you are looking for a way of joining many lists (including the case when you joining zero lists).
I think the best option is to use list comprehensions:
>>> a = [[1,2,3], [4,5,6], [7,8,9]]
>>> [x for xs in a for x in xs]
[1, 2, 3, 4, 5, 6, 7, 8, 9]
You can create generators as well:
>>> map(str, (x for xs in a for x in xs))
['1', '2', '3', '4', '5', '6', '7', '8', '9']
Old Answer
Consider this more generic approach:
a = [[1,2,3], [4,5,6], [7,8,9]]
reduce(lambda c, x: c + x, a, [])
Will output:
[1, 2, 3, 4, 5, 6, 7, 8, 9]
Note, this also works correctly when a is [] or [[1,2,3]].
However, this can be done more efficiently with itertools:
a = [[1,2,3], [4,5,6], [7,8,9]]
list(itertools.chain(*a))
If you don't need a list, but just an iterable, omit list().
Update
Alternative suggested by Patrick Collins in the comments could also work for you:
sum(a, [])
You could simply use the + or += operator as follows:
a = [1, 2, 3]
b = [4, 5, 6]
c = a + b
Or:
c = []
a = [1, 2, 3]
b = [4, 5, 6]
c += (a + b)
Also, if you want the values in the merged list to be unique you can do:
c = list(set(a + b))
It's worth noting that the itertools.chain function accepts variable number of arguments:
>>> l1 = ['a']; l2 = ['b', 'c']; l3 = ['d', 'e', 'f']
>>> [i for i in itertools.chain(l1, l2)]
['a', 'b', 'c']
>>> [i for i in itertools.chain(l1, l2, l3)]
['a', 'b', 'c', 'd', 'e', 'f']
If an iterable (tuple, list, generator, etc.) is the input, the from_iterable class method may be used:
>>> il = [['a'], ['b', 'c'], ['d', 'e', 'f']]
>>> [i for i in itertools.chain.from_iterable(il)]
['a', 'b', 'c', 'd', 'e', 'f']
For cases with a low number of lists you can simply add the lists together or use in-place unpacking (available in Python-3.5+):
In [1]: listone = [1, 2, 3]
...: listtwo = [4, 5, 6]
In [2]: listone + listtwo
Out[2]: [1, 2, 3, 4, 5, 6]
In [3]: [*listone, *listtwo]
Out[3]: [1, 2, 3, 4, 5, 6]
As a more general way for cases with more number of lists you can use chain.from_iterable()1 function from itertools module. Also, based on this answer this function is the best; or at least a very good way for flatting a nested list as well.
>>> l=[[1, 2, 3], [4, 5, 6], [7, 8, 9]]
>>> import itertools
>>> list(itertools.chain.from_iterable(l))
[1, 2, 3, 4, 5, 6, 7, 8, 9]
1. Note that `chain.from_iterable()` is available in Python 2.6 and later. In other versions, use `chain(*l)`.
With Python 3.3+ you can use yield from:
listone = [1,2,3]
listtwo = [4,5,6]
def merge(l1, l2):
yield from l1
yield from l2
>>> list(merge(listone, listtwo))
[1, 2, 3, 4, 5, 6]
Or, if you want to support an arbitrary number of iterators:
def merge(*iters):
for it in iters:
yield from it
>>> list(merge(listone, listtwo, 'abcd', [20, 21, 22]))
[1, 2, 3, 4, 5, 6, 'a', 'b', 'c', 'd', 20, 21, 22]
If you want to merge the two lists in sorted form, you can use the merge function from the heapq library.
from heapq import merge
a = [1, 2, 4]
b = [2, 4, 6, 7]
print list(merge(a, b))
If you can't use the plus operator (+), you can use the operator import:
import operator
listone = [1,2,3]
listtwo = [4,5,6]
result = operator.add(listone, listtwo)
print(result)
>>> [1, 2, 3, 4, 5, 6]
Alternatively, you could also use the __add__ dunder function:
listone = [1,2,3]
listtwo = [4,5,6]
result = list.__add__(listone, listtwo)
print(result)
>>> [1, 2, 3, 4, 5, 6]
If you need to merge two ordered lists with complicated sorting rules, you might have to roll it yourself like in the following code (using a simple sorting rule for readability :-) ).
list1 = [1,2,5]
list2 = [2,3,4]
newlist = []
while list1 and list2:
if list1[0] == list2[0]:
newlist.append(list1.pop(0))
list2.pop(0)
elif list1[0] < list2[0]:
newlist.append(list1.pop(0))
else:
newlist.append(list2.pop(0))
if list1:
newlist.extend(list1)
if list2:
newlist.extend(list2)
assert(newlist == [1, 2, 3, 4, 5])
If you are using NumPy, you can concatenate two arrays of compatible dimensions with this command:
numpy.concatenate([a,b])
Use a simple list comprehension:
joined_list = [item for list_ in [list_one, list_two] for item in list_]
It has all the advantages of the newest approach of using Additional Unpacking Generalizations - i.e. you can concatenate an arbitrary number of different iterables (for example, lists, tuples, ranges, and generators) that way - and it's not limited to Python 3.5 or later.
Another way:
>>> listone = [1, 2, 3]
>>> listtwo = [4, 5, 6]
>>> joinedlist = [*listone, *listtwo]
>>> joinedlist
[1, 2, 3, 4, 5, 6]
>>>
list(set(listone) | set(listtwo))
The above code does not preserve order and removes duplicates from each list (but not from the concatenated list).
As already pointed out by many, itertools.chain() is the way to go if one needs to apply exactly the same treatment to both lists. In my case, I had a label and a flag which were different from one list to the other, so I needed something slightly more complex. As it turns out, behind the scenes itertools.chain() simply does the following:
for it in iterables:
for element in it:
yield element
(see https://docs.python.org/2/library/itertools.html), so I took inspiration from here and wrote something along these lines:
for iterable, header, flag in ( (newList, 'New', ''), (modList, 'Modified', '-f')):
print header + ':'
for path in iterable:
[...]
command = 'cp -r' if os.path.isdir(srcPath) else 'cp'
print >> SCRIPT , command, flag, srcPath, mergedDirPath
[...]
The main points to understand here are that lists are just a special case of iterable, which are objects like any other; and that for ... in loops in python can work with tuple variables, so it is simple to loop on multiple variables at the same time.
You could use the append() method defined on list objects:
mergedlist =[]
for elem in listone:
mergedlist.append(elem)
for elem in listtwo:
mergedlist.append(elem)
a = [1, 2, 3]
b = [4, 5, 6]
c = a + b
print(c)
Output
>>> [1, 2, 3, 4, 5, 6]
In the above code, the "+" operator is used to concatenate the two lists into a single list.
Another solution
a = [1, 2, 3]
b = [4, 5, 6]
c = [] # Empty list in which we are going to append the values of list (a) and (b)
for i in a:
c.append(i)
for j in b:
c.append(j)
print(c)
Output
>>> [1, 2, 3, 4, 5, 6]
All the possible ways to join lists that I could find
import itertools
A = [1,3,5,7,9] + [2,4,6,8,10]
B = [1,3,5,7,9]
B.append([2,4,6,8,10])
C = [1,3,5,7,9]
C.extend([2,4,6,8,10])
D = list(zip([1,3,5,7,9],[2,4,6,8,10]))
E = [1,3,5,7,9]+[2,4,6,8,10]
F = list(set([1,3,5,7,9] + [2,4,6,8,10]))
G = []
for a in itertools.chain([1,3,5,7,9], [2,4,6,8,10]):
G.append(a)
print("A: " + str(A))
print("B: " + str(B))
print("C: " + str(C))
print("D: " + str(D))
print("E: " + str(E))
print("F: " + str(F))
print("G: " + str(G))
Output
A: [1, 3, 5, 7, 9, 2, 4, 6, 8, 10]
B: [1, 3, 5, 7, 9, [2, 4, 6, 8, 10]]
C: [1, 3, 5, 7, 9, 2, 4, 6, 8, 10]
D: [(1, 2), (3, 4), (5, 6), (7, 8), (9, 10)]
E: [1, 3, 5, 7, 9, 2, 4, 6, 8, 10]
F: [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
G: [1, 3, 5, 7, 9, 2, 4, 6, 8, 10]
I recommend three methods to concatenate the list, but the first method is most recommended,
# Easiest and least complexity method <= recommended
listone = [1, 2, 3]
listtwo = [4, 5, 6]
newlist = listone + listtwo
print(newlist)
# Second-easiest method
newlist = listone.copy()
newlist.extend(listtwo)
print(newlist)
In the second method, I assign newlist to a copy of the listone, because I don't want to change listone.
# Third method
newlist = listone.copy()
for j in listtwo:
newlist.append(j)
print(newlist)
This is not a good way to concatenate lists because we are using a for loop to concatenate the lists. So time complexity is much higher than with the other two methods.
The most common method used to concatenate lists are the plus operator and the built-in method append, for example:
list = [1,2]
list = list + [3]
# list = [1,2,3]
list.append(3)
# list = [1,2,3]
list.append([3,4])
# list = [1,2,[3,4]]
For most of the cases, this will work, but the append function will not extend a list if one was added. Because that is not expected, you can use another method called extend. It should work with structures:
list = [1,2]
list.extend([3,4])
# list = [1,2,3,4]
A really concise way to combine a list of lists is
list_of_lists = [[1,2,3], [4,5,6], [7,8,9]]
reduce(list.__add__, list_of_lists)
which gives us
[1, 2, 3, 4, 5, 6, 7, 8, 9]
So there are two easy ways.
Using +: It creates a new list from provided lists
Example:
In [1]: a = [1, 2, 3]
In [2]: b = [4, 5, 6]
In [3]: a + b
Out[3]: [1, 2, 3, 4, 5, 6]
In [4]: %timeit a + b
10000000 loops, best of 3: 126 ns per loop
Using extend: It appends new list to existing list. That means it does not create a separate list.
Example:
In [1]: a = [1, 2, 3]
In [2]: b = [4, 5, 6]
In [3]: %timeit a.extend(b)
10000000 loops, best of 3: 91.1 ns per loop
Thus we see that out of two of most popular methods, extend is efficient.
You could also just use sum.
>>> a = [1, 2, 3]
>>> b = [4, 5, 6]
>>> sum([a, b], [])
[1, 2, 3, 4, 5, 6]
>>>
This works for any length and any element type of list:
>>> a = ['a', 'b', 'c', 'd']
>>> b = [1, 2, 3, 4]
>>> c = [1, 2]
>>> sum([a, b, c], [])
['a', 'b', 'c', 'd', 1, 2, 3, 4, 1, 2]
>>>
The reason I add [], is because the start argument is set to 0 by default, so it loops through the list and adds to start, but 0 + [1, 2, 3] would give an error, so if we set the start to []. It would add to [], and [] + [1, 2, 3] would work as expected.
I assume you want one of the two methods:
Keep duplicate elements
It is very easy. Just concatenate like a string:
def concat_list(l1,l2):
l3 = l1+l2
return l3
Next, if you want to eliminate duplicate elements
def concat_list(l1,l2):
l3 = []
for i in [l1,l2]:
for j in i:
if j not in l3:
# Check if element exists in final list, if no then add element to list
l3.append(j)
return l3
The solutions provided are for a single list. In case there are lists within a list and the merging of corresponding lists is required, the "+" operation through a for loop does the work.
a = [[1,2,3], [4,5,6]]
b = [[0,1,2], [7,8,9]]
for i in range(len(a)):
cc.append(a[i] + b[i])
Output: [[1, 2, 3, 0, 1, 2], [4, 5, 6, 7, 8, 9]]

Iterating over only a part of list in Python

I have a list in python, that consists of both alphabetic and numeric elements, say something like list = ["a", 1, 2, 3, "b", 4, 5, 6] and I want to slice it into 2 lists, containing numbers that follow the alphabetic characters, so list1 = [1, 2, 3] and list2 = [4, 5, 6]. a and b elements could be in reversed order, but generally, I want to store numeric elements that follow a and b elements in separate lists. The easiest solution that I came up with was creating a loop with condition:
#Generating a list for numeric elements following "a":
for e in list[list.index("a")+1:]:
if not str.isdigit(e):
break
else:
list1.append(e)
I'd do it similarly for list2 and numeric elements after "b".
But could there be more elegant solutions? I'm new to Python, but I've seen beautiful one-liner constructions, could there be something like that in my case? Thanks in advance.
Something like this, maybe?
>>> import itertools
>>> import numbers
>>> lst = ["a", 1, 2, 3, "b", 4, 5, 6]
>>> groups = itertools.groupby(lst, key=lambda x: isinstance(x, numbers.Number))
>>> result = [[x for x in group_iter] for is_number, group_iter in groups if is_number]
>>> result
[[1, 2, 3], [4, 5, 6]]
And here is a less “sexy” version that outputs a list of tuple pairs (group_key, group_numbers):
>>> import itertools
>>> import numbers
>>> lst = ["a", 1, 2, 3, "b", 4, 5, 6]
>>> groups = itertools.groupby(lst, key=lambda x: isinstance(x, numbers.Number))
>>> group_key = None
>>> result = []
>>> for is_number, group_iter in groups:
... if not is_number:
... for x in group_iter:
... group_key = x
... else:
... result.append((group_key, [x for x in group_iter]))
>>> result
[('a', [1, 2, 3]), ('b', [4, 5, 6])]
Note that it is a quick and dirty version which expects the input data to be well-formed.
Here you have a functional aproach:
>>> l = ["a", 1, 2, 3, "b", 4, 5, 6]
>>> dig = [x for (x, y) in enumerate(l) if type(y) is str] + [len(l)]
>>> dig
[0, 4, 8]
>>> slices = zip(map(lambda x:x+1, dig), dig[1:])
>>> slices
[(1, 4), (5, 8)]
>>> lists = map(lambda (i, e): l[i:e], slices)
>>> lists
[[1, 2, 3], [4, 5, 6]]
First we get the index of the letters with, notice that we need the size of the list to know the end of it:
[x for (x, y) in enumerate(l) if type(y) is str] + [len(l)]
Then we get the pair of slices where the lists are:
zip(map(lambda x:x+1, dig), dig[1:])
Finally, we get each slice from the original list:
map(lambda (i, e): l[i:e], slices)
You can use slices:
list = ["a", 1, 2, 3, "b", 4, 5, 6]
lista = list[list.index('a')+1:list.index('b')]
listb = list[list.index('b')+1:]
Another approach (Python 3 only):
def chunks(values, idx=0):
''' Yield successive integer values delimited by a character. '''
tmp = []
for idx, val in enumerate(values[1:], idx):
if not isinstance(val, int):
yield from chunks(values[idx + 1:], idx)
break
tmp.append(val)
yield tmp
>>> values = ['a', 1, 2, 3, 'b', 4, 5, 6]
>>> list(chunks(values))
[[4, 5, 6], [1, 2, 3]]

How to get items in a list before and after a specified index

Suppose I have the list f=[1,2,3] and index i -- I want to iterate over f, excluding i. Is there a way I can use i to split the list, something like f[:i:], where I would be given a new list of [1,3] when ran with i=1?
Code I'm trying to fit this into:
# permutations, excluding self addition
# <something here> would be f excluding f[x]
f = [1,2,3]
r = [x + y for x in f for y in <something here>]
# Expected Output (notice absence of any f[i]+f[i])
[3, 4, 3, 5, 4, 5]
Use enumerate() in order to have access to index at iteration time.
[item for i, item in enumerate(f) if i != 3]
In this case you can escape the intended index or if you have a set of indices you can check the membership with in:
[item for i, item in enumerate(f) if i not in {3, 4, 5}]
If you want to remove an item in a certain index you can use del statement:
>>> l = ['a', 'b', 'c', 'd', 'e']
>>>
>>> del l[3]
>>> l
['a', 'b', 'c', 'e']
>>>
If you want to create a new list by removing that item and preserve teh main list you can use a simple slicing:
>>> new = l[:3] + l[4:]
>>> new
['a', 'b', 'c', 'e']
iterate y over the index:
f = [10,20,30,40,50,60]
r = [x + f[y] for x in f for y in range(len(f)) if f[y] != x]
Probably not the most elegant solution, but this might work:
f = [1,2,3,4,5]
for i, x in enumerate(f):
if i == 0:
new_list = f[1:]
elif i == len(f) -1:
new_list = f[:-1]
else:
new_list = f[:i]+f[i+1:]
print i, new_list
prints:
0 [2, 3, 4, 5]
1 [1, 3, 4, 5]
2 [1, 2, 4, 5]
3 [1, 2, 3, 5]
4 [1, 2, 3, 4]
Well, it may seem scary but that's a one-liner that does the work:
>>> from numpy import array
>>> import itertools
>>> list(itertools.chain(*(i+array(l) for i,l in zip(reversed(f), itertools.combinations(f, len(f)-1)))))
[3, 4, 3, 5, 4, 5]
If you look at it slowly, it's not so complicated:
The itertools.combination give all the possible options to the len(f)-1 combination:
>>> list(itertools.combinations(f, len(f)-1))
[(1, 2), (1, 3), (2, 3)]
You wrap it with zip and reversed(f) so you can get each combination together with the missing value:
>>> [(i,l) for i,l in zip(reversed(f), itertools.combinations(f, len(f)-1))]
[(3, (1, 2)), (2, (1, 3)), (1, (2, 3))]
Then you convert l to a numpy.array so you can add the missing value:
>>> list((i+array(l) for i,l in zip(reversed(f), itertools.combinations(f, len(f)-1))))
[array([4, 5]), array([3, 5]), array([3, 4])]
And finaly you use itertools.chain to get the desired result.

Slicing sublists with different lengths

I have a list of lists. Each sublist has a length that varies between 1 and 100. Each sublist contains a particle ID at different times in a set of data. I would like to form lists of all particle IDs at a given time. To do this I could use something like:
list = [[1,2,3,4,5],[2,6,7,8],[1,3,6,7,8]]
list2 = [item[0] for item in list]
list2 would contain the first elements of each sublist in list. I would like to do this operation not just for the first element, but for every element between 1 and 100. My problem is that element number 100 (or 66 or 77 or whatever) does not exists for every sublist.
Is there some way of creating a lists of lists, where each sublist is the list of all particle IDs at a given time.
I have thought about trying to use numpy arrays to solve this problem, as if the lists were all the same length this would be trivial. I have tried adding -1's to the end of each list to make them all the same length, and then masking the negative numbers, but this hasn't worked for me so far. I will use the list of IDs at a given time to slice another separate array:
pos = pos[satIDs]
lst = [[1,2,3,4,5],[2,6,7,8],[1,3,6,7,8]]
func = lambda x: [line[x] for line in lst if len(line) > x]
func(3)
[4, 8, 7]
func(4)
[5, 8]
--update--
func = lambda x: [ (line[x],i) for i,line in enumerate(lst) if len(line) > x]
func(4)
[(5, 0), (8, 2)]
You could use itertools.zip_longest. This will zip the lists together and insert None when one of the lists is exhausted.
>>> lst = [[1,2,3,4,5],['A','B','C'],['a','b','c','d','e','f','g']]
>>> list(itertools.zip_longest(*lst))
[(1, 'A', 'a'),
(2, 'B', 'b'),
(3, 'C', 'c'),
(4, None, 'd'),
(5, None, 'e'),
(None, None, 'f'),
(None, None, 'g')]
If you don't want the None elements, you can filter them out:
>>> [[x for x in sublist if x is not None] for sublist in itertools.zip_longest(*lst)]
[[1, 'A', 'a'], [2, 'B', 'b'], [3, 'C', 'c'], [4, 'd'], [5, 'e'], ['f'], ['g']]
Approach #1
One almost* vectorized approach could be suggested that goes along creating ID based on the new order and splitting, like so -
def position_based_slice(L):
# Get lengths of each element in input list
lens = np.array([len(item) for item in L])
# Form ID array that has *ramping* IDs within an element starting from 0
# and restarts with a new element at 0
id_arr = np.ones(lens.sum(),int)
id_arr[lens[:-1].cumsum()] = -lens[:-1]+1
# Get order maintained sorted indices for sorting flattened version of list
ids = np.argsort(id_arr.cumsum(),kind='mergesort')
# Get sorted version and split at boundaries decided by lengths of ids
vals = np.take(np.concatenate(L),ids)
cut_idx = np.where(np.diff(ids)<0)[0]+1
return np.split(vals,cut_idx)
*There is a loop comprehension involved at the start, but being meant to collect just the lengths of the input elements of the list, its effect on the total runtime should be minimal.
Sample run -
In [76]: input_list = [[1,2,3,4,5],[2,6,7,8],[1,3,6,7,8],[3,2]]
In [77]: position_based_slice(input_list)
Out[77]:
[array([1, 2, 1, 3]), # input_list[ID=0]
array([2, 6, 3, 2]), # input_list[ID=1]
array([3, 7, 6]), # input_list[ID=2]
array([4, 8, 7]), # input_list[ID=3]
array([5, 8])] # input_list[ID=4]
Approach #2
Here's another approach that creates a 2D array, which is easier to index and trace back to original input elements. This uses NumPy broadcasting alongwith boolean indexing. The implementation would look something like this -
def position_based_slice_2Dgrid(L):
# Get lengths of each element in input list
lens = np.array([len(item) for item in L])
# Create a mask of valid places in a 2D grid mapped version of list
mask = lens[:,None] > np.arange(lens.max())
out = np.full(mask.shape,-1,dtype=int)
out[mask] = np.concatenate(L)
return out
Sample run -
In [126]: input_list = [[1,2,3,4,5],[2,6,7,8],[1,3,6,7,8],[3,2]]
In [127]: position_based_slice_2Dgrid(input_list)
Out[127]:
array([[ 1, 2, 3, 4, 5],
[ 2, 6, 7, 8, -1],
[ 1, 3, 6, 7, 8],
[ 3, 2, -1, -1, -1]])
So, now each column of the output would correspond to your ID based outputting.
If you want it with a one-line forloop and in an array you can do this:
list2 = [[item[i] for item in list if len(item) > i] for i in range(0, 100)]
And if you want to know which id is from which list you can do this:
list2 = [{list.index(item): item[i] for item in list if len(item) > i} for i in range(0, 100)]
list2 would be like this:
[{0: 1, 1: 2, 2: 1}, {0: 2, 1: 6, 2: 3}, {0: 3, 1: 7, 2: 6}, {0: 4, 1: 8, 2: 7},
{0: 5, 2: 8}, {}, {}, ... ]
You could append numpy.nan to your short lists and afterwards create a numpy array
import numpy
import itertools
lst = [[1,2,3,4,5],[2,6,7,8],[1,3,6,7,8,9]]
arr = numpy.array(list(itertools.izip_longest(*lst, fillvalue=numpy.nan)))
Afterwards you can use numpy slicing as usual.
print arr
print arr[1, :] # [2, 6, 3]
print arr[4, :] # [5, nan, 8]
print arr[5, :] # [nan, nan, 9]

Categories