Combining lists of lists in python - python

I want to combine list of lists, here is the below sample
mylist = [[['a', 'b'], ['c', 'd']],
[['e', 'f'], ['g', 'h']]]
and the output should be:
output = ['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h']
I also tried using itertools, but here is what it returned
>>> combined = list(itertools.chain.from_iterable(mylist))
>>> combined
>>> [['a', 'b'], ['c', 'd'], ['e', 'f'], ['g', 'h']]
How I can achieve this ['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h']
Can anyone highlight whats I'm missing?

The reason why the itertools method didn't work is because what you have isn't a list of lists, but a list of lists of lists. itertools is working properly, its just flattening the list once. Calling the exact same function again with the partially flattened list as an argument will work:
flat = list(itertools.chain.from_iterable(itertools.chain.from_iterable(mylist)))
Or, a simple list comprehension solution:
flat = [item for slist in mylist for sslist in slist for item in sslist]
This basically translates to:
for slist in mylist:
for sslist in slist:
for item in sslist:
flat.append(item)
Keep in mind, both these solutions are only good for dealing with double nesting. If there is a chance you will have to deal with even more nesting, I suggest you look up how to flatten arbitrarily nested lists.

As others have noted, you have two levels here so you need two calls to chain. But you don't actually need the from_iterable call; you can use the * syntax instead:
list(itertools.chain(*itertools.chain(*mylist)))

With numpy.ndarray.flatten():
import numpy as np
mylist = [ [['a', 'b'], ['c', 'd']], [['e', 'f'], ['g', 'h']] ]
a = np.array(mylist).flatten().tolist()
print(a)
The output:
['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h']

Related

remove lists with same elements but in different order, from a list of lists

I want to filter a list of lists for duplicates. I consider two lists to be a duplicate of each other when they contain the same elements but not necessarily in the same order. So for example
[['A', 'B', 'C'], ['C', 'B', 'A'], ['D', 'B', 'A']]
should become
[['A', 'B', 'C'], ['D', 'B', 'A']]
since ['C', 'B', 'A'] is a duplicate of ['A', 'B', 'C'].
It does not matter which one of the duplicates gets removed, as long as the final list of lists does not contain any duplicates anymore. And all lists need to keep the order of there elements. So using set() may not be an option.
I found this related questions:
Determine if 2 lists have the same elements, regardless of order? ,
How to efficiently compare two unordered lists (not sets)?
But they only talk about how to compare two lists, not how too efficiently remove duplicates. I'm using python.
using dictionary comprehension
>>> data = [['A', 'B', 'C'], ['C', 'B', 'A'], ['D', 'B', 'A']]
>>> result = {tuple(sorted(i)): i for i in data}.values()
>>> result
dict_values([['C', 'B', 'A'], ['D', 'B', 'A']])
>>> list( result )
[['C', 'B', 'A'], ['D', 'B', 'A']]
You can use frozenset
>>> x = [['A', 'B', 'C'], ['C', 'B', 'A'], ['D', 'B', 'A']]
>>> [list(s) for s in set([frozenset(item) for item in x])]
[['A', 'B', 'D'], ['A', 'B', 'C']]
Or, with map:
>>> [list(s) for s in set(map(frozenset, x))]
[['A', 'B', 'D'], ['A', 'B', 'C']]
If you want to keep the order of elements:
data = [['A', 'B', 'C'], ['C', 'B', 'A'], ['D', 'B', 'A']]
seen = set()
result = []
for obj in data:
if frozenset(obj) not in seen:
result.append(obj)
seen.add(frozenset(obj))
Output:
[['A', 'B', 'C'], ['D', 'B', 'A']]
Do you want to keep the order of elements?
from itertools import groupby
data = [['A', 'B', 'C'], ['C', 'B', 'A'], ['D', 'B', 'A']]
print([k for k, _ in groupby(data, key=sorted)])
Output:
[['A', 'B', 'C'], ['A', 'B', 'D']]
In python you have to remember that you can't change existing data but you can somehow append / update data.
The simplest way is as follows:
dict = [['A', 'B', 'C'], ['C', 'B', 'A'], ['D', 'B', 'A']]
temp = []
for i in dict:
if sorted(i) in temp:
pass
else:
temp.append(i)
print(temp)
cheers, athrv

Combination of elements in a list with constraints

I am writing a python code and I need help with a task. I have a list of 8 elements
[A,B,C,D,E,F,G,H]
and I need to find all the combinations of shorter lists (4 elements) in lexicographic order such that two elements are taken from the subset A,C,E,G and the other two from B,D,F,H. I know that there is the library itertools, but I don't know how to combine its functions properly to perform this task
The wording of the question is unclear, but I think this is what you want:
array = ['f','g','d','e','c','b','h','a']
first = sorted(array[::2]) # ['c', 'd', 'f', 'h']
second = sorted(array[1::2]) # ['a', 'b', 'e', 'g']
I think this is what you want.
I need the set of all the new lists with length 4 such that the first two elements are taken from A,C,E,G and the other two are from B,D,F,H and I need them to be in lexicographic order.
We get the possible starting letters and ending letters then combine all possible pairs of each of them into all_lists:
from itertools import combinations
lst = ['A', 'B', 'C', 'D', 'E', 'F', 'G', 'H']
starters = lst[::2] # ['A', 'C', 'E', 'G']
enders = lst[1::2] # ['B', 'D', 'F', 'H']
all_lists = []
for a in combinations(starters, 2):
for b in combinations(enders, 2):
all_lists.append(sorted(a + b))
print(all_lists) # Gives [['A', 'B', 'C', 'D'], ['A', 'B', 'C', 'F'], ['A', 'B', 'C', 'H'], ['A', 'C', 'D', 'F'], ['A', 'C', 'D', 'H'], ['A', 'C', 'F', 'H'], ...
print(all_lists == sorted(all_lists)) # False now
(Updated to sort each mini-list.)
Come to think of it you could maybe do the second part with itertools.product.

Iterate lists at intervals based on list values

I've been trying to accomplish this in a few different ways and just can't quite seem to get it to work for me.
I'm trying to iterate over a list in blocks, where the first index value is an integer for how many elements are in the first block. After that, another integer with n elements, and another, etc.
Example:
test = [3, 'a', 'b', 'c', 2, 'd', 'e', 3, 'f', 'g', 'h']
I want to read 3, pull 'a', 'b', 'c' from the list and perform some operation on them.
Then return to the list at 2, pull 'd', 'e' - more operations, etc.
Or even just using the integers to split into sub-lists would work.
I'm thinking list slicing with updated [start:stop:step] variables but am having trouble pulling it together.
Any suggestions?
Can only use the standard Python library.
You could create a generator to iterate lazily on the parts of the list:
test = [3, 'a', 'b', 'c', 2, 'd', 'e', 3, 'f', 'g', 'h']
​
def parts(lst):
idx = 0
while idx < len(lst):
part_length = lst[idx]
yield lst[idx+1: idx + part_length + 1 ]
idx += part_length+1
for part in parts(test):
print(part)
Output:
['a', 'b', 'c']
['d', 'e']
['f', 'g', 'h']
If your input structure is always like this you can do the following:
result = [test[i:i+j] for i, j in enumerate(test, 1) if isinstance(j, int)]
print(result)
# [['a', 'b', 'c'], ['d', 'e'], ['f', 'g', 'h']]
Using an iterator on the list makes this super simple. Just grab the next item which tells you how much more to grab next, and so on until the end of the list:
test = [3, 'a', 'b', 'c', 2, 'd', 'e', 3, 'f', 'g', 'h']
it = iter(test)
for num in it:
print(", ".join(next(it) for _ in range(num)))
which prints:
a, b, c
d, e
f, g, h
You can also convert this to a list if you need to save the result:
>>> it = iter(test)
>>> [[next(it) for _ in range(num)] for num in it]
[['a', 'b', 'c'], ['d', 'e'], ['f', 'g', 'h']]

slice a list up to a given element

If you have a list my_list = ['a', 'd', 'e', 'c', 'b', 'f'] and you want to construct a sublist, containing all elements up to a given one, for example my_list_up_to_c = ['a', 'd', 'e'], how can this be done in a way that scales easily? Also can this be made faster by using numpy arrays?
The least amount of code would probably be using .index() (note that this searches till the first occurence of the element in said list):
>>> my_list = ['a', 'd', 'e', 'c', 'b', 'f']
>>> my_list
['a', 'd', 'e', 'c', 'b', 'f']
>>> my_list[:my_list.index('c')] # excluding the specified element
['a', 'd', 'e']
>>> my_list[:my_list.index('c')+1] # including the specified element
['a', 'd', 'e', 'c']
The time complexity of the call to .index() is O(n), meaning it will at most iterate once over the list. The list slicing has complexity O(k) (according to this source), meaning it depends on the size of the slice.
So in the worst case the element you look for is at the end of the list, so your search will run till the end of the list (O(n)) and the slice will copy the whole list as well (also O(n)), resulting in a worst case of O(2n) which is still linear complexity.
Use index() to get the first occurrence of a list item. Then use the slice notation to get the desired part of the list.
>>> my_list = ['a', 'd', 'e', 'c', 'b', 'f']
>>> my_list[:my_list.index('c')]
['a', 'd', 'e']
The itertools solution
In[9]: from itertools import takewhile
In[10]: my_list = ['a', 'd', 'e', 'c', 'b', 'f']
In[11]: list(takewhile(lambda x: x != 'c', my_list))
Out[11]: ['a', 'd', 'e']
In Haskell it would be
takeWhile ((/=) 'c') "adecbf"

Merging a list of lists

How do I merge a list of lists?
[['A', 'B', 'C'], ['D', 'E', 'F'], ['G', 'H', 'I']]
into
['A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'I']
Even better if I can add a value on the beginning and end of each item before merging the lists, like html tags.
i.e., the end result would be:
['<tr>A</tr>', '<tr>B</tr>', '<tr>C</tr>', '<tr>D</tr>', '<tr>E</tr>', '<tr>F</tr>', '<tr>G</tr>', '<tr>H</tr>', '<tr>I</tr>']
Don't use sum(), it is slow for joining lists.
Instead a nested list comprehension will work:
>>> x = [['A', 'B', 'C'], ['D', 'E', 'F'], ['G', 'H', 'I']]
>>> [elem for sublist in x for elem in sublist]
['A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'I']
>>> ['<tr>' + elem + '</tr>' for elem in _]
The advice to use itertools.chain was also good.
import itertools
print [('<tr>%s</tr>' % x) for x in itertools.chain.from_iterable(l)]
You can use sum, but I think that is kinda ugly because you have to pass the [] parameter. As Raymond points out, it will also be expensive. So don't use sum.
To concatenate the lists, you can use sum
values = sum([['A', 'B', 'C'], ['D', 'E', 'F'], ['G', 'H', 'I']], [])
To add the HTML tags, you can use a list comprehension.
html_values = ['<tr>' + i + '</tr>' for i in values]
Use itertools.chain:
>>> import itertools
>>> list(itertools.chain(*mylist))
['A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'I']
Wrapping the elements in HTML can be done afterwards.
>>> ['<tr>' + x + '</tr>' for x in itertools.chain(*mylist)]
['<tr>A</tr>', '<tr>B</tr>', '<tr>C</tr>', '<tr>D</tr>', '<tr>E</tr>', '<tr>F</tr>',
'<tr>G</tr>', '<tr>H</tr>', '<tr>I</tr>']
Note that if you are trying to generate valid HTML you may also need to HTML escape some of the content in your strings.

Categories