Flatten lists of variable depths in Python - python

I have a list of n lists. Each internal list contains a combination of (a) strings, (b) the empty list, or (c) a list containing one string. I would like to transform the inside lists so they only contain the strings.
I have a list like this for example:
[[[],["a"],"a"],[["ab"],[],"abc"]]
and I would like it to be like this:
[["","a","a"],["ab","","abc"]]
I know I could probably go through with a loop but I am looking for a more elegant solution, preferably with a list comprehension.

List comprehension:
>>> original = [[[],["a"],"a"],[["ab"],[],"abc"]]
>>> result = [['' if not item else ''.join(item) for item in sublist] for sublist in original]
>>> result
[['', 'a', 'a'], ['ab', '', 'abc']]

As every element of the list that you'd like to flatten is iterable, instead of checking of being instance of some class (list, string) you can actually make use of duck-typing:
>> my_list = [[[],["a"],"a"],[["ab"],[],"abc"]]
>> [list(map(lambda x: ''.join(x), elem)) for elem in my_list]
Or more readable version:
result = []
for elem in my_list:
flatten = map(lambda x: ''.join(x), elem)
result.append(list(flatten))
Result:
[['', 'a', 'a'], ['ab', '', 'abc']]
It's quite pythonic to not to check what something is but rather leverage transformation mechanics to adaptive abilities of each of the structure.

Via list comprehension:
lst = [[[],["a"],"a"],[["ab"],[],"abc"]]
result = [ ['' if not v else (v[0] if isinstance(v, list) else v) for v in sub_l]
for sub_l in lst ]
print(result)
The output:
[['', 'a', 'a'], ['ab', '', 'abc']]

original_list = [[[],["a"],"a"],[["ab"],[],"abc"]]
flatten = lambda x: "" if x == [] else x[0] if isinstance(x, list) else x
flattened_list = [[flatten(i) for i in j] for j in original_list]

Related

Removing strings from a list dependent on their length?

I am trying to remove any strings from this list of strings if their length is greater than 5. I don't know why it is only removing what seems to be random strings. Please help. The item for sublist part of the code just changes the list of lists, into a normal list of strings.
list2 = [['name'],['number'],['continue'],['stop'],['signify'],['tester'],['racer'],['stopping']]
li = [item for sublist in list2 for item in sublist]
var=0
for words in li:
if len(li[var])>5:
li.pop()
var+=1
print(li)
The output is: ['name', 'number', 'continue', 'stop', 'signify']
Just include the check when flattening the list:
list2 = [['name'],['number'],['continue'],['stop'],['signify'],['tester'],['racer'],['stopping']]
li = [item for sublist in list2 for item in sublist if len(item) <= 5]
['name', 'stop', 'racer']
You can use a list comprehension to build a new list with only items that are 5 or less in length.
>>> l = ['123456', '123', '12345', '1', '1234567', '12', '1234567']
>>> l = [x for x in l if len(x) <= 5]
>>> l
['123', '12345', '1', '12']
list(filter(lambda x: len(x[0]) <= 5, list2))

How to split a list into smaller lists python

I have a nested list that looks something like:
lst = [['ID1', 'A'],['ID1','B'],['ID2','AAA'], ['ID2','DDD']...]
Is it possible for me to split the lst into small lists by their ID so that each small list contained elements with the same ID? The results should look something looks like:
lst1 = [['ID1', 'A'], ['ID1', 'B']...]
lst2 = [['ID2', 'AAA'], ['ID2', 'DDD']...]
You can use groupby:
from itertools import groupby
grp_lists = []
for i, grp in groupby(lst, key= lambda x: x[0]):
grp_lists.append(list(grp))
print(grp_lists[0])
[['ID1', 'A'], ['ID1', 'B']]
print(grp_lists[1])
[['ID2', 'AAA'], ['ID2', 'DDD']]
using collections.defaultdict:
lst = [['ID1', 'A'],['ID1','B'],['ID2','AAA'], ['ID2','DDD']]
from collections import defaultdict
result = defaultdict(list)
for item in lst:
result[item[0]].append(item)
print(list(result.values()))
output:
[[['ID1', 'A'], ['ID1', 'B']], [['ID2', 'AAA'], ['ID2', 'DDD']]]
Without external functions: build a set of unique indexes, then loop over the original list building a new list for each of the indexes and filling it with list items that contain that index:
lst = [['ID1', 'A'],['ID1','B'],['ID2','AAA'], ['ID2','DDD']]
unique_set = set(elem[0] for elem in lst)
lst2 = [ [elem for elem in lst if elem[0] in every_unique] for every_unique in unique_set]
print (lst2)
Result:
[[['ID2', 'AAA'], ['ID2', 'DDD']], [['ID1', 'A'], ['ID1', 'B']]]
(It is possible to move unique_set into the final line, making it a one-liner. But that would make it less clear what happens.)
If you want to get separate variables like your example of a result:
lst1 = [sub_lst for sub_lst in lst if sub_lst[0] == 'ID1']
and
lst2 = [sub_lst for sub_lst in lst if sub_lst[0] == 'ID2']
from that, you can make a function:
def create_sub_list(id_str, original_lst):
return [x for x in original_lst if x[0] == id_str]
And call it like that:
lst1 = create_sub_list('ID1', lst)
If you want a dictionary of the sub-lists, for easier access, you can use:
from functools import reduce
def reduce_dict(ret_dict, sub_lst):
if (sub_lst[0] not in ret_dict):
ret_dict[sub_lst[0]] = sub_lst[1:]
else:
ret_dict[sub_lst[0]] += sub_lst[1:]
return ret_dict
grouped_dict = reduce(reduce_dict, lst, dict())
(If you know that in your list there will only be 1 string after each ID slot you can change both the sub_lst[1:]'s to sub_lst[1])
And then to access the elements if the dictionary you use the ID strings:
print(grouped_dict['ID1'])
This will print:
['A', 'B']

'NoneType' object is not iterable?

def merge_list(list1,list2):
res_list=[]
list2=list2.reverse()
conv = lambda i : i or ''
res = [conv(i) for i in list2]
for i in range(0,len(list1)):
res_list.append(list1[i]+list2[i])
merged_data=' '.join(res_list)
return merged_data
list1 = ['A', 'app', 'a', 'd','ke','th','doc','awa']
list2=['y','tor','e','eps','ay',None,'le','n']
data=merge_list(list1,list2)
data
I'm trying to reverse list2 and concatenate the strings from both the lists to get a string as result. The objective is to ignore None in the list if there is any, and print the final sentence.
The error is that you have used
list2=list2.reverse()
the list.reverse() method reverses the list in place, the method itself returns None. To fix this, change the line to just read:
list2.reverse()
Also, you have used
res_list.append(list1[i]+list2[i])
This should probably read
res_list.append(list1[i]+res[i])
These corrections give the output:
'An apple a day keeps the doctor away'
Another way to solve this problem is as a one-liner, using zip, list slicing to reverse list2, and ternary statements to apply the None -> '' logic.
>>> list1 = ['A', 'app', 'a', 'd', 'ke', 'th', 'doc', 'awa']
>>> list2 = ['y', 'tor', 'e', 'eps', 'ay', None, 'le', 'n']
>>> ' '.join(''.join(y if y else '' for y in x) for x in zip(list1, list2[::-1]))
'An apple a day keeps the doctor away'
1) Replace None in both the lists:
>>> list1 = [x if x else "" for x in list1]
>>> list2 = [x if x else "" for x in list2]
2) zip and iterate:
>>> lis=[]
>>> for x,y in zip(list1,list2[::-1]):
... lis.append(x+y)
3) join:
>>> " ".join(lis)
'An apple a day keeps the doctor away'
CDJB's one liner combines all these steps

How to create a list from lists?

fp.readline()
for line in fp:
line_lst = line.strip().split(',')
Suppose I get a bunch of lists after running the code above:
['a','b','c']['1','2','3']['A','B','C']
How could I get another lists
['a','1','A']['b','2','B']['c','3','C']
from the lists instead of directly creating it?
Assuming all lists have same number of elements
my_lists = [my_list_1, my_list2, ...]
for i in xrange(len(my_lists[0])):
print [l[i] for l in my_lists]
An example...
>>> my_lists = [['a','b','c'],['1','2','3'],['A','B','C']]
>>> for i in xrange(len(my_lists[0])):
... print [l[i] for l in my_lists]
...
['a', '1', 'A']
['b', '2', 'B']
['c', '3', 'C']
You could use
all_list = [['a','b','c'],['1','2','3'],['A','B','C']]
result = [x[0] for x in all_list]
print(result)
This is called list comprehension in Python.
For your need, you should use zip function here, the link is give by #Arc676.
all_list = [['a','b','c'],['1','2','3'],['A','B','C']]
# result = list(zip(all_list[0], all_list[1], all_list[2]))
# if you have more list in all_list, you could use this
result = list(zip(*all_list))
print(result)
You could try something like below.
var aggregateList = line_lst.map(function (value) { return value[0]; });
EDIT: Oops, thought I was in the javascript section, for python:
aggregateList = map(lambda value: value[0], line_lst)
This should do it... but it's really just getting the first item from each list, so I'm not sure if it's exactly what you want
new_list = []
for sub_list in line_lst:
new_list.append(sub_list[0])
you could also do this (which I like better)
new_list = [sub_list[0] for sub_list in line_lst]
Just:
zip(['a','b','c'], ['1','2','3'], ['A','B','C'])
Which gives:
[('a', '1', 'A'), ('b', '2', 'B'), ('c', '3', 'C')]

How to convert list of intable strings to int

In Python, I want to convert a list of strings:
l = ['sam','1','dad','21']
and convert the integers to integer types like this:
t = ['sam',1,'dad',21]
I tried:
t = [map(int, x) for x in l]
but is showing an error.
How could I convert all intable strings in a list to int, leaving other elements as strings?
My list might be multi-dimensional. A method which works for a generic list would be preferable:
l=[['aa','2'],['bb','3']]
I'd use a custom function:
def try_int(x):
try:
return int(x)
except ValueError:
return x
Example:
>>> [try_int(x) for x in ['sam', '1', 'dad', '21']]
['sam', 1, 'dad', 21]
Edit: If you need to apply the above to a list of lists, why didn't you converted those strings to int while building the nested list?
Anyway, if you need to, it's just a matter of choice on how to iterate over such nested list and apply the method above.
One way for doing that, might be:
>>> list_of_lists = [['aa', '2'], ['bb', '3']]
>>> [[try_int(x) for x in lst] for lst in list_of_lists]
[['aa', 2], ['bb', 3]]
You can obviusly reassign that to list_of_lists:
>>> list_of_lists = [[try_int(x) for x in lst] for lst in list_of_lists]
How about using map and lambda
>>> map(lambda x:int(x) if x.isdigit() else x,['sam','1','dad','21'])
['sam', 1, 'dad', 21]
or with List comprehension
>>> [int(x) if x.isdigit() else x for x in ['sam','1','dad','21']]
['sam', 1, 'dad', 21]
>>>
As mentioned in the comment, as isdigit may not capture negative numbers, here is a refined condition to handle it notable a string is a number if its alphanumeric and not a alphabet :-)
>>> [int(x) if x.isalnum() and not x.isalpha() else x for x in ['sam','1','dad','21']]
['sam', 1, 'dad', 21]
I would create a generator to do it:
def intify(lst):
for i in lst:
try:
i = int(i)
except ValueError:
pass
yield i
lst = ['sam','1','dad','21']
intified_list = list(intify(lst))
# or if you want to modify an existing list
# lst[:] = intify(lst)
If you want this to work on a list of lists, just:
new_list_of_lists = map(list, map(intify, list_of_lists))
For multidimenson lists, use recursive technique may help.
from collections import Iterable
def intify(maybeLst):
try:
return int(maybeLst)
except:
if isinstance(maybeLst, Iterable) and not isinstance(lst, str):
return [intify(i) for i in maybeLst] # here we call intify itself!
else:
return maybeLst
maybeLst = [[['sam', 2],'1'],['dad','21']]
print intify(maybeLst)
Use isdigit() to check each character in the string to see if it is a digit.
Example:
mylist = ['foo', '3', 'bar', '9']
t = [ int(item) if item.isdigit() else item for item in mylist ]
print(t)
Use a list comprehension to validate the numeracy of each list item.
str.isnumeric won't pass a negative sign
Use str.lstrip to remove the -, check .isnumeric, and convert to int if it is.
Alternatively, use str.isdigit in place of .isnumeric.
Keep all values in the list
l = ['sam', '1', 'dad', '21', '-10']
t = [int(v) if v.lstrip('-').isnumeric() else v for v in l]
print(t)
>>> ['sam', 1, 'dad', 21, -10]
Remove non-numeric values
l = ['sam', '1', 'dad', '21', '-10']
t = [int(v) for v in t if v.lstrip('-').isnumeric()]
print(t)
>>> [1, 21, -10]
Nested list
l = [['aa', '2'], ['bb', '3'], ['sam', '1', 'dad', '21', '-10']]
t = [[int(v) if v.lstrip('-').isnumeric() else v for v in x] for x in l]
print(t)
>>> [['aa', 2], ['bb', 3], ['sam', 1, 'dad', 21, -10]]

Categories