Grouping two lists of lists in Python based on value - python

I have two lists in Python.
list1 = ['a','a','b','a','c','b','c','a','d','a','b']
list2 = ['1','2','21','12','1','32','11','12','21','3','31']
I have to group the similar elements in list1. The corresponding elements in list2 should also get grouped based on this. Output should be this:
list1 = [['a','a','a','a','a'],['b','b','b'],['c','c'],['d']]
list2 = [['1','2','12','12','3'],['21','32','31'],['1','11'],['21']]
What is the best way to do this?

If you do not care about the order of elements in the first list, you may use defaultdict:
In [7]: from collections import defaultdict
In [8]: from itertools import izip
In [9]: res = defaultdict(list)
In [10]: for k, v in izip(list1, list2):
....: res[k].append(v)
....:
In [11]: print(res)
defaultdict(<type 'list'>, {'a': ['1', '2', '12', '12', '3'], 'c': ['1', '11'], 'b': ['21', '32', '31'], 'd': ['21']})
In [12]: res.items()
Out[12]:
[('a', ['1', '2', '12', '12', '3']),
('c', ['1', '11']),
('b', ['21', '32', '31']),
('d', ['21'])]

This code worked for me:
groups = list(set(list1))
list1_tmp, list2_tmp = [], []
for char in groups:
list1_tmp.append([])
list2_tmp.append([])
for i in range(len(list1)):
list1_tmp[groups.index(list1[i])].append(list1[i])
list2_tmp[groups.index(list1[i])].append(list2[i])
list1 = list1_tmp
list2 = list2_tmp
The output should be valid as well for any other similar input.

This code should do it:
final_list1 = []
final_list2 = []
for distinct in sorted(list(set(list1))):
index = 0
distinct_list1 = []
distinct_list2 = []
for element in list1:
if element == distinct:
distinct_list1.append(element)
distinct_list2.append(list2[index])
index += 1
final_list1.append(distinct_list1)
final_list2.append(distinct_list2)
list1 = final_list1
list2 = final_list2
This will give you exactly the output you asked for. If you don't really care about the output, there are probably better ways as #soon proposed.

Here's a (kind of ugly) implementation that would do the trick:
list1 = ['a','a','b','a','c','b','c','a','d','a','b']
list2 = ['1','2','21','12','1','32','11','12','21','3','31']
def transform(in_list, other_list):
if len(in_list) != len(other_list):
raise ValueError("Lists must have the sema length!")
out_list = list()
out_other_list = list()
for i, c in enumerate(in_list):
for inner_list, inner_other_list in zip(out_list, out_other_list):
if c in inner_list:
inner_list.append(c)
inner_other_list.append(other_list[i])
break
else:
out_list.append([c])
out_other_list.append([other_list[i]])
return out_list, out_other_list
print transform(list1, list2)

Though I personally like soon's answer,This one successfully retrieve your desired output.
lst= sorted(zip(list1,list2),key=lambda x:x[0])
intList=[]
initial=lst[0][0]
count=0
for index,value in enumerate(lst):
if value[0]==initial:
continue
else:
intList.append(lst[count:index])
initial=value[0]
count=index
finList1=[[a for a,b in innerList] for innerList in intList]
finList2=[[b for a,b in innerList] for innerList in intList]

Related

How to convert a flat list into a dictionary in python?

I have a flat list containing information of multiple variables and need to convert it into a dictionary. For example, 'a','b','c' are variable names and need to be the keys in the dictionary. The list could be split by '_' and ':'.
list_x = ['a:1_b:45_c:abc','a:2_b:24_c:def','a:4_b:78_c:xxx']
The desired output would be:
dict_x = {'a':[1,2,4],'b':[45,24,78],'c':['abc','def','xxx']}
I am not sure how to loop to get the keys for the dictionary since it is the same for all elements in the list.
lst = [y.split(":") for x in [x.split("_") for x in list_x] for y in x]
d = {x:[] for x in set([x[0] for x in lst])}
for k, v in lst:
d[k].append(v)
# Out[40]: {'a': ['1', '2', '4'], 'c': ['abc', 'def', 'xxx'], 'b': ['45', '24', '78']}
Try this method (explanation inline as code comments) -
#Function to turn a list of tuples into a dict after converting integers and keeping string types.
def convert(tup):
di = {}
for a, b in tup:
if b.isdecimal(): #convert to int if possible
b = int(b)
di.setdefault(a, []).append(b)
return di
#convert the input into a list of tuples
k = [tuple(j.split(':')) for i in list_x for j in i.split('_')]
#convert list of tuples into dict
convert(k)
{'a': [1, 2, 4], 'b': [45, 24, 78], 'c': ['abc', 'def', 'xxx']}
list_x = ['a:1_b:45_c:abc','a:2_b:24_c:def','a:4_b:78_c:xxx']
result_dict = {}
for list_element in list_x:
key_val_pair = list_element.split('_')
for key_val in key_val_pair:
key, val = key_val.split(':')
if key not in result_dict:
result_dict[key] = []
result_dict[key].append(val)
print(result_dict)
You need to ensure that your dictionary is dictionary of type string: list that is why I check if the dictionary contains the key and if it does then I push the item and if it doesn't then add a new key with a list containing only the value.
list_x = ['a:1_b:45_c:abc','a:2_b:24_c:def','a:4_b:78_c:xxx']
print(list_x)
dic_x = dict()
for x in list_x:
keyValueList = x.split('_')
for keyValue in keyValueList:
split = keyValue.split(':')
key = split[0]
value = split[1]
if key in dic_x:
dic_x[key].append(value)
else:
dic_x.update({key: [value]})
print(dic_x)
Assuming strings in your list_x always have the same format as: a:integer_b:integer_c:string, you can do this:
dict_x = {'a':[],'b':[],'c':[]}
for s in list_x:
sl = s.split('_')
dict_x['a'].append(int(sl[0][2:]))
dict_x['b'].append(int(sl[1][2:]))
dict_x['c'].append(sl[2][2:])
Maybe this can solve you problem with an easy way without being too much verbose neither compact. It's versatile so you can add as much identifier as you want but as you can see the format of them should be the same
list_x = ['a:1_b:45_c:abc','a:2_b:24_c:def','a:4_b:78_c:xxx']
dict_x ={}
for val in list_x:
elements = val.split('_')
for el in elements:
key, value = el.split(':')[0], el.split(':')[1]
if dict_x.get(key) is None: #If the key it's founded for the first time
dict_x[key] = [value]
else: #If I've already founded the key the data is being appended
dict_x[key].append(value)
print(dict_x)
As you can see the core it's the if that checks if the key founded not exists, in this case create a new array containing the first value founded; otherwise append the value to the actual array.
First split each string based on _ as delimiter and then split it based on : as delimiter, and add each item to a dict
>>> list_x = ['a:1_b:45_c:abc','a:2_b:24_c:def','a:4_b:78_c:xxx']
>>>
>>> from collections import defaultdict
>>> d = defaultdict(list)
>>> for s in list_x:
... for kv in s.split('_'):
... k,v = kv.split(':')
... d[k].append(int(v) if v.isdigit() else v)
...
>>> dict(d)
{'a': [1, 2, 4], 'b': [45, 24, 78], 'c': ['abc', 'def', 'xxx']}
First, let's split the string by ':' and then '_'
list_x = ['a:1_b:45_c:abc','a:2_b:24_c:def','a:4_b:78_c:xxx']
def parse(s):
return [t.split(":") for t in s.split("_")]
parsed_to_lists = [parse(st) for st in list_x]
We now have
[[['a', '1'], ['b', '45'], ['c', 'abc']], [['a', '2'], ['b', '24'], ['c', 'def']], [['a', '4'], ['b', '78'], ['c', 'xxx']]]
we can flatten that by
flat_list = [item for sublist in parsed_to_lists for item in sublist]
flat_list
Which returns
[['a', '1'], ['b', '45'], ['c', 'abc'], ['a', '2'], ['b', '24'], ['c', 'def'], ['a', '4'], ['b', '78'], ['c', 'xxx']]
We want the result as a dictionary of lists, so let's create an empty one
from collections import defaultdict
res = defaultdict(list)
and fill it
for k,v in flat_list:
res[k].append(v)
res
defaultdict(<class 'list'>, {'a': ['1', '2', '4'], 'b': ['45', '24', '78'], 'c': ['abc', 'def', 'xxx']})

How to create a list from lists?

fp.readline()
for line in fp:
line_lst = line.strip().split(',')
Suppose I get a bunch of lists after running the code above:
['a','b','c']['1','2','3']['A','B','C']
How could I get another lists
['a','1','A']['b','2','B']['c','3','C']
from the lists instead of directly creating it?
Assuming all lists have same number of elements
my_lists = [my_list_1, my_list2, ...]
for i in xrange(len(my_lists[0])):
print [l[i] for l in my_lists]
An example...
>>> my_lists = [['a','b','c'],['1','2','3'],['A','B','C']]
>>> for i in xrange(len(my_lists[0])):
... print [l[i] for l in my_lists]
...
['a', '1', 'A']
['b', '2', 'B']
['c', '3', 'C']
You could use
all_list = [['a','b','c'],['1','2','3'],['A','B','C']]
result = [x[0] for x in all_list]
print(result)
This is called list comprehension in Python.
For your need, you should use zip function here, the link is give by #Arc676.
all_list = [['a','b','c'],['1','2','3'],['A','B','C']]
# result = list(zip(all_list[0], all_list[1], all_list[2]))
# if you have more list in all_list, you could use this
result = list(zip(*all_list))
print(result)
You could try something like below.
var aggregateList = line_lst.map(function (value) { return value[0]; });
EDIT: Oops, thought I was in the javascript section, for python:
aggregateList = map(lambda value: value[0], line_lst)
This should do it... but it's really just getting the first item from each list, so I'm not sure if it's exactly what you want
new_list = []
for sub_list in line_lst:
new_list.append(sub_list[0])
you could also do this (which I like better)
new_list = [sub_list[0] for sub_list in line_lst]
Just:
zip(['a','b','c'], ['1','2','3'], ['A','B','C'])
Which gives:
[('a', '1', 'A'), ('b', '2', 'B'), ('c', '3', 'C')]

I want to move a item to first index in list. How to simplify code?

This is my code.
lst=['0','1','2','3','4']
i = lst.index('2')
lst.pop(i)
tmp=[]
tmp.append('2')
tmp.extend(lst)
lst = tmp
print lst #output:['2','0','1','3','4']
Now I want to write pretty code. I think it may be to have room for improvement.So, I hope anyone who can explain and instruct me.Thanks!
sorted([0,1,2,3,4,5], key=lambda x: x == 2, reverse=True)
As an alternative answer you can use slicing :
>>> i = lst.index('2')
>>> ['2']+lst[:i]+lst[i+1:]
['2', '0', '1', '3', '4']
You can embed it inside a function :
>>> def move(elem,l):
... i = l.index(elem)
... return [elem]+lst[:i]+lst[i+1:]
...
>>> move('2',lst)
['2', '0', '1', '3', '4']
Donig it in place (altering the list):
lst = lst.pop(lst.index(2)) and (not lst.insert(0, 2)) and lst
Creating a new list for the result:
[2] + (lst.pop(lst.index(2)) and lst)

Add entry to beginning of list and remove the last one

I have a list of about 40 entries. And I frequently want to append an item to the start of the list (with id 0) and want to delete the last entry (with id 40) of the list.
How do I do this the best?
Example with 5 entries:
[0] = "herp"
[1] = "derp"
[2] = "blah"
[3] = "what"
[4] = "da..."
after adding "wuggah" and deleting last it should be like:
[0] = "wuggah"
[1] = "herp"
[2] = "derp"
[3] = "blah"
[4] = "what"
And I don't want to end up manually moving them one after another all of the entries to the next id.
Use collections.deque:
>>> import collections
>>> q = collections.deque(["herp", "derp", "blah", "what", "da.."])
>>> q.appendleft('wuggah')
>>> q.pop()
'da..'
>>> q
deque(['wuggah', 'herp', 'derp', 'blah', 'what'])
Use insert() to place an item at the beginning of the list:
myList.insert(0, "wuggah")
Use pop() to remove and return an item in the list. Pop with no arguments pops the last item in the list
myList.pop() #removes and returns "da..."
Use collections.deque
In [21]: from collections import deque
In [22]: d = deque([], 3)
In [24]: for c in '12345678':
....: d.appendleft(c)
....: print d
....:
deque(['1'], maxlen=3)
deque(['2', '1'], maxlen=3)
deque(['3', '2', '1'], maxlen=3)
deque(['4', '3', '2'], maxlen=3)
deque(['5', '4', '3'], maxlen=3)
deque(['6', '5', '4'], maxlen=3)
deque(['7', '6', '5'], maxlen=3)
deque(['8', '7', '6'], maxlen=3)
Here's a one-liner, but it probably isn't as efficient as some of the others ...
myList=["wuggah"] + myList[:-1]
Also note that it creates a new list, which may not be what you want ...
Another approach
L = ["herp", "derp", "blah", "what", "da..."]
L[:0]= ["wuggah"]
L.pop()

How to convert list of intable strings to int

In Python, I want to convert a list of strings:
l = ['sam','1','dad','21']
and convert the integers to integer types like this:
t = ['sam',1,'dad',21]
I tried:
t = [map(int, x) for x in l]
but is showing an error.
How could I convert all intable strings in a list to int, leaving other elements as strings?
My list might be multi-dimensional. A method which works for a generic list would be preferable:
l=[['aa','2'],['bb','3']]
I'd use a custom function:
def try_int(x):
try:
return int(x)
except ValueError:
return x
Example:
>>> [try_int(x) for x in ['sam', '1', 'dad', '21']]
['sam', 1, 'dad', 21]
Edit: If you need to apply the above to a list of lists, why didn't you converted those strings to int while building the nested list?
Anyway, if you need to, it's just a matter of choice on how to iterate over such nested list and apply the method above.
One way for doing that, might be:
>>> list_of_lists = [['aa', '2'], ['bb', '3']]
>>> [[try_int(x) for x in lst] for lst in list_of_lists]
[['aa', 2], ['bb', 3]]
You can obviusly reassign that to list_of_lists:
>>> list_of_lists = [[try_int(x) for x in lst] for lst in list_of_lists]
How about using map and lambda
>>> map(lambda x:int(x) if x.isdigit() else x,['sam','1','dad','21'])
['sam', 1, 'dad', 21]
or with List comprehension
>>> [int(x) if x.isdigit() else x for x in ['sam','1','dad','21']]
['sam', 1, 'dad', 21]
>>>
As mentioned in the comment, as isdigit may not capture negative numbers, here is a refined condition to handle it notable a string is a number if its alphanumeric and not a alphabet :-)
>>> [int(x) if x.isalnum() and not x.isalpha() else x for x in ['sam','1','dad','21']]
['sam', 1, 'dad', 21]
I would create a generator to do it:
def intify(lst):
for i in lst:
try:
i = int(i)
except ValueError:
pass
yield i
lst = ['sam','1','dad','21']
intified_list = list(intify(lst))
# or if you want to modify an existing list
# lst[:] = intify(lst)
If you want this to work on a list of lists, just:
new_list_of_lists = map(list, map(intify, list_of_lists))
For multidimenson lists, use recursive technique may help.
from collections import Iterable
def intify(maybeLst):
try:
return int(maybeLst)
except:
if isinstance(maybeLst, Iterable) and not isinstance(lst, str):
return [intify(i) for i in maybeLst] # here we call intify itself!
else:
return maybeLst
maybeLst = [[['sam', 2],'1'],['dad','21']]
print intify(maybeLst)
Use isdigit() to check each character in the string to see if it is a digit.
Example:
mylist = ['foo', '3', 'bar', '9']
t = [ int(item) if item.isdigit() else item for item in mylist ]
print(t)
Use a list comprehension to validate the numeracy of each list item.
str.isnumeric won't pass a negative sign
Use str.lstrip to remove the -, check .isnumeric, and convert to int if it is.
Alternatively, use str.isdigit in place of .isnumeric.
Keep all values in the list
l = ['sam', '1', 'dad', '21', '-10']
t = [int(v) if v.lstrip('-').isnumeric() else v for v in l]
print(t)
>>> ['sam', 1, 'dad', 21, -10]
Remove non-numeric values
l = ['sam', '1', 'dad', '21', '-10']
t = [int(v) for v in t if v.lstrip('-').isnumeric()]
print(t)
>>> [1, 21, -10]
Nested list
l = [['aa', '2'], ['bb', '3'], ['sam', '1', 'dad', '21', '-10']]
t = [[int(v) if v.lstrip('-').isnumeric() else v for v in x] for x in l]
print(t)
>>> [['aa', 2], ['bb', 3], ['sam', 1, 'dad', 21, -10]]

Categories