Splitting Nested List at every ['-1'] - python

If I have a nested list like this:
[['01'], ['02'], ['-1'], ['03'], ['04']]
Is there a way I split this nested list at every ['-1']?
So that it looks like this:
[[['01'], ['02']], [['03'], ['04']]]
Any sort of help would be appreciated :)

You can use itertools.groupby to group at every occurrence of your split value (here ['-1']). if not k ensures that we leave out the split value itself.
orig = [['01'], ['02'], ['-1'], ['03'], ['04']]
from itertools import groupby
n = [list(g) for k, g in groupby(orig, lambda x: x == ['-1']) if not k]

Try this,
lists = [['01'], ['02'], ['-1'], ['03'], ['04'], ['-1'], ['05'], ['-1']]
results = list()
prev_idx = 0
for idx, l in enumerate(lists):
if l == ['-1']:
results.append(lists[prev_idx:idx])
prev_idx = idx+1
if prev_idx <= idx: # the last group might be [] as shown in this case
results.append(lists[prev_idx:])
print(results)
# Output
[[['01'], ['02']], [['03'], ['04']], [['05']]]

Seems like a usecase for groupby
>>> from itertools import groupby
>>> l = [['01'], ['02'], ['-1'], ['03'], ['04'], ['-1'], ['05'], ['06']]
>>> [list(g) for k,g in groupby(l, lambda x: x == ['-1']) if not k]
[[['01'], ['02']], [['03'], ['04']], [['05'], ['06']]]
itertools.groupby docs

A good old fashioned loop should do it:
l = [['01'], ['02'], ['-1'], ['03'], ['04']]
new = []
current = [] # Build a new list here
for i, item in enumerate(l):
if item != ['-1']:
current.append(item)
if i == len(l) - 1: # If the item is the last in the list
new.append(current)
else:
new.append(current)
current = []
>>> [[['01'], ['02']], [['03'], ['04']]]

Related

What is an easy way to remove duplicates from only part of the string in Python?

I have a list of strings that goes like this:
1;213;164
2;213;164
3;213;164
4;213;164
5;213;164
6;213;164
7;213;164
8;213;164
9;145;112
10;145;112
11;145;112
12;145;112
13;145;112
14;145;112
15;145;112
16;145;112
17;145;112
1001;1;151
1002;2;81
1003;3;171
1004;4;31
I would like to remove all duplicates where second 2 numbers are the same. So after running it through program I would get something like this:
1;213;164
9;145;112
1001;1;151
1002;2;81
1003;3;171
1004;4;31
But something like
8;213;164
15;145;112
1001;1;151
1002;2;81
1003;3;171
1004;4;31
would also be correct.
Here is a nice and fast trick you can use (assuming l is your list):
list({ s.split(';', 1)[1] : s for s in l }.values())
No need to import anything, and fast as can be.
In general you can define:
def custom_unique(L, keyfunc):
return list({ keyfunc(li): li for li in L }.values())
You can group the items by this key and then use the first item in each group (assuming l is your list).
import itertools
keyfunc = lambda x: x.split(";", 1)[1]
[next(g) for k, g in itertools.groupby(sorted(l, key=keyfunc), keyfunc)]
Here is a code on the few first items, just switch my list with yours:
x = [
'7;213;164',
'8;213;164',
'9;145;112',
'10;145;112',
'11;145;112',
]
new_list = []
for i in x:
check = True
s_part = i[i.find(';'):]
for j in new_list:
if s_part in j:
check = False
if check == True:
new_list.append(i)
print(new_list)
Output:
['7;213;164', '9;145;112']

list method to group elements in Python

Thanks for all your answers but I edit my question because it was not clear for all.
I have the following list of tuples:
[("ok",1),("yes",1),("no",0),("why",1),("some",1),("eat",0),("give",0),("about",0),("tell",1),("ask",0),("be",0)]
I would like to have :
[("ok yes","no"),("why some","eat give about"),("tell","ask be")]
Thank you !
So I want to regroup all 1 and when a 0 appears I add the value in my list and I create a new element for the next values.
You can use itertools.groupby:
from itertools import groupby
d = [("ok",1),("yes",1),("no",0),("why",1),("some",1),("eat",0),("give",0),("about",0),("tell",1),("ask",0),("be",0)]
new_d = [' '.join(j for j, _ in b) for _, b in groupby(d, key=lambda x:x[-1])]
result = [(new_d[i], new_d[i+1]) for i in range(0, len(new_d), 2)]
Output:
[('ok yes', 'no'), ('why some', 'eat give about'), ('tell', 'ask be')]
As per my understanding following code should work for your above question
list_tuples = [("ok",1),("yes",1),("no",0),("why",1),("some",1),("eat",0)]
tups=[]
updated_list=[]
for elem in list_tuples:
if elem[1] == 0:
updated_list.append(tuple([' '.join(tups), elem[0]]))
tups=[]
else:
tups.append(elem[0])
print updated_list
One possible solution using itertools.groupby:
from operator import itemgetter
from itertools import groupby
lst = [("ok",1), ("yes",1), ("no",0), ("why",1), ("some",1), ("eat",0)]
def generate(lst):
rv = []
for v, g in groupby(lst, itemgetter(1)):
if v:
rv.append(' '.join(map(itemgetter(0), g)))
else:
for i in g:
rv.append(i[0])
yield tuple(rv)
rv = []
# yield last item if==1:
if v:
yield tuple(rv)
print([*generate(lst)])
Prints:
[('ok yes', 'no'), ('why some', 'eat')]

Grouping the nested attribute list in Python

I have a list
lst = ['orb|2|3|4', 'obx|2|3|4', 'orb|2|3|4', 'obx|1|2|3', 'obx|1|2|3','obx|1|2|3']
How can I group the list by the initial three lines, so that in the end it's like this. Grouping occurs on three characters of the line. If the line starts with "orb", then subsequent lines are added to the list that begins with this line. Thanks for the answer.
result = [['orb|2|3|4', 'obx|2|3|4'], ['orb|2|3|4', 'obx|1|2|3', 'obx|1|2|3','obx|1|2|3']]
Here is an algorithm of O(N) complexity:
res = []
tmp = []
for x in lst:
if x.startswith('orb'):
if tmp:
res.append(tmp)
tmp = [x]
elif tmp:
tmp.append(x)
res.append(tmp)
result:
In [133]: res
Out[133]:
[['orb|2|3|4', 'obx|2|3|4'],
['orb|2|3|4', 'obx|1|2|3', 'obx|1|2|3', 'obx|1|2|3']]
You can use itertools.groupby:
import itertools, re
lst = ['orb|2|3|4', 'obx|2|3|4', 'orb|2|3|4', 'obx|1|2|3', 'obx|1|2|3','obx|1|2|3']
new_result = [list(b) for _, b in itertools.groupby(lst, key=lambda x:re.findall('^\w+', x)[0])]
final_result = [new_result[i]+new_result[i+1] for i in range(0, len(new_result), 2)]
Output:
[['orb|2|3|4', 'obx|2|3|4'], ['orb|2|3|4', 'obx|1|2|3', 'obx|1|2|3', 'obx|1|2|3']]

Slice a list into a nested list based on special characters using Python

I have a list of strings like this:
lst = ['23532','user_name=app','content=123',
'###########################',
'54546','user_name=bee','content=998 hello','source=fb',
'###########################',
'12/22/2015']
I want a similar method like string.split('#') that can give me output like this:
[['23532','user_name=app','content='123'],
['54546','user_name=bee',content='998 hello','source=fb'],
['12/22/2015']]
but I know list has not split attribute. I cannot use ''.join(lst) either because this list comes from part of a txt file I read in and my txt.file was too big, so it will throw an memory error to me.
I don't think there's a one-liner for this, but you can easily write a generator to do what you want:
def sublists(lst):
x = []
for item in lst:
if item == '###########################': # or whatever condition you like
if x:
yield x
x = []
else:
x.append(item)
if x:
yield x
new_list = list(sublists(old_list))
If you can't use .join(), you can loop through the list and save the index of any string that contains # then loop again to slice the list:
lst = ['23532', 'user_name=app', 'content=123', '###########################' ,'54546','user_name=bee','content=998 hello','source=fb','###########################','12/22/2015']
idx = []
new_lst = []
for i,val in enumerate(lst):
if '#' in val:
idx.append(i)
j = 0
for x in idx:
new_lst.append(lst[j:x])
j = x+1
new_lst.append(lst[j:])
print new_lst
output:
[['23532', 'user_name=app', 'content=123'], ['54546', 'user_name=bee', 'content=998 hello', 'source=fb'], ['12/22/2015']]
sep = '###########################'
def split_list(_list):
global sep
lists = list()
sub_list = list()
for x in _list:
if x == sep:
lists.append(sub_list)
sub_list = list()
else:
sub_list.append(x)
lists.append(sub_list)
return lists
l = ['23532','user_name=app','content=123',
'###########################',
'54546','user_name=bee','content=998 hello','source=fb',
'###########################',
'12/22/2015']
pprint(split_list(l))
Output:
[['23532', 'user_name=app', 'content=123'],
['54546', 'user_name=bee', 'content=998 hello', 'source=fb'],
['12/22/2015']]
You can achieve this by itertools.groupby
from itertools import groupby
lst = ['23532','user_name=app','content=123',
'###########################','54546','user_name=bee','content=998 hello','source=fb',
'###########################','12/22/2015']
[list(g) for k, g in groupby(lst, lambda x: x == '###########################') if not k ]
Output
[['23532', 'user_name=app', 'content=123'],
['54546', 'user_name=bee', 'content=998 hello', 'source=fb'],
['12/22/2015']]

how to apply a groupby on list of tuples in python?

In my function I will create different tuples and add to an empty list :
tup = (pattern,matchedsen)
matchedtuples.append(tup)
The patterns have format of regular expressions. I am looking for apply groupby() on matchedtuples in following way:
For example :
matchedtuples = [(p1, s1) , (p1,s2) , (p2, s5)]
And I am looking for this result:
result = [ (p1,(s1,s2)) , (p2, s5)]
So, in this way I will have groups of sentences with the same pattern. How can I do this?
My answer for your question will work for any input structure you will use and print the same output as you gave. And i will use only groupby from itertools module:
# Let's suppose your input is something like this
a = [("p1", "s1"), ("p1", "s2"), ("p2", "s5")]
from itertools import groupby
result = []
for key, values in groupby(a, lambda x : x[0]):
b = tuple(values)
if len(b) >= 2:
result.append((key, tuple(j[1] for j in b)))
else:
result.append(tuple(j for j in b)[0])
print(result)
Output:
[('p1', ('s1', 's2')), ('p2', 's5')]
The same solution work if you add more values to your input:
# When you add more values to your input
a = [("p1", "s1"), ("p1", "s2"), ("p2", "s5"), ("p2", "s6"), ("p3", "s7")]
from itertools import groupby
result = []
for key, values in groupby(a, lambda x : x[0]):
b = tuple(values)
if len(b) >= 2:
result.append((key, tuple(j[1] for j in b)))
else:
result.append(tuple(j for j in b)[0])
print(result)
Output:
[('p1', ('s1', 's2')), ('p2', ('s5', 's6')), ('p3', 's7')]
Now, if you modify your input structure:
# Let's suppose your modified input is something like this
a = [(["p1"], ["s1"]), (["p1"], ["s2"]), (["p2"], ["s5"])]
from itertools import groupby
result = []
for key, values in groupby(a, lambda x : x[0]):
b = tuple(values)
if len(b) >= 2:
result.append((key, tuple(j[1] for j in b)))
else:
result.append(tuple(j for j in b)[0])
print(result)
Output:
[(['p1'], (['s1'], ['s2'])), (['p2'], ['s5'])]
Also, the same solution work if you add more values to your new input structure:
# When you add more values to your new input
a = [(["p1"], ["s1"]), (["p1"], ["s2"]), (["p2"], ["s5"]), (["p2"], ["s6"]), (["p3"], ["s7"])]
from itertools import groupby
result = []
for key, values in groupby(a, lambda x : x[0]):
b = tuple(values)
if len(b) >= 2:
result.append((key, tuple(j[1] for j in b)))
else:
result.append(tuple(j for j in b)[0])
print(result)
Output:
[(['p1'], (['s1'], ['s2'])), (['p2'], (['s5'], ['s6'])), (['p3'], ['s7'])]
Ps: Test this code and if it breaks with any other kind of inputs please let me know.
If you require the output you present, you'll need to manually loop through the grouping of matchedtuples and build your list.
First, of course, if the matchedtuples list isn't sorted, sort it with itemgetter:
from operator import itemgetter as itmg
li = sorted(matchedtuples, key=itmg(0))
Then, loop through the result supplied by groupby and append to the list r based on the size of the group:
r = []
for i, j in groupby(matchedtuples, key=itmg(0)):
j = list(j)
ap = (i, j[0][1]) if len(j) == 1 else (i, tuple(s[1] for s in j))
r.append(ap)

Categories