How to pack consecutive duplicates of list elements into sublists?

How to pack consecutive duplicates of list elements into sublists? - python

How can I "pack" consecutive duplicated elements in a list into sublists of the repeated element?
What I mean is:
l = [1, 1, 1, 2, 2, 3, 4, 4, 1]
pack(l) -> [[1,1,1], [2,2], [3], [4, 4], [1]]
I want to do this problem in a very basic way as I have just started i.e using loops and list methods. I have looked for other methods but they were difficult for me to understand
For removing the duplicates instead of packing them, see Removing elements that have consecutive duplicates

You can use groupby:
from itertools import groupby
def pack(List):
result = []
for key, group in groupby(List):
result.append(list(group))
return result
l = [1, 1, 1, 2, 2, 3, 4, 4, 1]
print(pack(l))
Or one-line:
l = [1, 1, 1, 2, 2, 3, 4, 4, 1]
result = [list(group) for key,group in groupby(l)]
# [[1, 1, 1], [2, 2], [3], [4, 4], [1]]

You can use:
lst = [1, 1, 1, 2, 2, 3, 4, 4, 1]
# bootstrap: initialize a sublist with the first element of lst
out = [[lst[0]]]
for it1, it2 in zip(lst, lst[1:]):
# if previous item and current one are equal, append result to the last sublist
if it1 == it2:
out[-1].append(it2)
# else append a new empty sublist
else:
out.append([it2])
Output:
>>> out
[[1, 1, 1], [2, 2], [3], [4, 4], [1]]

This code will do:
data = [0,0,1,2,3,4,4,5,6,6,6,7,8,9,4,4,9,9,9,9,9,3,3,2,45,2,11,11,11]
newdata=[]
for i,l in enumerate(data):
if i==0 or l!=data[i-1]:
newdata.append([l])
else:
newdata[-1].append(l)
#Output
[[0,0],[1],[2],[3],[4,4],[5],[6,6,6],[7],[8],[9],[4,4],[9,9,9,9,9],[3,3],[2],[45],[2],[11,11,11]]

Related

Subtract previous list from current list in a list of lists loop

I have a list of dataframes with data duplicating in every next dataframe within list which I need to subtract between themselves
the_list[0] = [1, 2, 3]
the_list[1] = [1, 2, 3, 4, 5, 6, 7]
There are also df headers. Dataframes are only different in number of rows.
Wanted solution:
the_list[0] = [1, 2, 3]
the_list[1] = [4, 5, 6, 7]
Due to the fact that my list of lists, the_list, contains several dataframes, I have to work backward and go from the last df to first with first remaining intact.
My current code (estwin is the_list):
estwin = [df1, df2, df3, df4]
output=([])
estwin.reverse()
for i in range(len(estwin) -1):
difference = Diff(estwin[i], estwin[i+1])
output.append(difference)
return(output)
def Diff(li_bigger, li_smaller):
c = [x for x in li_bigger if x not in li_smaller]
return (c)
Currently, the result is an empty list. I need an updated the_list that contains only the differences (no duplicate values between lists).

You should not need to go backward for this problem, it is easier to keep track of what you have already seen going forward.
Keep a set that gets updated with new items as you traverse through each list, and use it to filter out the items that should be present in the output.
list1 = [1,2,3]
list2 = [1,2,3,4,5,6,7]
estwin = [list1, list2]
lookup = set() #to check which items/numbers have already been seen.
output = []
for lst in estwin:
updated_lst = [i for i in lst if i not in lookup] #only new items present
lookup.update(updated_lst)
output.append(updated_lst)
print(output) #[[1, 2, 3], [4, 5, 6, 7]]

Your code is not runnable, but if I guess what you meant to write, it works, except that you have one bug in your algorithm:
the_list = [
[1, 2, 3],
[1, 2, 3, 4, 5, 6, 7],
[1, 2, 3, 4, 5, 6, 7, 8, 9]
]
def process(lists):
output = []
lists.reverse()
for i in range(len(lists)-1):
difference = diff(lists[i], lists[i+1])
output.append(difference)
# BUGFIX: Always add first list (now last becuase of reverse)
output.append(lists[-1])
output.reverse()
return output
def diff(li_bigger, li_smaller):
return [x for x in li_bigger if x not in li_smaller]
print(the_list)
print(process(the_list))
Output:
[[1, 2, 3], [1, 2, 3, 4, 5, 6, 7], [1, 2, 3, 4, 5, 6, 7, 8, 9]]
[[1, 2, 3], [4, 5, 6, 7], [8, 9]]

One-liner:
from itertools import chain
l = [[1, 2], [1, 2, 3], [1, 2, 3, 4], [1, 2, 3, 4, 5]]
new_l = [sorted(list(set(v).difference(chain.from_iterable(l[:num]))))
for num, v in enumerate(l)]
print(new_l)
# [[1, 2], [3], [4], [5]]

Python groupby to split list by delimiter

I am pretty new to Python (3.6) and struggling to understand itertools groupby.
I've got the following list containing integers:
list1 = [1, 2, 0, 2, 3, 0, 4, 5, 0]
But the list could also be much longer and the '0' doesn't have to appear after every pair of numbers. It can also be after 3, 4 or more numbers. My goal is to split this list into sublists where the '0' is used as a delimiter and doesn't appear in any of these sublists.
list2 = [[1, 2], [2, 3], [4, 5]]
A similar problem has been solved here already:
Python spliting a list based on a delimiter word
Answer 2 seemed to help me a lot but unfortunately it only gave me a TypeError.
import itertools as it
list1 = [1, 2, 0, 2, 3, 0, 4, 5, 0]
list2 = [list(group) for key, group in it.groupby(list1, lambda x: x == 0) if not key]
print(list2)
File "H:/Python/work/ps0001/example.py", line 13, in
list2 = [list(group) for key, group in it.groupby(list, lambda x: x == '0') if not key]
TypeError: 'list' object is not callable
I would appreciate any help and be very happy to finally understand groupby.

You were checking for "0" (str) but you only have 0 (int) in your list. Also, you were using list as a variable name for your first list, which is a keyword in Python.
from itertools import groupby
list1 = [1, 2, 0, 2, 7, 3, 0, 4, 5, 0]
list2 = [list(group) for key, group in groupby(list1, lambda x: x == 0) if not key]
print(list2)
This should give you:
[[1, 2], [2, 7, 3], [4, 5]]

In your code, you need to change lambda x: x == '0' to lambda x: x == 0, since your working with a list of int, not a list of str.
Since others have shown how to improve your solution with itertools.groupby, you can also do this task with no libraries:
>>> list1 = [1, 2, 0, 2, 3, 0, 4, 5, 0]
>>> zeroes = [-1] + [i for i, e in enumerate(list1) if e == 0]
>>> result = [list1[zeroes[i] + 1: zeroes[i + 1]] for i in range(len(zeroes) - 1)]
>>> print(result)
[[1, 2], [2, 3], [4, 5]]

You can use regex for this:
>>> import ast
>>> your_list = [1, 2, 0, 2, 3, 0, 4, 5, 0]
>>> a_list = str(your_list).replace(', 0,', '], [').replace(', 0]', ']')
>>> your_result = ast.literal_eval(a_list)
>>> your_result
([1, 2], [2, 3], [4, 5])
>>> your_result[0]
[1, 2]
>>>
Or a single line solution:
ast.literal_eval(str(your_list).replace(', 0,', '], [').replace(', 0]', ']'))

You could do that within a Loop as depicted in the commented Snippet below:
list1 = [1, 2, 0, 2, 3, 0, 4, 5, 0]
tmp,result = ([],[]) # tmp HOLDS A TEMPORAL LIST :: result => RESULT
for i in list1:
if not i:
# CURRENT VALUE IS 0 SO WE BUILD THE SUB-LIST
result.append(tmp)
# RE-INITIALIZE THE tmp VARIABLE
tmp = []
else:
# SINCE CURRENT VALUE IS NOT 0, WE POPULATE THE tmp LIST
tmp.append(i)
print(result) # [[1, 2], [2, 3], [4, 5]]
Effectively:
list1 = [1, 2, 0, 2, 3, 0, 4, 5, 0]
tmp,result = ([],[]) # HOLDS A TEMPORAL LIST
for i in list1:
if not i:
result.append(tmp); tmp = []
else:
tmp.append(i)
print(result) # [[1, 2], [2, 3], [4, 5]]

Use zip to return a tuple of lists and convert them to list later on
>>> a
[1, 2, 0, 2, 3, 0, 4, 5, 0]
>>> a[0::3]
[1, 2, 4]
>>> a[1::3]
[2, 3, 5]
>>> zip(a[0::3],a[1::3])
[(1, 2), (2, 3), (4, 5)]
>>> [list(i) for i in zip(a[0::3],a[1::3])]
[[1, 2], [2, 3], [4, 5]]

Try to use join and then split by 0
lst = [1, 2, 0, 2, 3, 0, 4, 5, 0]
lst_string = "".join([str(x) for x in lst])
lst2 = lst_string.split('0')
lst3 = [list(y) for y in lst2]
lst4 = [list(map(int, z)) for z in lst3]
print(lst4)
Running on my console:

How to sort list of list by size of list and by element in list in mixed list

I have the following list
list = [1, 2, 3, [3, [1, 2]]]
the result would be:
[[[2, 1], 3], 3, 2, 1]
How to sort that list by size of list and by element?

Here's one way to recursively sort the list:
def recursive_sort(item):
if isinstance(item, list):
item[:] = sorted(item, key=recursive_sort)
return 0, -len(item)
else:
return 1, -item
lst = [1, 2, 3, [3, [1, 2], [2, 3, 6]]]
print(sorted(lst, key=recursive_sort))
# [[[6, 3, 2], [2, 1], 3], 3, 2, 1]
Caveat: This is more of an academic exercise and should never be used in production code. The state of the list during a sort (at least with Timsort in CPython) is undefined, so you shouldn't count on this to always work.

How to retrieve list(s) that contains specific query items

I am trying to group list of items relevant to a query item. Below is an example of the problem and my attempt at it:
>>> _list=[[1,2,3],[2,3,4]]
>>> querylist=[1,2,4]
>>> relvant=[]
>>> for x in querylist:
for y in _list:
if x in y:
relvant.append(y)
My output:
>>> relvant
[[1, 2, 3], [1, 2, 3], [2, 3, 4], [2, 3, 4]]
Desired output:
[[[1, 2, 3]], [[1, 2, 3], [2, 3, 4]],[[2, 3, 4]]]
The issue is after each loop of a query item, I expected the relevant lists to be grouped but that isn't the case with my attempt.
Thanks for your suggestions.

I think it's clearer to use a list comprehension:
>>> _list = [[1,2,3],[2,3,4]]
>>> querylist = [1,2,4]
>>> [[l for l in _list if x in l] for x in querylist]
[[[1, 2, 3]], [[1, 2, 3], [2, 3, 4]], [[2, 3, 4]]]
The inner expression [l for l in _list if x in l] describes the list of all sublists that contain x. The outer expression's job is to get that list for all values of x in the query list.

By making minimal changes in the code provided you can create new dummy list to store values and at end of each inner loop iteration you just append it to the main list.
_list=[[1,2,3],[2,3,4]]
querylist=[1,2,4]
relvant=[]
for x in querylist:
dummy = []
for y in _list:
if x in y:
dummy.append(y)
relvant.append(dummy)
print relvant
>>> [[[1, 2, 3]], [[1, 2, 3], [2, 3, 4]],[[2, 3, 4]]]

Splitting a list of arbitrary size into N-not-equal parts [duplicate]

This question already has answers here:
How to group a list of tuples/objects by similar index/attribute in python?
(3 answers)
Closed 8 years ago.
I see splitting-a-list-of-arbitrary-size-into-only-roughly-n-equal-parts. How about not-equal splitting? I have list having items with some attribute (value which can be retrieved for running same function against every item), how to split items having same attribute to be new list e.g. new sublist? Something lambda-related could work here?
Simple example could be:
list = [1, 1, 1, 2, 3, 3, 3, 3, 4, 4]
After fancy operation we could have:
list = [[1, 1, 1], [2], [3, 3, 3, 3], [4, 4]]

>>> L = [1, 1, 1, 2, 3, 3, 3, 3, 4, 4]
>>> [list(g) for i, g in itertools.groupby(L)]
[[1, 1, 1], [2], [3, 3, 3, 3], [4, 4]]
>>> L2 = ['apple', 'aardvark', 'banana', 'coconut', 'crow']
>>> [list(g) for i, g in itertools.groupby(L2, operator.itemgetter(0))]
[['apple', 'aardvark'], ['banana'], ['coconut', 'crow']]

You should use the itertools.groupby function from the standard library.
This function groups the elements in the iterable it receives (by default using the identity function, i.e., checking consequent elements for equality), and for each streak of grouped elements, it reutrns a 2-tuple consisting of the streak representative (the element itself), and an iterator of the elements within the streak.
Indeed:
l = [1, 1, 1, 2, 3, 3, 3, 3, 4, 4]
list(list(k[1]) for k in groupby(l))
>>> [[1, 1, 1], [2], [3, 3, 3, 3], [4, 4]]
P.S. you should avoid using list as a variable name, as it would conflict with the built-in type/function.

Here's a pretty simple roll your own solution. If the 'attribute' in question is simply the value of the item, there are more straightforward approaches.
def split_into_sublists(data_list, sizes_list):
if sum(sizes_list) != len(data_list):
raise ValueError
count = 0
output = []
for size in sizes_list:
output.append(data_list[count:count+size])
count += size
return output
if __name__ == '__main__':
data_list = [1, 1, 1, 2, 3, 3, 3, 3, 4, 4]
sizes_list = [3,1,4,2]
list2 = [[1, 1, 1], [2], [3, 3, 3, 3], [4, 4]]
print(split_into_sublists(data_list, sizes_list) == list2) # True

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

How to pack consecutive duplicates of list elements into sublists? - python

Related

Subtract previous list from current list in a list of lists loop

Python groupby to split list by delimiter

How to sort list of list by size of list and by element in list in mixed list

How to retrieve list(s) that contains specific query items

Splitting a list of arbitrary size into N-not-equal parts [duplicate]

Categories

Resources