Split List By Value and Keep Separators - python
I have a list called list_of_strings that looks like this:
['a', 'b', 'c', 'a', 'd', 'c', 'e']
I want to split this list by a value (in this case c). I also want to keep c in the resulting split.
So the expected result is:
[['a', 'b', 'c'], ['a', 'd', 'c'], ['e']]]
Any easy way to do this?
You can use more_itertoools+ to accomplish this simply and clearly:
from more_itertools import split_after
lst = ["a", "b", "c", "a", "d", "c", "e"]
list(split_after(lst, lambda x: x == "c"))
# [['a', 'b', 'c'], ['a', 'd', 'c'], ['e']]
Another example, here we split words by simply changing the predicate:
lst = ["ant", "bat", "cat", "asp", "dog", "carp", "eel"]
list(split_after(lst, lambda x: x.startswith("c")))
# [['ant', 'bat', 'cat'], ['asp', 'dog', 'carp'], ['eel']]
+ A third-party library that implements itertools recipes and more. > pip install more_itertools
stuff = ['a', 'b', 'c', 'a', 'd', 'c', 'e']
You can find out the indices with 'c' like this, and add 1 because you'll be splitting after it, not at its index:
indices = [i + 1 for i, x in enumerate(stuff) if x == 'c']
Then extract slices like this:
split_stuff = [stuff[i:j] for i, j in zip([0] + indices, indices + [None])]
The zip gives you a list of tuples analogous to (indices[i], indices[i + 1]), with the concatenated [0] allowing you to extract the first part and [None] extracting the last slice (stuff[i:])
You could try something like the following:
list_of_strings = ['a', 'b', 'c', 'a', 'd', 'c', 'e']
output = [[]]
for x in list_of_strings:
output[-1].append(x)
if x == 'c':
output.append([])
Though it should be noted that this will append an empty list to your output if your input's last element is 'c'
def spliter(value, array):
res = []
while value in array:
index = array.index(value)
res.append(array[:index + 1])
array = array[index + 1:]
if array:
# Append last elements
res.append(array)
return res
a = ['a', 'b', 'c', 'a', 'd', 'c', 'e']
print(spliter('b',a))
# [['a', 'b'], ['c', 'a', 'd', 'c', 'e']]
print(spliter('c',a))
# [['a', 'b', 'c'], ['a', 'd', 'c'], ['e']]
What about this. It should only iterate over the input once and some of that is in the index method, which is executed as native code.
def splitkeep(v, c):
curr = 0
try:
nex = v.index(c)
while True:
yield v[curr: (nex + 1)]
curr = nex + 1
nex += v[curr:].index(c) + 1
except ValueError:
if v[curr:]: yield v[curr:]
print(list(splitkeep( ['a', 'b', 'c', 'a', 'd', 'c', 'e'], 'c')))
result
[['a', 'b', 'c'], ['a', 'd', 'c'], ['e']]
I wasn't sure if you wanted to keep an empty list at the end of the result if the final value was the value you were splitting on. I made an assumption you wouldn't, so I put a condition in excluding the final value if it's empty.
This has the result that the input [] results in only [] when arguably it might result in [[]].
How about this rather playful script:
a = ['a', 'b', 'c', 'a', 'd', 'c', 'e']
b = ''.join(a).split('c') # ['ab', 'ad', 'e']
c = [x + 'c' if i < len(b)-1 else x for i, x in enumerate(b)] # ['abc', 'adc', 'e']
d = [list(x) for x in c if x]
print(d) # [['a', 'b', 'c'], ['a', 'd', 'c'], ['e']]
It can also handle beginnings and endings with a "c"
a = ['c', 'a', 'b', 'c', 'a', 'd', 'c', 'e', 'c']
d -> [['c'], ['a', 'b', 'c'], ['a', 'd', 'c'], ['e', 'c']]
list_of_strings = ['a', 'b', 'c', 'a', 'd', 'c', 'e']
value = 'c'
new_list = []
temp_list = []
for item in list_of_strings:
if item is value:
temp_list.append(item)
new_list.append(temp_list[:])
temp_list.clear()
else:
temp_list.append(item)
if (temp_list):
new_list.append(temp_list)
print(new_list)
You can try using below snippet. Use more_itertools
>>> l = ['a', 'b', 'c', 'a', 'd', 'c', 'e']
>>> from more_itertools import sliced
>>> list(sliced(l,l.index('c')+1))
Output is:
[['a', 'b', 'c'], ['a', 'd', 'c'], ['e']]
Related
How can I iterate in a list with sepration of them?
I want to say for f in x: do certain thing. but i dont want to connect them to each other. first do a work for x[0] then for x[1] then etc. x = ['ABCED', 'BACF', 'BCD'] for exmple, it cant seprate them by the index, it just repeate print part for all q in x. how can i fix it? for f in x: for q in f: print('hi') the output that i except that is: [[A,B,C,E,D],[B,A,C,F],[B,C,D]]
Use list on str create a list of each character: out = [list(i) for i in x] print(out) # Output [['A', 'B', 'C', 'E', 'D'], ['B', 'A', 'C', 'F'], ['B', 'C', 'D']]
x = ['ABCED', 'BACF', 'BCD'] new_list= [list(i) for i in x] or: list(map(list, x)) output: [['A', 'B', 'C', 'E', 'D'], ['B', 'A', 'C', 'F'], ['B', 'C', 'D']]
Group items if trailed by string [duplicate]
I have a list called list_of_strings that looks like this: ['a', 'b', 'c', 'a', 'd', 'c', 'e'] I want to split this list by a value (in this case c). I also want to keep c in the resulting split. So the expected result is: [['a', 'b', 'c'], ['a', 'd', 'c'], ['e']]] Any easy way to do this?
You can use more_itertoools+ to accomplish this simply and clearly: from more_itertools import split_after lst = ["a", "b", "c", "a", "d", "c", "e"] list(split_after(lst, lambda x: x == "c")) # [['a', 'b', 'c'], ['a', 'd', 'c'], ['e']] Another example, here we split words by simply changing the predicate: lst = ["ant", "bat", "cat", "asp", "dog", "carp", "eel"] list(split_after(lst, lambda x: x.startswith("c"))) # [['ant', 'bat', 'cat'], ['asp', 'dog', 'carp'], ['eel']] + A third-party library that implements itertools recipes and more. > pip install more_itertools
stuff = ['a', 'b', 'c', 'a', 'd', 'c', 'e'] You can find out the indices with 'c' like this, and add 1 because you'll be splitting after it, not at its index: indices = [i + 1 for i, x in enumerate(stuff) if x == 'c'] Then extract slices like this: split_stuff = [stuff[i:j] for i, j in zip([0] + indices, indices + [None])] The zip gives you a list of tuples analogous to (indices[i], indices[i + 1]), with the concatenated [0] allowing you to extract the first part and [None] extracting the last slice (stuff[i:])
You could try something like the following: list_of_strings = ['a', 'b', 'c', 'a', 'd', 'c', 'e'] output = [[]] for x in list_of_strings: output[-1].append(x) if x == 'c': output.append([]) Though it should be noted that this will append an empty list to your output if your input's last element is 'c'
def spliter(value, array): res = [] while value in array: index = array.index(value) res.append(array[:index + 1]) array = array[index + 1:] if array: # Append last elements res.append(array) return res a = ['a', 'b', 'c', 'a', 'd', 'c', 'e'] print(spliter('b',a)) # [['a', 'b'], ['c', 'a', 'd', 'c', 'e']] print(spliter('c',a)) # [['a', 'b', 'c'], ['a', 'd', 'c'], ['e']]
What about this. It should only iterate over the input once and some of that is in the index method, which is executed as native code. def splitkeep(v, c): curr = 0 try: nex = v.index(c) while True: yield v[curr: (nex + 1)] curr = nex + 1 nex += v[curr:].index(c) + 1 except ValueError: if v[curr:]: yield v[curr:] print(list(splitkeep( ['a', 'b', 'c', 'a', 'd', 'c', 'e'], 'c'))) result [['a', 'b', 'c'], ['a', 'd', 'c'], ['e']] I wasn't sure if you wanted to keep an empty list at the end of the result if the final value was the value you were splitting on. I made an assumption you wouldn't, so I put a condition in excluding the final value if it's empty. This has the result that the input [] results in only [] when arguably it might result in [[]].
How about this rather playful script: a = ['a', 'b', 'c', 'a', 'd', 'c', 'e'] b = ''.join(a).split('c') # ['ab', 'ad', 'e'] c = [x + 'c' if i < len(b)-1 else x for i, x in enumerate(b)] # ['abc', 'adc', 'e'] d = [list(x) for x in c if x] print(d) # [['a', 'b', 'c'], ['a', 'd', 'c'], ['e']] It can also handle beginnings and endings with a "c" a = ['c', 'a', 'b', 'c', 'a', 'd', 'c', 'e', 'c'] d -> [['c'], ['a', 'b', 'c'], ['a', 'd', 'c'], ['e', 'c']]
list_of_strings = ['a', 'b', 'c', 'a', 'd', 'c', 'e'] value = 'c' new_list = [] temp_list = [] for item in list_of_strings: if item is value: temp_list.append(item) new_list.append(temp_list[:]) temp_list.clear() else: temp_list.append(item) if (temp_list): new_list.append(temp_list) print(new_list)
You can try using below snippet. Use more_itertools >>> l = ['a', 'b', 'c', 'a', 'd', 'c', 'e'] >>> from more_itertools import sliced >>> list(sliced(l,l.index('c')+1)) Output is: [['a', 'b', 'c'], ['a', 'd', 'c'], ['e']]
Removing duplicate characters from a list in Python where the pattern repeats
I am monitoring a serial port that sends data that looks like this: ['','a','a','a','a','a','a','','b','b','b','b','b','b','b','b', '','','c','c','c','c','c','c','','','','d','d','d','d','d','d','d','d', '','','e','e','e','e','e','e','','','a','a','a','a','a','a', '','','','b','b','b','b','b','b','b','b','b','','','c','c','c','c','c','c', '','','','d','d','d','d','d','d','','','e','e','e','e','e','e', '','','a','a','a','a','a','a','','b','b','b','b','b','b','b','b', '','','c','c','c','c','c','c','','','','d','d','d','d','d','d','d','d', '','','e','e','e','e','e','e','','','a','a','a','a','a','a', '','','','b','b','b','b','b','b','b','b','b','','','c','c','c','c','c','c', '','','','d','d','d','d','d','d','','','e','e','e','e','e','e','',''] I need to be able to convert this into: ['a','b','c','d','a','b','c','d','a','b','c','d','a','b','c','d'] So I'm removing duplicates and empty strings, but also retaining the number of times the pattern repeats itself. I haven't been able to figure it out. Can someone help?
Here's a solution using a list comprehension and itertools.zip_longest: keep an element only if it's not an empty string, and not equal to the next element. You can use an iterator to skip the first element, to avoid the cost of slicing the list. from itertools import zip_longest def remove_consecutive_duplicates(lst): ahead = iter(lst) next(ahead) return [ x for x, y in zip_longest(lst, ahead) if x and x != y ] Usage: >>> remove_consecutive_duplicates([1, 1, 2, 2, 3, 1, 3, 3, 3, 2]) [1, 2, 3, 1, 3, 2] >>> remove_consecutive_duplicates(my_list) ['a', 'b', 'c', 'd', 'e', 'a', 'b', 'c', 'd', 'e', 'a', 'b', 'c', 'd', 'e', 'a', 'b', 'c', 'd', 'e'] I'm assuming either that there are no duplicates separated by empty strings (e.g. 'a', '', 'a'), or that you don't want to remove such duplicates. If this assumption is wrong, then you should filter out the empty strings first: >>> example = ['a', '', 'a'] >>> remove_consecutive_duplicates([ x for x in example if x ]) ['a']
You can loop over the list and add the appropriate contitions. For the response that you are expecting, you just need to whether previous character is not same as current character current_sequence = ['','a','a','a','a','a','a','','b','b','b','b','b','b','b','b','','','c','c','c','c','c','c','','','','d','d','d','d','d','d','d','d','','','e','e','e','e','e','e','','','a','a','a','a','a','a','','','','b','b','b','b','b','b','b','b','b','','','c','c','c','c','c','c','','','','d','d','d','d','d','d','','','e','e','e','e','e','e','','','a','a','a','a','a','a','','b','b','b','b','b','b','b','b','','','c','c','c','c','c','c','','','','d','d','d','d','d','d','d','d','','','e','e','e','e','e','e','','','a','a','a','a','a','a','','','','b','b','b','b','b','b','b','b','b','','','c','c','c','c','c','c','','','','d','d','d','d','d','d','','','e','e','e','e','e','e','',''] sequence_list = [] for x in range(len(current_sequence)): if current_sequence[x]: if current_sequence[x] != current_sequence[x-1]: sequence_list.append(current_sequence[x]) print(sequence_list)
You need something like that li = ['','a','a','a','a','a','a','','b','b','b','b','b','b','b','b','','','c','c','c','c','c','c','','','','d','d','d','d','d','d','d','d','','','e','e','e','e','e','e','','','a','a','a','a','a','a','','','','b','b','b','b','b','b','b','b','b','','','c','c','c','c','c','c','','','','d','d','d','d','d','d','','','e','e','e','e','e','e','','','a','a','a','a','a','a','','b','b','b','b','b','b','b','b','','','c','c','c','c','c','c','','','','d','d','d','d','d','d','d','d','','','e','e','e','e','e','e','','','a','a','a','a','a','a','','','','b','b','b','b','b','b','b','b','b','','','c','c','c','c','c','c','','','','d','d','d','d','d','d','','','e','e','e','e','e','e','',''] new_li = [] e_ = '' for e in li: if len(e) > 0 and e_ != e: new_li.append(e) e_ = e print(new_li) Output ['a', 'b', 'c', 'd', 'e', 'a', 'b', 'c', 'd', 'e', 'a', 'b', 'c', 'd', 'e', 'a', 'b', 'c', 'd', 'e']
You can use itertools.groupby: if your list is ll ll = [i for i in ll if i] out = [] for k, g in groupby(ll, key=lambda x: ord(x)): out.append(chr(k)) print(out) #prints ['a', 'b', 'c', 'd', 'e', 'a', 'b', 'c', 'd', 'e', ...
from itertools import groupby from operator import itemgetter # data <- your data a = [k for k, v in groupby(data) if k] # approach 1 b = list(filter(bool, map(itemgetter(0), groupby(data)))) # approach 2 assert a == b print(a) Result: ['a', 'b', 'c', 'd', 'e', 'a', 'b', 'c', 'd', 'e', 'a', 'b', 'c', 'd', 'e', 'a', 'b', 'c', 'd', 'e']
using the set method you can remove the duplicates from the list data = ['','a','a','a','a','a','a','','b','b','b','b','b','b','b','b', '','','c','c','c','c','c','c','','','','d','d','d','d','d','d','d','d', '','','e','e','e','e','e','e','','','a','a','a','a','a','a', '','','','b','b','b','b','b','b','b','b','b','','','c','c','c','c','c','c', '','','','d','d','d','d','d','d','','','e','e','e','e','e','e', '','','a','a','a','a','a','a','','b','b','b','b','b','b','b','b', '','','c','c','c','c','c','c','','','','d','d','d','d','d','d','d','d', '','','e','e','e','e','e','e','','','a','a','a','a','a','a', '','','','b','b','b','b','b','b','b','b','b','','','c','c','c','c','c','c', '','','','d','d','d','d','d','d','','','e','e','e','e','e','e','',''] print(set(data))
How can I copy each element of a list a distinct, specified number of times?
I am using python 3 and I want to create a new list with elements from a the first list repeated as many times as the respective number of the second list For example: char = ['a', 'b', 'c'] int = [2, 4, 3] result = ['a', 'a', 'b', 'b', 'b', 'b', 'c', 'c', 'c'] Thx all for help
One-liner solution Iterate over both lists simultaneously with zip, and create sub-lists for each element with the correct length. Join them with itertools.chain: # from itertools import chain list(chain(*([l]*n for l, n in zip(char, int)))) Output: ['a', 'a', 'b', 'b', 'b', 'b', 'c', 'c', 'c']
char = ['a', 'b', 'c'] ints = [2, 4, 3] Solution 1: Using numpy import numpy as np result = np.repeat(char, ints) Solution 2: Pure python result = [] for i, c in zip(ints, char): result.extend(c*i) Output: ['a', 'a', 'b', 'b', 'b', 'b', 'c', 'c', 'c']
Using zip Ex: c = ['a', 'b', 'c'] intVal = [2, 4, 3] result = [] for i, v in zip(c, intVal): result.extend(list(i*v)) print(result) Output: ['a', 'a', 'b', 'b', 'b', 'b', 'c', 'c', 'c']
With for loops, very basic: results = list() for k, i in enumerate(integers): results_to_add = char[k]*i results.extend(results_to_add)
char = ['a', 'b', 'c'] rep = [2, 4, 3] res = [c*i.split(",") for i,c in zip(char, rep )] # [['a', 'a'], ['b', 'b', 'b', 'b'], ['c', 'c', 'c']] print([item for sublist in res for item in sublist]) # flattening the list EDIT: one-liner using itertools.chain: print(list(chain(*[c*i.split(",") for (i,c) in zip(char, int)]))) OUTPUT: ['a', 'a', 'b', 'b', 'b', 'b', 'c', 'c', 'c']
One liner using list-comprehension and sum(list_, []). sum([[x]*y for x,y in zip(char_, int_)], []) >>> char_ = ['a', 'b', 'c'] >>> int_ = [2, 4, 3] >>> print(sum([[x]*y for x,y in zip(char_, int_)], [])) >>> ['a', 'a', 'b', 'b', 'b', 'b', 'c', 'c', 'c'] Alternative: list(itertools.chain.from_iterable([[x]*y for x,y in zip(char_, int_)])) Looks like it is faster than using itertools. >>> timeit.repeat(lambda:list(itertools.chain.from_iterable([[x]*y for x,y in zip(char_, int_)])), number = 1000000) [1.2130177360377274, 1.115080286981538, 1.1174913379945792] >>> timeit.repeat(lambda:sum([[x]*y for x,y in zip(char_, int_)], []), number = 1000000) [1.0470570910256356, 0.9831087450147606, 0.9912429330288433]
Python List Column Move
I'm trying to move the second value in a list to the third value in a list for each nested list. I tried the below, but it's not working as expected. Code List = [['a','b','c','d'],['a','b','c','d'],['a','b','c','d']] print(List) col_out = [List.pop(1) for col in List] col_in = [List.insert(2,List) for col in col_out] print(List) Result [['a', 'b', 'c', 'd'], ['a', 'b', 'c', 'd'], ['a', 'b', 'c', 'd']] [['a', 'b', 'c', 'd'], [...], [...]] Desired Result [['a', 'c', 'b', 'd'], ['a', 'c', 'b', 'd'], ['a', 'c', 'b', 'd']] UPDATE Based upon pynoobs comment, i came up with the following. But i'm still not there. Why is 'c' printing? Code List = [['a','b','c','d'],['a','b','c','d'],['a','b','c','d']] col_out = [col.pop(1) for col in List for i in col] print(col_out) Result ['b', 'c', 'b', 'c', 'b', 'c']
[List.insert(2,List) for col in col_out] ^^^^ -- See below. You are inserting an entire list as an element within the same list. Think recursion! Also, please refrain from using state-changing expressions in list comprehension. A list comprehension should NOT modify any variables. It is bad manners! In your case, you'd do: lists = [['a','b','c','d'],['a','b','c','d'],['a','b','c','d']] for lst in lists: lst[1], lst[2] = lst[2], lst[1] print(lists) Output: [['a', 'c', 'b', 'd'], ['a', 'c', 'b', 'd'], ['a', 'c', 'b', 'd']]
You can do it like this myList = [['a','b','c','d'],['a','b','c','d'],['a','b','c','d']] myOrder = [0,2,1,3] myList = [[sublist[i] for i in myOrder] for sublist in myList]