This question already has answers here:
How to split a list-of-strings into sublists-of-strings by a specific string element
(6 answers)
Closed 9 months ago.
I am trying to split a list into sublists if it contains a certain element like '---'.
For example, if I have a list:
['a', 'b', 'c', '----', 'd', 'e'], then the resulting list should be
[['a', 'b', 'c'], ['d', 'e']]
I am new to python and struggling with this, this is the code that I wrote for this problem but its not working
start_index = 0
end_index = 0
new_list = []
for character in range(0, len(characters_list)- 1):
if characters_list[character] == '----':
end_index = character - 1
if character == characters_list.index('----'):
start_index = 0
else:
start_index = character + 1
for char in range(start_index, end_index):
new_list.append(characters_list[char])
Use groupby from itertools. It groups the terms of the list into subslists wrt to the criterium described by key-function. Use the match (is a boolean value) to filter the sublist.
import itertools as it
characters_list = #
new_lst = list(list(i) for match, i in it.groupby(characters_list, lambda p: p == '----') if not match)
print(new_lst)
To make clear how the key works, here an example of grouping with the opposite condition
list(list(i) for match, i in it.groupby(characters_list, lambda p: p != '----') if match)
A more intuitive approach
lst = ['a', 'b', 'c', '----', 'd', 'e', '----', '1']
out = [[]]
for term in lst:
if term != '----':
out[-1].append(term)
else:
out.append([])
print(out)
Related
I have a list of lists where the length of the lists are same. I need to find the common elements from them with the order of occurrence maintained.
For example:
Suppose the list of lists is [['a','e','d','c','f']['e','g','a','d','c']['c','a','h','e','j']]
The output list should contain ['a','e','c'] Priority should be given to elements which occur earlier in most of the lists. In this example 'a' occurs earlier, then 'e' and so on.
How to proceed with this?
you could find common items first then sorted it
from collections import defaultdict
data = [['a','e','d','c','f'],['e','g','a','d','c'],['c','a','h','e','j']]
common = set(data[0])
for line in data:
common = common.intersection(set(line))
res = defaultdict(int)
for line in data:
for idx, item in enumerate(line):
if item in common:
res[item] += idx
[item[0] for item in sorted(res.items(), key=lambda x: x[1])]
output:
['a', 'e', 'c']
Here's a quick solution that I managed to get working:
data = [['a', 'e', 'd', 'c', 'f'],
['e', 'g', 'a', 'd', 'c'], ['c', 'a', 'h', 'e', 'j']]
# count number of times each character appears
char_count = {}
for arr in data:
for char in arr:
if not char in char_count:
char_count.update({char: 1})
else:
char_count[char] += 1
# select characters that appear multiple times
common_chars = [i[0] for i in char_count.items() if i[1] > 1]
# remove characters that are not present in all lists
for char in common_chars:
count = 0
for arr in data:
if char in arr:
count += 1
if count < len(data):
common_chars.remove(char)
# final result with common characters
print(common_chars)
Resulting output:
['a', 'e', 'c']
Probably not the most efficient solution if you're working with lots of data though.
This question already has answers here:
How to get all subsets of a set? (powerset)
(32 answers)
Closed 1 year ago.
I am not sure of the technical terminology for what I am trying to do, but this is the gist of it. I have the following list:
x = ['a', 'b', 'c']
I want to create a new list y where len(y) = 2 ** len(x) such that:
y = ['∅', 'a', 'b', 'c', 'a,b', 'a,c', 'b,c', 'a,b,c']
I am unsure of what operations to use when looping through x to create the desired list y.
Although this is much less efficient than itertools, if you are not allowed to use libraries, you could make a recursive function to produce the power-set and assemble the strings using join() in a list comprehension:
def powerSet(L):
return [[]] if not L else [c for p in powerSet(L[1:]) for c in (p,L[:1]+p)]
x = ['a','b','c']
y = [",".join(s) or "ø" for s in powerSet(x)]
print(y)
['ø', 'a', 'b', 'a,b', 'c', 'a,c', 'b,c', 'a,b,c']
You can also do this directly in an iterative function that extends all previous combinations with each letter in the list:
def allCombos(L):
result = [""]
for c in L:
result.extend([f"{r},{c}" if r else c for r in result])
result[0] = "ø"
return result
print(allCombos(x))
['ø', 'a', 'b', 'a,b', 'c', 'a,c', 'b,c', 'a,b,c']
This question already has answers here:
Python split for lists
(6 answers)
Closed 2 years ago.
I want to create sub-lists from a list which has many repeating elements, ie.
l = ['a', 'b', 'c', 'c', 'b', 'a', 'b', 'c', 'b', 'a']
Wherever the 'a' begins the list should be split. (preferably removing 'a' but not a must)
As such:
l = [ ['b', 'c', 'c', 'b'], ['b', 'c', 'b'] ]
I have tried new_list = [x.split('a')[-1] for x in l] but I am not getting the desired "New list" effect.
When you write,
new_list = [x.split('a')[-1] for x in l]
you are essentially performing,
result = []
for elem in l:
result.append(elem.split('a')[-1])
That is, you're splitting each string contained in l on the letter 'a', and collecting the last element of each of the strings into the result.
Here's one possible implementation of the mechanic you're looking for:
def extract_parts(my_list, delim):
# Locate the indices of all instances of ``delim`` in ``my_list``
indices = [i for i, x in enumerate(my_list) if x == delim]
# Collect each end-exclusive sublist bounded by each pair indices
sublists = []
for i in range(len(indices)-1):
part = my_list[indices[i]+1:indices[i+1]]
sublists.append(part)
return sublists
Using this function, we have
>>> l = ['a', 'b', 'c', 'c', 'b', 'a', 'b', 'c', 'b', 'a']
>>> extract_parts(l, 'a')
[['b', 'c', 'c', 'b'], ['b', 'c', 'b']]
You can use zip and enumerate to do that. Create a list of ids for separation and just break it at those points.
size = len(l)
id_list = [id + 1 for id, val in
enumerate(test_list) if val == 'a']
result = [l[i:j] for i, j in zip([0] + id_list, id_list +
([size] if id_list[-1] != size else []))]
It will not include the delimiter
import itertools
lst = ['a', 'b', 'c', 'c', 'b', 'a', 'b', 'c', 'b', 'a']
delimiter = lst[0]
li=[list(value) for key,value in itertools.groupby(lst, lambda e: e == delimiter) if not key]
print(li)
Explanation: groupby function will create a new group each time key will change
Key value
True itertools._grouper object pointing to group 'a'
False itertools._grouper object pointing to group 'b', 'c', 'c', 'b'
True itertools._grouper object pointing to group 'a'
False itertools._grouper object pointing to group 'b', 'c', 'b'
True itertools._grouper object pointing to group 'a'
In if condition checking if the key is false, return the itertools._grouper object and then pass itertool object to list.
Create a counters array for each element you want to split at then write a condition in this fashion:
l = ['a', 'b', 'c', 'c', 'b', 'a', 'b', 'c', 'b', 'a']
counters = [0,0,0] #to count instances
index = 0 #index of l
startIndex = 0 #where to start split
endIndex = 0 #where to end split
splitLists = [] #container for splits
for element in l:
if element == 'a': #find 'a'
counters[0] += 1 #increase counter
if counters[0] == 1: #if first instance
startIndex = index + 1 #start split after
if counters[0] == 2:
endIndex = index #if second instance
splitList = l[startIndex:endIndex] #end split here
counters[0] = 1 #make second starting location
startIndex = index + 1
splitLists.append(splitList) #append to main list of lists
index += 1
print(splitLists)
So basically you are finding the start and end index of the matching pattern within the list. You use these to split the list, and append this list into a main list of lists (2d list).
Trying to implement and form a very simple algorithm. This algorithm takes in a sequence of letters or numbers. It first creates an array (list) out of each character or digit. Then it checks each individual character compared with the following character in the sequence. If the two are equal, it removes the character from the array.
For example the input: 12223344112233 or AAAABBBCCCDDAAABB
And the output should be: 1234123 or ABCDAB
I believe the issue stems from the fact I created a counter and increment each loop. I use this counter for my comparison using the counter as an index marker in the array. Although, each time I remove an item from the array it changes the index while the counter increases.
Here is the code I have:
def sort(i):
iter = list(i)
counter = 0
for item in iter:
if item == iter[counter + 1]:
del iter[counter]
counter = counter + 1
return iter
You're iterating over the same list that you are deleting from. That usually causes behaviour that you would not expect. Make a copy of the list & iterate over that.
However, there is a simpler solution: Use itertools.groupby
import itertools
def sort(i):
return [x for x, _ in itertools.groupby(list(i))]
print(sort('12223344112233'))
Output:
['1', '2', '3', '4', '1', '2', '3']
A few alternatives, all using s = 'AAAABBBCCCDDAAABB' as setup:
>>> import re
>>> re.sub(r'(.)\1+', r'\1', s)
'ABCDAB'
>>> p = None
>>> [c for c in s if p != (p := c)]
['A', 'B', 'C', 'D', 'A', 'B']
>>> [c for c, p in zip(s, [None] + list(s)) if c != p]
['A', 'B', 'C', 'D', 'A', 'B']
>>> [c for i, c in enumerate(s) if not s.endswith(c, None, i)]
['A', 'B', 'C', 'D', 'A', 'B']
The other answers a good. This one iterates over the list in reverse to prevent skipping items, and uses the look ahead type algorithm OP described. Quick note OP this really isn't a sorting algorithm.
def sort(input_str: str) -> str:
as_list = list(input_str)
for idx in range(len(as_list), 0, -1)):
if item == as_list[idx-1]:
del as_list[idx]
return ''.join(as_list)
This question already has answers here:
Sorting a List by frequency of occurrence in a list
(7 answers)
Closed 4 years ago.
Input:
"tree"
Output:
"eert"
Explanation:
'e' appears twice while 'r' and 't' both appear once.
So 'e' must appear before both 'r' and 't'. Therefore "eetr" is also a valid answer.
I tried something like this :
class Solution(object):
def frequencySort(self, s):
"""
:type s: str
:rtype: str
"""
has = dict()
l = list()
for c in s:
if c not in has:
has[c] = 1
else:
has[c] += 1
for k in sorted(has,key = has.get, reverse = True):
for i in range(has[k]):
l.extend(k)
return ("".join(l))
but its O(n * m)
n = length of string, m = maximum occurrence of a character
How can i improve this to order of n?
Is there a reason you cannot use the built in sort with a lambda key?
>>> a = 'aabbbcccccd'
>>> sorted(a, key=lambda c: a.count(c))
['d', 'a', 'a', 'b', 'b', 'b', 'c', 'c', 'c', 'c', 'c']
>>> sorted(a, key=lambda c: a.count(c), reverse=True)
['c', 'c', 'c', 'c', 'c', 'b', 'b', 'b', 'a', 'a', 'd']
>>> ''.join(sorted(a, key=lambda c: a.count(c), reverse=True))
'cccccbbbaad'
I believe python's sort methods is O(n log n), but the count will make this O(n^2)