Iterate lists at intervals based on list values - python

I've been trying to accomplish this in a few different ways and just can't quite seem to get it to work for me.
I'm trying to iterate over a list in blocks, where the first index value is an integer for how many elements are in the first block. After that, another integer with n elements, and another, etc.
Example:
test = [3, 'a', 'b', 'c', 2, 'd', 'e', 3, 'f', 'g', 'h']
I want to read 3, pull 'a', 'b', 'c' from the list and perform some operation on them.
Then return to the list at 2, pull 'd', 'e' - more operations, etc.
Or even just using the integers to split into sub-lists would work.
I'm thinking list slicing with updated [start:stop:step] variables but am having trouble pulling it together.
Any suggestions?
Can only use the standard Python library.

You could create a generator to iterate lazily on the parts of the list:
test = [3, 'a', 'b', 'c', 2, 'd', 'e', 3, 'f', 'g', 'h']
​
def parts(lst):
idx = 0
while idx < len(lst):
part_length = lst[idx]
yield lst[idx+1: idx + part_length + 1 ]
idx += part_length+1
for part in parts(test):
print(part)
Output:
['a', 'b', 'c']
['d', 'e']
['f', 'g', 'h']

If your input structure is always like this you can do the following:
result = [test[i:i+j] for i, j in enumerate(test, 1) if isinstance(j, int)]
print(result)
# [['a', 'b', 'c'], ['d', 'e'], ['f', 'g', 'h']]

Using an iterator on the list makes this super simple. Just grab the next item which tells you how much more to grab next, and so on until the end of the list:
test = [3, 'a', 'b', 'c', 2, 'd', 'e', 3, 'f', 'g', 'h']
it = iter(test)
for num in it:
print(", ".join(next(it) for _ in range(num)))
which prints:
a, b, c
d, e
f, g, h
You can also convert this to a list if you need to save the result:
>>> it = iter(test)
>>> [[next(it) for _ in range(num)] for num in it]
[['a', 'b', 'c'], ['d', 'e'], ['f', 'g', 'h']]

Related

Use list of lists created from orignal list to get remaining values as a list of lists

I have a list that was used to create a new list of lists (sub_lst1) and I want to use that sub_list1 to filter the remaining values in the list.
And use those remaining values to create a new list of lists (sub_lst2).
I have provided a toy example below of the problem.
I have tried the following:
lst = ['f','f','a','g','h','a','b','g','h','a','h','d','a','b']
sub_lst1 = []
sub_lst2 = []
>>> for i, v in enumerate(lst):
... if "b" in v:
... sub_lst1.append(lst[i-3:i+1])
>>>print(sub_lst1)
[['g', 'h', 'a', 'b'], ['h', 'd', 'a', 'b']]
>>> for i, v in enumerate(lst):
... if sub_lst1[0:][0:] not in v:
... sub_lst2.append(lst[i-2:i+1])
>>> print(sub_lst2)
[[], [], ['f', 'f', 'a'],['f', 'a', 'g'], ['a', 'g', 'h'], ['g', 'h', 'a'], ['a', 'b', 'g'], ['b', 'g', 'h'], ['g', 'h', 'a'], ['h', 'a', 'h'], ['a', 'h', 'd'], ['h', 'd', 'a']]
But the desired result would be to have the two sub-lists where one sub-list has the two preceding values to 'a' and 'b' and the second sub-list has the two preceding values of 'a' where 'b' does not follow 'a'. The sub-lists would look as follows:
>>> print(sub_lst1)
[['g', 'h', 'a', 'b'], ['h', 'd', 'a', 'b']]
>>> print(sub_lst2)
[['f', 'f', 'a'], ['g', 'h', 'a']]
I'd recommend tackling this problem by simply finding where the 'b's are and then slicing up the main list in one go, rather than doing it in two steps. For example:
lst = ['f','f','a','g','h','a','b','g','h','a','h','d','a','b']
sub_lst1 = []
sub_lst2 = []
to_find = 'b'
found_indexes = []
for i, v in enumerate(lst):
if v == to_find:
found_indexes.append(i)
last_idx = -1
for idx in found_indexes:
sub_lst2.append(lst[last_idx+1:idx-3])
sub_lst1.append(lst[idx-3:idx+1])
last_idx = idx
if lst[-1] != to_find: # don't forget to check in case 'b' isn't the last entry
sub_lst2.append(lst[last_idx+1:])
This gets the result you are looking for, assuming that if 'b' isn't the last entry, you would want sub_lst2 to include the trailing letters.
I'm sure there's a solution with superior speed that involves turning the first lst into a single string of characters and then split()ing it using 'b', but I don't think speed is your concern and this answer will work for a list with contents of any type, not just single characters.

Combination of elements in a list with constraints

I am writing a python code and I need help with a task. I have a list of 8 elements
[A,B,C,D,E,F,G,H]
and I need to find all the combinations of shorter lists (4 elements) in lexicographic order such that two elements are taken from the subset A,C,E,G and the other two from B,D,F,H. I know that there is the library itertools, but I don't know how to combine its functions properly to perform this task
The wording of the question is unclear, but I think this is what you want:
array = ['f','g','d','e','c','b','h','a']
first = sorted(array[::2]) # ['c', 'd', 'f', 'h']
second = sorted(array[1::2]) # ['a', 'b', 'e', 'g']
I think this is what you want.
I need the set of all the new lists with length 4 such that the first two elements are taken from A,C,E,G and the other two are from B,D,F,H and I need them to be in lexicographic order.
We get the possible starting letters and ending letters then combine all possible pairs of each of them into all_lists:
from itertools import combinations
lst = ['A', 'B', 'C', 'D', 'E', 'F', 'G', 'H']
starters = lst[::2] # ['A', 'C', 'E', 'G']
enders = lst[1::2] # ['B', 'D', 'F', 'H']
all_lists = []
for a in combinations(starters, 2):
for b in combinations(enders, 2):
all_lists.append(sorted(a + b))
print(all_lists) # Gives [['A', 'B', 'C', 'D'], ['A', 'B', 'C', 'F'], ['A', 'B', 'C', 'H'], ['A', 'C', 'D', 'F'], ['A', 'C', 'D', 'H'], ['A', 'C', 'F', 'H'], ...
print(all_lists == sorted(all_lists)) # False now
(Updated to sort each mini-list.)
Come to think of it you could maybe do the second part with itertools.product.

Removing duplicate characters from a list in Python where the pattern repeats

I am monitoring a serial port that sends data that looks like this:
['','a','a','a','a','a','a','','b','b','b','b','b','b','b','b',
'','','c','c','c','c','c','c','','','','d','d','d','d','d','d','d','d',
'','','e','e','e','e','e','e','','','a','a','a','a','a','a',
'','','','b','b','b','b','b','b','b','b','b','','','c','c','c','c','c','c',
'','','','d','d','d','d','d','d','','','e','e','e','e','e','e',
'','','a','a','a','a','a','a','','b','b','b','b','b','b','b','b',
'','','c','c','c','c','c','c','','','','d','d','d','d','d','d','d','d',
'','','e','e','e','e','e','e','','','a','a','a','a','a','a',
'','','','b','b','b','b','b','b','b','b','b','','','c','c','c','c','c','c',
'','','','d','d','d','d','d','d','','','e','e','e','e','e','e','','']
I need to be able to convert this into:
['a','b','c','d','a','b','c','d','a','b','c','d','a','b','c','d']
So I'm removing duplicates and empty strings, but also retaining the number of times the pattern repeats itself.
I haven't been able to figure it out. Can someone help?
Here's a solution using a list comprehension and itertools.zip_longest: keep an element only if it's not an empty string, and not equal to the next element. You can use an iterator to skip the first element, to avoid the cost of slicing the list.
from itertools import zip_longest
def remove_consecutive_duplicates(lst):
ahead = iter(lst)
next(ahead)
return [ x for x, y in zip_longest(lst, ahead) if x and x != y ]
Usage:
>>> remove_consecutive_duplicates([1, 1, 2, 2, 3, 1, 3, 3, 3, 2])
[1, 2, 3, 1, 3, 2]
>>> remove_consecutive_duplicates(my_list)
['a', 'b', 'c', 'd', 'e', 'a', 'b', 'c', 'd', 'e', 'a', 'b', 'c', 'd',
'e', 'a', 'b', 'c', 'd', 'e']
I'm assuming either that there are no duplicates separated by empty strings (e.g. 'a', '', 'a'), or that you don't want to remove such duplicates. If this assumption is wrong, then you should filter out the empty strings first:
>>> example = ['a', '', 'a']
>>> remove_consecutive_duplicates([ x for x in example if x ])
['a']
You can loop over the list and add the appropriate contitions. For the response that you are expecting, you just need to whether previous character is not same as current character
current_sequence = ['','a','a','a','a','a','a','','b','b','b','b','b','b','b','b','','','c','c','c','c','c','c','','','','d','d','d','d','d','d','d','d','','','e','e','e','e','e','e','','','a','a','a','a','a','a','','','','b','b','b','b','b','b','b','b','b','','','c','c','c','c','c','c','','','','d','d','d','d','d','d','','','e','e','e','e','e','e','','','a','a','a','a','a','a','','b','b','b','b','b','b','b','b','','','c','c','c','c','c','c','','','','d','d','d','d','d','d','d','d','','','e','e','e','e','e','e','','','a','a','a','a','a','a','','','','b','b','b','b','b','b','b','b','b','','','c','c','c','c','c','c','','','','d','d','d','d','d','d','','','e','e','e','e','e','e','','']
sequence_list = []
for x in range(len(current_sequence)):
if current_sequence[x]:
if current_sequence[x] != current_sequence[x-1]:
sequence_list.append(current_sequence[x])
print(sequence_list)
You need something like that
li = ['','a','a','a','a','a','a','','b','b','b','b','b','b','b','b','','','c','c','c','c','c','c','','','','d','d','d','d','d','d','d','d','','','e','e','e','e','e','e','','','a','a','a','a','a','a','','','','b','b','b','b','b','b','b','b','b','','','c','c','c','c','c','c','','','','d','d','d','d','d','d','','','e','e','e','e','e','e','','','a','a','a','a','a','a','','b','b','b','b','b','b','b','b','','','c','c','c','c','c','c','','','','d','d','d','d','d','d','d','d','','','e','e','e','e','e','e','','','a','a','a','a','a','a','','','','b','b','b','b','b','b','b','b','b','','','c','c','c','c','c','c','','','','d','d','d','d','d','d','','','e','e','e','e','e','e','','']
new_li = []
e_ = ''
for e in li:
if len(e) > 0 and e_ != e:
new_li.append(e)
e_ = e
print(new_li)
Output
['a', 'b', 'c', 'd', 'e', 'a', 'b', 'c', 'd', 'e', 'a', 'b', 'c', 'd', 'e', 'a', 'b', 'c', 'd', 'e']
You can use itertools.groupby:
if your list is ll
ll = [i for i in ll if i]
out = []
for k, g in groupby(ll, key=lambda x: ord(x)):
out.append(chr(k))
print(out)
#prints ['a', 'b', 'c', 'd', 'e', 'a', 'b', 'c', 'd', 'e', ...
from itertools import groupby
from operator import itemgetter
# data <- your data
a = [k for k, v in groupby(data) if k] # approach 1
b = list(filter(bool, map(itemgetter(0), groupby(data)))) # approach 2
assert a == b
print(a)
Result:
['a', 'b', 'c', 'd', 'e', 'a', 'b', 'c', 'd', 'e', 'a', 'b', 'c', 'd', 'e', 'a', 'b', 'c', 'd', 'e']
using the set method you can remove the duplicates from the list
data = ['','a','a','a','a','a','a','','b','b','b','b','b','b','b','b',
'','','c','c','c','c','c','c','','','','d','d','d','d','d','d','d','d',
'','','e','e','e','e','e','e','','','a','a','a','a','a','a',
'','','','b','b','b','b','b','b','b','b','b','','','c','c','c','c','c','c',
'','','','d','d','d','d','d','d','','','e','e','e','e','e','e',
'','','a','a','a','a','a','a','','b','b','b','b','b','b','b','b',
'','','c','c','c','c','c','c','','','','d','d','d','d','d','d','d','d',
'','','e','e','e','e','e','e','','','a','a','a','a','a','a',
'','','','b','b','b','b','b','b','b','b','b','','','c','c','c','c','c','c',
'','','','d','d','d','d','d','d','','','e','e','e','e','e','e','','']
print(set(data))

Merge lists in Python by placing every nth item from one list and others from another?

I have two lists, list1 and list2.
Here len(list2) << len(list1).
Now I want to merge both of the lists such that every nth element of final list is from list2 and the others from list1.
For example:
list1 = ['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h']
list2 = ['x', 'y']
n = 3
Now the final list should be:
['a', 'b', 'x', 'c', 'd', 'y', 'e', 'f', 'g', 'h']
What is the most Pythonic way to achieve this?
I want to add all elements of list2 to the final list, final list should include all elements from list1 and list2.
Making the larger list an iterator makes it easy to take multiple elements for each element of the smaller list:
list1 = ['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h']
list2 = ['x', 'y']
n = 3
iter1 = iter(list1)
res = []
for x in list2:
res.extend([next(iter1) for _ in range(n - 1)])
res.append(x)
res.extend(iter1)
>>> res
['a', 'b', 'x', 'c', 'd', 'y', 'e', 'f', 'g', 'h']
This avoids insert which can be expensive for large lists because each time the whole list needs to be re-created.
To preserve the original list, you could try the following:
result = copy.deepcopy(list1)
index = n - 1
for elem in list2:
result.insert(index, elem)
index += n
result
['a', 'b', 'x', 'c', 'd', 'y', 'e', 'f', 'g', 'h']
Using the itertools module and the supplementary more_itertools package, you can construct an iterable solution a couple different ways. First the imports:
import itertools as it, more_itertools as mt
This first one seems the cleanest, but it relies on more_itertools.chunked().
it.chain(*mt.roundrobin(mt.chunked(list1, n-1), list2))
This one uses only more_itertools.roundrobin(), whose implementation is taken from the itertools documentation, so if you don't have access to more_itertools you can just copy it yourself.
mt.roundrobin(*([iter(list1)]*(n-1) + [list2]))
Alternatively, this does nearly the same thing as the first sample without using any more_itertools-specific functions. Basically, grouper can replace chunked, but it will add Nones at the end in some cases, so I wrap it in it.takewhile to remove those. Naturally, if you are using this on lists which actually do contain None, it will stop once it reaches those elements, so be careful.
it.takewhile(lambda o: o is not None,
it.chain(*mt.roundrobin(mt.grouper(n-1, list1), list2))
)
I tested these on Python 3.4, but I believe these code samples should also work in Python 2.7.
What about the below solution? However I don't have a better one...
>>> list1 = ['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h']
>>> list2 = ['x', 'y']
>>> n = 2
>>> for i in range(len(list2)):
... list1.insert(n, list2[i])
... n += 3
...
...
>>> list1
['a', 'b', 'x', 'c', 'd', 'y', 'e', 'f', 'g', 'h']
n is 2 because the index of third element in a list is 2, since it starts at 0.
list(list1[i-1-min((i-1)//n, len(list2))] if i % n or (i-1)//n >= len(list2) else list2[(i-1)//n] for i in range(1, len(list1)+len(list2)+1))
Definitely not pythonic, but I thought it might be fun to do it in a one-liner. More readable (really?) version:
list(
list1[i-1-min((i-1)//n, len(list2))]
if i % n or (i-1)//n >= len(list2)
else
list2[(i-1)//n]
for i in range(1, len(list1)+len(list2)+1)
)
Basically, some tinkering around with indexes and determining which list and which index to take next element from.
Yet another way, calculating the slice steps:
list1 = ['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h']
list2 = ['x', 'y']
n = 3
res = []
m = n - 1
start, end = 0, m
for x in list2:
res.extend(list1[start:end])
res.append(x)
start, end = end, end + m
res.extend(list1[start:])
>>> res
['a', 'b', 'x', 'c', 'd', 'y', 'e', 'f', 'g', 'h']
list1 = ['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h']
list2 = ['x', 'y']
n = 3
new = list1[:]
for index, item in enumerate(list2):
new[n * (index + 1) - 1: n * (index + 1) - 1] = item
print(new)
I admire #David Z's use of more_itertools. Updates to the tools can simplify the solution:
import more_itertools as mit
n = 3
groups = mit.windowed(list1, n-1, step=n-1)
list(mit.flatten(mit.interleave_longest(groups, list2)))
# ['a', 'b', 'x', 'c', 'd', 'y', 'e', 'f', 'g', 'h']
Summary: list2 is being interleaved into groups from list1 and finally flattened into one list.
Notes
groups: n-1 size sliding windows, e.g. [('a', 'b'), ('c', 'd'), ('e', 'f'), ('g', 'h')]
interleave_longest is presently equivalent to roundrobin
None is the default fillvalue. Optionally remove with filter(None, ...)
Maybe here is another solution, slice the list1 the correct index then add the element of list2 into list1.
>>> list1 = ['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h']
>>> list2 = ['x', 'y']
>>> n = 3
>>> for i in range(len(list2)):
... list1 = list1[:n*(i+1) - 1] + list(list2[i]) + list1[n*(i+1)-1:]
...
>>> list1
['a', 'b', 'x', 'c', 'd', 'y', 'e', 'f', 'g', 'h']

Keep strings that occur N times or more

I have a list that is
mylist = ['a', 'a', 'a', 'b', 'b', 'c', 'c', 'd']
And I used Counter from collections on this list to get the result:
from collection import Counter
counts = Counter(mylist)
#Counter({'a': 3, 'c': 2, 'b': 2, 'd': 1})
Now I want to subset this so that I have all elements that occur some number of times, for example: 2 times or more - so that the output looks like this:
['a', 'b', 'c']
This seems like it should be a simple task - but I have not found anything that has helped me so far.
Can anyone suggest somewhere to look? I am also not attached to using Counter if I have taken the wrong approach. I should note I am new to python so I apologise if this is trivial.
[s for s, c in counts.iteritems() if c >= 2]
# => ['a', 'c', 'b']
Try this...
def get_duplicatesarrval(arrval):
dup_array = arrval[:]
for i in set(arrval):
dup_array.remove(i)
return list(set(dup_array))
mylist = ['a', 'a', 'a', 'b', 'b', 'c', 'c', 'd']
print get_duplicatesarrval(mylist)
Result:
[a, b, c]
The usual way would be to use a list comprehension as #Adaman does.
In the special case of 2 or more, you can also subtract one Counter from another
>>> counts = Counter(mylist) - Counter(set(mylist))
>>> counts.keys()
['a', 'c', 'b']
from itertools import groupby
mylist = ['a', 'a', 'a', 'b', 'b', 'c', 'c', 'd']
res = [i for i,j in groupby(mylist) if len(list(j))>=2]
print res
['a', 'b', 'c']
I think above mentioned answers are better, but I believe this is the simplest method to understand:
mylist = ['a', 'a', 'a', 'b', 'b', 'c', 'c', 'd']
newlist=[]
newlist.append(mylist[0])
for i in mylist:
if i in newlist:
continue
else:
newlist.append(i)
print newlist
>>>['a', 'b', 'c', 'd']

Categories