Get indexes that list items start to change - python

I have a list that contains number of strings that change after several consecutive repetitions of a string. Let me explain with an example: If I have this string
lst = ['A', 'A', 'A', 'A', 'F', 'F', 'F', 'F', 'D', 'G', 'G', 'G', 'W']
I want to get a list of items where the elements in that index have started to change (including the first item list), so the output should be like this:
out = [0, 4, 8, 9, 10]
How can I do this in the best possible way?

[i for i, z in enumerate(lst) if z != lst[i - 1] or i == 0]
haven't tested this, but it should work I believe.

Here's a for loop that can get the job done for you. It iterates through the list and checks to see if each value it contains is different from the previous one it identified. If so, it adds the index to a list and updates its curr_val.
lst = ['A', 'A', 'A', 'A', 'F', 'F', 'F', 'F', 'D', 'G', 'G', 'G', 'W']
def find_change(lst):
curr_val = ''
indices = []
for i,v in enumerate(lst):
if curr_val != v:
indices.append(i)
curr_val = v
return indices
i = find_change(lst)
print(i)

Related

Use list of lists created from orignal list to get remaining values as a list of lists

I have a list that was used to create a new list of lists (sub_lst1) and I want to use that sub_list1 to filter the remaining values in the list.
And use those remaining values to create a new list of lists (sub_lst2).
I have provided a toy example below of the problem.
I have tried the following:
lst = ['f','f','a','g','h','a','b','g','h','a','h','d','a','b']
sub_lst1 = []
sub_lst2 = []
>>> for i, v in enumerate(lst):
... if "b" in v:
... sub_lst1.append(lst[i-3:i+1])
>>>print(sub_lst1)
[['g', 'h', 'a', 'b'], ['h', 'd', 'a', 'b']]
>>> for i, v in enumerate(lst):
... if sub_lst1[0:][0:] not in v:
... sub_lst2.append(lst[i-2:i+1])
>>> print(sub_lst2)
[[], [], ['f', 'f', 'a'],['f', 'a', 'g'], ['a', 'g', 'h'], ['g', 'h', 'a'], ['a', 'b', 'g'], ['b', 'g', 'h'], ['g', 'h', 'a'], ['h', 'a', 'h'], ['a', 'h', 'd'], ['h', 'd', 'a']]
But the desired result would be to have the two sub-lists where one sub-list has the two preceding values to 'a' and 'b' and the second sub-list has the two preceding values of 'a' where 'b' does not follow 'a'. The sub-lists would look as follows:
>>> print(sub_lst1)
[['g', 'h', 'a', 'b'], ['h', 'd', 'a', 'b']]
>>> print(sub_lst2)
[['f', 'f', 'a'], ['g', 'h', 'a']]
I'd recommend tackling this problem by simply finding where the 'b's are and then slicing up the main list in one go, rather than doing it in two steps. For example:
lst = ['f','f','a','g','h','a','b','g','h','a','h','d','a','b']
sub_lst1 = []
sub_lst2 = []
to_find = 'b'
found_indexes = []
for i, v in enumerate(lst):
if v == to_find:
found_indexes.append(i)
last_idx = -1
for idx in found_indexes:
sub_lst2.append(lst[last_idx+1:idx-3])
sub_lst1.append(lst[idx-3:idx+1])
last_idx = idx
if lst[-1] != to_find: # don't forget to check in case 'b' isn't the last entry
sub_lst2.append(lst[last_idx+1:])
This gets the result you are looking for, assuming that if 'b' isn't the last entry, you would want sub_lst2 to include the trailing letters.
I'm sure there's a solution with superior speed that involves turning the first lst into a single string of characters and then split()ing it using 'b', but I don't think speed is your concern and this answer will work for a list with contents of any type, not just single characters.

Iterate lists at intervals based on list values

I've been trying to accomplish this in a few different ways and just can't quite seem to get it to work for me.
I'm trying to iterate over a list in blocks, where the first index value is an integer for how many elements are in the first block. After that, another integer with n elements, and another, etc.
Example:
test = [3, 'a', 'b', 'c', 2, 'd', 'e', 3, 'f', 'g', 'h']
I want to read 3, pull 'a', 'b', 'c' from the list and perform some operation on them.
Then return to the list at 2, pull 'd', 'e' - more operations, etc.
Or even just using the integers to split into sub-lists would work.
I'm thinking list slicing with updated [start:stop:step] variables but am having trouble pulling it together.
Any suggestions?
Can only use the standard Python library.
You could create a generator to iterate lazily on the parts of the list:
test = [3, 'a', 'b', 'c', 2, 'd', 'e', 3, 'f', 'g', 'h']
​
def parts(lst):
idx = 0
while idx < len(lst):
part_length = lst[idx]
yield lst[idx+1: idx + part_length + 1 ]
idx += part_length+1
for part in parts(test):
print(part)
Output:
['a', 'b', 'c']
['d', 'e']
['f', 'g', 'h']
If your input structure is always like this you can do the following:
result = [test[i:i+j] for i, j in enumerate(test, 1) if isinstance(j, int)]
print(result)
# [['a', 'b', 'c'], ['d', 'e'], ['f', 'g', 'h']]
Using an iterator on the list makes this super simple. Just grab the next item which tells you how much more to grab next, and so on until the end of the list:
test = [3, 'a', 'b', 'c', 2, 'd', 'e', 3, 'f', 'g', 'h']
it = iter(test)
for num in it:
print(", ".join(next(it) for _ in range(num)))
which prints:
a, b, c
d, e
f, g, h
You can also convert this to a list if you need to save the result:
>>> it = iter(test)
>>> [[next(it) for _ in range(num)] for num in it]
[['a', 'b', 'c'], ['d', 'e'], ['f', 'g', 'h']]

Removing duplicate characters from a list in Python where the pattern repeats

I am monitoring a serial port that sends data that looks like this:
['','a','a','a','a','a','a','','b','b','b','b','b','b','b','b',
'','','c','c','c','c','c','c','','','','d','d','d','d','d','d','d','d',
'','','e','e','e','e','e','e','','','a','a','a','a','a','a',
'','','','b','b','b','b','b','b','b','b','b','','','c','c','c','c','c','c',
'','','','d','d','d','d','d','d','','','e','e','e','e','e','e',
'','','a','a','a','a','a','a','','b','b','b','b','b','b','b','b',
'','','c','c','c','c','c','c','','','','d','d','d','d','d','d','d','d',
'','','e','e','e','e','e','e','','','a','a','a','a','a','a',
'','','','b','b','b','b','b','b','b','b','b','','','c','c','c','c','c','c',
'','','','d','d','d','d','d','d','','','e','e','e','e','e','e','','']
I need to be able to convert this into:
['a','b','c','d','a','b','c','d','a','b','c','d','a','b','c','d']
So I'm removing duplicates and empty strings, but also retaining the number of times the pattern repeats itself.
I haven't been able to figure it out. Can someone help?
Here's a solution using a list comprehension and itertools.zip_longest: keep an element only if it's not an empty string, and not equal to the next element. You can use an iterator to skip the first element, to avoid the cost of slicing the list.
from itertools import zip_longest
def remove_consecutive_duplicates(lst):
ahead = iter(lst)
next(ahead)
return [ x for x, y in zip_longest(lst, ahead) if x and x != y ]
Usage:
>>> remove_consecutive_duplicates([1, 1, 2, 2, 3, 1, 3, 3, 3, 2])
[1, 2, 3, 1, 3, 2]
>>> remove_consecutive_duplicates(my_list)
['a', 'b', 'c', 'd', 'e', 'a', 'b', 'c', 'd', 'e', 'a', 'b', 'c', 'd',
'e', 'a', 'b', 'c', 'd', 'e']
I'm assuming either that there are no duplicates separated by empty strings (e.g. 'a', '', 'a'), or that you don't want to remove such duplicates. If this assumption is wrong, then you should filter out the empty strings first:
>>> example = ['a', '', 'a']
>>> remove_consecutive_duplicates([ x for x in example if x ])
['a']
You can loop over the list and add the appropriate contitions. For the response that you are expecting, you just need to whether previous character is not same as current character
current_sequence = ['','a','a','a','a','a','a','','b','b','b','b','b','b','b','b','','','c','c','c','c','c','c','','','','d','d','d','d','d','d','d','d','','','e','e','e','e','e','e','','','a','a','a','a','a','a','','','','b','b','b','b','b','b','b','b','b','','','c','c','c','c','c','c','','','','d','d','d','d','d','d','','','e','e','e','e','e','e','','','a','a','a','a','a','a','','b','b','b','b','b','b','b','b','','','c','c','c','c','c','c','','','','d','d','d','d','d','d','d','d','','','e','e','e','e','e','e','','','a','a','a','a','a','a','','','','b','b','b','b','b','b','b','b','b','','','c','c','c','c','c','c','','','','d','d','d','d','d','d','','','e','e','e','e','e','e','','']
sequence_list = []
for x in range(len(current_sequence)):
if current_sequence[x]:
if current_sequence[x] != current_sequence[x-1]:
sequence_list.append(current_sequence[x])
print(sequence_list)
You need something like that
li = ['','a','a','a','a','a','a','','b','b','b','b','b','b','b','b','','','c','c','c','c','c','c','','','','d','d','d','d','d','d','d','d','','','e','e','e','e','e','e','','','a','a','a','a','a','a','','','','b','b','b','b','b','b','b','b','b','','','c','c','c','c','c','c','','','','d','d','d','d','d','d','','','e','e','e','e','e','e','','','a','a','a','a','a','a','','b','b','b','b','b','b','b','b','','','c','c','c','c','c','c','','','','d','d','d','d','d','d','d','d','','','e','e','e','e','e','e','','','a','a','a','a','a','a','','','','b','b','b','b','b','b','b','b','b','','','c','c','c','c','c','c','','','','d','d','d','d','d','d','','','e','e','e','e','e','e','','']
new_li = []
e_ = ''
for e in li:
if len(e) > 0 and e_ != e:
new_li.append(e)
e_ = e
print(new_li)
Output
['a', 'b', 'c', 'd', 'e', 'a', 'b', 'c', 'd', 'e', 'a', 'b', 'c', 'd', 'e', 'a', 'b', 'c', 'd', 'e']
You can use itertools.groupby:
if your list is ll
ll = [i for i in ll if i]
out = []
for k, g in groupby(ll, key=lambda x: ord(x)):
out.append(chr(k))
print(out)
#prints ['a', 'b', 'c', 'd', 'e', 'a', 'b', 'c', 'd', 'e', ...
from itertools import groupby
from operator import itemgetter
# data <- your data
a = [k for k, v in groupby(data) if k] # approach 1
b = list(filter(bool, map(itemgetter(0), groupby(data)))) # approach 2
assert a == b
print(a)
Result:
['a', 'b', 'c', 'd', 'e', 'a', 'b', 'c', 'd', 'e', 'a', 'b', 'c', 'd', 'e', 'a', 'b', 'c', 'd', 'e']
using the set method you can remove the duplicates from the list
data = ['','a','a','a','a','a','a','','b','b','b','b','b','b','b','b',
'','','c','c','c','c','c','c','','','','d','d','d','d','d','d','d','d',
'','','e','e','e','e','e','e','','','a','a','a','a','a','a',
'','','','b','b','b','b','b','b','b','b','b','','','c','c','c','c','c','c',
'','','','d','d','d','d','d','d','','','e','e','e','e','e','e',
'','','a','a','a','a','a','a','','b','b','b','b','b','b','b','b',
'','','c','c','c','c','c','c','','','','d','d','d','d','d','d','d','d',
'','','e','e','e','e','e','e','','','a','a','a','a','a','a',
'','','','b','b','b','b','b','b','b','b','b','','','c','c','c','c','c','c',
'','','','d','d','d','d','d','d','','','e','e','e','e','e','e','','']
print(set(data))

Remove certain indexes in a list

Suppose I have a list filled with indexes to remove
remove = [0, 2, 4, 5, 7, 9, 10, 11]
Then I have another list of lists, such as
l = [['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j', 'k', 'l'], ['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j', 'k', 'l']]
I want to remove the values at the indexes in remove
If you don't have to do this in place, you can construct new lists based on the index:
[[v for i, v in enumerate(s) if i not in to_remove] for s in l]
# [['b', 'd', 'g', 'i'], ['b', 'd', 'g', 'i']]
If you perform a step by step execution, the problem will become evident.
As you remove elements, the position of the following elements changes. For example, if you remove element 0 from a list, what was element 1 will become element 0.
If you want to stick with the current approach, just traverse the indices in reverse order (you don't need the values, just use a range).
If you don't want list comprehension, you can use a couple of loops like so:
for x in remove[::-1]:
for list in l:
del list[x]
[['b', 'd', 'g', 'i'], ['b', 'd', 'g', 'i']]

Merge lists in Python by placing every nth item from one list and others from another?

I have two lists, list1 and list2.
Here len(list2) << len(list1).
Now I want to merge both of the lists such that every nth element of final list is from list2 and the others from list1.
For example:
list1 = ['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h']
list2 = ['x', 'y']
n = 3
Now the final list should be:
['a', 'b', 'x', 'c', 'd', 'y', 'e', 'f', 'g', 'h']
What is the most Pythonic way to achieve this?
I want to add all elements of list2 to the final list, final list should include all elements from list1 and list2.
Making the larger list an iterator makes it easy to take multiple elements for each element of the smaller list:
list1 = ['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h']
list2 = ['x', 'y']
n = 3
iter1 = iter(list1)
res = []
for x in list2:
res.extend([next(iter1) for _ in range(n - 1)])
res.append(x)
res.extend(iter1)
>>> res
['a', 'b', 'x', 'c', 'd', 'y', 'e', 'f', 'g', 'h']
This avoids insert which can be expensive for large lists because each time the whole list needs to be re-created.
To preserve the original list, you could try the following:
result = copy.deepcopy(list1)
index = n - 1
for elem in list2:
result.insert(index, elem)
index += n
result
['a', 'b', 'x', 'c', 'd', 'y', 'e', 'f', 'g', 'h']
Using the itertools module and the supplementary more_itertools package, you can construct an iterable solution a couple different ways. First the imports:
import itertools as it, more_itertools as mt
This first one seems the cleanest, but it relies on more_itertools.chunked().
it.chain(*mt.roundrobin(mt.chunked(list1, n-1), list2))
This one uses only more_itertools.roundrobin(), whose implementation is taken from the itertools documentation, so if you don't have access to more_itertools you can just copy it yourself.
mt.roundrobin(*([iter(list1)]*(n-1) + [list2]))
Alternatively, this does nearly the same thing as the first sample without using any more_itertools-specific functions. Basically, grouper can replace chunked, but it will add Nones at the end in some cases, so I wrap it in it.takewhile to remove those. Naturally, if you are using this on lists which actually do contain None, it will stop once it reaches those elements, so be careful.
it.takewhile(lambda o: o is not None,
it.chain(*mt.roundrobin(mt.grouper(n-1, list1), list2))
)
I tested these on Python 3.4, but I believe these code samples should also work in Python 2.7.
What about the below solution? However I don't have a better one...
>>> list1 = ['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h']
>>> list2 = ['x', 'y']
>>> n = 2
>>> for i in range(len(list2)):
... list1.insert(n, list2[i])
... n += 3
...
...
>>> list1
['a', 'b', 'x', 'c', 'd', 'y', 'e', 'f', 'g', 'h']
n is 2 because the index of third element in a list is 2, since it starts at 0.
list(list1[i-1-min((i-1)//n, len(list2))] if i % n or (i-1)//n >= len(list2) else list2[(i-1)//n] for i in range(1, len(list1)+len(list2)+1))
Definitely not pythonic, but I thought it might be fun to do it in a one-liner. More readable (really?) version:
list(
list1[i-1-min((i-1)//n, len(list2))]
if i % n or (i-1)//n >= len(list2)
else
list2[(i-1)//n]
for i in range(1, len(list1)+len(list2)+1)
)
Basically, some tinkering around with indexes and determining which list and which index to take next element from.
Yet another way, calculating the slice steps:
list1 = ['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h']
list2 = ['x', 'y']
n = 3
res = []
m = n - 1
start, end = 0, m
for x in list2:
res.extend(list1[start:end])
res.append(x)
start, end = end, end + m
res.extend(list1[start:])
>>> res
['a', 'b', 'x', 'c', 'd', 'y', 'e', 'f', 'g', 'h']
list1 = ['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h']
list2 = ['x', 'y']
n = 3
new = list1[:]
for index, item in enumerate(list2):
new[n * (index + 1) - 1: n * (index + 1) - 1] = item
print(new)
I admire #David Z's use of more_itertools. Updates to the tools can simplify the solution:
import more_itertools as mit
n = 3
groups = mit.windowed(list1, n-1, step=n-1)
list(mit.flatten(mit.interleave_longest(groups, list2)))
# ['a', 'b', 'x', 'c', 'd', 'y', 'e', 'f', 'g', 'h']
Summary: list2 is being interleaved into groups from list1 and finally flattened into one list.
Notes
groups: n-1 size sliding windows, e.g. [('a', 'b'), ('c', 'd'), ('e', 'f'), ('g', 'h')]
interleave_longest is presently equivalent to roundrobin
None is the default fillvalue. Optionally remove with filter(None, ...)
Maybe here is another solution, slice the list1 the correct index then add the element of list2 into list1.
>>> list1 = ['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h']
>>> list2 = ['x', 'y']
>>> n = 3
>>> for i in range(len(list2)):
... list1 = list1[:n*(i+1) - 1] + list(list2[i]) + list1[n*(i+1)-1:]
...
>>> list1
['a', 'b', 'x', 'c', 'd', 'y', 'e', 'f', 'g', 'h']

Categories