For example we have [0, 1, 3, 5, 7, 8, 9, 10, 12, 13] .
The result must be 7, 8, 9, 10 because they are adjacent to each other, index wise and are consecutive integers, and also this chain is longer than 0, 1.
English is not my first language, excuse me if the writing is a bit obscure.
Group the items into subsequences using itertools.groupby based on constant differences from an increasing count (provided by an itertools.count object), and then take the longest subsequence using the built-in max on key parameter len:
from itertools import groupby, count
lst = [0, 1, 3, 5, 7, 8, 9, 10, 12, 13]
c = count()
val = max((list(g) for _, g in groupby(lst, lambda x: x-next(c))), key=len)
print(val)
# [7, 8, 9, 10]
You may include the group key in the result (suppressed as _) to further understand how this works.
Alternative solution using numpy module:
import numpy as np
nums = np.array([0, 1, 3, 5, 7, 8, 9, 10, 12, 13])
longest_seq = max(np.split(nums, np.where(np.diff(nums) != 1)[0]+1), key=len).tolist()
print(longest_seq)
The output:
[7, 8, 9, 10]
np.where(np.diff(nums) != 1)[0]+1 - gets the indices of elements on which the array should be split (if difference between 2 consequtive numbers is not equal to 1, e.g. 3 and 5)
np.split(...) - split the array into sub-arrays
https://docs.scipy.org/doc/numpy-dev/reference/generated/numpy.diff.html#numpy.diff
https://docs.scipy.org/doc/numpy-dev/reference/generated/numpy.split.html
Code
Using itertools.groupby (similar to #Moses Koledoye's answer):
groups = [[y[1] for y in g] for k, g in itertools.groupby(enumerate(iterable), key=lambda x: x[0]-x[1])]
groups
# [[0, 1], [3], [5], [7, 8, 9, 10], [12, 13]]
max(groups, key=len)
# [7, 8, 9, 10]
Alternative
Consider the third-party tool more_itertools.consecutive_groups:
import more_itertools as mit
iterable = [0, 1, 3, 5, 7, 8, 9, 10, 12, 13]
max((list(g) for g in mit.consecutive_groups(iterable)), key=len)
# [7, 8, 9, 10]
Related
I need to resize the columns in the Excel datasheet with Python's xlsxwriter. For this purpose I've prepared three lists of maximum text lengths in dataframe columns.
l1 = [5, 10, 12, 3, 6, 2]
l2 = []
l3 = [6, 9, 11, 5, 4, 4, 8, 7]
I need to get the maximum values in these lists so that the resulting list will look like this:
common = [6, 10, 12, 5, 6, 4, 8, 7]
I know I should write here my own solution, but it's trivial: we should find the list with maximum length, then compare each value with each other list if its length allows that. But is there a more optimal way?
If they are always guaranteed to be positive numbers, you can zip_longest them using 0 as a fillvalue:
>>> from itertools import zip_longest
>>> list(map(max, zip_longest(l1, l2, l3, fillvalue=0)))
[6, 10, 12, 5, 6, 4, 8, 7]
If there can be negative numbers as well, you can use negative infinity instead of 0:
>>> from math import inf
>>> list(map(max, zip_longest(l1, l2, l3, fillvalue=-inf)))
[6, 10, 12, 5, 6, 4, 8, 7]
I believe this is an easy problem to solve. I have searched and found a few similar answers but not an efficient way to exactly what I want to achieve.
Assuming the following list:
x = [6, 7, 8]
I want to create a new list by repeating each number k times. Assuming k=3, the result should be:
xr = [6, 6, 6, 7, 7, 7, 8, 8, 8]
I was able to accomplish this using nest loops, which I believe is very inefficient:
xr = []
for num in x: # for each number in the list
for t in range(3): # repeat 3 times
xx2.append(num)
I also tried:
[list(itertools.repeat(x[i], 3)) for i in range(len(x))]
but I get:
[[6, 6, 6], [7, 7, 7], [8, 8, 8]]
Is there a more efficient direct method to accomplish this?
You can use list comprehension:
x = [6, 7, 8]
k = 3
out = [v for v in x for _ in range(k)]
print(out)
Prints:
[6, 6, 6, 7, 7, 7, 8, 8, 8]
def repeat_k(l,k):
lo = []
for x in l:
for i in range(k):
lo.append(x)
return lo
print (repeat_k([1,2,3],5))
Output:
[1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 3, 3, 3, 3, 3]
With list comprehension:
def repeat_k(l,k):
return [ x for x in l for i in range(k) ]
print (repeat_k([1,2,3],5))
Output:
[1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 3, 3, 3, 3, 3]
Another possibility:
>>> x = [6, 7, 8]
>>> k = 3
>>> l = []
>>> for item in x:
... l += k * [item]
...
>>> l
[6, 6, 6, 7, 7, 7, 8, 8, 8]
You can create a convenient function:
def repeat(it, n):
for elem in it: yield from [elem] * n
Use it like:
>>> list(repeat(x, n=3))
[6, 6, 6, 7, 7, 7, 8, 8, 8]
Thanks, everyone for the answers.
It seems there is an easier and more direct way to solve this using Numpy.
np.repeat(x, 3).tolist()
prints exactly what I needed:
[6, 6, 6, 7, 7, 7, 8, 8, 8]
import itertools
x=[4,5,6]
k=3
res = list(itertools.chain.from_iterable(itertools.repeat(i, K) for i in test_list))
print (res)
It can also be solved using python inbuilt functions of itertools library. The repeat function does the task of repetition and grouping into a list is done by the from_iterable function.
I have a sample list of lists like:
lol = [[1,2,3,4],[5,6],[7,8,9,0,11],[21]]
the expected combined list is:
cl = [1,5,7,21,2,6,8,3,9,4,0,11]
Is there an elegant way of doing this preferably without nested for loops?
You can use itertools.zip_longest:
from itertools import zip_longest
lol = [[1, 2, 3, 4], [5, 6], [7, 8, 9, 0, 11], [21]]
out = [i for v in zip_longest(*lol) for i in v if not i is None]
print(out)
Prints:
[1, 5, 7, 21, 2, 6, 8, 3, 9, 4, 0, 11]
itertools is your friend. Use zip_longest to zip ignoring the differing lengths, chain it to flatten the zipped lists, and then just filter the Nones.
lol = [[1,2,3,4],[5,6],[7,8,9,0,11],[21]]
print([x for x in itertools.chain.from_iterable(itertools.zip_longest(*lol)) if x is not None])
In case it helps, a generator version of zip_longest is available as more_itertools.interleave_longest.
from more_itertools import interleave_longest, take
lol = [[1, 2, 3, 4], [5, 6], [7, 8, 9, 0, 11], [21]]
gen_from_lol = interleave_longest(*lol)
print(next(gen_from_lol), next(gen_from_lol))
print(take(6, gen_from_lol))
print(next(gen_from_lol))
print(next(gen_from_lol), next(gen_from_lol))
Output
1 5
[7, 21, 2, 6, 8, 3]
9
4 0
Note that interleave_longest(*iterables) is the basically the same as chain.from_iterable(zip_longest(*iterables))
This question already has answers here:
how to keep elements of a list based on another list [duplicate]
(3 answers)
Closed 3 years ago.
I have two list, one reference and one input list
Ref = [3, 2, 1, 12, 11, 10, 9, 8, 7, 6, 5, 4]
Input = [9, 5, 2, 3, 10, 4, 11, 8]
I want to sort Input list, in the order as that of Ref. If some element is missing in Input list, it can skip and go for the other element.
Hence sorted Input list, based on Ref list will be like this
Sorted_Input = [3, 2, 11, 10, 9, 8, 5, 4]
I think this answers your question:
>>> [x for x in Ref if x in Input]
>>> [3, 2, 11, 10, 9, 8, 5, 4]
Hope it helps.
UPDATE:
Making Input a set for faster access:
>>> Input_Set = set(Input)
>>> [x for x in Ref if x in Input_Set]
[3, 2, 11, 10, 9, 8, 5, 4]
Another approach in addition to dcg's answer would be as follows:
Ref = [3, 2, 1, 12, 11, 10, 9, 8, 7, 6, 5, 4]
Input = [9, 5, 2, 3, 10, 4, 11, 8]
ref = set(Ref)
inp = set(Input)
sorted_list = sorted(ref.intersection(inp), key = Ref.index)
This outputs to:
[3, 2, 11, 10, 9, 8, 5, 4]
Here you convert the lists into sets, find their intersection, and sort them. The set is sorted based on the 'Ref' list's indexing.
You can use the sorted method:
# keep in a dict the index for each value from Ref
ref = {val: i for i, val in enumerate(Ref)}
# sort by the index value from Ref for each number from Input
sorted(Input, key=ref.get)
output:
[3, 2, 11, 10, 9, 8, 5, 4]
Here's the naive approach:
sorted(Input, key=Ref.index)
Or in-place:
Input.sort(key=Ref.index)
Either way it's just one line.
Although I think it's slow -- O(n*m) where n and m are the lengths of Input and Ref. #rusu_ro1's solution uses a similar method but seems to be O(n+m).
I have the following list of integers, which I need to cross compare with eachother:
compare = [[2,4,5,7,8,10,12],[1,3,5,8,9,10,12],[1,2,4,6,8,10,11,12],[2,3,4,6,7,9,12]]
Even though you cant name lists within lists in Python(I think), we'll just call each sublist a,b,c and d.
What I wish to do, is make a for loop that can compare whether any one integer is present in either 2, 3, or all of the lists. The loop it self is simple, it iterates over all the integers in a-d, but the conditions to which the comparisons are made is quite complex, or perhaps just long winded, for e.g:
if i in a and i in b, or i in a and i in c... or i in a and i in b and i in c... or i in (every list):
pattern.append (i)
Obviously this is impractical. I've looked up solutions to the issue but to no avail. Also, would the & and | operators be usable in anyway, or should I stick to AND and OR?
Thanks in advance for any help!
I suggest for this problem use itertools.chain to chain all the element , then count the number of any element that you want :
>>> import itertools
>>> new_list=list(itertools.chain(*compare))
[2, 4, 5, 7, 8, 10, 12, 1, 3, 5, 8, 9, 10, 12, 1, 2, 4, 6, 8, 10, 11, 12, 2, 3, 4, 6, 7, 9, 12]
>>> pattern=[i for i in new_list if new_list.count(i)>2]
>>> pattern
[2, 4, 8, 10, 12, 8, 10, 12, 2, 4, 8, 10, 12, 2, 4, 12]
So you want to iterate only over values in c, and check whether it's in any of the other lists (a, b, d)? Then you can use the any() builtin for this:
compare = [
[2, 4, 5, 7, 8, 10, 12],
[1, 3, 5, 8, 9, 10, 12],
[1, 2, 4, 6, 8, 10, 11, 12],
[2, 3, 4, 6, 7, 9, 12]
]
a, b, c, d = compare
pattern = []
for value in c:
if any(value in lst for lst in (a, b, d)):
pattern.append(value)
The line a, b, c, d = compare is using list unpacking to assign each of the 4 sub-lists in compare to a separate variable, and the expression inside any() is called a generator expression.
On Set() you can use intersection and find all values which are in both lists.
a = [2,4,5,7,8,10,12]
b = [1,3,5,8,9,10,12]
set(a).intersection(set(b))
=> set([8, 10, 12, 5])
It will be explained here
Make an array of booleans, like so:
present = [any_given_integer in L for L in compare]
Now present has values like [True, True, False, False], etc.
Then you can do tests like:
if present.count(True) == 2:
...
or
if all(present):
...
etc.