Find longest consecutive sub array (not sorted)-Python - python

v=[1,2,3,11,5,8,9,10,11,6,4] in the list above 1,2,3 are consecutive numbers (1st consecutive set). 8,9,10,11 are consecutive numbers (2nd set,largest one). How can I find this 2nd set? This code below gives the consecutive numbers:
for i in range(len(v)-1):
if v[i+1]==v[i]+1:
if v[i-1]!=v[i]-1:
print(v[i])
print(v[i]+1)
Output:1,2,3,8,9,10,11
I was thinking of using something like below and add the outputs in a new list and then find out max value of the list.I can't think of a logic to combining those 2 ideas.
for i in range(len(v)-1):
for j in range(i+1,len(v)):
if v[j]-v[i]
I looked at this example but I think that solution is different from what I am looking for. Thanks in advance for your time and suggestion.

You're pretty close. Store the current run as a list, update the best list when necessary and clear it whenever you break the run. Care should be taken to include the last grouping if it appears at the very end of the list.
v = [1,2,3,11,5,8,9,10,11,6,4]
best = []
run = []
for i in range(1, len(v) + 1):
run.append(v[i-1])
if i == len(v) or v[i-1] + 1 != v[i]:
if len(best) < len(run):
best = run
run = []
print(best)
Output:
[8, 9, 10, 11]

You can iterate over the list and keep appending the item to the potentially longest consecutive sub-list, and start a new one if the item not consecutive to the last item of the sub-list, and assign the sub-list as the new longest sub-list if it is longer than the current longest sub-list:
candidate = []
longest = []
for i in v:
if candidate and candidate[-1] != i - 1:
if len(candidate) > len(longest):
longest = candidate
candidate = []
candidate.append(i)
if len(candidate) > len(longest):
longest = candidate
longest becomes:
[8, 9, 10, 11]

You can use a sliding window shrinking the size and check if all numbers are in ascending order:
from itertools import islice
def window(seq, n=2):
"Returns a sliding window (of width n) over data from the iterable"
" s -> (s0,s1,...s[n-1]), (s1,s2,...,sn), ... "
it = iter(seq)
result = tuple(islice(it, n))
if len(result) == n:
yield result
for elem in it:
result = result[1:] + (elem,)
yield result
def longestConsecutiveSeq(s):
for seq in (window(s, i) for i in range(len(s)-1, 1, -1)):
for subseq in seq:
l = list(subseq)
if all((y-x) == 1 for (x, y) in zip(l, l[1:])):
return l
print(longestConsecutiveSeq([1,2,3,11,5,8,9,10,11,6,4]))
Result: [8, 9, 10, 11]
This algorithm will stop on the first encounter of biggest size.

You can use pandas:
import pandas as pd
v=[1,2,3,11,5,8,9,10,11,6,4]
s = pd.Series(v)
sgc = s.groupby(s.diff().ne(1).cumsum()).transform('count')
result = s[sgc == sgc.max()].tolist()
result
Output:
[8, 9, 10, 11]
Details:
Create a pandas series, use diff to calculate the difference from the previous value. Next, use ne to create a boolean series where the difference is not equal to 1, then cumsum this boolean series to create groups, where consective values are all grouped together. Use,groupby with transform to a count of the group size to each record. Lastly, use boolean indexing to only select parts of the series where the count in a group is equal to the max count of all groups. Then convert to array using tolist.

You can use differences between elements and their indices to group elements using the function ‘groupby()’:
from itertools import groupby
l = [1, 2, 3, 11, 5, 8, 9, 10, 11, 6, 4]
gb = groupby(enumerate(l), lambda x: x[0] - x[1])
max(([i for _, i in g] for _, g in gb), key=len)
# [8, 9, 10, 11]

Related

Find 4 values in a list that are close together

I am trying to find the 4 closest value in a given list within a defined value for the difference. The list can be of any length and is sorted in increasing order. Below is what i have tried:
holdlist=[]
m=[]
nlist = []
t = 1
q = [2,3,5,6,7,8]
for i in range(len(q)-1):
for j in range(i+1,len(q)):
if abs(q[i]-q[j])<=1:
holdlist.append(i)
holdlist.append(j)
t=t+1
break
else:
if t != 4:
holdlist=[]
t=1
elif t == 4:
nlist = holdlist
holdlist=[]
t=1
nlist = list(dict.fromkeys(nlist))
for num in nlist:
m.append(q[num])
The defined difference value here is 1. Where "q" is the list and i am trying to get the result in "m" to be [5,6,7,8]. but it turns out to be an empty list.
This works only if the list "q" is [5,6,7,8,10,11]. My guess is after comparing the last value, the for loop ends and the result does not go into "holdlist".
Is there a more elegant way of writing the code?
Thank you.
One solution would be to sort the input list and find the smallest window of four elements. Given the example input, this is
min([sorted(q)[i:i+4] for i in range(len(q) - 3)],
key=lambda w: w[3] - w[0])
But given a different input this will still return a value if the smallest window has a bigger spacing than 1. But I'd still use this solution, with a bit of error handling:
assert len(q) > 4
answer = min([sorted(q)[i:i+4] for i in range(len(q) - 3)], key=lambda w: w[3] - w[0])
assert answer[3] - answer[0] < 4
Written out and annotated:
sorted_q = sorted(q)
if len(q) < 4:
raise RuntimeError("Need at least four members in the list!")
windows = [sorted_q[i:i+4] for i in range(len(q) - 3)] # All the chunks of four elements
def size(window):
"""The size of the window."""
return window[3] - window[0]
answer = min(windows, key=size) # The smallest window, by size
if answer[3] - answer[0] > 3:
return "No group of four elements has a maximum distance of 1"
return answer
This would be one easy approach to find four closest numbers in list
# Lets have a list of numbers. It have to be at least 4 numbers long
numbers = [10, 4, 9, 1,7,12,25,26,28,29,30,77,92]
numbers.sort()
#now we have sorted list
delta = numbers[4]-numbers[0] # Lets see how close first four numbers in sorted list are from each others.
idx = 0 # Let's save our starting index
for i in range(len(numbers)-4):
d = numbers[i+4]-numbers[i]
if d < delta:
# if some sequence are closer together we save that value and index where they were found
delta = d
idx = i
if numbers[idx:idx+4] == 4:
print ("closest numbers are {}".format(numbers[idx:idx+4]))
else:
print ("Sequence with defined difference didn't found")
Here is my jab at the issue for OP's reference, as #kojiro and #ex4 have already supplied answers that deserve credit.
def find_neighbor(nums, dist, k=4):
res = []
nums.sort()
for i in range(len(nums) - k):
if nums[i + k - 1] - nums[i] <= dist * k:
res.append(nums[i: i + k])
return res
Here is the function in action:
>>> nums = [10, 11, 5, 6, 7, 8, 9] # slightly modified input for better demo
>>> find_neighbor(nums, 1)
[[5, 6, 7, 8], [6, 7, 8, 9], [7, 8, 9, 10]]
Assuming sorting is legal in tackling this problem, we first sort the input array. (I decided to sort in-place for marginal performance gain, but we can also use sorted(nums) as well.) Then, we essentially create a window of size k and check if the difference between the first and last element within that window are lesser or equal to dist * k. In the provided example, for instance, we would expect the difference between the two elements to be lesser or equal to 1 * 4 = 4. If there exists such window, we append that subarray to res, which we return in the end.
If the goal is to find a window instead of all windows, we could simply return the subarray without appending it to res.
You can do this in a generic fashion (i.e. for any size of delta or resulting largest group) using the zip function:
def deltaGroups(aList,maxDiff):
sList = sorted(aList)
diffs = [ (b-a)<=maxDiff for a,b in zip(sList,sList[1:]) ]
breaks = [ i for i,(d0,d1) in enumerate(zip(diffs,diffs[1:]),1) if d0!=d1 ]
groups = [ sList[s:e+1] for s,e in zip([0]+breaks,breaks+[len(sList)]) if diffs[s] ]
return groups
Here's how it works:
Sort the list in order to have each number next to the closest other numbers
Identify positions where the next number is within the allowed distance (diffs)
Get the index positions where compliance with the allowed distance changes (breaks) from eligible to non-eligible and from non-eligible to eligible
This corresponds to start and end of segments of the sorted list that have consecutive eligible pairs.
Extract subsets of the the sorted list based on the start/end positions of consecutive eligible differences (groups)
The deltaGroups function returns a list of groups with at least 2 values that are within the distance constraints. You can use it to find the largest group using the max() function.
output:
q = [10,11,5,6,7,8]
m = deltaGroups(q,1)
print(q)
print(m)
print(max(m,key=len))
# [10, 11, 5, 6, 7, 8]
# [[5, 6, 7, 8], [10, 11]]
# [5, 6, 7, 8]
q = [15,1,9,3,6,16,8]
m = deltaGroups(q,2)
print(q)
print(m)
print(max(m,key=len))
# [15, 1, 9, 3, 6, 16, 8]
# [[1, 3], [6, 8, 9], [15, 16]]
# [6, 8, 9]
m = deltaGroups(q,3)
print(m)
print(max(m,key=len))
# [[1, 3, 6, 8, 9], [15, 16]]
# [1, 3, 6, 8, 9]

Is there a way to find a max value between a range?

For example,
l = [1, -9, 2, 5, 9, 16, 11, 0, 21]
and if the range is 10 (10 meaning any numbers higher than 10 wont be considered as the max), I want the code to return 9.
You can first delete all elements too large and then find the max:
filtered = filter(lambda x: x <= limit, list)
val = max(filtered, default = None) # the `default` part means that that's returned if there are no elements
filtered is a filter object which contains all elements less than or equal to the limit. val is the maximum value in that.
Alternatively,
filtered = [x for x in list if x <= limit]
val = max(filtered, default = None)
filtered contains all elements in the list if and only if they are less than the limit. val is the maximum of filtered.
Alternatively,
val = max((x for x in list if x <= limit), default = None)
This combines the two steps from the above method by using an argument comprehension.
Alternatively,
val = max(filter(limit.__ge__, list), default = None)
limit.__ge__ is a function that means x => limit >= x (ge means Greater-Equal). This is the shortest and least readable way of writing it.
Also please rename list
list is a global variable (the list type in Python). Please don't overwrite global variables ;_;
The following is not radically different, conceptually, than #HyperNeutrino's excellent answer, but I think it's somewhat clearer (per the Zen):
from __future__ import print_function
l = [1, -9, 2, 5, 9, 16, 11, 0, 21]
def lim(x, n):
if x <= n:
return x
print(max(lim(a,10) for a in l))
The cleanest and most space efficient method is to utilize a conditioned generator expression:
maxl = max(num for num in l if num <= 10)
This loops over the list l once, ignoring any numbers not satisfying num <= 10 and finds the maximum. No additional list is build.

How to put sequential numbers into lists from a list

I have a list numbers say,
[1,2,3,6,8,9,10,11]
First, I want to get the sum of the differences (step size) between the numbers (n, n+1) in the list.
Second, if a set of consecutive numbers having a difference of 1 between them, put them in a list, i.e. there are two such lists in this example,
[1,2,3]
[8,9,10,11]
and then put the rest numbers in another list, i.e. there is only one such list in the example,
[6].
Third, get the lists with the max/min sizes from the sequential lists, i.e. [1,2,3], [8,9,10,11] in this example, the max list is,
[8,9,10,11]
min list is
[1,2,3].
What's the best way to implement this?
First, I want to get the sum of the differences (step size) between
the numbers (n, n+1) in the list.
Use sum on the successive differences of elements in the list:
>>> sum(lst[i] - x for i, x in enumerate(lst[:-1], start=1))
10
Second, if a set of consecutive numbers having a difference of 1 between them, put them in a list, i.e. there are two such lists in
this example, and then put the rest numbers in another list, i.e.
there is only one such list in the example,
itertools.groupby does this by grouping on the difference of each element on a reference itertools.count object:
>>> from itertools import groupby, count
>>> c = count()
>>> result = [list(g) for i, g in groupby(lst, key=lambda x: x-next(c))]
>>> result
[[1, 2, 3, 4], [6], [8, 9, 10, 11]]
Third, get the lists with the max/min sizes from above
max and min with the key function as sum:
>>> max(result, key=sum)
[8, 9, 10, 11]
>>> min(result, key=sum)
[6] #??? shouldn't this be [6]
I wonder if you've already got the answer to this (given the missing 4 from your answers) as the first thing I naively tried produced that answer. (That and/or it reads like a homework question)
>>> a=[1,2,3,4,6,8,9,10,11]
>>> sum([a[x+1] - a[x] for x in range(len(a)-1)])
10
>>> [a[x] for x in range(len(a)-1) if abs(a[x] - a[x+1]) ==1]
[1, 2, 3, 8, 9, 10]
Alternatively, try :
a=[1,2,3,6,8,9,10,11]
sets = []
cur_set = set()
total_diff = 0
for index in range(len(a)-1):
total_diff += a[index +1] - a[index]
if a[index +1] - a[index] == 1:
cur_set = cur_set | set([ a[index +1], a[index]])
else:
if len(cur_set) > 0:
sets.append(cur_set)
cur_set = set()
if len(cur_set) > 0:
sets.append(cur_set)
all_seq_nos = set()
for seq_set in sets:
all_seq_nos = all_seq_nos | seq_set
non_seq_set = set(a) - all_seq_nos
print("Sum of differences is {0:d}".format(total_diff))
print("sets of sequential numbers are :")
for seq_set in sets:
print(sorted(list(seq_set)))
print("set of non-sequential numbers is :")
print(sorted(list(non_seq_set)))
big_set=max(sets, key=sum)
sml_set=min(sets, key=sum)
print ("Biggest set of sequential numbers is :")
print (sorted(list(big_set)))
print ("Smallest set of sequential numbers is :")
print (sorted(list(sml_set)))
Which will produce the output :
Sum of differences is 10
sets of sequential numbers are :
[1, 2, 3]
[8, 9, 10, 11]
set of non-sequential numbers is :
[6]
Biggest set of sequential numbers is :
[8, 9, 10, 11]
Smallest set of sequential numbers is :
[1, 2, 3]
Hopefully that all helps ;-)

Python - finding next and previous values in list with different criteria

I'm looking for a way to iterate through a list of numbers in Python to find the index of a particular element and then find the nearest elements to it that meet certain criteria. I can't seem to find any built in functions that will hold my place in a list so that I can find previous and next items with different search criteria. Does anything like this exist in Python?
I have a long list of numbers in which I'm trying to find a particular recurring pattern.
For example:
L = [1, 1, 3, 5, 7, 5, 1, 2, 1, 1, 1, 8, 9, 1, 1, 1]
Say I want to find the peaks by looking for the index of the first number in the list >4, and then the indices of the nearest numbers <2 on either side. Then I want to find the next peak and do the same thing. (The actual pattern is more complicated than this.)
So the eventual output I'm looking for in this example is 1:6, 10:13.
I'm using this to find the first value:
a = next(i for i, v in enumerate(L) if v > 4)
Or this to find all values > 4 to later group them:
indexes = [i for i, v in enumerate(L) if v > 4]
I've tried next, iter, generators, many kinds of for loops and more without success. I've looked at islice also, but it seems like overkill to slice the list in two for every index found and then do forward and reverse searches on the two pieces. There must be a less convoluted way?
Any help would be much appreciated.
I would use a generator function and track the indices matching your conditions as you iterate over the input:
L = [1, 1, 3, 5, 7, 5, 1, 2, 1, 1, 1, 8, 9, 1, 1, 1]
def peak_groups(l):
start_i = 0
peak_i = None
for i,x in enumerate(l):
if peak_i is None:
# look for start of peak group, or peak itself
if x < 2:
start_i = i
elif x > 6:
peak_i = i
else:
# look for end of peak group
if x < 2:
yield (start_i, peak_i, i)
start_i = i
peak_i = None
# finally check edge condition if we reached the end of the list
if peak_i is not None:
yield (start_i, peak_i, i)
for group in peak_groups(L):
print group
Results in:
(1, 4, 6)
(10, 11, 13)
The nice thing is you're only iterating over the input once. Though it might not be so simple with your real world grouping conditions.
You'll have to think about what should happen if multiple "peak groups" overlap, and this currently doesn't find the greatest peak in the group, but it should be a starting point.
This finds the peaks as you stated, but requires the initial index of the list to determine where to search:
def findpeaks(lst, index, v1, v2):
l = lst[index:]
ele = next(i for (i, v) in enumerate(l) if v > v1)
idx = ele - next(i for (i, v) in enumerate(l[:ele+1][::-1]) if v < v2)
jdx = ele + next(i for (i, v) in enumerate(l[ele:]) if v < v2)
# Returns a tuple containing:
#
# the index of the element > v1
# the index of the element < v2 (found before the element > v1),
# the index of the element < v2 (found after the element > v1).
return (ele + index, idx + index, jdx + index)
This works by:
Finding the element matching the first value's criterion (> 4, in your example)
Finding the element before this index that matches that criterion of the second value (< 2). It does this by creating a slice of the list from where it finds the index from part 1, and reversing it. The index you find then has to be subtracted from where the index of part 1 is.
Search forward by creating a slice of the original list, and looking from there.
The end result has to take into account the starting index, so add that to the results. And, that's it.
For example:
L = [1, 1, 3, 5, 7, 5, 1, 2, 1, 1, 1, 8, 9, 1, 1, 1]
print findpeaks(L, 0, 4, 2) # prints (3, 1, 6)
print findpeaks(L, 6, 4, 2) # prints (11, 10, 13)
The next obvious step is to find all the elements that meet this criterion. One suggestion would be to make this recursive - but you can do that on your own.

Getting the indices of the X largest numbers in a list

Please no built-ins besides len() or range(). I'm studying for a final exam.
Here's an example of what I mean.
def find_numbers(x, lst):
lst = [3, 8, 1, 2, 0, 4, 8, 5]
find_numbers(3, lst) # this should return -> (1, 6, 7)
I tried this not fully....couldn't figure out the best way of going about it:
def find_K_highest(lst, k):
newlst = [0] * k
maxvalue = lst[0]
for i in range(len(lst)):
if lst[i] > maxvalue:
maxvalue = lst[i]
newlst[0] = i
Take the first 3 (x) numbers from the list. The minimum value for the maximum are these. In your case: 3, 8, 1. Their index is (0, 1, 2). Build pairs of them ((3,0), (8,1), (1,2)).
Now sort them by size of the maximum value: ((8,1), (3,0), (1,2)).
With this initial List, you can traverse the rest of the list recursively. Compare the smallest value (1, _) with the next element in the list (2, 3). If that is larger (it is), sort it into the list ((8,1), (3,0), (2,3)) and throw away the smallest.
In the beginning you have many changes in the top 3, but later on, they get rare. Of course you have to keep book about the last position (3, 4, 5, ...) too, when traversing.
An insertion sort for the top N elements should be pretty performant.
Here is a similar problem in Scala but without the need to report the indexes.
I dont know is it good to post a solution, but this seems to work:
def find_K_highest(lst, k):
# escape index error
if k>len(lst):
k=len(lst)
# the output array
idxs = [None]*k
to_watch = range(len(lst))
# do it k times
for i in range(k):
# guess that max value is at least at idx '0' of to_watch
to_del=0
idx = to_watch[to_del]
max_val = lst[idx]
# search through the list for bigger value and its index
for jj in range(len(to_watch)):
j=to_watch[jj]
val = lst[j]
# check that its bigger that previously finded max
if val > max_val:
idx = j
max_val = val
to_del=jj
# append it
idxs[i] = idx
del to_watch[to_del]
# return answer
return idxs
PS I tried to explain every line of code.
Can you use list methods? (e.g. append, sort, index?). If so, this should work (I think...)
def find_numbers(n,lst):
ll=lst[:]
ll.sort()
biggest=ll[-n:]
idx=[lst.index(i) for i in biggest] #This has the indices already, but we could have trouble if one of the numbers appeared twice
idx.sort()
#check for duplicates. Duplicates will always be next to each other since we sorted.
for i in range(1,len(idx)):
if(idx[i-1]==idx[i]):
idx[i]=idx[i]+lst[idx[i]+1:].index(lst[idx[i]]) #found a duplicate, chop up the input list and find the new index of that number
idx.sort()
return idx
lst = [3, 8, 1, 2, 0, 4, 8, 5]
print find_numbers(3, lst)
Dude. You have two ways you can go with this.
First way is to be clever. Phyc your teacher out. What she is looking for is recursion. You can write this with NO recursion and NO built in functions or methods:
#!/usr/bin/python
lst = [3, 8, 1, 2, 0, 4, 8, 5]
minval=-2**64
largest=[]
def enum(lst):
for i in range(len(lst)):
yield i,lst[i]
for x in range(3):
m=minval
m_index=None
for i,j in enum(lst):
if j>m:
m=j
m_index=i
if m_index:
largest=largest+[m_index]
lst[m_index]=minval
print largest
This works. It is clever. Take that teacher!!! BUT, you will get a C or lower...
OR -- you can be the teacher's pet. Write it the way she wants. You will need a recursive max of a list. The rest is easy!
def max_of_l(l):
if len(l) <= 1:
if not l:
raise ValueError("Max() arg is an empty sequence")
else:
return l[0]
else:
m = max_of_l(l[1:])
return m if m > l[0] else l[0]
print max_of_l([3, 8, 1, 2, 0, 4, 8, 5])

Categories