Find 4 values in a list that are close together - python

I am trying to find the 4 closest value in a given list within a defined value for the difference. The list can be of any length and is sorted in increasing order. Below is what i have tried:
holdlist=[]
m=[]
nlist = []
t = 1
q = [2,3,5,6,7,8]
for i in range(len(q)-1):
for j in range(i+1,len(q)):
if abs(q[i]-q[j])<=1:
holdlist.append(i)
holdlist.append(j)
t=t+1
break
else:
if t != 4:
holdlist=[]
t=1
elif t == 4:
nlist = holdlist
holdlist=[]
t=1
nlist = list(dict.fromkeys(nlist))
for num in nlist:
m.append(q[num])
The defined difference value here is 1. Where "q" is the list and i am trying to get the result in "m" to be [5,6,7,8]. but it turns out to be an empty list.
This works only if the list "q" is [5,6,7,8,10,11]. My guess is after comparing the last value, the for loop ends and the result does not go into "holdlist".
Is there a more elegant way of writing the code?
Thank you.

One solution would be to sort the input list and find the smallest window of four elements. Given the example input, this is
min([sorted(q)[i:i+4] for i in range(len(q) - 3)],
key=lambda w: w[3] - w[0])
But given a different input this will still return a value if the smallest window has a bigger spacing than 1. But I'd still use this solution, with a bit of error handling:
assert len(q) > 4
answer = min([sorted(q)[i:i+4] for i in range(len(q) - 3)], key=lambda w: w[3] - w[0])
assert answer[3] - answer[0] < 4
Written out and annotated:
sorted_q = sorted(q)
if len(q) < 4:
raise RuntimeError("Need at least four members in the list!")
windows = [sorted_q[i:i+4] for i in range(len(q) - 3)] # All the chunks of four elements
def size(window):
"""The size of the window."""
return window[3] - window[0]
answer = min(windows, key=size) # The smallest window, by size
if answer[3] - answer[0] > 3:
return "No group of four elements has a maximum distance of 1"
return answer

This would be one easy approach to find four closest numbers in list
# Lets have a list of numbers. It have to be at least 4 numbers long
numbers = [10, 4, 9, 1,7,12,25,26,28,29,30,77,92]
numbers.sort()
#now we have sorted list
delta = numbers[4]-numbers[0] # Lets see how close first four numbers in sorted list are from each others.
idx = 0 # Let's save our starting index
for i in range(len(numbers)-4):
d = numbers[i+4]-numbers[i]
if d < delta:
# if some sequence are closer together we save that value and index where they were found
delta = d
idx = i
if numbers[idx:idx+4] == 4:
print ("closest numbers are {}".format(numbers[idx:idx+4]))
else:
print ("Sequence with defined difference didn't found")

Here is my jab at the issue for OP's reference, as #kojiro and #ex4 have already supplied answers that deserve credit.
def find_neighbor(nums, dist, k=4):
res = []
nums.sort()
for i in range(len(nums) - k):
if nums[i + k - 1] - nums[i] <= dist * k:
res.append(nums[i: i + k])
return res
Here is the function in action:
>>> nums = [10, 11, 5, 6, 7, 8, 9] # slightly modified input for better demo
>>> find_neighbor(nums, 1)
[[5, 6, 7, 8], [6, 7, 8, 9], [7, 8, 9, 10]]
Assuming sorting is legal in tackling this problem, we first sort the input array. (I decided to sort in-place for marginal performance gain, but we can also use sorted(nums) as well.) Then, we essentially create a window of size k and check if the difference between the first and last element within that window are lesser or equal to dist * k. In the provided example, for instance, we would expect the difference between the two elements to be lesser or equal to 1 * 4 = 4. If there exists such window, we append that subarray to res, which we return in the end.
If the goal is to find a window instead of all windows, we could simply return the subarray without appending it to res.

You can do this in a generic fashion (i.e. for any size of delta or resulting largest group) using the zip function:
def deltaGroups(aList,maxDiff):
sList = sorted(aList)
diffs = [ (b-a)<=maxDiff for a,b in zip(sList,sList[1:]) ]
breaks = [ i for i,(d0,d1) in enumerate(zip(diffs,diffs[1:]),1) if d0!=d1 ]
groups = [ sList[s:e+1] for s,e in zip([0]+breaks,breaks+[len(sList)]) if diffs[s] ]
return groups
Here's how it works:
Sort the list in order to have each number next to the closest other numbers
Identify positions where the next number is within the allowed distance (diffs)
Get the index positions where compliance with the allowed distance changes (breaks) from eligible to non-eligible and from non-eligible to eligible
This corresponds to start and end of segments of the sorted list that have consecutive eligible pairs.
Extract subsets of the the sorted list based on the start/end positions of consecutive eligible differences (groups)
The deltaGroups function returns a list of groups with at least 2 values that are within the distance constraints. You can use it to find the largest group using the max() function.
output:
q = [10,11,5,6,7,8]
m = deltaGroups(q,1)
print(q)
print(m)
print(max(m,key=len))
# [10, 11, 5, 6, 7, 8]
# [[5, 6, 7, 8], [10, 11]]
# [5, 6, 7, 8]
q = [15,1,9,3,6,16,8]
m = deltaGroups(q,2)
print(q)
print(m)
print(max(m,key=len))
# [15, 1, 9, 3, 6, 16, 8]
# [[1, 3], [6, 8, 9], [15, 16]]
# [6, 8, 9]
m = deltaGroups(q,3)
print(m)
print(max(m,key=len))
# [[1, 3, 6, 8, 9], [15, 16]]
# [1, 3, 6, 8, 9]

Related

How can I separate the numbers with the highest difference into different groups

So I wrote this code to find the differences between numbers.
def partition(lst: list):
f = []
l = sorted(f)
for i in range(len(lst)):
if i < len(lst)-1:
diff = lst[i+1] - lst[i]
l.append(diff)
else:
return f
and it works, but now I want to create another function grouping to separate the list where the difference between two numbers are the greatest. So like for instance if this is my list
[1,3,5,7,12,14,15]
after running partition(lst), I get
[2, 2, 2, 5, 2, 1]
So Now I want to let grouping(lst, n) separate the list where the difference is widest, which is 5, so I want grouping to return
[(1,3,5,7),(12,14,15)]
Furthermore, in grouping(lst,n), there will be an int n as well, that will determine how many groupings are required. So basically, as long as n <= len(lst), n will be the number of groupings made.
To understand what I mean, so basically if n = 1, I want grouping(lst,n) to separate the numbers where the MAXIMUM DIFFERENCE OCCURS.
If n = 2, grouping(lst,n) should separate the numbers where the TOP TWO MAXIMUM DIFFERENCES OCCUR.
If n = 3, 'grouping(lst,n) should separate the numbers where the top 3 maximum differences occur. and so on..
Here is my code so far including grouping(lst,n)
def partition(lst: list):
f = []
for i in range(len(lst)):
if i < len(lst)-1:
diff = lst[i+1] - lst[i]
f.append(diff)
else:
return f
print(partition([1,3,5,7,12,14,15]))
def grouping(lst: list, n):
for x in lst:
if partition(x) == max(partition(lst)):
lst.append(x)
What should I write under grouping(lst,n) to make it right?
here my answer to your problem:
import numpy
def grouping(input_user, n):
diff = numpy.diff(input_user)
heighests = numpy.sort(diff)[-n:]
results = []
result = [input_user[0]]
index = 0
for i, cell in enumerate(input_user[1:]):
if diff[i] in heighests:
heighests = numpy.delete(heighests, numpy.where(heighests == diff[i])[0][0])
result.append(cell)
results.append(result)
result = []
result.append(cell)
if result:
results.append(result)
return results
for n in range(1, 4):
print(f"n={n}: {grouping(numpy.array([1,3,5,7,12,14,15]), n)}")
Result:
n=1: [[1, 3, 5, 7, 12], [14, 15]]
n=2: [[1, 3], [5, 7, 12], [14, 15]]
n=3: [[1, 3], [5], [7, 12], [14, 15]]

Checking sum of items in a list if equals target value

I am trying to make a program that checks whether which items are equal to a target value in a list and then output their indexes.
E.g.
li = [2, 5, 7, 9, 3]
target = 16
output: [2, 3]
li = [2, 5, 7, 9, 3]
target = 7
output: [0, 1]
Another way, assuming you can sort the list is the following
original_l = [1,2,6,4,9,3]
my_l = [ [index, item] for item,index in zip(original_l, range(0,len(original_l)))]
my_l_sort = sorted(my_l, key=lambda x: x[1])
start_i = 0
end_i = len(my_l_sort)-1
result = []
target = 7
while start_i < end_i:
if my_l_sort[start_i][1] + my_l_sort[end_i][1] == target:
result.append([my_l_sort[start_i][0], my_l_sort[end_i][0]])
break
elif my_l_sort[start_i][1] + my_l_sort[end_i][1] < target:
start_i+=1
else:
end_i-=1
if len(result) != 0:
print(f"Match for indices {result[0]}")
else:
print("No match")
The indices 0 and 1 of result[0] are respectively the 2 positions, given as a 2 element string, in original_l that holds the values that summed give the target.
is this a homework?
Anyways, here is the answer you are looking for
def check_sum(l, target):
for i in range(len(l)):
sum_temp = 0
for j in range(i, len(l)):
if sum_temp == target:
return [i, j-1]
else:
sum_temp += l[j]
return None
print(check_sum([2, 5, 7, 9, 3], 16))
"""
check_sum([2, 5, 7, 9, 3], 16)
>>> [2, 3]
check_sum([2, 5, 7, 9, 3], 7)
>>> [0, 1]
check_sum([2, 5, 7, 9, 3], 99)
>>> None
"""
The code is self-explanatory and does not require extra commenting. It simply iterates over the list of integers you have as an input and tries to find a sequence of values that add up to your target.
If you dont worry about stack explosion, for smaller input.
We divide solutions containing an index and not containing index and merge all those solution. It returns indices of all possible solutions.
It is O(2^n) solutions. Similar ones
def solve(residual_sum, original_list, present_index):
'''Returns list of list of indices where sum gives residual_sum'''
if present_index == len(original_list)-1:
# If at end of list
if residual_sum == original_list[-1]:
# if residual sum if equal to present element
# then this index is part of solution
return [[present_index]]
if residual_sum == 0:
# 0 sum, empty solution
return [[]]
# Reaching here would mean list at caller side can not
# lead to desired sum, so there is no solution possible
return []
all_sols = []
# Get all solutions which contain i
# since i is part of solution,
# so we only need to find for residual_sum-original_list[present_index]
solutions_with_i = solve(residual_sum-original_list[present_index], original_list, present_index+1)
if solutions_with_i:
# Add solutions containing i
all_sols.extend([[present_index] + x for x in solutions_with_i])
# solution dont contain i, so use same residual sum
solutions_without_i = solve(residual_sum, original_list, present_index+1)
if solutions_without_i:
all_sols.extend(solutions_without_i)
return all_sols
print(solve(16, [2, 5, 7, 9, 3], 0))
Indices
[[0, 1, 3], [2, 3]]

Group Consecutive Increasing Numbers in List [duplicate]

This question already has answers here:
Decompose a list of integers into lists of increasing sequences
(6 answers)
Closed 2 years ago.
How can I group together consecutive increasing integers in a list? For example, I have the following list of integers:
numbers = [0, 5, 8, 3, 4, 6, 1]
I would like to group elements together as follow:
[[0, 5, 8], [3, 4, 6], [1]]
While the next integer is more than previous, keep adding to the same nested list; ones the next integer is smaller, add nested list to main list and start again.
I have tried few different ways (while loop, for loop, enumerate and range), but cannot figure out how to make it append to the same nested list as long as next integer is larger.
result = []
while (len(numbers) - 1) != 0:
group = []
first = numbers.pop(0)
second = numbers[0]
while first < second:
group.append(first)
if first > second:
result.append(group)
break
You could use a for loop:
numbers = [0, 5, 8, 3, 4, 6, 1]
result = [[]]
last_num = numbers[0] # last number (to check if the next number is greater or equal)
for number in numbers:
if number < last_num:
result.append([]) # add a new consecutive list
result[-1].append(number)
last_num = number # set last_num to this number, so it can be used later
print(result)
NOTE: This doesn't use .pop(), so the numbers list stays intact. Also, one loop = O(N) time complexity!!
If pandas are allowed, I would do this:
import pandas as pd
numbers = [0, 5, 8, 3, 4, 6, 1]
df = pd.DataFrame({'n':numbers})
[ g['n'].values.tolist() for _,g in df.groupby((df['n'].diff()<0).cumsum())]
produces
[[0, 5, 8], [3, 4, 6], [1]]
You can do this:
numbers = [0, 5, 8, 3, 4, 6, 1]
result = []
while len(numbers) != 0:
secondresult = []
for _ in range(3):
if numbers != []:
toappend = numbers.pop(0)
secondresult.append(toappend)
else:
continue
result.append(secondresult)
print(result)
use while and for loops. and append them to secondresult and result

Obtaining non decreasing subsequence from a python list efficiently

I wrote a code to obtain non decreasing sub-sequence from a python list
lis = [3, 6, 3, 8, 6, 4, 2, 9, 5]
ind = 0
newlis = []
while ind < len(lis):
minele = min(lis[ind:])
newlis.append(minele)
ind = lis.index(minele) + 1
print(newlis)
Though it seems to be working fine with the testcases I tried, is there a more efficient way to do this, because the worst cast time complexity of this code is O(n^2) for the case the list is already sorted, assuming the built-in min method uses linear search.
To be more precise, I want longest possible non decreasing sub-list and the sub-list should start with the minimum element of the list. And by sub-list, I mean that the elements need not be in a contiguous stretch in the given original list (lis).
I'm almost convinced it can run in linear time. You just need to keep trying to build a longest sequence during one scan and either store them all as I did or keep currently longest one.
lis = [9, 1, 3, 8, 9, 6, 9, 8, 2, 3, 4, 5]
newlis = [[]]
minele = min(lis)
ind = lis.index(minele)
currmin = minele
seq = 0
longest = 0
longest_idx = None
while ind < len(lis):
if lis[ind] >= currmin:
newlis[seq].append(lis[ind])
currmin = lis[ind]
ind += 1
else:
if len(newlis[seq]) > longest:
longest = len(newlis[seq])
longest_idx = seq
newlis.append([minele])
currmin = minele
seq += 1
if len(newlis[seq]) > longest:
longest = len(newlis[seq])
longest_idx = seq
print(newlis)
print(newlis[longest_idx])

Python, converting a list of indices to slices

So I have a list of indices,
[0, 1, 2, 3, 5, 7, 8, 10]
and want to convert it to this,
[[0, 3], [5], [7, 8], [10]]
this will run on a large number of indices.
Also, this technically isn't for slices in python, the tool I am working with is faster when given a range compared to when given the individual ids.
The pattern is based on being in a range, like slices work in python. So in the example, the 1 and 2 are dropped because they are already included in the range of 0 to 3. The 5 would need accessed individually since it is not in a range, etc. This is more helpful when a large number of ids get included in a range such as [0, 5000].
Since you want the code to be fast, I wouldn't try to be too fancy. A straight-forward approach should perform quite well:
a = [0, 1, 2, 3, 5, 7, 8, 10]
it = iter(a)
start = next(it)
slices = []
for i, x in enumerate(it):
if x - a[i] != 1:
end = a[i]
if start == end:
slices.append([start])
else:
slices.append([start, end])
start = x
if a[-1] == start:
slices.append([start])
else:
slices.append([start, a[-1]])
Admittedly, that's doesn't look too nice, but I expect the nicer solutions I can think of to perform worse. (I did not do a benchmark.)
Here is s slightly nicer, but slower solution:
from itertools import groupby
a = [0, 1, 2, 3, 5, 7, 8, 10]
slices = []
for key, it in groupby(enumerate(a), lambda x: x[1] - x[0]):
indices = [y for x, y in it]
if len(indices) == 1:
slices.append([indices[0]])
else:
slices.append([indices[0], indices[-1]])
def runs(seq):
previous = None
start = None
for value in itertools.chain(seq, [None]):
if start is None:
start = value
if previous is not None and value != previous + 1:
if start == previous:
yield [previous]
else:
yield [start, previous]
start = value
previous = value
Since performance is an issue go with the first solution by #SvenMarnach but here is a fun one liner split into two lines! :D
>>> from itertools import groupby, count
>>> indices = [0, 1, 2, 3, 5, 7, 8, 10]
>>> [[next(v)] + list(v)[-1:]
for k,v in groupby(indices, lambda x,c=count(): x-next(c))]
[[0, 3], [5], [7, 8], [10]]
Below is a simple python code with numpy:
def list_to_slices(inputlist):
"""
Convert a flatten list to a list of slices:
test = [0,2,3,4,5,6,12,99,100,101,102,13,14,18,19,20,25]
list_to_slices(test)
-> [(0, 0), (2, 6), (12, 14), (18, 20), (25, 25), (99, 102)]
"""
inputlist.sort()
pointers = numpy.where(numpy.diff(inputlist) > 1)[0]
pointers = zip(numpy.r_[0, pointers+1], numpy.r_[pointers, len(inputlist)-1])
slices = [(inputlist[i], inputlist[j]) for i, j in pointers]
return slices
If your input is a sorted sequence, which I assume it is, you can do it in a minimalistic way in three steps by employing the old good zip() function:
x = [0, 1, 2, 3, 5, 7, 8, 10]
# find beginnings and endings of sequential runs,
# N.B. the first beginning and the last ending are not included
begs_ends_iter = zip(
*[(x1, x0) for x0, x1 in zip(x[:-1], x[1:]) if x1 - x0 > 1]
)
# handling case when there is only one sequential run
begs, ends = tuple(begs_ends_iter) or ((), ())
# add the first beginning and the last ending,
# combine corresponding beginnings and endings,
# and convert isolated elements into the lists of length one
y = [
[beg] if beg == end else [beg, end]
for beg, end in zip(tuple(x[:1]) + begs, ends + tuple(x[-1:]))
]
If your input is unsorted then sort it and you will get sorted list, which is a sequence. If you have a sorted iterable and do not want to convert it to a sequence (e.g., because it is too long) then you may make use of chain() and pairwise() functions from itertools package (pairwise() is available since Python 3.10):
from itertools import chain, pairwise
x = [0, 1, 2, 3, 5, 7, 8, 10]
# find beginnings and endings of sequential runs,
# N.B. the last beginning and the first ending are None's
begs, ends = zip(
*[
(x1, x0)
for x0, x1 in pairwise(chain([None], x, [None]))
if x0 is None or x1 is None or x1 - x0 > 1
]
)
# removing the last beginning and the first ending,
# combine corresponding beginnings and endings,
# and convert isolated elements into the lists of length one
y = [
[beg] if beg == end else [beg, end]
for beg, end in zip(begs[:-1], ends[1:])
]
These solutions are similar to the one proposed by bougui, but without using numpy. Which may be more efficient if data is not in numpy array already and is not very large sequence or opposite, too large iterable to fit into memory.

Categories