I have a list numbers say,
[1,2,3,6,8,9,10,11]
First, I want to get the sum of the differences (step size) between the numbers (n, n+1) in the list.
Second, if a set of consecutive numbers having a difference of 1 between them, put them in a list, i.e. there are two such lists in this example,
[1,2,3]
[8,9,10,11]
and then put the rest numbers in another list, i.e. there is only one such list in the example,
[6].
Third, get the lists with the max/min sizes from the sequential lists, i.e. [1,2,3], [8,9,10,11] in this example, the max list is,
[8,9,10,11]
min list is
[1,2,3].
What's the best way to implement this?
First, I want to get the sum of the differences (step size) between
the numbers (n, n+1) in the list.
Use sum on the successive differences of elements in the list:
>>> sum(lst[i] - x for i, x in enumerate(lst[:-1], start=1))
10
Second, if a set of consecutive numbers having a difference of 1 between them, put them in a list, i.e. there are two such lists in
this example, and then put the rest numbers in another list, i.e.
there is only one such list in the example,
itertools.groupby does this by grouping on the difference of each element on a reference itertools.count object:
>>> from itertools import groupby, count
>>> c = count()
>>> result = [list(g) for i, g in groupby(lst, key=lambda x: x-next(c))]
>>> result
[[1, 2, 3, 4], [6], [8, 9, 10, 11]]
Third, get the lists with the max/min sizes from above
max and min with the key function as sum:
>>> max(result, key=sum)
[8, 9, 10, 11]
>>> min(result, key=sum)
[6] #??? shouldn't this be [6]
I wonder if you've already got the answer to this (given the missing 4 from your answers) as the first thing I naively tried produced that answer. (That and/or it reads like a homework question)
>>> a=[1,2,3,4,6,8,9,10,11]
>>> sum([a[x+1] - a[x] for x in range(len(a)-1)])
10
>>> [a[x] for x in range(len(a)-1) if abs(a[x] - a[x+1]) ==1]
[1, 2, 3, 8, 9, 10]
Alternatively, try :
a=[1,2,3,6,8,9,10,11]
sets = []
cur_set = set()
total_diff = 0
for index in range(len(a)-1):
total_diff += a[index +1] - a[index]
if a[index +1] - a[index] == 1:
cur_set = cur_set | set([ a[index +1], a[index]])
else:
if len(cur_set) > 0:
sets.append(cur_set)
cur_set = set()
if len(cur_set) > 0:
sets.append(cur_set)
all_seq_nos = set()
for seq_set in sets:
all_seq_nos = all_seq_nos | seq_set
non_seq_set = set(a) - all_seq_nos
print("Sum of differences is {0:d}".format(total_diff))
print("sets of sequential numbers are :")
for seq_set in sets:
print(sorted(list(seq_set)))
print("set of non-sequential numbers is :")
print(sorted(list(non_seq_set)))
big_set=max(sets, key=sum)
sml_set=min(sets, key=sum)
print ("Biggest set of sequential numbers is :")
print (sorted(list(big_set)))
print ("Smallest set of sequential numbers is :")
print (sorted(list(sml_set)))
Which will produce the output :
Sum of differences is 10
sets of sequential numbers are :
[1, 2, 3]
[8, 9, 10, 11]
set of non-sequential numbers is :
[6]
Biggest set of sequential numbers is :
[8, 9, 10, 11]
Smallest set of sequential numbers is :
[1, 2, 3]
Hopefully that all helps ;-)
Related
I got a list like this: my_list = [5, 9, 3, 4, 1, 8, 7, 6]
And it can have an undefined number of integers in it. I need to perform a calculation between the numbers that ignores the second number and calculate something like this: (5 - 3) + (4 - 8) + 7 and then repeat the process.
I have tried a for loop like this one:
for i in range(0, len(my_list), 2):
print(my_list[i])
But it seems wrong and I don't know how to proceed further.
You want the sum:
my_list[0] - my_list[2]
+ my_list[3] - my_list[5]
+ my_list[6] - ...
The elements in the left column, with a + sign, are the elements of the slice my_list[::3]. The elements in the right column, with a - sign, are the elements of the slice my_list[2::3].
Solution:
Thus the function you are looking for is:
def f(my_list):
return sum(my_list[::3]) - sum(my_list[2::3])
# f([5, 9, 3, 4, 1, 8, 7, 6]) == 5
Iterating to print intermediary results:
If you want to print the intermediary results, you can iterate through my_list[::3] and my_list[2::3] simultaneously using itertools.zip_longest:
from itertools import zip_longest
for a,b in zip_longest(my_list[::3], my_list[2::3], fillvalue=0):
print('{} - {} = {}'.format(a, b, a-b))
# OUTPUT:
# 5 - 3 = 2
# 4 - 8 = -4
# 7 - 0 = 7
See also:
Understanding slice notation
builtin function zip documentation
itertools.zip_longest documentation
A nice Pythonic way to achieve it would be with itertools:
from itertools import compress, cycle
my_list = [5, 9, 3, 4, 1, 8, 7, 6]
# Choose first and third items of each 3
narrowed = compress(my_list, cycle([1,0,1]))
# Alternate between positive and negative numbers
signs = map(lambda x,y: x*y, narrowed, cycle([1,-1]))
# Sum everything
sum(signs)
Or in one line:
sum(map(lambda x,sign: sign*x, compress(my_list, cycle([1,0,1])), cycle([1,-1])))
cycle - infinitely repeat an iterator
compress - allow you to choose a subset by a bit sign
map - applies a function on an iterator
sum - sums
Simple and Efficient Solution.
score = 0
for i in range(0, len(my_list), 3):
j = i+2
if j < len(my_list):
score += my_list[i] - my_list[j]
else:
score += my_list[i]
print(score)
I am trying to find the 4 closest value in a given list within a defined value for the difference. The list can be of any length and is sorted in increasing order. Below is what i have tried:
holdlist=[]
m=[]
nlist = []
t = 1
q = [2,3,5,6,7,8]
for i in range(len(q)-1):
for j in range(i+1,len(q)):
if abs(q[i]-q[j])<=1:
holdlist.append(i)
holdlist.append(j)
t=t+1
break
else:
if t != 4:
holdlist=[]
t=1
elif t == 4:
nlist = holdlist
holdlist=[]
t=1
nlist = list(dict.fromkeys(nlist))
for num in nlist:
m.append(q[num])
The defined difference value here is 1. Where "q" is the list and i am trying to get the result in "m" to be [5,6,7,8]. but it turns out to be an empty list.
This works only if the list "q" is [5,6,7,8,10,11]. My guess is after comparing the last value, the for loop ends and the result does not go into "holdlist".
Is there a more elegant way of writing the code?
Thank you.
One solution would be to sort the input list and find the smallest window of four elements. Given the example input, this is
min([sorted(q)[i:i+4] for i in range(len(q) - 3)],
key=lambda w: w[3] - w[0])
But given a different input this will still return a value if the smallest window has a bigger spacing than 1. But I'd still use this solution, with a bit of error handling:
assert len(q) > 4
answer = min([sorted(q)[i:i+4] for i in range(len(q) - 3)], key=lambda w: w[3] - w[0])
assert answer[3] - answer[0] < 4
Written out and annotated:
sorted_q = sorted(q)
if len(q) < 4:
raise RuntimeError("Need at least four members in the list!")
windows = [sorted_q[i:i+4] for i in range(len(q) - 3)] # All the chunks of four elements
def size(window):
"""The size of the window."""
return window[3] - window[0]
answer = min(windows, key=size) # The smallest window, by size
if answer[3] - answer[0] > 3:
return "No group of four elements has a maximum distance of 1"
return answer
This would be one easy approach to find four closest numbers in list
# Lets have a list of numbers. It have to be at least 4 numbers long
numbers = [10, 4, 9, 1,7,12,25,26,28,29,30,77,92]
numbers.sort()
#now we have sorted list
delta = numbers[4]-numbers[0] # Lets see how close first four numbers in sorted list are from each others.
idx = 0 # Let's save our starting index
for i in range(len(numbers)-4):
d = numbers[i+4]-numbers[i]
if d < delta:
# if some sequence are closer together we save that value and index where they were found
delta = d
idx = i
if numbers[idx:idx+4] == 4:
print ("closest numbers are {}".format(numbers[idx:idx+4]))
else:
print ("Sequence with defined difference didn't found")
Here is my jab at the issue for OP's reference, as #kojiro and #ex4 have already supplied answers that deserve credit.
def find_neighbor(nums, dist, k=4):
res = []
nums.sort()
for i in range(len(nums) - k):
if nums[i + k - 1] - nums[i] <= dist * k:
res.append(nums[i: i + k])
return res
Here is the function in action:
>>> nums = [10, 11, 5, 6, 7, 8, 9] # slightly modified input for better demo
>>> find_neighbor(nums, 1)
[[5, 6, 7, 8], [6, 7, 8, 9], [7, 8, 9, 10]]
Assuming sorting is legal in tackling this problem, we first sort the input array. (I decided to sort in-place for marginal performance gain, but we can also use sorted(nums) as well.) Then, we essentially create a window of size k and check if the difference between the first and last element within that window are lesser or equal to dist * k. In the provided example, for instance, we would expect the difference between the two elements to be lesser or equal to 1 * 4 = 4. If there exists such window, we append that subarray to res, which we return in the end.
If the goal is to find a window instead of all windows, we could simply return the subarray without appending it to res.
You can do this in a generic fashion (i.e. for any size of delta or resulting largest group) using the zip function:
def deltaGroups(aList,maxDiff):
sList = sorted(aList)
diffs = [ (b-a)<=maxDiff for a,b in zip(sList,sList[1:]) ]
breaks = [ i for i,(d0,d1) in enumerate(zip(diffs,diffs[1:]),1) if d0!=d1 ]
groups = [ sList[s:e+1] for s,e in zip([0]+breaks,breaks+[len(sList)]) if diffs[s] ]
return groups
Here's how it works:
Sort the list in order to have each number next to the closest other numbers
Identify positions where the next number is within the allowed distance (diffs)
Get the index positions where compliance with the allowed distance changes (breaks) from eligible to non-eligible and from non-eligible to eligible
This corresponds to start and end of segments of the sorted list that have consecutive eligible pairs.
Extract subsets of the the sorted list based on the start/end positions of consecutive eligible differences (groups)
The deltaGroups function returns a list of groups with at least 2 values that are within the distance constraints. You can use it to find the largest group using the max() function.
output:
q = [10,11,5,6,7,8]
m = deltaGroups(q,1)
print(q)
print(m)
print(max(m,key=len))
# [10, 11, 5, 6, 7, 8]
# [[5, 6, 7, 8], [10, 11]]
# [5, 6, 7, 8]
q = [15,1,9,3,6,16,8]
m = deltaGroups(q,2)
print(q)
print(m)
print(max(m,key=len))
# [15, 1, 9, 3, 6, 16, 8]
# [[1, 3], [6, 8, 9], [15, 16]]
# [6, 8, 9]
m = deltaGroups(q,3)
print(m)
print(max(m,key=len))
# [[1, 3, 6, 8, 9], [15, 16]]
# [1, 3, 6, 8, 9]
Given a list of numbers, create a new list of numbers such that the first and last numbers are added and stored as the first number, the second and second-to-last numbers are stored as the second number, and so on
num_list = [1,2,3,4,5,6]
num_list2 = [num_list[-1] + num_list[0], num_list[-2] + num_list[1],
num_list[-3] + num_list[2]]
print(num_list2)
output is [7,7,7]
I got the correct output this way but I am sure this is not an efficient way to do it. Is there a better way? I also am supposed to check for even and odd length of the list and if its an odd number of integers, add the central integer in the original list to the end of the new list but don't know how I would go about doing this
I think this is more efficient, i just simply did a for loop:
num_list2 = []
num_list = [1,2,3,4,5,6]
for i in range(round(len(num_list)/2)):
num_list2.append(num_list[i]+num_list[-(i+1)])
print(num_list2)
Output:
[7, 7, 7]
Let us using reversed
[x + y for x, y in zip(num_list, list(reversed(num_list)))][:len(num_list)//2]
Out[406]: [7, 7, 7]
Here's an inefficient[1], but clear way of doing this:
from itertools import zip_longest # or izip_longest in Python2
lst = [1,2,3,4,5,6]
chop_index = len(lst) // 2 # (or +1, depending on how you want to handle odd sized lists)
lh, rh = lst[:chop_index], lst[:chop_index-1:-1]
print(lh, rh) # To see what's going on in the "chopping"
sums = [x + y for (x,y) in zip_longest(lh, rh, fillvalue=0)]
print(sums)
You could improve it by using islice and reversed iterators, or use index math exclusively.
Output:
lst = [1,2,3,4,5,6] => [7, 7, 7]
lst = [1,2,3,4,5,6,7] => [8, 8, 8, 4]
[1] This makes two copies of the list parts. For long lists this is silly, and you shouldn't use this method. It was mostly written to highlight zip_longest's fillvalue optional argument.
Using itertools.islice on a generator:
from itertools import islice
num_list = [1,2,3,4,5,6]
generator = (x + y for x, y in zip(num_list, num_list[::-1]))
print(list(islice(generator, len(num_list)//2)))
# [7, 7, 7]
You can use the following method, which is compatible with asymmetrical list.
def sum_start_end(list_):
result = [x + y for x, y in zip(list_, list_[::-1])][:len(list_) // 2]
if len(list_) % 2 != 0:
result.append(list_[len(list_) // 2])
return result
so for a symmetric list
>>> num_list = [1, 2, 3, 4, 5, 6]
>>> sum_start_end(num_list)
[7, 7, 7]
and for asymmetric list
>>> num_list = [1, 2, 3, 4, 5, 6, 7]
>>> sum_start_end(num_list)
[8, 8, 8, 4]
It's simpler than you imagine.
Just observe your manual attempt and try to infer from it. We can simply do
x = len(num_list)//2 + len(num_list)%2
for i in range(x):
sumBoth = num_list[i] + num_list[-i-1]
num_list2.append(sumBoth)
or with a simpler one-liner
num_list2 = [ num_list[i] + num_list[-i-1] for i in range(len(num_list)//2+len(num_list)%2)]
This works for even as well as odd lengths because of the len(num_list)%2 at the end in the range.
I'm trying to compare two huge lists which contain 10,000+ lists integers. Each sub-list contains 20 integers, which are random between 1 and 99. Within the sub-lists all integers are unique.
list1 = [[1, 25, 23, 44, ...], [3, 85, 9, 24, 34, ...], ...]
list2 = [[3, 83, 45, 24, ...], [9, 82, 3, 47, 36, ...], ...]
result = compare_lists(list1, list2)
The compare_lists() function would compare integer from two lists that are in the same position, and return the two lists if the integers are different.
It is obviously very inefficient to loop through each sub-list as there are 100 Million+ possible combinations. (each of the 10,000+ sub-lists in list1 gets compared to 10,000+ in list2)
import itertools
def compare_lists(list1, list2):
for (a, b) in itertools.product(list1, list2):
count = 0
for z in range(20):
if a[z] != b[z]:
count += 1
if count == 20:
yield [a, b]
For example (i'll use 4 integers per list):
a = [1, 2, 3, 4] # True
b = [5, 6, 7, 8] # (integers are different)
a = [1, 2, 3, 4] # True
b = [2, 3, 4, 1] # (same integers but not in same position, still true)
a = [1, 2, 3, 4] # False
b = [1, 6, 7, 8] # (position [0] is identical)
itertools.product appears to be very inefficient in situations like this. Is there a faster or more efficient way to do this?
Sorry if this is unclear, I've only recently started using Python.
I don't know how to reduce the number of list-list comparisons based on some precomputed data in general.
Maybe you can get some advantage if the dataset has some property. For example, if you know that the vast majority of the possible 100M+ pairs will be in your output, I would focus on finding the small minority of rejected pairs. If value V appears on position P in a sublist, you can categorize the data in such way that every sublist belongs to 20 categories (P,V) from roughly 2K possibilities (20 positions * 99 values). Two sublist compare False it they share a category. This way you could build in few steps a set of (i,j) pairs such that list1[i] compares False with list2[j]. The output is than everything else from the carthesian product of possible indices i,j.
BTW, you can make the comparison a little bit more efficient than it currently is.
One matching pair a[z] == b[z] is enough to know the result is False.
for z in range(20):
if a[z] == b[z]:
break
else:
yield [a, b]
or equivalent:
if all(i != j for i,j in zip(a,b)):
yield [a, b]
I did not run timing test which one is faster. Anyway the speedup is probably marginal.
I have two lists of the same length which contains a variety of different elements. I'm trying to compare them to find the number of elements which exist in both lists, but have different indexes.
Here are some example inputs/outputs to demonstrate what I mean:
>>> compare([1, 2, 3, 4], [4, 3, 2, 1])
4
>>> compare([1, 2, 3], [1, 2, 3])
0
# Each item in the first list has the same index in the other
>>> compare([1, 2, 4, 4], [1, 4, 4, 2])
2
# The 3rd '4' in both lists don't count, since they have the same indexes
>>> compare([1, 2, 3, 3], [5, 3, 5, 5])
1
# Duplicates don't count
The lists are always the same size.
This is the algorithm I have so far:
def compare(list1, list2):
# Eliminate any direct matches
list1 = [a for (a, b) in zip(list1, list2) if a != b]
list2 = [b for (a, b) in zip(list1, list2) if a != b]
out = 0
for possible in list1:
if possible in list2:
index = list2.index(possible)
del list2[index]
out += 1
return out
Is there a more concise and eloquent way to do the same thing?
This python function does hold for the examples you provided:
def compare(list1, list2):
D = {e:i for i, e in enumerate(list1)}
return len(set(e for i, e in enumerate(list2) if D.get(e) not in (None, i)))
since duplicates don't count, you can use sets to find only the elements in each list. A set only holds unique elements. Then select only the elements shared between both using list.index
def compare(l1, l2):
s1, s2 = set(l1), set(l2)
shared = s1 & s2 # intersection, only the elements in both
return len([e for e in shared if l1.index(e) != l2.index(e)])
You can actually bring this down to a one-liner if you want
def compare(l1, l2):
return len([e for e in set(l1) & set(l2) if l1.index(e) != l2.index(e)])
Alternative:
Functionally you can use the reduce builtin (in python3, you have to do from functools import reduce first). This avoids construction of the list which saves excess memory usage. It uses a lambda function to do the work.
def compare(l1, l2):
return reduce(lambda acc, e: acc + int(l1.index(e) != l2.index(e)),
set(l1) & set(l2), 0)
A brief explanation:
reduce is a functional programming contruct that reduces an iterable to a single item traditionally. Here we use reduce to reduce the set intersection to a single value.
lambda functions are anonymous functions. Saying lambda x, y: x + 1 is like saying def func(x, y): return x + y except that the function has no name. reduce takes a function as its first argument. The first argument a the lambda receives when used with reduce is the result of the previous function, the accumulator.
set(l1) & set(l2) is a set consisting of unique elements that are in both l1 and l2. It is iterated over, and each element is taken out one at a time and used as the second argument to the lambda function.
0 is the initial value for the accumulator. We use this since we assume there are 0 shared elements with different indices to start.
I dont claim it is the simplest answer, but it is a one-liner.
import numpy as np
import itertools
l1 = [1, 2, 3, 4]
l2 = [1, 3, 2, 4]
print len(np.unique(list(itertools.chain.from_iterable([[a,b] for a,b in zip(l1,l2) if a!= b]))))
I explain:
[[a,b] for a,b in zip(l1,l2) if a!= b]
is the list of couples from zip(l1,l2) with different items. Number of elements in this list is number of positions where items at same position differ between the two lists.
Then, list(itertools.chain.from_iterable() is for merging component lists of a list. For instance :
>>> list(itertools.chain.from_iterable([[3,2,5],[5,6],[7,5,3,1]]))
[3, 2, 5, 5, 6, 7, 5, 3, 1]
Then, discard duplicates with np.unique(), and take len().