Related
I've got an optimization problem in which I need to minimize the sum product of two uneven but consecutive arrays, say:
A = [1, 2, 3]
B = [4, 9, 5, 3, 2, 10]
Shuffling of values is not allowed i.e. the index of the arrays must remain the same.
In other words, it is a distribution minimization of array A over array B in consecutive order.
Or: Given that len(B)>=len(A) Minimize the sum product the values of Array A of length n over n values of array B without changing the order of array A or B.
In this case, the minimum would be:
min_sum = 1*4 + 2*3 + 3*2 = 16
A brute force approach to this problem would be:
from itertools import combinations
sums = [sum(a*b for a,b in zip(A,b)) for b in combinations(B,len(A))]
min_sum = min(sums)
I need to do this for many sets of arrays however. I see a lot of overlap with the knapsack problem and I have the feeling that it should be solved with dynamic programming. I am stuck however in how to write an efficient algorithm to perform this.
Any help would be greatly appreciated!
Having two lists
A = [1, 2, 3]
B = [4, 9, 5, 3, 2, 10]
the optimal sum product can be found using:
min_sum = sum(a*b for a,b in zip(sorted(A), sorted(B)[:len(A)][::-1]))
In case A is always given sorted, this simplified version can be used:
min_sum = sum(a*b for a,b in zip(A, sorted(B)[:len(A)][::-1]))
The important part(s) to note:
You need factors of A sorted. sorted(A) will do this job, without modifying the original A (in contrast to A.sort()). In case A is already given sorted, this step can be left out.
You need the N lowest values from B, where N is the length of A. This can be done with sorted(B)[:len(A)]
In order to evaluate the minimal sum of products, you need to multiply the highest number of A with the lowest of B, the second hightst of A with the second lowest of B. That is why after getting the N lowest values of B the order gets reversed with [::-1]
Output
print(min_sum)
# 16
print(A)
# [1, 2, 3] <- The original list A is not modified
print(B)
# [4, 9, 5, 3, 2, 10] <- The original list B is not modified
With Python, you can easily sort and flip sets. The code you are looking for is
A, B = sorted(A), sorted(B)[:len(A)]
min_sum = sum([a*b for a,b in zip(A, B[::-1])])
You may need to get the values one by one from B, and keep the order of the list by having each value assigned to a key.
A = [1, 3, 2]
B = [4, 9, 5, 3, 2, 10]
#create a new dictionary with key value pairs of B array values
new_dict = {}
j=0
for k in B:
new_dict[j] = k
j+= 1
#create a new list of the smallest values in B up to length of array A
min_Bmany =[]
for lp in range(0,len(A)):
#get the smallest remaining value from dictionary new_dict
rmvky= min(zip(new_dict.values(), new_dict.keys()))
#append this item to minimums list
min_Bmany.append((rmvky[1],rmvky[0]))
#delete this key from the dictionary new_dict
del new_dict[rmvky[1]]
#sort the list by the keys(instead of the values)
min_Bmany.sort(key=lambda r: r[0])
#create list of only the values, but still in the same order as they are in original array
min_B =[]
for z in min_Bmany:
min_B.append(z[1])
print(A)
print(min_B)
ResultStr = ""
Result = 0
#Calculate the result
for s in range(0,len(A)):
ResultStr = ResultStr + str(A[s]) +"*" +str(min_B[s])+ " + "
Result = Result + A[s]*min_B[s]
print(ResultStr)
print("Result = ",Result)
The output will be as follows:
A = [1, 3, 2]
B = [4, 9, 5, 3, 2, 10]
1*4 + 3*3 + 2*2 +
Result = 17
Then change the A, and the output becomes:
A = [1, 2, 3]
B = [4, 9, 5, 3, 2, 10]
1*4 + 2*3 + 3*2 +
Result = 16
Not sure if this is helpful, but anyway.
This can be formulated as a mixed-integer programming (MIP) problem. Basically, an assignment problem with some side constraints.
min sum((i,j),x(i,j)*a(i)*b(j))
sum(j, x(i,j)) = 1 ∀i "each a(i) is assigned to exactly one b(j)"
sum(i, x(i,j)) ≤ 1 ∀j "each b(j) can be assigned to at most one a(i)"
v(i) = sum(j, j*x(i,j)) "position of each a(i) in b"
v(i) ≥ v(i-1)+1 ∀i>1 "maintain ordering"
x(i,j) ∈ {0,1} "binary variable"
v(i) ≥ 1 "continuous (or integer) variable"
Example output:
---- 40 VARIABLE z.L = 16.000
---- 40 VARIABLE x.L assign
j1 j4 j5
i1 1.000
i2 1.000
i3 1.000
---- 40 VARIABLE v.L position of a(i) in b
i1 1.000, i2 4.000, i3 5.000
Cute little MIP model.
Just as an experiment I generated a random problem with len(a)=50 and len(b)=500. This leads to a MIP with 650 rows and 25k columns. Solved in 50 seconds (to proven global optimality) on my slow laptop.
It turns out using a shortest path algorithm on a direct graph is pretty fast. Erwin did a post showing a MIP model. As you can see in the comments section there, a few of us independently tried shortest path approaches, and on examples with 100 for the length of A and 1000 for the length of B we get optimal solutions in the vicinity of 4 seconds.
The graph can look like:
Nodes are labeled n(i,j) indicating that visiting the node means assigning a(i) to b(j). The costs a(i)*b(j) can be associated with any incoming (or any outgoing) arc. After that calculate the shortest path from src to snk.
BTW can you tell a bit about the background of this problem?
I am trying to find the 4 closest value in a given list within a defined value for the difference. The list can be of any length and is sorted in increasing order. Below is what i have tried:
holdlist=[]
m=[]
nlist = []
t = 1
q = [2,3,5,6,7,8]
for i in range(len(q)-1):
for j in range(i+1,len(q)):
if abs(q[i]-q[j])<=1:
holdlist.append(i)
holdlist.append(j)
t=t+1
break
else:
if t != 4:
holdlist=[]
t=1
elif t == 4:
nlist = holdlist
holdlist=[]
t=1
nlist = list(dict.fromkeys(nlist))
for num in nlist:
m.append(q[num])
The defined difference value here is 1. Where "q" is the list and i am trying to get the result in "m" to be [5,6,7,8]. but it turns out to be an empty list.
This works only if the list "q" is [5,6,7,8,10,11]. My guess is after comparing the last value, the for loop ends and the result does not go into "holdlist".
Is there a more elegant way of writing the code?
Thank you.
One solution would be to sort the input list and find the smallest window of four elements. Given the example input, this is
min([sorted(q)[i:i+4] for i in range(len(q) - 3)],
key=lambda w: w[3] - w[0])
But given a different input this will still return a value if the smallest window has a bigger spacing than 1. But I'd still use this solution, with a bit of error handling:
assert len(q) > 4
answer = min([sorted(q)[i:i+4] for i in range(len(q) - 3)], key=lambda w: w[3] - w[0])
assert answer[3] - answer[0] < 4
Written out and annotated:
sorted_q = sorted(q)
if len(q) < 4:
raise RuntimeError("Need at least four members in the list!")
windows = [sorted_q[i:i+4] for i in range(len(q) - 3)] # All the chunks of four elements
def size(window):
"""The size of the window."""
return window[3] - window[0]
answer = min(windows, key=size) # The smallest window, by size
if answer[3] - answer[0] > 3:
return "No group of four elements has a maximum distance of 1"
return answer
This would be one easy approach to find four closest numbers in list
# Lets have a list of numbers. It have to be at least 4 numbers long
numbers = [10, 4, 9, 1,7,12,25,26,28,29,30,77,92]
numbers.sort()
#now we have sorted list
delta = numbers[4]-numbers[0] # Lets see how close first four numbers in sorted list are from each others.
idx = 0 # Let's save our starting index
for i in range(len(numbers)-4):
d = numbers[i+4]-numbers[i]
if d < delta:
# if some sequence are closer together we save that value and index where they were found
delta = d
idx = i
if numbers[idx:idx+4] == 4:
print ("closest numbers are {}".format(numbers[idx:idx+4]))
else:
print ("Sequence with defined difference didn't found")
Here is my jab at the issue for OP's reference, as #kojiro and #ex4 have already supplied answers that deserve credit.
def find_neighbor(nums, dist, k=4):
res = []
nums.sort()
for i in range(len(nums) - k):
if nums[i + k - 1] - nums[i] <= dist * k:
res.append(nums[i: i + k])
return res
Here is the function in action:
>>> nums = [10, 11, 5, 6, 7, 8, 9] # slightly modified input for better demo
>>> find_neighbor(nums, 1)
[[5, 6, 7, 8], [6, 7, 8, 9], [7, 8, 9, 10]]
Assuming sorting is legal in tackling this problem, we first sort the input array. (I decided to sort in-place for marginal performance gain, but we can also use sorted(nums) as well.) Then, we essentially create a window of size k and check if the difference between the first and last element within that window are lesser or equal to dist * k. In the provided example, for instance, we would expect the difference between the two elements to be lesser or equal to 1 * 4 = 4. If there exists such window, we append that subarray to res, which we return in the end.
If the goal is to find a window instead of all windows, we could simply return the subarray without appending it to res.
You can do this in a generic fashion (i.e. for any size of delta or resulting largest group) using the zip function:
def deltaGroups(aList,maxDiff):
sList = sorted(aList)
diffs = [ (b-a)<=maxDiff for a,b in zip(sList,sList[1:]) ]
breaks = [ i for i,(d0,d1) in enumerate(zip(diffs,diffs[1:]),1) if d0!=d1 ]
groups = [ sList[s:e+1] for s,e in zip([0]+breaks,breaks+[len(sList)]) if diffs[s] ]
return groups
Here's how it works:
Sort the list in order to have each number next to the closest other numbers
Identify positions where the next number is within the allowed distance (diffs)
Get the index positions where compliance with the allowed distance changes (breaks) from eligible to non-eligible and from non-eligible to eligible
This corresponds to start and end of segments of the sorted list that have consecutive eligible pairs.
Extract subsets of the the sorted list based on the start/end positions of consecutive eligible differences (groups)
The deltaGroups function returns a list of groups with at least 2 values that are within the distance constraints. You can use it to find the largest group using the max() function.
output:
q = [10,11,5,6,7,8]
m = deltaGroups(q,1)
print(q)
print(m)
print(max(m,key=len))
# [10, 11, 5, 6, 7, 8]
# [[5, 6, 7, 8], [10, 11]]
# [5, 6, 7, 8]
q = [15,1,9,3,6,16,8]
m = deltaGroups(q,2)
print(q)
print(m)
print(max(m,key=len))
# [15, 1, 9, 3, 6, 16, 8]
# [[1, 3], [6, 8, 9], [15, 16]]
# [6, 8, 9]
m = deltaGroups(q,3)
print(m)
print(max(m,key=len))
# [[1, 3, 6, 8, 9], [15, 16]]
# [1, 3, 6, 8, 9]
I can't import any modules to find the average, only a 'for loop'.
I essentially have to find the average of a nested list.
def get_average(map: List[List[int]]) -> float:
"""Return the average across all cells in the map.
>>> get_average(3X3)
5.0
>>> get_average(4X4)
3.8125
"""
total = 0
for sublist in range(len(map)): #gives sublist
for i in range(sublist): #access the items in the sublist
total = total + i
average = total / len(map)
return average
The output for get_average(4X4) is 1.0
L =[[1, 2, 6, 5],
[4, 5, 3, 2],
[7, 9, 8, 1],
[1, 2, 1, 4]]
def func(l):
total_sum = sum([sum(i) for i in l])
# make the sum of inner lists, store them in the list and then get the sum of final list
count = sum([len(i) for i in l]) # get the total element in the list
return total_sum/count # return average
print(func(L))
output
3.8125
what op code should be
def get_average_elevation(elevation_map):
"""Return the average elevation across all cells in the elevation map
elevation_map.
Precondition: elevation_map is a valid elevation map.
>>> get_average_elevation(UNIQUE_3X3)
5.0
>>> get_average_elevation(FOUR_BY_FOUR)
3.8125
"""
total = 0
count = 0
for sublist in range(len(elevation_map)): # gives sublist index
for i in range(len(elevation_map[sublist])): # gives index of item in sublist
count+=1
total = total + elevation_map[sublist][i]
return total/count
l = [[1, 2, 6, 5],
[4, 5, 3, 2],
[7, 9, 8, 1],
[1, 2, 1, 4]]
print(get_average_elevation(l))
why coming defference, this is beacause
let say a list is l = [1,2,3,4]
so for this list for i in range(len(l)) will iterate 4 times only, but it wont give elemnt inside list ( which op thought it will give) ,but range return a object which iterate in inclusive range , easy term it give list from start to end-1.
what op want was element inside the list for this he need to use for element in list this will give indivitual element , in this quest regard inner list.
also to get the avg, op need to get sum of all element, which he is geeting but he need to make the avg outside the for loop.
also op need a counter to count the total no of elements to get the average.
You misunderstand the difference between an index and the list contents.
for sublist in range(len(elevation_map)): #gives sublist
No, it does not. sublist iterates through the indices of elevation_map, the values 0-3.
for i in range(sublist): #access the items in the sublist
Again, no. i iterates through the values 0-sublist, which is in the range 0-3.
As a result, total winds up being the sum 0 + (0 + 1) + (0 + 1 + 2) = 4. That's how you got a mean of 1.0
Instead, write your loops to work as your comments describe:
def get_average_elevation(elevation_map):
"""Return the average elevation across all cells in the elevation map
elevation_map.
Precondition: elevation_map is a valid elevation map.
>>> get_average_elevation(UNIQUE_3X3)
5.0
>>> get_average_elevation(FOUR_BY_FOUR)
3.8125
"""
total = 0
for sublist in elevation_map: #gives sublist
for i in sublist: #access the items in the sublist
total = total + i
average = total / len(elevation_map)
return average
Now, this adds up all 16 elements and divides by the quantity of rows, giving you 15.25 as the result. I think you want the quantity of elements, so you'll need to count or compute that, instead.
Can you take it from there?
Your code is likely not working because for sublist in range(len(elevation_map)): will iterate over a generator that produces [1,2,3,4]. You never access that actual numbers within the elevation_map array. The inner loop suffers from the same issue.
You can make the code simpler by using a list comprehension to flatten the array, then get the average from the flattened list.
flat_list = [item for sublist in elevation_map for item in sublist]
average = sum(flat_list) / len(flat_list)
You can just turn your list of lists in to a flat list, then use the sum function:
FOUR_BY_FOUR = [[1, 2, 6, 5],
[4, 5, 3, 2],
[7, 9, 8, 1],
[1, 2, 1, 4]]
UNIQUE_3X3 = [[1, 2, 3],
[9, 8, 7],
[4, 5, 6]]
answer = [i for k in FOUR_BY_FOUR for i in k]
print(sum(answer)/16)
answer = [i for k in UNIQUE_3X3 for i in k]
print(sum(answer)/9)
This returns:
3.8125
5.0
I have a list numbers say,
[1,2,3,6,8,9,10,11]
First, I want to get the sum of the differences (step size) between the numbers (n, n+1) in the list.
Second, if a set of consecutive numbers having a difference of 1 between them, put them in a list, i.e. there are two such lists in this example,
[1,2,3]
[8,9,10,11]
and then put the rest numbers in another list, i.e. there is only one such list in the example,
[6].
Third, get the lists with the max/min sizes from the sequential lists, i.e. [1,2,3], [8,9,10,11] in this example, the max list is,
[8,9,10,11]
min list is
[1,2,3].
What's the best way to implement this?
First, I want to get the sum of the differences (step size) between
the numbers (n, n+1) in the list.
Use sum on the successive differences of elements in the list:
>>> sum(lst[i] - x for i, x in enumerate(lst[:-1], start=1))
10
Second, if a set of consecutive numbers having a difference of 1 between them, put them in a list, i.e. there are two such lists in
this example, and then put the rest numbers in another list, i.e.
there is only one such list in the example,
itertools.groupby does this by grouping on the difference of each element on a reference itertools.count object:
>>> from itertools import groupby, count
>>> c = count()
>>> result = [list(g) for i, g in groupby(lst, key=lambda x: x-next(c))]
>>> result
[[1, 2, 3, 4], [6], [8, 9, 10, 11]]
Third, get the lists with the max/min sizes from above
max and min with the key function as sum:
>>> max(result, key=sum)
[8, 9, 10, 11]
>>> min(result, key=sum)
[6] #??? shouldn't this be [6]
I wonder if you've already got the answer to this (given the missing 4 from your answers) as the first thing I naively tried produced that answer. (That and/or it reads like a homework question)
>>> a=[1,2,3,4,6,8,9,10,11]
>>> sum([a[x+1] - a[x] for x in range(len(a)-1)])
10
>>> [a[x] for x in range(len(a)-1) if abs(a[x] - a[x+1]) ==1]
[1, 2, 3, 8, 9, 10]
Alternatively, try :
a=[1,2,3,6,8,9,10,11]
sets = []
cur_set = set()
total_diff = 0
for index in range(len(a)-1):
total_diff += a[index +1] - a[index]
if a[index +1] - a[index] == 1:
cur_set = cur_set | set([ a[index +1], a[index]])
else:
if len(cur_set) > 0:
sets.append(cur_set)
cur_set = set()
if len(cur_set) > 0:
sets.append(cur_set)
all_seq_nos = set()
for seq_set in sets:
all_seq_nos = all_seq_nos | seq_set
non_seq_set = set(a) - all_seq_nos
print("Sum of differences is {0:d}".format(total_diff))
print("sets of sequential numbers are :")
for seq_set in sets:
print(sorted(list(seq_set)))
print("set of non-sequential numbers is :")
print(sorted(list(non_seq_set)))
big_set=max(sets, key=sum)
sml_set=min(sets, key=sum)
print ("Biggest set of sequential numbers is :")
print (sorted(list(big_set)))
print ("Smallest set of sequential numbers is :")
print (sorted(list(sml_set)))
Which will produce the output :
Sum of differences is 10
sets of sequential numbers are :
[1, 2, 3]
[8, 9, 10, 11]
set of non-sequential numbers is :
[6]
Biggest set of sequential numbers is :
[8, 9, 10, 11]
Smallest set of sequential numbers is :
[1, 2, 3]
Hopefully that all helps ;-)
I am currently working on a python problem that involves taking a list consisting of 2 sublists of numbers and an identifier, for a total of three things. The procedure name is compareTeams(lstTeams), and is meant to be used to calculate the average winning percentage of teams over a number of seasons. The first list is the games won, the second list is the games lost. The procedure in question takes a list of these lists and tries to find the highest average by adding up the games won over the total games then diving that out by the length of the list. Both lists have the same size. It then sorts the averages in order from greatest to least as pairs of lists, with the identifier tagging along as the first element in each list. To provide an example:
teamA = [[6, 4, 8, 5, 0], [3, 6, 0, 2, 4], 'A'] avg winning percentage = 0.56
(in case my explanation is poor and hard to follow, for teamA, the percentage is calculated as (6/9 + 4/10 + 8/8 + 5/7 + 0/4) / 5)
teamB = [[3, 6, 8, 2, 4], [3, 6, 8, 2, 4], 'B'] avg winning percentage = 0.50
teamC = [[3, 6, 8, 2, 4], [0, 0, 0, 0, 0], 'C'] avg winning percentage = 1
compareTeams([teamA, teamB, teamC]) gives [['C', 1],['A', 0.56],['B', 0.50]]
I have given this problem a good amount of thought, but am new to python, so I am unsure if I am calling everything correctly. The interpreter I am using does not even display my procedure when I run it, which leads me to believe that I may be doing something wrong. Here is my code:
def compareTeams(lstTeams):
a = 0
x = 0
lst = []
y = lstTeams[a]
for a in range(0, len(y)):
x = x + ((float(y[0][0]) / (y[1][0])) / len(y[0]))
a = a + 1
lst.append(x)
return lst.reverse(lst.sort())
Is this correct? Am I doing anything wrong? Any help would be greatly appreciated.
NOTE: I am using python 2.7 for this.
You can use zip here:
def compare_team(teams):
lis = []
for team in teams:
#zip fetches items from the same index one by one from the lists passed to it
avg = sum( (x*1.0)/(x+y) for x,y in zip(team[0],team[1]))/ len(team[0])
lis.append([team[-1],avg])
lis.sort(key = lambda x:x[1],reverse = True) #reverse sort based on the second item
return lis
>>> compare_team(([teamA, teamB, teamC]))
[['C', 1.0], ['A', 0.5561904761904761], ['B', 0.5]]