I am stuck in writing the python code for below problem, can anyone help to find the issue with the code is much appreciated.
List Reduction:
You are given a list of integers L having size N and an integer K. You can perform the following operation on the list at most k times:
Pick any two elements of the list.
Multiply any element by 2.
Divide the other element by 2, taking the ceiling if element is odd.
Note: that after such an operation, the list is changed and the changed list will be used in subsequent operations.
You need to minimize the sum of all elements present in the list after performing at most k such operations.
Input Format:
First line contains N K as described above.
The second line contains N space separated integers, representing the list initially,
Output Format:
Print the minimum possible sum of all elements in the list at the end, after performing at most k operations.
Code:
def solve (X, arr):
Sum = 0
largestDivisible, minimum = -1, arr[0]
for i in range(0,N):
Sum += arr[i]
if(arr[i]%X == 0 and largestDivisible < arr[i]):
largestDivisible = arr[i]
if arr[i] < minimum:
minimum = arr[i]
if largestDivisible == -1:
return Sum
sumAfterOperation = (Sum-minimum-largestDivisible+(X*minimum)+(largestDivisible//X))
return min(Sum,sumAfterOperation)
N=5
X =2
#map(int, input().split())
arr = [10, 7, 4, 2, 1]
#list(map(int, input().split()))
out_ = solve(X, arr)
print (out_)
output: 20
expected output: 19
Not optimal program.
Idea: Multiplying the minimal element and dividing the maximal element gives sequence with minimal total sum. Do you want process negative numbers? Does K take negative values?
K = int(input())
arr = list(map(int, input().split()))
for _ in range(K):
arr.sort()
arr[0] *= 2
arr[-1] = arr[-1] // 2 + arr[-1] % 2
print(sum(arr))
More effective solution.
K = int(input())
arr = list(map(int, input().split()))
for _ in range(K):
mn, mx = min(arr), max(arr)
mn_i = arr.index(mn)
if mn != mx:
mx_i = arr.index(mx)
else:
mx_i = arr.index(mx, mn_i+1)
arr[mn_i] *= 2
arr[mx_i] = arr[mx_i] // 2 + arr[mx_i] % 2
print(sum(arr))
And algorithmic solution.
K = int(input())
arr = list(map(int, input().split()))
for _ in range(K):
mn, mx = 0, 0
for i, x in enumerate(arr):
if x < arr[mn]:
mn = i
if x >= arr[mx]:
mx = i
arr[mn] *= 2
arr[mx] = arr[mx] // 2 + arr[mx] % 2
print(sum(arr))
The Task:
You are given two parameters, an array and a number. For all the numbers that make n in pairs of two, return the sum of their indices.
input is: arr = [1, 4, 2, 3, 0, 5] and n = 7
output: 11
since the perfect pairs are (4,3) and (2,5) with indices 1 + 3 + 2 + 5 = 11
So far I have this, which prints out the perfect pairs
from itertools import combinations
def pairwise(arr, n):
for i in combinations(arr, 2): # for index in combinations in arr, 2 elements
if i[0] + i[1] == n: # if their sum is equal to n
print(i[0],i[1])
Output:
4,3 2,5
However does anyone has tips on how to print the indices of the perfect pairs? Should I use numpy or should I change the whole function?
Instead of generating combinations of array elements you can generate combinations of indices.
from itertools import combinations
def pairwise(arr, n):
s = 0
for i in combinations(range(len(arr)), 2): # for index in combinations in arr, 2 elements
if arr[i[0]] + arr[i[1]] == n: # if their sum is equal to n
# print(arr[i[0]],arr[i[1]])
# print(i[0],i[1])
s += i[0] + i[1]
# print(s)
return s
You can use a dictonary mapping the indexes:
def pairwise(arr, n):
d = {b:a for a,b in enumerate(arr)} #create indexed dict
for i in combinations(arr, 2): # for index in combinations in arr, 2 elements
if i[0] + i[1] == n: # if their sum is equal to n
print(d[i[0]],d[i[1]])
Here you have a live example
Rather than generating combinations and checking if they add up to n, it's faster to turn your list into a dict where you can look up the exact number you need to add up to n. For each number x you can easily calculate n - x and then look up the index of that number in your dict.
This only works if the input list doesn't contain any duplicate numbers.
arr = [1, 4, 2, 3, 0, 5]
n = 7
indices = {x: i for i, x in enumerate(arr)}
total = 0
for i, x in enumerate(arr):
remainder = n - x
if remainder in indices:
idx = indices[remainder]
total += i + idx
# the loop counts each pair twice (once as [a,b] and once as [b,a]), so
# we have to divide the result by two to get the correct value
total //= 2
print(total) # output: 11
If the input does contain duplicate numbers, you have rewrite the code to store more than one index in the dict:
import collections
arr = [1, 4, 2, 3, 0, 5, 2]
n = 7
indices = collections.defaultdict(list)
for i, x in enumerate(arr):
indices[x].append(i)
total = 0
for i, x in enumerate(arr):
remainder = n - x
for idx in indices[remainder]:
total += i + idx
# the loop counts each pair twice (once as [a,b] and once as [b,a]), so
# we have to divide the result by two to get the correct value
total //= 2
You should use the naive approach here:
process each element of the array with its indice
for each element test for all elements after this one (to avoid duplications) whether their sum is the expected number and if it is add the sum of their indices
Code could be:
def myfunc(arr, number):
tot = 0
for i, val in enumerate(arr):
for j in range(i+1, len(arr)):
if val + arr[j] == number:
tot += i + j
return tot
Control:
>>> myfunc([1, 4, 2, 3, 0, 5], 7)
11
>>> myfunc([2, 4, 6], 8)
2
I have the problem that I want to count the number of combinations that fulfill the following condition:
a < b < a+d < c < b+d
Where a, b, c are elements of a list, and d is a fixed delta.
Here is a vanilla implementation:
def count(l, d):
s = 0
for a in l:
for b in l:
for c in l:
if a < b < a + d < c < b + d:
s += 1
return s
Here is a test:
def testCount():
l = [0, 0, 0, 1, 1, 2, 2, 2, 3, 3, 5, 7, 7, 8, 9, 9, 10, 10]
assert(32 == count(l, 4)) # Gone through everything by hand.
Question
How can I speed this up? I am looking at list sizes of 2 Million.
Supplementary Information
I am dealing with floats in the range of [-pi, pi]. For example, this limits a < 0.
What I have so far:
I have some implementation where I build indices that I use for b and c. However, the below code fails some cases. (i.e. This is wrong).
def count(l, d=pi):
low = lower(l, d)
high = upper(l, d)
s = 0
for indA in range(len(l)):
for indB in range(indA+1, low[indA]+1):
s += low[indB] + 1 - high[indA]
return s
def lower(l, d=pi):
'''Returns ind, s.t l[ind[i]] < l[i] + d and l[ind[i]+1] >= l[i] + d, for all i
Input must be sorted!
'''
ind = []
x = 0
length = len(l)
for elem in l:
while x < length and l[x] < elem + d:
x += 1
if l[x-1] < elem + d:
ind.append(x-1)
else:
assert(x == length)
ind.append(x)
return ind
def upper(l, d=pi):
''' Returns first index where l[i] > l + d'''
ind = []
x = 0
length = len(l)
for elem in l:
while x < length and l[x] <= elem + d:
x += 1
ind.append(x)
return ind
Original Problem
The original problem is from a well known math/comp-sci competition. The competition asks that you don't post solutions on the net. But it is from two weeks ago.
I can generate the list with this function:
def points(n):
x = 1
y = 1
for _ in range(n):
x = (x * 1248) % 32323
y = (y * 8421) % 30103
yield atan2(x - 16161, y - 15051)
def C(n):
angles = points(n)
angles.sort()
return count(angles, pi)
There is an approach to your problem that yields an O(n log n) algorithm. Let X be the set of values. Now let's fix b. Let A_b be the set of values { x in X: b - d < x < b } and C_b be the set of values { x in X: b < x < b + d }. If we can find |{ (x,y) : A_b X C_b | y > x + d }| fast, we solved the problem.
If we sort X, we can represent A_b and C_b as pointers into the sorted array, because they are contiguous. If we process the b candidates in non-decreasing order, we can thus maintain these sets using a sliding window algorithm. It goes like this:
sort X. Let X = { x_1, x_2, ..., x_n }, x_1 <= x_2 <= ... <= x_n.
Set left = i = 1 and set right so that C_b = { x_{i + 1}, ..., x_right }. Set count = 0
Iterate i from 1 to n. In every iteration we find out the number of valid triples (a,b,c) with b = x_i. To do that, increase left and right as much as necessary so that A_b = { x_left, ..., x_{i-1} } and C_b = { x_{i + 1}, ..., x_right } still holds. In the process, you basically add and remove elements from the imaginary sets A_b and C_b.
If you remove or add an element to one of the sets, check how many pairs (a, c) with c > a + d, a from A_b and c from C_b you add or destroy (this can be achieved by a simple binary search in the other set). Update count accordingly so that the invariant count = |{ (x,y) : A_b X C_b | y > x + d }| still holds.
sum up the values of count in every iteration. This is the final result.
The complexity is O(n log n).
If you want to solve the Euler problem with this algorithm, you have to avoid floating point issues. I suggest sorting the points by angle using a custom comparison function that uses integer arithmetics only (using 2D vector geometry). Implementing the |a-b| < d comparisons can also be done using integer operations only. Also, since you are working modulo 2*pi, you would probably have to introduce three copies of every angle a: a - 2*pi, a and a + 2*pi. You then only look for b in the range [0, 2*pi) and divide the result by three.
UPDATE OP implemented this algorithm in Python. Apparently it contains some bugs but it demonstrates the general idea:
def count(X, d):
X.sort()
count = 0
s = 0
length = len(X)
a_l = 0
a_r = 1
c_l = 0
c_r = 0
for b in X:
if X[a_r-1] < b:
# find boundaries of A s.t. b -d < a < b
while a_r < length and X[a_r] < b:
a_r += 1 # This adds an element to A_b.
ind = bisect_right(X, X[a_r-1]+d, c_l, c_r)
if c_l <= ind < c_r:
count += (ind - c_l)
while a_l < length and X[a_l] <= b - d:
a_l += 1 # This removes an element from A_b
ind = bisect_right(X, X[a_l-1]+d, c_l, c_r)
if c_l <= ind < c_r:
count -= (c_r - ind)
# Find boundaries of C s.t. b < c < b + d
while c_l < length and X[c_l] <= b:
c_l += 1 # this removes an element from C_b
ind = bisect_left(X, X[c_l-1]-d, a_l, a_r)
if a_l <= ind <= a_r:
count -= (ind - a_l)
while c_r < length and X[c_r] < b + d:
c_r += 1 # this adds an element to C_b
ind = bisect_left(X, X[c_r-1]-d, a_l, a_r)
if a_l <= ind <= a_r:
count += (ind - a_l)
s += count
return s
from bisect import bisect_left, bisect_right
from collections import Counter
def count(l, d):
# cdef long bleft, bright, cleft, cright, ccount, s
s = 0
# Find the unique elements and their counts
cc = Counter(l)
l = sorted(cc.keys())
# Generate a cumulative sum array
cumulative = [0] * (len(l) + 1)
for i, key in enumerate(l, start=1):
cumulative[i] = cumulative[i-1] + cc[key]
# Pregenerate all the left and right lookups
lefthand = [bisect_right(l, a + d) for a in l]
righthand = [bisect_left(l, a + d) for a in l]
aright = bisect_left(l, l[-1] - d)
for ai in range(len(l)):
bleft = ai + 1
# Search only the values of a that have a+d in range
if bleft > aright:
break
# This finds b such that a < b < a + d.
bright = righthand[ai]
for bi in range(bleft, bright):
# This finds the range for c such that a+d < c < b+d.
cleft = lefthand[ai]
cright = righthand[bi]
if cleft != cright:
# Find the count of c elements in the range cleft..cright.
ccount = cumulative[cright] - cumulative[cleft]
s += cc[l[ai]] * cc[l[bi]] * ccount
return s
def testCount():
l = [0, 0, 0, 1, 1, 2, 2, 2, 3, 3, 5, 7, 7, 8, 9, 9, 10, 10]
result = count(l, 4)
assert(32 == result)
testCount()
gets rid of repeated, identical values
iterates over only the required range for a value
uses a cumulative count across two indices to eliminate the loop over c
cache lookups on x + d
This is no longer O(n^3) but more like O(n^2)`.
This clearly does not yet scale up to 2 million. Here are my times on smaller floating point data sets (i.e. few or no duplicates) using cython to speed up execution:
50: 0:00:00.157849 seconds
100: 0:00:00.003752 seconds
200: 0:00:00.022494 seconds
400: 0:00:00.071192 seconds
800: 0:00:00.253750 seconds
1600: 0:00:00.951133 seconds
3200: 0:00:03.508596 seconds
6400: 0:00:10.869102 seconds
12800: 0:00:55.986448 seconds
Here is my benchmarking code (not including the operative code above):
from math import atan2, pi
def points(n):
x, y = 1, 1
for _ in range(n):
x = (x * 1248) % 32323
y = (y * 8421) % 30103
yield atan2(x - 16161, y - 15051)
def C(n):
angles = sorted(points(n))
return count(angles, pi)
def test_large():
from datetime import datetime
for n in [50, 100, 200, 400, 800, 1600, 3200, 6400, 12800]:
s = datetime.now()
C(n)
elapsed = datetime.now() - s
print("{1}: {0} seconds".format(elapsed, n))
if __name__ == '__main__':
testCount()
test_large()
Since l is sorted and a < b < c must be true, you could use itertools.combinations() to do fewer loops:
sum(1 for a, b, c in combinations(l, r=3) if a < b < a + d < c < b + d)
Looking at combinations only reduces this loop to 816 iterations.
>>> l = [0, 0, 0, 1, 1, 2, 2, 2, 3, 3, 5, 7, 7, 8, 9, 9, 10, 10]
>>> d = 4
>>> sum(1 for a, b, c in combinations(l, r=3))
816
>>> sum(1 for a, b, c in combinations(l, r=3) if a < b < a + d < c < b + d)
32
where the a < b test is redundant.
1) To reduce amount of iterations on each level you can remove elements from list that dont pass condition on each level
2) Using set with collections.counter you can reduce iterations by removing duplicates:
from collections import Counter
def count(l, d):
n = Counter(l)
l = set(l)
s = 0
for a in l:
for b in (i for i in l if a < i < a+d):
for c in (i for i in l if a+d < i < b+d):
s += (n[a] * n[b] * n[c])
return s
>>> l = [0, 0, 0, 1, 1, 2, 2, 2, 3, 3, 5, 7, 7, 8, 9, 9, 10, 10]
>>> count(l, 4)
32
Tested count of iterations (a, b, c) for your version:
>>> count1(l, 4)
18 324 5832
my version:
>>> count2(l, 4)
9 16 7
The basic ideas are:
Get rid of repeated, identical values
Have each value iterate only over the range it has to iterate.
As a result you can increase s unconditionally and the performance is roughly O(N), with N being the size of the array.
import collections
def count(l, d):
s = 0
# at first we get rid of repeated items
counter = collections.Counter(l)
# sort the list
uniq = sorted(set(l))
n = len(uniq)
# kad is the index of the first element > a+d
kad = 0
# ka is the index of a
for ka in range(n):
a = uniq[ka]
while uniq[kad] <= a+d:
kad += 1
if kad == n:
return s
for kb in range( ka+1, kad ):
# b only runs in the range [a..a+d)
b = uniq[kb]
if b >= a+d:
break
for kc in range( kad, n ):
# c only rund from (a+d..b+d)
c = uniq[kc]
if c >= b+d:
break
print( a, b, c )
s += counter[a] * counter[b] * counter[c]
return s
EDIT: Sorry, I messed up the submission. Fixed.
The longest arithmetic progression subsequence problem is as follows. Given an array of integers A, devise an algorithm to find the longest arithmetic progression in it. In other words find a sequence i1 < i2 < … < ik, such that A[i1], A[i2], …, A[ik] form an arithmetic progression, and k is maximal. The following code solves the problem in O(n^2) time and space. (Modified from http://www.geeksforgeeks.org/length-of-the-longest-arithmatic-progression-in-a-sorted-array/ . )
#!/usr/bin/env python
import sys
def arithmetic(arr):
n = len(arr)
if (n<=2):
return n
llap = 2
L = [[0]*n for i in xrange(n)]
for i in xrange(n):
L[i][n-1] = 2
for j in xrange(n-2,0,-1):
i = j-1
k = j+1
while (i >=0 and k <= n-1):
if (arr[i] + arr[k] < 2*arr[j]):
k = k + 1
elif (arr[i] + arr[k] > 2*arr[j]):
L[i][j] = 2
i -= 1
else:
L[i][j] = L[j][k] + 1
llap = max(llap, L[i][j])
i = i - 1
k = j + 1
while (i >=0):
L[i][j] = 2
i -= 1
return llap
arr = [1,4,5,7,8,10]
print arithmetic(arr)
This outputs 4.
However I would like to be able to find arithmetic progressions where up to one value is missing. So if arr = [1,4,5,8,10,13] I would like it to report that there is a progression of length 5 with one value missing.
Can this be done efficiently?
Adapted from my answer to Longest equally-spaced subsequence. n is the length of A, and d is the range, i.e. the largest item minus the smallest item.
A = [1, 4, 5, 8, 10, 13] # in sorted order
Aset = set(A)
for d in range(1, 13):
already_seen = set()
for a in A:
if a not in already_seen:
b = a
count = 1
while b + d in Aset:
b += d
count += 1
already_seen.add(b)
# if there is a hole to jump over:
if b + 2 * d in Aset:
b += 2 * d
count += 1
while b + d in Aset:
b += d
count += 1
# don't record in already_seen here
print "found %d items in %d .. %d" % (count, a, b)
# collect here the largest 'count'
I believe that this solution is still O(n*d), simply with larger constants than looking without a hole, despite the two "while" loops inside the two nested "for" loops. Indeed, fix a value of d: then we are in the "a" loop that runs n times; but each of the inner two while loops run at most n times in total over all values of a, giving a complexity O(n+n+n) = O(n) again.
Like the original, this solution is adaptable to the case where you're not interested in the absolute best answer but only in subsequences with a relatively small step d: e.g. n might be 1'000'000, but you're only interested in subsequences of step at most 1'000. Then you can make the outer loop stop at 1'000.
fib = [0,1]
a = 1
b = 0
i = 0
while i < n:
i = a+b
a,b = i, a
fib.append(i)
This works in cases where 'n' (which is a given variable) is a number in an actual Fibonacci sequence, like 21 or 13. However, if the number is something like six, it adds one more number than it should. The list should not contain a number that is greater than n.
You could always add a to the list first, then do your incrementing.
fib = [0]
a, b = 1, 0
while a <= n:
fib.append(a)
a,b = a+b, a
Using the classic shnazzy recursive Fibonacci function (which took me a few tries to remember and get right):
def fib(num):
if ((num == 0) or (num == 1)): return 1
fib_num = fib(num - 1) + fib(num - 2)
return fib_num
x, n, i = 2, 15, []
while (fib(x) < n):
i.append(fib(x))
x += 1