Say I have a 1D array x with positive and negative values in Python, e.g.:
x = random.rand(10) * 10
For a given positive value of K, I would like to find the offset c that makes the sum of positive elements of the array y = x + c equal to K.
How can I solve this problem efficiently?
How about binary search to determine which elements of x + c are going to contribute to the sum, followed by solving the linear equation? The running time of this code is O(n log n), but only O(log n) work is done in Python. The running time could be dropped to O(n) via a more complicated partitioning strategy. I'm not sure whether a practical improvement would result.
import numpy as np
def findthreshold(x, K):
x = np.sort(np.array(x))[::-1]
z = np.cumsum(np.array(x))
l = 0
u = x.size
while u - l > 1:
m = (l + u) // 2
if z[m] - (m + 1) * x[m] >= K:
u = m
else:
l = m
return (K - z[l]) / (l + 1)
def test():
x = np.random.rand(10)
K = np.random.rand() * x.size
c = findthreshold(x, K)
assert np.abs(K - np.sum(np.clip(x + c, 0, np.inf))) / K <= 1e-8
Here's a randomized expected O(n) variant. It's faster (on my machine, for large inputs), but not dramatically so. Watch out for catastrophic cancellation in both versions.
def findthreshold2(x, K):
sumincluded = 0
includedsize = 0
while x.size > 0:
pivot = x[np.random.randint(x.size)]
above = x[x > pivot]
if sumincluded + np.sum(above) - (includedsize + above.size) * pivot >= K:
x = above
else:
notbelow = x[x >= pivot]
sumincluded += np.sum(notbelow)
includedsize += notbelow.size
x = x[x < pivot]
return (K - sumincluded) / includedsize
You can sort x in descending order, loop over x and compute the required c thus far. If the next element plus c is positive, it should be included in the sum, so c gets smaller.
Note that it might be the case that there is no solution: if you include elements up to m, c is such that m+1 should also be included, but when you include m+1, c decreases and a[m+1]+c might get negative.
In pseudocode:
sortDescending(x)
i = 0, c = 0, sum = 0
while i < x.length and x[i] + c >= 0
sum += x[i]
c = (K - sum) / i
i++
if i == 0 or x[i-1] + c < 0
#no solution
The running time is obviously O(n log n) because it is dominated by the initial sort.
Related
I have integer input: 0 < a, K, N < 10^9
I need to find all b numbers that satisfy:
a + b <= N
(a + b) % K = 0
For example: 10 6 40 -> [2, 8, 14, 20, 26]
I tried a simple brute force and failed (Time Limit Exceeded). Can anyone suggest answer? Thanks
a, K, N = [int(x) for x in input().split()]
count = 0
b = 1
while (a + b <= N):
if ((a + b) % K) == 0:
count+=1
print(b, end=" ")
b+=1
if (count == 0):
print(-1)
The first condition is trivial in the sense that it just poses an upper limit on b. The second condition can be rephrased using the definition of % as
a + b = P * K
For some arbitrary integer P. From this, is simple to compute the smallest b by finding the smallest P that gives you a positive result for P * K - a. In other words
P * K - a >= 0
P * K >= a
P >= a / K
P = ceil(a / K)
So you have
b0 = ceil(a / K) * K - a
b = range(b0, N + 1, K)
range is a generator, so it won't compute the values up front. You can force that by doing list(b).
At the same time, if you only need the count of elements, range objects will do the math on the limits and step size for you conveniently, all without computing the actual values, so you can just do len(b).
To find the list of bs, you can use some maths. First, we note that (a + b) % K is equivalent to a % K + b % K. Also when n % K is 0, that means that n is a multiple of K. So the smallest value of b is n * K - a for the smallest value of n where this calculation is still positive. Once you find that value, you can simply add K repeatedly to find all other values of b.
b = k - a%k
Example: a=19, k=11, b = 11-19%11 = 11-8 =3
Using Python, I would like to implement a function that takes a natural number n as input and outputs a list of natural numbers [y1, y2, y3, ...] such that n + y1*y1 and n + y2*y2 and n + y3*y3 and so forth is again a square.
What I tried so far is to obtain one y-value using the following function:
def find_square(n:int) -> tuple[int, int]:
if n%2 == 1:
y = (n-1)//2
x = n+y*y
return (y,x)
return None
It works fine, eg. find_square(13689) gives me a correct solution y=6844. It would be great to have an algorithm that yields all possible y-values such as y=44 or y=156.
Simplest slow approach is of course for given N just to iterate all possible Y and check if N + Y^2 is square.
But there is a much faster approach using integer Factorization technique:
Lets notice that to solve equation N + Y^2 = X^2, that is to find all integer pairs (X, Y) for given fixed integer N, we can rewrite this equation to N = X^2 - Y^2 = (X + Y) * (X - Y) which follows from famous school formula of difference of squares.
Now lets rename two factors as A, B i.e. N = (X + Y) * (X - Y) = A * B, which means that X = (A + B) / 2 and Y = (A - B) / 2.
Notice that A and B should be of same odditiy, either both odd or both even, otherwise in last formulas above we can't have whole division by 2.
We will factorize N into all possible pairs of two factors (A, B) of same oddity. For fast factorization in code below I used simple to implement but yet quite fast algorithm Pollard Rho, also two extra algorithms were needed as a helper to Pollard Rho, one is Fermat Primality Test (which allows fast checking if number is probably prime) and second is Trial Division Factorization (which helps Pollard Rho to factor out small factors, which could cause Pollard Rho to fail).
Pollard Rho for composite number has time complexity O(N^(1/4)) which is very fast even for 64-bit numbers. Any faster factorization algorithm can be chosen if needed a bigger space to be searched. My fast algorithm time is dominated by speed of factorization, remaining part of algorithm is blazingly fast, just few iterations of loop with simple formulas.
If your N is a square itself (hence we know its root easily), then Pollard Rho can factor N even much faster, within O(N^(1/8)) time. Even for 128-bit numbers it means very small time, 2^16 operations, and I hope you're solving your task for less than 128 bit numbers.
If you want to process a range of possible N values then fastest way to factorize them is to use techniques similar to Sieve of Erathosthenes, using set of prime numbers, it allows to compute all factors for all N numbers within some range. Using Sieve of Erathosthenes for the case of range of Ns is much faster than factorizing each N with Pollard Rho.
After factoring N into pairs (A, B) we compute (X, Y) based on (A, B) by formulas above. And output resulting Y as a solution of fast algorithm.
Following code as an example is implemented in pure Python. Of course one can use Numba to speed it up, Numba usually gives 30-200 times speedup, for Python it achieves same speed as optimized C++. But I thought that main thing here is to implement fast algorithm, Numba optimizations can be done easily afterwards.
I added time measurement into following code. Although it is pure Python still my fast algorithm achieves 8500x times speedup compared to regular brute force approach for limit of 1 000 000.
You can change limit variable to tweak amount of searched space, or num_tests variable to tweak amount of different tests.
Following code implements both solutions - fast solution find_fast() described above plus very tiny brute force solution find_slow() which is very slow as it scans all possible candidates. This slow solution is only used to compare correctness in tests and compare speedup.
Code below uses nothing except few standard Python library modules, no external modules were used.
Try it online!
def find_slow(N):
import math
def is_square(x):
root = int(math.sqrt(float(x)) + 0.5)
return root * root == x, root
l = []
for y in range(N):
if is_square(N + y ** 2)[0]:
l.append(y)
return l
def find_fast(N):
import itertools, functools
Prod = lambda it: functools.reduce(lambda a, b: a * b, it, 1)
fs = factor(N)
mfs = {}
for e in fs:
mfs[e] = mfs.get(e, 0) + 1
fs = sorted(mfs.items())
del mfs
Ys = set()
for take_a in itertools.product(*[
(range(v + 1) if k != 2 else range(1, v)) for k, v in fs]):
A = Prod([p ** t for (p, _), t in zip(fs, take_a)])
B = N // A
assert A * B == N, (N, A, B, take_a)
if A < B:
continue
X = (A + B) // 2
Y = (A - B) // 2
assert N + Y ** 2 == X ** 2, (N, A, B, X, Y)
Ys.add(Y)
return sorted(Ys)
def trial_div_factor(n, limit = None):
# https://en.wikipedia.org/wiki/Trial_division
fs = []
while n & 1 == 0:
fs.append(2)
n >>= 1
all_checked = False
for d in range(3, (limit or n) + 1, 2):
if d * d > n:
all_checked = True
break
while True:
q, r = divmod(n, d)
if r != 0:
break
fs.append(d)
n = q
if n > 1 and all_checked:
fs.append(n)
n = 1
return fs, n
def fermat_prp(n, trials = 32):
# https://en.wikipedia.org/wiki/Fermat_primality_test
import random
if n <= 16:
return n in (2, 3, 5, 7, 11, 13)
for i in range(trials):
if pow(random.randint(2, n - 2), n - 1, n) != 1:
return False
return True
def pollard_rho_factor(n):
# https://en.wikipedia.org/wiki/Pollard%27s_rho_algorithm
import math, random
fs, n = trial_div_factor(n, 1 << 7)
if n <= 1:
return fs
if fermat_prp(n):
return sorted(fs + [n])
for itry in range(8):
failed = False
x = random.randint(2, n - 2)
for cycle in range(1, 1 << 60):
y = x
for i in range(1 << cycle):
x = (x * x + 1) % n
d = math.gcd(x - y, n)
if d == 1:
continue
if d == n:
failed = True
break
return sorted(fs + pollard_rho_factor(d) + pollard_rho_factor(n // d))
if failed:
break
assert False, f'Pollard Rho failed! n = {n}'
def factor(N):
import functools
Prod = lambda it: functools.reduce(lambda a, b: a * b, it, 1)
fs = pollard_rho_factor(N)
assert N == Prod(fs), (N, fs)
return sorted(fs)
def test():
import random, time
limit = 1 << 20
num_tests = 20
t0, t1 = 0, 0
for i in range(num_tests):
if (round(i / num_tests * 1000)) % 100 == 0 or i + 1 >= num_tests:
print(f'test {i}, ', end = '', flush = True)
N = random.randrange(limit)
tb = time.time()
r0 = find_slow(N)
t0 += time.time() - tb
tb = time.time()
r1 = find_fast(N)
t1 += time.time() - tb
assert r0 == r1, (N, r0, r1, t0, t1)
print(f'\nTime slow {t0:.05f} sec, fast {t1:.05f} sec, speedup {round(t0 / max(1e-6, t1))} times')
if __name__ == '__main__':
test()
Output:
test 0, test 2, test 4, test 6, test 8, test 10, test 12, test 14, test 16, test 18, test 19,
Time slow 26.28198 sec, fast 0.00301 sec, speedup 8732 times
For the easiest solution, you can try this:
import math
n=13689 #or we can ask user to input a square number.
for i in range(1,9999):
if math.sqrt(n+i**2).is_integer():
print(i)
I'm a bit stuck on a python problem.
I'm suppose to write a function that takes a positive integer n and returns the number of different operations that can sum to n (2<n<201) with decreasing and unique elements.
To give an example:
If n = 3 then f(n) = 1 (Because the only possible solution is 2+1).
If n = 5 then f(n) = 2 (because the possible solutions are 4+1 & 3+2).
If n = 10 then f(n) = 9 (Because the possible solutions are (9+1) & (8+2) & (7+3) & (7+2+1) & (6+4) & (6+3+1) & (5+4+1) & (5+3+2) & (4+3+2+1)).
For the code I started like that:
def solution(n):
nb = list(range(1,n))
l = 2
summ = 0
itt = 0
for index in range(len(nb)):
x = nb[-(index+1)]
if x > 3:
for index2 in range(x-1):
y = nb[index2]
#print(str(x) + ' + ' + str(y))
if (x + y) == n:
itt = itt + 1
for index3 in range(y-1):
z = nb[index3]
if (x + y + z) == n:
itt = itt + 1
for index4 in range(z-1):
w = nb[index4]
if (x + y + z + w) == n:
itt = itt + 1
return itt
It works when n is small but when you start to be around n=100, it's super slow and I will need to add more for loop which will worsen the situation...
Do you have an idea on how I could solve this issue? Is there an obvious solution I missed?
This problem is called integer partition into distinct parts. OEIS sequence (values are off by 1 because you don't need n=>n case )
I already have code for partition into k distinct parts, so modified it a bit to calculate number of partitions into any number of parts:
import functools
#functools.lru_cache(20000)
def diffparts(n, k, last):
result = 0
if n == 0 and k == 0:
result = 1
if n == 0 or k == 0:
return result
for i in range(last + 1, n // k + 1):
result += diffparts(n - i, k - 1, i)
return result
def dparts(n):
res = 0
k = 2
while k * (k + 1) <= 2 * n:
res += diffparts(n, k, 0)
k += 1
return res
print(dparts(201))
I can't use inner loops
I can't use if-else
I need to compute the following series:
x - x^3/3! + x^5/5! - x^7/7! + x^9/9! ...
I am thinking something like the following:
n =1
x =0.3
one=1
fact1=1
fact2=1
term =0
sum =0
for i in range(1, n+1, 2):
one = one * (-1)
fact1 = fact1*i
fact2 = fact2*i+1
fact = fact1*fact2
x = x * x
term = x/fact
sum = sum + term
But, I am finding hard times in keeping the multiplications of both fact and x.
You want to compute a sum of terms. Each term is the previous term mutiplied by -1 * x * x and divided by n * (n+1). Just write it:
def func(x):
eps = 1e-6 # the expected precision order
term = x
sum = term
n = 1
while True:
term *= -x * x
term /= (n+1) * (n+2)
if abs(term) < eps: break
sum += term
n += 2
return sum
Demo:
>>> func(math.pi / 6)
0.4999999918690232
giving as expected 0.5 with a precision of 10e-6
Note: the series is the well known development of the sin function...
Isn't that a Taylor series for sin(x)? And can you use list comprehension? With list comprehension that could be something like
x = 0.3
sum([ (-1)**(n+1) * x**(2n-1) / fact(2n-1) for n in range(1, numOfTerms)])
If you can't use list comprehension you could simply loop that like this
x=0.3
terms = []
for n in range(1, numberOfTerms):
term = (-1)**(n+1)*x**(2n-1)/fact(2n-1)
terms.append(term)
sumOfTerms = sum(terms)
Then calculating the factorial by recursion:
def fact(k):
if (k == 1):
return n
else:
return fact(k-1)*k
Calcualting the factorial using Striling's approximation:
fact(k) = sqrt(2*pi*k)*k**k*e**(-k)
No if-else here nor inner loops. But then there will be precision errors and need to use math lib to get the constants or get even more precision error and use hard coded values for pi and e.
Hope this can help!
n = NUMBER_OF_TERMS
x = VALUE_OF_X
m = -1
sum = x # Final sum
def fact(i):
f = 1
while i >= 1:
f = f * i
i = i - 1
return f
for i in range(1, n):
r = 2 * i + 1
a = pow (x , r)
term = a * m / fact(r);
sum = sum + term;
m = m * (-1)
Given :
I : a positive integer
n : a positive integer
nth Term of sequence for input = I :
F(I,1) = (I * (I+1)) / 2
F(I,2) = F(I,1) + F(I-1,1) + F(I-2,1) + .... F(2,1) + F(1,1)
F(I,3) = F(I,2) + F(I-1,2) + F(I-2,2) + .... F(2,2) + F(2,1)
..
..
F(I,n) = F(I,n-1) + F(I-1,n-1) + F(I-2,n-1) + .... F(2,n-1) + F(1,n-1)
nth term --> F(I,n)
Approach 1 : Used recursion to find the above :
def recursive_sum(I, n):
if n == 1:
return (I * (I + 1)) // 2
else:
return sum(recursive_sum(j, n - 1) for j in range(I, 0, -1))
Approach 2 : Iteration to store reusable values in a dictionary. Used this dictionary to get the nth term.:
def non_recursive_sum_using_data(I, n):
global data
if n == 1:
return (I * (I + 1)) // 2
else:
return sum(data[j][n - 1] for j in range(I, 0, -1))
def iterate(I,n):
global data
data = {}
i = 1
j = 1
for i in range(n+1):
for j in range(I+1):
if j not in data:
data[j] = {}
data[j][i] = recursive_sum(j,i)
return data[I][n]
The recursion approach is obviously not efficient due to maximum recursion depth. Also the next approach's time and space complexity will be poor.
Is there better way to recurse ? or a different approach than recursion ?
I am curious if we can find a formula for nth term.
You could just cache your recursive results:
from functools import lru_cache
#lru_cache(maxsize=None)
def recursive_sum(I, n):
if n == 1:
return (I * (I + 1)) // 2
return sum(recursive_sum(j, n - 1) for j in range(I, 0, -1))
That way you can get the readability and brevity of the recursive approach without most of the performance issues since the function is only called once for each argument combination (I, n).
Using the usual binomial(n,k) = n!/(k!*(n-k)!), you have
F(I,n) = binomial(I+n, n+1).
Then you can choose the method you like most to compute binomial coefficients.
Here an example:
def binomial(n, k):
numerator = denominator = 1
t = max(k, n-k)
for low,high in enumerate(range(t+1, n+1), 1):
numerator *= high
denominator *= low
return numerator // denominator
def F(I,n): return binomial(I+n, n+1)
The formula for the nth term of the sequence is the one you have already mentioned.
Also rightly so you have identified that it will lead to an inefficient algorithm and stack overflow.
You can look into dynamic programming approach where u calculate F(I,N) just once and just reuse the value.
For example this is how the fibonacci seq is calculated.
[just-example] https://www.geeksforgeeks.org/program-for-nth-fibonacci-number/
You need to find the same pattern and cache the value
I have an example for this here in this small code written in golang
https://play.golang.org/p/vRi-QMj7z2v
the standard DP
One can do a (tiny) bit of math to rewrite your function:
F(i,n) = sum_{k=0}^{i-1} F(i-k, n-1) = sum_{k=1}^{i} F(k, n-1)
Now notice, that if you consider a matrix F_{ixn}, to compute F(i,n) we just need to add the elements of the previous column.
x----+---
| + |
|----+ |
|----+-F(i,n)
We conclude that we can build the first layer (aka column). Then the second one. And so forth until we get to the n-th layer.
Finally we take the last element of our final layer which is F(i,n)
The computation time is about O(I*n)
More math based but faster
An other way is to consider our layer as a vector variable X.
We can write the recurrence relation as
X_n = MX_{n-1}
where M is a triangular matrix with 1 in the lower part.
We then want to compute the general term of X_n so we want to compute M^n.
By following Yves Daoust
(I just copy from the link above)
Coefficients should be indiced _{n+1} and _n, but here it is _1 and '' for readability
Moreover the matrix is upper triangular but we can just take the transpose afterwards...
a_1 b_1 c_1 d_1 1 1 1 1 a b c d
a_1 b_1 c_1 = 0 1 1 1 * 0 a b c
a_1 b_1 0 0 1 1 0 0 a b
a_1 0 0 0 1 0 0 0 a
by going from last row to first:
a = 1
from b_1 = a+b = 1 + b = n, b = n
from c_1 = a+b+c = 1+n+c, c = n(n+1)/2
from d_1 = a+b+c+d = 1+n+n(n+1)/2 +d, d = n(n+1)(n+2)/6
I have not proved it but I hint that e_1 = n(n+1)(n+2)(n+3)/24 (so basically C_n^k)
(I think the proof lies more in the fact that F(i,n) = F(i,n-1) + F(i-1,n) )
More generally instead of taking variables a,b,c... but X_n(0), X_n(1)...
X_n(0) = 1
X_n(i) = n*...*(n+i-1) / i!
And by applying recusion for computing X:
X_n(0) = 1
X_n(i) = X_n(i-1)*(n+i-1)/i
Finally we deduce F(i,n) as the scalar product Y_{n-1} * X_1 where Y_n is the reversed vector of X_n and X_1(n) = n*(n+1)/2
from functools import lru_cache
#this is copypasted from schwobaseggl
#lru_cache(maxsize=None)
def recursive_sum(I, n):
if n == 1:
return (I * (I + 1)) // 2
return sum(recursive_sum(j, n - 1) for j in range(I, 0, -1))
def iterative_sum(I,n):
layer = [ i*(i+1)//2 for i in range(1,I+1)]
x = 2
while x <= n:
next_layer = [layer[0]]
for i in range(1,I):
#we don't need to compute all the sum everytime
#take the previous sum and add it the new number
next_layer.append( next_layer[i-1] + layer[i] )
layer = next_layer
x += 1
return layer[-1]
def brutus(I,n):
if n == 1:
return I*(I+1)//2
X_1 = [ i*(i+1)//2 for i in range(1, I+1)]
X_n = [1]
for i in range(1, I):
X_n.append(X_n[-1] * (n-1 + i-1) / i )
X_n.reverse()
s = 0
for i in range(0, I):
s += X_1[i]*X_n[i]
return s
def do(k,n):
print('rec', recursive_sum(k,n))
print('it ', iterative_sum(k,n))
print('bru', brutus(k,n))
print('---')
do(1,4)
do(2,1)
do(3,2)
do(4,7)
do(7,4)