How can I generate three random integers that satisfy some condition? [closed] - python
Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 2 years ago.
Improve this question
I'm a beginner in programming and I'm looking for a nice idea how to generate three integers that satisfy a condition.
Example:
We are given n = 30, and we've been asked to generate three integers a, b and c, so that 7*a + 5*b + 3*c = n.
I tried to use for loops, but it takes too much time and I have a maximum testing time of 1000 ms.
I'm using Python 3.
My attempt:
x = int(input())
c = []
k = []
w = []
for i in range(x):
for j in range(x):
for h in range(x):
if 7*i + 5*j + 3*h = x:
c.append(i)
k.append(j)
w.append(h)
if len(c) == len(k) == len(w)
print(-1)
else:
print(str(k[0]) + ' ' + str(c[0]) + ' ' + str(w[0]))
First, let me note that your task is underspecified in at least two respects:
The allowed range of the generated values is not specified. In particular, you don't specify whether the results may include negative integers.
The desired distribution of the generated values is not specified.
Normally, if not specified, one might assume that a uniform distribution on the set of possible solutions to the equation was expected (since it is, in a certain sense, the most random possible distribution on a given set). But a (discrete) uniform distribution is only possible if the solution set is finite, which it won't be if the range of results is unrestricted. (In particular, if (a, b, c) is a solution, then so is (a, b + 3k, c − 5k) for any integer k.) So if we interpret the task as asking for a uniform distribution with unlimited range, it's actually impossible!
On the other hand, if we're allowed to choose any distribution and range, the task becomes trivial: just make the generator always return a = −n, b = n, c = n. Clearly this is a solution to the equation (since −7n + 5n + 3n = (−7 + 5 + 3)n = 1n), and a degenerate distribution that assigns all probability mass to single point is still a valid probability distribution!
If you wanted a slightly less degenerate solution, you could pick a random integer k (using any distribution of your choice) and return a = −n, b = n + 3k, c = n − 5k. As noted above, this is also a solution to the equation for any k. Of course, this distribution is still somewhat degenerate, since the value of a is fixed.
If you want to let all return values be at least somewhat random, you could also pick a random h and return a = −n + h, b = n − 2h + 3k and c = n + h − 5k. Again, this is guaranteed to be a valid solution for any h and k, since it clearly satisfies the equation for h = k = 0, and it's also easy to see that increasing or decreasing either h or k will leave the value of the left-hand side of the equation unchanged.
In fact, it can be proved that this method can generate all possible solutions to the equation, and that each solution will correspond to a unique (h, k) pair! (One fairly intuitive way to see this is to plot the solutions in 3D space and observe that they form a regular lattice of points on a 2D plane, and that the vectors (+1, −2, +1) and (0, +3, −5) span this lattice.) If we pick h and k from some distribution that (at least in theory) assigns a non-zero probability to every integer, then we'll have a non-zero probability of returning any valid solution. So, at least for one somewhat reasonable interpretation of the task (unbounded range, any distribution with full support) the following code should solve the task efficiently:
from random import gauss
def random_solution(n):
h = int(gauss(0, 1000)) # any distribution with full support on the integers will do
k = int(gauss(0, 1000))
return (-n + h, n - 2*h + 3*k, n + h - 5*k)
If the range of possible values is restricted, the problem becomes a bit trickier. On the positive side, if all values are bounded below (or above), then the set of possible solutions is finite, and so a uniform distribution exists on it. On the flip side, efficiently sampling this uniform distribution is not trivial.
One possible approach, which you've used yourself, is to first generate all possible solutions (assuming there's a finite number of them) and then sample from the list of solutions. We can do the solution generation fairly efficiently like this:
find all possible values of a for which the equation might have a solution,
for each such a, find all possible values of b for which there still have a solution,
for each such (a, b) pair, solve the equation for c and check if it's valid (i.e. an integer within the specified range), and
if yes, add (a, b, c) to the set of solutions.
The tricky part is step 2, where we want to calculate the range of possible b values. For this, we can make use of the observation that, for a given a, setting c to its smallest allowed value and solving the equation gives an upper bound for b (and vice versa).
In particular, solving the equation for a, b and c respectively, we get:
a = (n − 5b − 3c) / 7
b = (n − 7a − 3c) / 5
c = (n − 7a − 5b) / 3
Given lower bounds on some of the values, we can use these solutions to compute corresponding upper bounds on the others. For example, the following code will generate all non-negative solutions efficiently (and can be easily modified to use a lower bound other than 0, if needed):
def all_nonnegative_solutions(n):
a_min = b_min = c_min = 0
a_max = (n - 5*b_min - 3*c_min) // 7
for a in range(a_min, a_max + 1):
b_max = (n - 7*a - 3*c_min) // 5
for b in range(b_min, b_max + 1):
if (n - 7*a - 5*b) % 3 == 0:
c = (n - 7*a - 5*b) // 3
yield (a, b, c)
We can then store the solutions in a list or a tuple and sample from that list:
from random import choice
solutions = tuple(all_nonnegative_solutions(30))
a, b, c = choice(solutions)
Ps. Apparently Python's random.choice is not smart enough to use reservoir sampling to sample from an arbitrary iterable, so we do need to store the full list of solutions even if we only want to sample from it once. Or, of course, we could always implement our own sampler:
def reservoir_choice(iterable):
r = None
n = 0
for x in iterable:
n += 1
if randrange(n) == 0:
r = x
return r
a, b, c = reservoir_choice(all_nonnegative_solutions(30))
BTW, we could make the all_nonnegative_solutions function above a bit more efficient by observing that the (n - 7*a - 5*b) % 3 == 0 condition (which checks whether c = (n − 7a − 5b) / 3 is an integer, and thus a valid solution) is true for every third value of b. Thus, if we first calculated the smallest value of b that satisfies the condition for a given a (which can be done with a bit of modular arithmetic), we could iterate over b with a step size of 3 starting from that minimum value and skip the divisibility check entirely. I'll leave implementing that optimization as an exercise.
import numpy as np
def generate_answer(n: int, low_limit:int, high_limit: int):
while True:
a = np.random.randint(low_limit, high_limit + 1, 1)[0]
b = np.random.randint(low_limit, high_limit + 1, 1)[0]
c = (n - 7 * a - 5 * b) / 3.0
if int(c) == c and low_limit <= c <= high_limit:
break
return a, b, int(c)
if __name__ == "__main__":
n = 30
ans = generate_answer(low_limit=-5, high_limit=50, n=n)
assert ans[0] * 7 + ans[1] * 5 + ans[2] * 3 == n
print(ans)
If you select two of the numbers a, b, c, you know the third. In this case, I randomize ints for a, b, and I find c by c = (n - 7 * a - 5 * b) / 3.0.
Make sure c is an integer, and in the allowed limits, and we are done.
If it is not, randomize again.
If you want to generate all possibilities,
def generate_all_answers(n: int, low_limit:int, high_limit: int):
results = []
for a in range(low_limit, high_limit + 1):
for b in range(low_limit, high_limit + 1):
c = (n - 7 * a - 5 * b) / 3.0
if int(c) == c and low_limit <= c <= high_limit:
results.append((a, b, int(c)))
return results
If third-party libraries are allowed, you can use SymPy's diophantine.diop_linear linear Diophantine equations solver:
from sympy.solvers.diophantine.diophantine import diop_linear
from sympy import symbols
from numpy.random import randint
n = 30
N = 8 # Number of solutions needed
# Unknowns
a, b, c = symbols('a, b, c', integer=True)
# Coefficients
x, y, z = 7, 5, 3
# Parameters of parametric equation of solution
t_0, t_1 = symbols('t_0, t_1', integer=True)
solution = diop_linear(x * a + y * b + z * c - n)
if not (None in solution):
for s in range(N):
# -10000 and 10000 (max and min for t_0 and t_1)
t_sub = [(t_0, randint(-10000, 10000)), (t_1, randint(-10000, 10000))]
a_val, b_val, c_val = map(lambda t : t.subs(t_sub), solution)
print('Solution #%d' % (s + 1))
print('a =', a_val, ', b =', b_val, ', c =', c_val)
else:
print('no solutions')
Output (random):
Solution #1
a = -141 , b = -29187 , c = 48984
Solution #2
a = -8532 , b = -68757 , c = 134513
Solution #3
a = 5034 , b = 30729 , c = -62951
Solution #4
a = 7107 , b = 76638 , c = -144303
Solution #5
a = 4587 , b = 23721 , c = -50228
Solution #6
a = -9294 , b = -106269 , c = 198811
Solution #7
a = -1572 , b = -43224 , c = 75718
Solution #8
a = 4956 , b = 68097 , c = -125049
Why your solution can't cope with large values of n
You may understand that everything in a for loop with a range of i, will run i times. So it will multiply the time taken by i.
For example, let's pretend (to keep things simple) that this runs in 4 milliseconds:
if 7*a + 5*b + 3*c = n:
c.append(a)
k.append(b)
w.append(c)
then this will run in 4×n milliseconds:
for c in range(n):
if 7*a + 5*b + 3*c = n:
c.append(a)
k.append(b)
w.append(c)
Approximately:
n = 100 would take 0.4 seconds
n = 250 would take 1 second
n = 15000 would take 60 seconds
If you put that inside a for loop over a range of n then the whole thing will be repeated n times. I.e.
for b in range(n):
for c in range(n):
if 7*a + 5*b + 3*c = n:
c.append(a)
k.append(b)
w.append(c)
will take 4n² milliseconds.
n = 30 would take 4 seconds
n = 50 would take 10 seconds
n = 120 would take 60 seconds
Putting it in a third for-loop will take 4n³ milliseconds.
n = 10 would take 4 seconds
n = 14 would take 10 seconds.
n = 24 would take 60 seconds.
Now, what if you halved the original if to 2 milliseconds? n would be able to increase by 15000 in the first case... and 23 in the last case. The lesson here is that fewer for-loops is usually much more important than speeding up what's inside them. As you can see in Gulzar's answer part 2, there are only two for loops which makes a big difference. (This only applies if the loops are inside each other; if they are just one after another you don't have the multiplication problem.)
from my perspective, the last number of the three is never a random number. let say you generate a and b first then c is never a random because it should be calculated from the equation
n = 7*a + 5*b + 3*c
c = (7*a + 5*b - n) / -3
this means that we need to generate two random values (a,b)
that 7*a + 5*b - n is divisible by 3
import random
n = 30;
max = 1000000;
min = -1000000;
while True:
a = random.randint(min , max);
b = random.randint(min , max);
t = (7*a) + (5*b) - n;
if (t % 3 == 0) :
break;
c = (t/-3);
print("A = " + str(a));
print("B = " + str(b));
print("C = " + str(c));
print("7A + 5B + 3C =>")
print("(7 * " + str(a) + ") + (5 * " + str(b) + ") + (3 * " + str(c) + ") = ")
print((7*a) + (5*b) + (3*c));
REPL
Related
Is there a better way to find ‘highly composite’ pythagorean triples in Python?
I’m trying to find ‘highly composite’ pythagorean triples - numbers (c) that have more than one unique a,b (in the naturals) that satisfy a² + b² = c². I’ve written a short python script to find these - it cycles through c in the range (0,1000), and for each c, finds all possible (a,b) such that b < a < c. This is a more brute force method, and I know if I did some reading on number theory I could find some more methods for different cases of a and b. I have a feeling that my script isn’t particularly efficient, especially for large c. I don’t really know what to change or how to make it more efficient. I’d be really grateful for any help or pointers! a = 0 b = 0 l=[] for i in range (0,1000): #i is our c. while a<i: while b<a: #for each a, we cycle through b = 1, b = 2, … until b = a. #Then we make b = 0 and a = a+1, and start the iterative process again. if a*a + b*b == i*i: l.append(a) l.append(b) #I tried adding a break here - my thought process was that we can’t find any #other b^2 that satisfies a^2 + b^2 = i^2 without changing our a^2. This #actually made the runtime longer, and I don’t know why. b = b+1 a = a+1 b = 0 if len(l) > 4: #all our pairs of pythagorean triples, with the c at the end. print(l, i) #reset, and find pairs again for i = i+1. l = [] b = 0 a = 0
Your code seems quite inefficient, because you are doing many times the same computations. You could make it more efficient by not calculating things that are not useful. The most important detail is the computation of a and b. You are looping through all possible values for a and b and checking if it's a pythagorean triplet. But once you give yourself a value for a, there is only one possible choice for b, so the b loop is useless. By removing that loop, you're basically lowering the degree of the polynomial complexity by one, which will make it increasingly faster (compared to your current script) when c grows Also, your code seems to be wrong, as it misses some triplets. I ran it and the first triplets found were with 65 and 85, but 25, 50 and 75 are also highly composite pythagoren triplets. That's because you're checking len(l)>4, while you should check len(l)>=4 instead because you're missing numbers that have two decompositions. As a comparison, I programmed a similar python script as yours (except I did it myself and tried to make it as efficient as possible). On my computer, your script ran in 66 seconds, while mine ran in 4 seconds, so you have a lot of room for improvement. EDIT : I added my code for the sake of sharing. Here is a list of what differs from yours : I stored all squares of numbers from 1 to N in a list called squares so I can check efficiently if a number is a square I store the results in a dictionary where the value at key c is a list of tuples corresponding to (a, b) The loop for a goes from 1 to floor(c/sqrt(2)) Instead of looping for b, I check whether c²-a² is a square On a general note, I pre-compute every value that has to be used several times (invsqrt2, csqr) from math import floor, sqrt invsqrt2 = 1/sqrt(2) N=1000 highly_composite_triplets = {} squares = list(map(lambda x: x**2, range(0,N+1))) for c in range(2,N+1): if c%50==0: print(c) # Just to keep track of the thing csqr = c**2 listpairs = [] for a in range(1,floor(c*invsqrt2)+1): sqrdiff = csqr-a**2 if sqrdiff in squares: listpairs.append((a, squares.index(sqrdiff))) if len(listpairs)>1: highly_composite_triplets[c] = listpairs print(highly_composite_triplets)
First of all, and as already mentioned, you should fix that > 4 by >= 4. For performance, I would suggest using the Tree of primitive Pythagorean triples. It allows to generate all possible primitive triples, such that three "children" of a given triple have a c-value that is at least as great as the one of the "parent". The non-primitive triples can be easily generated from a primitive one, by multiplying all three values with a coefficient (until the maximum value of c is reached). This has to only be done for the initial triplet, as the others will follow from it. That is the part where most efficiency gain is made. Then in a second phase: group those triples by their c value. You can use itertools.groupby for that. In a third phase: only select the groups that have at least 2 members (i.e. 4 values). Here is an implementation: import itertools import operator def pythagorian(end): # DFS traversal through the pythagorian tree: def recur(a, b, c): if c < end: yield c, max(a, b), min(a, b) yield from recur( a - 2*b + 2*c, 2*a - b + 2*c, 2*a - 2*b + 3*c) yield from recur( a + 2*b + 2*c, 2*a + b + 2*c, 2*a + 2*b + 3*c) yield from recur(-a + 2*b + 2*c, -2*a + b + 2*c, -2*a + 2*b + 3*c) # Start traversal from basic triplet, and its multiples for i in range(1, end // 5): yield from recur(4*i, 3*i, 5*i) def grouped_pythagorian(end): # Group by value of c, and flatten the a, b pairs into a list return [ (c, [a for _, *ab in group for a in ab]) for c, group in itertools.groupby(sorted(pythagorian(end)), operator.itemgetter(0)) ] def highly_pythagorian(end): # Select the groups of triples that have at least 2 members (i.e. 4 values) return [(group, c) for c, group in grouped_pythagorian(end) if len(group) >= 4] Run the function as follows: for result in highly_pythagorian(1000): print(*result) This produces the triples within a fraction of a second, and is thousands of times faster than your version and the one in #Mateo's answer. Simplified As discussed in comments, I provide here code that uses the same algorithm, but without imports, list comprehensions, generators (yield), and unpacking operators (*): def highly_pythagorian(end): triples = [] # DFS traversal through the pythagorian tree: def dfs(a, b, c): if c < end: triples.append((c, max(a, b), min(a, b))) dfs( a - 2*b + 2*c, 2*a - b + 2*c, 2*a - 2*b + 3*c) dfs( a + 2*b + 2*c, 2*a + b + 2*c, 2*a + 2*b + 3*c) dfs(-a + 2*b + 2*c, -2*a + b + 2*c, -2*a + 2*b + 3*c) # Start traversal from basic triplet, and its multiples for i in range(1, end // 5): dfs(4*i, 3*i, 5*i) # Sort the triples by their c-component (first one), # ...and then their a-component triples.sort() # Group the triples in a dict, keyed by c values groups = {} for c, a, b in triples: if not c in groups: groups[c] = [] groups[c].append(a) groups[c].append(b) # Select the groups of triples that have at least 2 members (i.e. 4 values) results = [] for c, ab_pairs in sorted(groups.items()): if len(ab_pairs) >= 4: results.append((ab_pairs, c)) return results Call as: for ab_pairs, c in highly_pythagorian(1000): print(ab_pairs, c)
Here is a solution based on the mathematical intuition behind Gaussian integers. We are working in the "ring" R of all numbers of the form a + ib where a, b are integers. This is the ring of Gaussian integers. Here, i is the square root of -1. So i² = -1. Such numbers lead to a similar arithmetic as in the case of the (usual) integers. Each such number has a unique decomposition in gaussian primes. (Up to the order of the factors.) Such a domain is called a unique factorization domain, UFD. Which are the primes in R? (Those elements that cannot be split multiplicatively in more than two non-invertible pieces.) There is a concrete characterization for them. The classical primes of the shapes 4k + 3 remain primes in R, are inert. So we cannot split primes like 3, 7, 11, 19, 23, 31, ... in R. But we can always split uniquely (up to unit conjugation, a unit being one among 1, -1, i, -i) the (classical) primes of the shape 4k + 1 in R. For instance: (*) 5 = (2 + i)(2 - i) 13 = (3 + 2i)(3 - 2i) 17 = (4 + i)(4 - i) 29 = (5 + 2i)(5 - 2i) 37 = (6 + i)(6 - i) 41 = (5 + 4i)(5 - 4i) 53 = (7 + 2i)(7 - 2i) 61 = (6 + 5i)(6 - 5i) and so on, i hope the scheme is clear. For our purpose, the remained prime two is the oddest prime. Since we have its decomposition 2 = (1 + i)(1 -i), where the two Gaussian primes (1 + i) and (1 - i) are associated, multiplying with a unit bring one in the other one. I will avoid this prime below. Now consider the product of some of the numbers on the L.H.S. in (*). For instance 5.5.13.17 = 5525 - and let us pick from each of the four (classical) prime factors one of the Gaussian primes inside. We may thus pick (2 + i) twice from the two 5-factors, (3 - 2i) from 13 and (4 + i) from the 17. We multiply and get: sage: (2 + i)^2 * (3 - 2*i) * (4 + i) 41*I + 62 And indeed, a = 41 and b = 62 is a solution of 41² + 62² = 5525. Unfortunately 5525 is not a square. OK, let us start with a square, one like 1105² = 5².13².17² = (2+i)²(2-i)² . (3+2i)²(3-2i)² . (4+i)²(4-i)² and now separate the factors in "two parts", so that in one part we have some factors, and in the other part the conjugates. Here are the possibilities for 25 = 5²: (2+i)² and (2-i)² 5 and 5 (2-i)² and (2+i)² There are three possibilities. Do the same for the other two squares, then combine. For instance: sage: (2 + i)^2 * (3 - 2*i)^2 * 17 -272*I + 1071 And indeed, 272² + 1071² = 1105² . This solution is not "primitive", in the sense that 17 is a divisor of the three involved numbers, 272, 1071, 1105. Well, this happens because we took the factor 17 from the separation of 17² in two (equal) parts. To get some other solutions, we may take each possible first part from 5² with... each possible first part from 13² with... each possible first part from 17² and thus get "many solutions". Here are they: sage: [ (m, n) for m in range(1, 1105) for n in range(1, 1105) ....: if m <= n and m2 + n2 == 1105**2 ] [(47, 1104), (105, 1100), (169, 1092), (264, 1073), (272, 1071), (425, 1020), (468, 1001), (520, 975), (561, 952), (576, 943), (663, 884), (700, 855), (744, 817)] We expect 3.3.3 solutions. One of them is the trivial one, 1105² = 1105² + 0². The other solutions of 1105² = a² + b² may be arranged to have a < b. (No chance to get equality.) So we expect (27 - 1)/2 = 13 solutions, yes, the ones above. Which solution is produced by taking the "first parts" as follows: (2 + i)^2 * (3 - 2*i)^2 * (4 + i)^2 ?! sage: (2 + i)^2 * (3 - 2*i)^2 * (4 + i)^2 264*I + 1073 And indeed, (264, 1073) is among the solutions above. So if getting "highly composite" numbers is the issue, with an accent on highly, then just pick for c such a product of primes of the shape 4k + 1. For instance c = 5³.13.17 or c = 5.13.17.29. Then compute all representations c² = (a + ib)(a - ib) = a² + b² best by using the UFD property of the Gaussian integers. For instance, in a python3 dialog with the interpreter... In [16]: L25 = [complex(2, 1)**4, complex(2, 1)**2 * 5, 25, complex(2, -1)**2 * 5, complex(2, -1)**4] In [17]: L13 = [complex(3, 2)**2, 13, complex(3, -2)**2] In [18]: L17 = [complex(4, 1)**2, 17, complex(4, -1)**2] In [19]: solutions = [] In [20]: for z1 in L25: ...: for z2 in L13: ...: for z3 in L17: ...: z = z1 * z2 * z3 ...: a, b = int(abs(z.real)), int(abs(z.imag)) ...: if a > b: ...: a, b = b, a ...: solutions.append((a, b)) ...: In [21]: solutions = list(set(solutions)) In [22]: solutions.sort() In [23]: len(solutions) Out[23]: 23 In [24]: solutions Out[24]: [(0, 5525), (235, 5520), (525, 5500), (612, 5491), (845, 5460), (1036, 5427), (1131, 5408), (1320, 5365), (1360, 5355), (1547, 5304), (2044, 5133), (2125, 5100), (2163, 5084), (2340, 5005), (2600, 4875), (2805, 4760), (2880, 4715), (3124, 4557), (3315, 4420), (3468, 4301), (3500, 4275), (3720, 4085), (3861, 3952)] We have 23 = 22 + 1 solutions. The last one is the trivial one. All other solutions (a, b) listed have a < b, so there are totally 1 + 22*2 = 45 = 5 * 3 * 3, as expected from the triple for loop above. A similar code can be written for c = 5 * 13 * 17 * 29 = 32045 leading to (3^4 - 1)/2 = 40 non-trivial solutions. In [26]: L5 = [complex(2, 1)**2, 5, complex(2, -1)**2] In [27]: L13 = [complex(3, 2)**2, 13, complex(3, -2)**2] In [28]: L17 = [complex(4, 1)**2, 17, complex(4, -1)**2] In [29]: L29 = [complex(5, 2)**2, 29, complex(5, -2)**2] In [30]: z_list = [z1*z2*z3*z4 ...: for z1 in L5 for z2 in L13 ...: for z3 in L17 for z4 in L29] In [31]: ab_list = [(int(abs(z.real)), int(abs(z.imag))) for z in z_list] In [32]: len(ab_list) Out[32]: 81 In [33]: ab_list = list(set([(min(a, b), max(a, b)) for (a, b) in ab_list])) In [34]: ab_list.sort() In [35]: len(ab_list) Out[35]: 41 In [36]: ab_list[:10] Out[36]: [(0, 32045), (716, 32037), (1363, 32016), (2277, 31964), (2400, 31955), (3045, 31900), (3757, 31824), (3955, 31800), (4901, 31668), (5304, 31603)] (Feel free to also use powers of two in c.)
#There is a general formula for pythagoran triples take 2 numbers, m & n where m > n a = (m^2) - (n^2) b = 2mn c = (m^2) + (n^2) That will always give you a pythagoran triple. Its more efficient but it might not be what you're looking for.
Let n be a square number. Using Python, how we can efficiently calculate natural numbers y up to a limit l such that n+y^2 is again a square number?
Using Python, I would like to implement a function that takes a natural number n as input and outputs a list of natural numbers [y1, y2, y3, ...] such that n + y1*y1 and n + y2*y2 and n + y3*y3 and so forth is again a square. What I tried so far is to obtain one y-value using the following function: def find_square(n:int) -> tuple[int, int]: if n%2 == 1: y = (n-1)//2 x = n+y*y return (y,x) return None It works fine, eg. find_square(13689) gives me a correct solution y=6844. It would be great to have an algorithm that yields all possible y-values such as y=44 or y=156.
Simplest slow approach is of course for given N just to iterate all possible Y and check if N + Y^2 is square. But there is a much faster approach using integer Factorization technique: Lets notice that to solve equation N + Y^2 = X^2, that is to find all integer pairs (X, Y) for given fixed integer N, we can rewrite this equation to N = X^2 - Y^2 = (X + Y) * (X - Y) which follows from famous school formula of difference of squares. Now lets rename two factors as A, B i.e. N = (X + Y) * (X - Y) = A * B, which means that X = (A + B) / 2 and Y = (A - B) / 2. Notice that A and B should be of same odditiy, either both odd or both even, otherwise in last formulas above we can't have whole division by 2. We will factorize N into all possible pairs of two factors (A, B) of same oddity. For fast factorization in code below I used simple to implement but yet quite fast algorithm Pollard Rho, also two extra algorithms were needed as a helper to Pollard Rho, one is Fermat Primality Test (which allows fast checking if number is probably prime) and second is Trial Division Factorization (which helps Pollard Rho to factor out small factors, which could cause Pollard Rho to fail). Pollard Rho for composite number has time complexity O(N^(1/4)) which is very fast even for 64-bit numbers. Any faster factorization algorithm can be chosen if needed a bigger space to be searched. My fast algorithm time is dominated by speed of factorization, remaining part of algorithm is blazingly fast, just few iterations of loop with simple formulas. If your N is a square itself (hence we know its root easily), then Pollard Rho can factor N even much faster, within O(N^(1/8)) time. Even for 128-bit numbers it means very small time, 2^16 operations, and I hope you're solving your task for less than 128 bit numbers. If you want to process a range of possible N values then fastest way to factorize them is to use techniques similar to Sieve of Erathosthenes, using set of prime numbers, it allows to compute all factors for all N numbers within some range. Using Sieve of Erathosthenes for the case of range of Ns is much faster than factorizing each N with Pollard Rho. After factoring N into pairs (A, B) we compute (X, Y) based on (A, B) by formulas above. And output resulting Y as a solution of fast algorithm. Following code as an example is implemented in pure Python. Of course one can use Numba to speed it up, Numba usually gives 30-200 times speedup, for Python it achieves same speed as optimized C++. But I thought that main thing here is to implement fast algorithm, Numba optimizations can be done easily afterwards. I added time measurement into following code. Although it is pure Python still my fast algorithm achieves 8500x times speedup compared to regular brute force approach for limit of 1 000 000. You can change limit variable to tweak amount of searched space, or num_tests variable to tweak amount of different tests. Following code implements both solutions - fast solution find_fast() described above plus very tiny brute force solution find_slow() which is very slow as it scans all possible candidates. This slow solution is only used to compare correctness in tests and compare speedup. Code below uses nothing except few standard Python library modules, no external modules were used. Try it online! def find_slow(N): import math def is_square(x): root = int(math.sqrt(float(x)) + 0.5) return root * root == x, root l = [] for y in range(N): if is_square(N + y ** 2)[0]: l.append(y) return l def find_fast(N): import itertools, functools Prod = lambda it: functools.reduce(lambda a, b: a * b, it, 1) fs = factor(N) mfs = {} for e in fs: mfs[e] = mfs.get(e, 0) + 1 fs = sorted(mfs.items()) del mfs Ys = set() for take_a in itertools.product(*[ (range(v + 1) if k != 2 else range(1, v)) for k, v in fs]): A = Prod([p ** t for (p, _), t in zip(fs, take_a)]) B = N // A assert A * B == N, (N, A, B, take_a) if A < B: continue X = (A + B) // 2 Y = (A - B) // 2 assert N + Y ** 2 == X ** 2, (N, A, B, X, Y) Ys.add(Y) return sorted(Ys) def trial_div_factor(n, limit = None): # https://en.wikipedia.org/wiki/Trial_division fs = [] while n & 1 == 0: fs.append(2) n >>= 1 all_checked = False for d in range(3, (limit or n) + 1, 2): if d * d > n: all_checked = True break while True: q, r = divmod(n, d) if r != 0: break fs.append(d) n = q if n > 1 and all_checked: fs.append(n) n = 1 return fs, n def fermat_prp(n, trials = 32): # https://en.wikipedia.org/wiki/Fermat_primality_test import random if n <= 16: return n in (2, 3, 5, 7, 11, 13) for i in range(trials): if pow(random.randint(2, n - 2), n - 1, n) != 1: return False return True def pollard_rho_factor(n): # https://en.wikipedia.org/wiki/Pollard%27s_rho_algorithm import math, random fs, n = trial_div_factor(n, 1 << 7) if n <= 1: return fs if fermat_prp(n): return sorted(fs + [n]) for itry in range(8): failed = False x = random.randint(2, n - 2) for cycle in range(1, 1 << 60): y = x for i in range(1 << cycle): x = (x * x + 1) % n d = math.gcd(x - y, n) if d == 1: continue if d == n: failed = True break return sorted(fs + pollard_rho_factor(d) + pollard_rho_factor(n // d)) if failed: break assert False, f'Pollard Rho failed! n = {n}' def factor(N): import functools Prod = lambda it: functools.reduce(lambda a, b: a * b, it, 1) fs = pollard_rho_factor(N) assert N == Prod(fs), (N, fs) return sorted(fs) def test(): import random, time limit = 1 << 20 num_tests = 20 t0, t1 = 0, 0 for i in range(num_tests): if (round(i / num_tests * 1000)) % 100 == 0 or i + 1 >= num_tests: print(f'test {i}, ', end = '', flush = True) N = random.randrange(limit) tb = time.time() r0 = find_slow(N) t0 += time.time() - tb tb = time.time() r1 = find_fast(N) t1 += time.time() - tb assert r0 == r1, (N, r0, r1, t0, t1) print(f'\nTime slow {t0:.05f} sec, fast {t1:.05f} sec, speedup {round(t0 / max(1e-6, t1))} times') if __name__ == '__main__': test() Output: test 0, test 2, test 4, test 6, test 8, test 10, test 12, test 14, test 16, test 18, test 19, Time slow 26.28198 sec, fast 0.00301 sec, speedup 8732 times
For the easiest solution, you can try this: import math n=13689 #or we can ask user to input a square number. for i in range(1,9999): if math.sqrt(n+i**2).is_integer(): print(i)
Efficiency when adding values from two lists
I'm trying to learn algorithms by writing a python application that tests out Fermat's last theorem. It iterates all combinations of a^n + b^n = c^n Where a/b hit a ceiling at 10000 and n hits a ceiling at 100. I realize I won't get any hits, but it's just a bit of fun. Anyway, the specifics don't really matter. What it boils down to is a + b where a and b iterate all combinations 1 to 10000. But here's the problem: 4 + 5 is exactly the same as 5 + 4. So my program is doing twice the work it needs to do. How can I iterate these combinations while skipping over mirrored inputs? base_ceiling = 10000 # max values for a and b n_ceiling = 100 # max value for power of n powers = [] for i in range(n_ceiling): jarr = [] for j in range(base_ceiling): jarr.append(j ** i) powers.append(jarr) for k in range(3, n_ceiling): for i in range(1, base_ceiling): for j in range(1, base_ceiling): pow_vals = powers[k] a = powers[k][i] b = powers[k][j] c = a + b try: idx = pow_vals.index(c) if idx > -1: print k, ": ", i, j, "=", idx, " results in ", a, b, "=", c except ValueError: continue
It's as simple as using for j in range(i, base_ceiling). This works because it will start from i instead of 1, so it doesn't repeat anything less than i. You could use i + 1 instead, because i^n + i^n will never be a power of n.
How to find sum of cubes of the divisors for every number from 1 to input number x in python where x can be very large
Examples, 1.Input=4 Output=111 Explanation, 1 = 1³(divisors of 1) 2 = 1³ + 2³(divisors of 2) 3 = 1³ + 3³(divisors of 3) 4 = 1³ + 2³ + 4³(divisors of 4) ------------------------ sum = 111(output) 1.Input=5 Output=237 Explanation, 1 = 1³(divisors of 1) 2 = 1³ + 2³(divisors of 2) 3 = 1³ + 3³(divisors of 3) 4 = 1³ + 2³ + 4³(divisors of 4) 5 = 1³ + 5³(divisors of 5) ----------------------------- sum = 237 (output) x=int(raw_input().strip()) tot=0 for i in range(1,x+1): for j in range(1,i+1): if(i%j==0): tot+=j**3 print tot Using this code I can find the answer for small number less than one million. But I want to find the answer for very large numbers. Is there any algorithm for how to solve it easily for large numbers?
Offhand I don't see a slick way to make this truly efficient, but it's easy to make it a whole lot faster. If you view your examples as matrices, you're summing them a row at a time. This requires, for each i, finding all the divisors of i and summing their cubes. In all, this requires a number of operations proportional to x**2. You can easily cut that to a number of operations proportional to x, by summing the matrix by columns instead. Given an integer j, how many integers in 1..x are divisible by j? That's easy: there are x//j multiples of j in the range, so divisor j contributes j**3 * (x // j) to the grand total. def better(x): return sum(j**3 * (x // j) for j in range(1, x+1)) That runs much faster, but still takes time proportional to x. There are lower-level tricks you can play to speed that in turn by constant factors, but they still take O(x) time overall. For example, note that x // j == 1 for all j such that x // 2 < j <= x. So about half the terms in the sum can be skipped, replaced by closed-form expressions for a sum of consecutive cubes: def sum3(x): """Return sum(i**3 for i in range(1, x+1))""" return (x * (x+1) // 2)**2 def better2(x): result = sum(j**3 * (x // j) for j in range(1, x//2 + 1)) result += sum3(x) - sum3(x//2) return result better2() is about twice as fast as better(), but to get faster than O(x) would require deeper insight. Quicker Thinking about this in spare moments, I still don't have a truly clever idea. But the last idea I gave can be carried to a logical conclusion: don't just group together divisors with only one multiple in range, but also those with two multiples in range, and three, and four, and ... That leads to better3() below, which does a number of operations roughly proportional to the square root of x: def better3(x): result = 0 for i in range(1, x+1): q1 = x // i # value i has q1 multiples in range result += i**3 * q1 # which values have i multiples? q2 = x // (i+1) + 1 assert x // q1 == i == x // q2 if i < q2: result += i * (sum3(q1) - sum3(q2 - 1)) if i+1 >= q2: # this becomes true when i reaches roughly sqrt(x) break return result Of course O(sqrt(x)) is an enormous improvement over the original O(x**2), but for very large arguments it's still impractical. For example better3(10**6) appears to complete instantly, but better3(10**12) takes a few seconds, and better3(10**16) is time for a coffee break ;-) Note: I'm using Python 3. If you're using Python 2, use xrange() instead of range(). One more better4() has the same O(sqrt(x)) time behavior as better3(), but does the summations in a different order that allows for simpler code and fewer calls to sum3(). For "large" arguments, it's about 50% faster than better3() on my box. def better4(x): result = 0 for i in range(1, x+1): d = x // i if d >= i: # d is the largest divisor that appears `i` times, and # all divisors less than `d` also appear at least that # often. Account for one occurence of each. result += sum3(d) else: i -= 1 lastd = x // i # We already accounted for i occurrences of all divisors # < lastd, and all occurrences of divisors >= lastd. # Account for the rest. result += sum(j**3 * (x // j - i) for j in range(1, lastd)) break return result It may be possible to do better by extending the algorithm in "A Successive Approximation Algorithm for Computing the Divisor Summatory Function". That takes O(cube_root(x)) time for the possibly simpler problem of summing the number of divisors. But it's much more involved, and I don't care enough about this problem to pursue it myself ;-) Subtlety There's a subtlety in the math that's easy to miss, so I'll spell it out, but only as it pertains to better4(). After d = x // i, the comment claims that d is the largest divisor that appears i times. But is that true? The actual number of times d appears is x // d, which we did not compute. How do we know that x // d in fact equals i? That's the purpose of the if d >= i: guarding that comment. After d = x // i we know that x == d*i + r for some integer r satisfying 0 <= r < i. That's essentially what floor division means. But since d >= i is also known (that's what the if test ensures), it must also be the case that 0 <= r < d. And that's how we know x // d is i. This can break down when d >= i is not true, which is why a different method needs to be used then. For example, if x == 500 and i == 51, d (x // i) is 9, but it's certainly not the case that 9 is the largest divisor that appears 51 times. In fact, 9 appears 500 // 9 == 55 times. While for positive real numbers d == x/i if and only if i == x/d that's not always so for floor division. But, as above, the first does imply the second if we also know that d >= i. Just for Fun better5() rewrites better4() for about another 10% speed gain. The real pedagogical point is to show that it's easy to compute all the loop limits in advance. Part of the point of the odd code structure above is that it magically returns 0 for a 0 input without needing to test for that. better5() gives up on that: def isqrt(n): "Return floor(sqrt(n)) for int n > 0." g = 1 << ((n.bit_length() + 1) >> 1) d = n // g while d < g: g = (d + g) >> 1 d = n // g return g def better5(x): assert x > 0 u = isqrt(x) v = x // u return (sum(map(sum3, (x // d for d in range(1, u+1)))) + sum(x // i * i**3 for i in range(1, v)) - u * sum3(v-1))
def sum_divisors(n): sum = 0 i = 0 for i in range (1, n) : if n % i == 0 and n != 0 : sum = sum + i # Return the sum of all divisors of n, not including n return sum print(sum_divisors(0)) # 0 print(sum_divisors(3)) # Should sum of 1 # 1 print(sum_divisors(36)) # Should sum of 1+2+3+4+6+9+12+18 # 55 print(sum_divisors(102)) # Should be sum of 2+3+6+17+34+51 # 114
How does Big O notation work?
Ok so I'm fairly new to coding and I am to approximate a WCET T(a, b) and complexity of a function. Example function: def testFunction(self): x = 0 for r in range(a): for c in range(b): if testFunction2(r, c): x = x + 1 return x I understand that the complexity of this function is quadratic O(N^2) but I'm not sure on approximating the WCET? Also isn't there only two assignments in that function, being: x = 0 and x = x + 1 ? If so, how do I express the assignments with T(a, b)? Maths has never been my strong point but I want to learn how to do this. None of the materials I've read explains it in a way I understand. Thanks in advance.
def testFunction(self): x = 0 # 1 for r in range(a): # a for c in range(b): # b if testFunction2(r, c): # a*b x = x + 1 # depends testFunction2 return x # 1 WCET for this function ab where a=n b=n then you can say O(n^2) if always testFunction2 returns True then x = x +1 will execute ab times but it wont effect the sum of execution time. Finally you sum all this exection time: (1 + a + b + a*b + a*b + 1) 2 + a + b + 2*a*b for example, while n = 1000 and a=b=n 2 + 1000 + 1000 + 2*1000*1000 2002 + 2000000 so when you evalute this result you will see 2002 is nothing while you have 2000000.
For a Worst Case Execution Time, you can simply assume there is an input specifically crafted to make your program slow. In this case that would be testfunction2 always returns true. Within the body of the loop, the assignment x = x + 1 happens a * b times in the worst case. Instead of describing this as O(N^2), I would describe it as O(ab), and then note for a ~= b ~= N that is O(N^2)