I’m trying to find ‘highly composite’ pythagorean triples - numbers (c) that have more than one unique a,b (in the naturals) that satisfy a² + b² = c².
I’ve written a short python script to find these - it cycles through c in the range (0,1000), and for each c, finds all possible (a,b) such that b < a < c. This is a more brute force method, and I know if I did some reading on number theory I could find some more methods for different cases of a and b.
I have a feeling that my script isn’t particularly efficient, especially for large c. I don’t really know what to change or how to make it more efficient.
I’d be really grateful for any help or pointers!
a = 0
b = 0
l=[]
for i in range (0,1000):
#i is our c.
while a<i:
while b<a:
#for each a, we cycle through b = 1, b = 2, … until b = a.
#Then we make b = 0 and a = a+1, and start the iterative process again.
if a*a + b*b == i*i:
l.append(a)
l.append(b)
#I tried adding a break here - my thought process was that we can’t find any
#other b^2 that satisfies a^2 + b^2 = i^2 without changing our a^2. This
#actually made the runtime longer, and I don’t know why.
b = b+1
a = a+1
b = 0
if len(l) > 4:
#all our pairs of pythagorean triples, with the c at the end.
print(l, i)
#reset, and find pairs again for i = i+1.
l = []
b = 0
a = 0
Your code seems quite inefficient, because you are doing many times the same computations. You could make it more efficient by not calculating things that are not useful. The most important detail is the computation of a and b. You are looping through all possible values for a and b and checking if it's a pythagorean triplet. But once you give yourself a value for a, there is only one possible choice for b, so the b loop is useless.
By removing that loop, you're basically lowering the degree of the polynomial complexity by one, which will make it increasingly faster (compared to your current script) when c grows
Also, your code seems to be wrong, as it misses some triplets. I ran it and the first triplets found were with 65 and 85, but 25, 50 and 75 are also highly composite pythagoren triplets. That's because you're checking len(l)>4, while you should check len(l)>=4 instead because you're missing numbers that have two decompositions.
As a comparison, I programmed a similar python script as yours (except I did it myself and tried to make it as efficient as possible). On my computer, your script ran in 66 seconds, while mine ran in 4 seconds, so you have a lot of room for improvement.
EDIT : I added my code for the sake of sharing. Here is a list of what differs from yours :
I stored all squares of numbers from 1 to N in a list called squares so I can check efficiently if a number is a square
I store the results in a dictionary where the value at key c is a list of tuples corresponding to (a, b)
The loop for a goes from 1 to floor(c/sqrt(2))
Instead of looping for b, I check whether c²-a² is a square
On a general note, I pre-compute every value that has to be used several times (invsqrt2, csqr)
from math import floor, sqrt
invsqrt2 = 1/sqrt(2)
N=1000
highly_composite_triplets = {}
squares = list(map(lambda x: x**2, range(0,N+1)))
for c in range(2,N+1):
if c%50==0: print(c) # Just to keep track of the thing
csqr = c**2
listpairs = []
for a in range(1,floor(c*invsqrt2)+1):
sqrdiff = csqr-a**2
if sqrdiff in squares:
listpairs.append((a, squares.index(sqrdiff)))
if len(listpairs)>1:
highly_composite_triplets[c] = listpairs
print(highly_composite_triplets)
First of all, and as already mentioned, you should fix that > 4 by >= 4.
For performance, I would suggest using the Tree of primitive Pythagorean triples. It allows to generate all possible primitive triples, such that three "children" of a given triple have a c-value that is at least as great as the one of the "parent".
The non-primitive triples can be easily generated from a primitive one, by multiplying all three values with a coefficient (until the maximum value of c is reached). This has to only be done for the initial triplet, as the others will follow from it.
That is the part where most efficiency gain is made.
Then in a second phase: group those triples by their c value. You can use itertools.groupby for that.
In a third phase: only select the groups that have at least 2 members (i.e. 4 values).
Here is an implementation:
import itertools
import operator
def pythagorian(end):
# DFS traversal through the pythagorian tree:
def recur(a, b, c):
if c < end:
yield c, max(a, b), min(a, b)
yield from recur( a - 2*b + 2*c, 2*a - b + 2*c, 2*a - 2*b + 3*c)
yield from recur( a + 2*b + 2*c, 2*a + b + 2*c, 2*a + 2*b + 3*c)
yield from recur(-a + 2*b + 2*c, -2*a + b + 2*c, -2*a + 2*b + 3*c)
# Start traversal from basic triplet, and its multiples
for i in range(1, end // 5):
yield from recur(4*i, 3*i, 5*i)
def grouped_pythagorian(end):
# Group by value of c, and flatten the a, b pairs into a list
return [
(c, [a for _, *ab in group for a in ab])
for c, group in itertools.groupby(sorted(pythagorian(end)),
operator.itemgetter(0))
]
def highly_pythagorian(end):
# Select the groups of triples that have at least 2 members (i.e. 4 values)
return [(group, c) for c, group in grouped_pythagorian(end) if len(group) >= 4]
Run the function as follows:
for result in highly_pythagorian(1000):
print(*result)
This produces the triples within a fraction of a second, and is thousands of times faster than your version and the one in #Mateo's answer.
Simplified
As discussed in comments, I provide here code that uses the same algorithm, but without imports, list comprehensions, generators (yield), and unpacking operators (*):
def highly_pythagorian(end):
triples = []
# DFS traversal through the pythagorian tree:
def dfs(a, b, c):
if c < end:
triples.append((c, max(a, b), min(a, b)))
dfs( a - 2*b + 2*c, 2*a - b + 2*c, 2*a - 2*b + 3*c)
dfs( a + 2*b + 2*c, 2*a + b + 2*c, 2*a + 2*b + 3*c)
dfs(-a + 2*b + 2*c, -2*a + b + 2*c, -2*a + 2*b + 3*c)
# Start traversal from basic triplet, and its multiples
for i in range(1, end // 5):
dfs(4*i, 3*i, 5*i)
# Sort the triples by their c-component (first one),
# ...and then their a-component
triples.sort()
# Group the triples in a dict, keyed by c values
groups = {}
for c, a, b in triples:
if not c in groups:
groups[c] = []
groups[c].append(a)
groups[c].append(b)
# Select the groups of triples that have at least 2 members (i.e. 4 values)
results = []
for c, ab_pairs in sorted(groups.items()):
if len(ab_pairs) >= 4:
results.append((ab_pairs, c))
return results
Call as:
for ab_pairs, c in highly_pythagorian(1000):
print(ab_pairs, c)
Here is a solution based on the mathematical intuition behind Gaussian integers. We are working in the "ring" R of all numbers of the form
a + ib
where a, b are integers. This is the ring of Gaussian integers.
Here, i is the square root of -1. So i² = -1.
Such numbers lead to a similar arithmetic as in the case of the (usual) integers. Each such number has a unique decomposition in gaussian primes. (Up to the order of the factors.) Such a domain is called a unique factorization domain, UFD.
Which are the primes in R? (Those elements that cannot be split multiplicatively in more than two non-invertible pieces.) There is a concrete characterization for them.
The classical primes of the shapes 4k + 3 remain primes in R, are inert. So we cannot split primes like 3, 7, 11, 19, 23, 31, ... in R. But we can always split uniquely (up to unit conjugation, a unit being one among 1, -1, i, -i) the (classical) primes of the shape 4k + 1 in R. For instance:
(*)
5 = (2 + i)(2 - i)
13 = (3 + 2i)(3 - 2i)
17 = (4 + i)(4 - i)
29 = (5 + 2i)(5 - 2i)
37 = (6 + i)(6 - i)
41 = (5 + 4i)(5 - 4i)
53 = (7 + 2i)(7 - 2i)
61 = (6 + 5i)(6 - 5i)
and so on, i hope the scheme is clear. For our purpose, the remained prime two is the oddest prime. Since we have its decomposition
2 = (1 + i)(1 -i), where the two Gaussian primes (1 + i) and (1 - i) are associated, multiplying with a unit bring one in the other one. I will avoid this prime below.
Now consider the product of some of the numbers on the L.H.S. in (*). For instance 5.5.13.17 = 5525 - and let us pick from each of the four (classical) prime factors one of the Gaussian primes inside.
We may thus pick (2 + i) twice from the two 5-factors, (3 - 2i) from 13 and (4 + i) from the 17. We multiply and get:
sage: (2 + i)^2 * (3 - 2*i) * (4 + i)
41*I + 62
And indeed, a = 41 and b = 62 is a solution of 41² + 62² = 5525. Unfortunately 5525 is not a square. OK, let us start with a square, one like
1105² = 5².13².17² = (2+i)²(2-i)² . (3+2i)²(3-2i)² . (4+i)²(4-i)²
and now separate the factors in "two parts", so that in one part we have some factors, and in the other part the conjugates. Here are the possibilities for 25 = 5²:
(2+i)² and (2-i)²
5 and 5
(2-i)² and (2+i)²
There are three possibilities. Do the same for the other two squares, then combine. For instance:
sage: (2 + i)^2 * (3 - 2*i)^2 * 17
-272*I + 1071
And indeed, 272² + 1071² = 1105² . This solution is not "primitive", in the sense that 17 is a divisor of the three involved numbers, 272, 1071, 1105. Well, this happens because we took the factor 17 from the separation of 17² in two (equal) parts. To get some other solutions, we may take
each possible first part from 5² with...
each possible first part from 13² with...
each possible first part from 17²
and thus get "many solutions". Here are they:
sage: [ (m, n) for m in range(1, 1105) for n in range(1, 1105)
....: if m <= n and m2 + n2 == 1105**2 ]
[(47, 1104),
(105, 1100),
(169, 1092),
(264, 1073),
(272, 1071),
(425, 1020),
(468, 1001),
(520, 975),
(561, 952),
(576, 943),
(663, 884),
(700, 855),
(744, 817)]
We expect 3.3.3 solutions. One of them is the trivial one, 1105² = 1105² + 0².
The other solutions of 1105² = a² + b² may be arranged to have a < b. (No chance to get equality.) So we expect (27 - 1)/2 = 13 solutions, yes, the ones above.
Which solution is produced by taking the "first parts" as follows: (2 + i)^2 * (3 - 2*i)^2 * (4 + i)^2 ?!
sage: (2 + i)^2 * (3 - 2*i)^2 * (4 + i)^2
264*I + 1073
And indeed, (264, 1073) is among the solutions above.
So if getting "highly composite" numbers is the issue, with an accent on highly, then just pick for c such a product of primes of the shape 4k + 1.
For instance c = 5³.13.17 or c = 5.13.17.29. Then compute all representations c² = (a + ib)(a - ib) = a² + b² best by using the UFD property of the Gaussian integers.
For instance, in a python3 dialog with the interpreter...
In [16]: L25 = [complex(2, 1)**4, complex(2, 1)**2 * 5, 25, complex(2, -1)**2 * 5, complex(2, -1)**4]
In [17]: L13 = [complex(3, 2)**2, 13, complex(3, -2)**2]
In [18]: L17 = [complex(4, 1)**2, 17, complex(4, -1)**2]
In [19]: solutions = []
In [20]: for z1 in L25:
...: for z2 in L13:
...: for z3 in L17:
...: z = z1 * z2 * z3
...: a, b = int(abs(z.real)), int(abs(z.imag))
...: if a > b:
...: a, b = b, a
...: solutions.append((a, b))
...:
In [21]: solutions = list(set(solutions))
In [22]: solutions.sort()
In [23]: len(solutions)
Out[23]: 23
In [24]: solutions
Out[24]:
[(0, 5525),
(235, 5520),
(525, 5500),
(612, 5491),
(845, 5460),
(1036, 5427),
(1131, 5408),
(1320, 5365),
(1360, 5355),
(1547, 5304),
(2044, 5133),
(2125, 5100),
(2163, 5084),
(2340, 5005),
(2600, 4875),
(2805, 4760),
(2880, 4715),
(3124, 4557),
(3315, 4420),
(3468, 4301),
(3500, 4275),
(3720, 4085),
(3861, 3952)]
We have 23 = 22 + 1 solutions. The last one is the trivial one. All other solutions (a, b) listed have a < b, so there are totally 1 + 22*2 = 45 = 5 * 3 * 3, as expected from the triple for loop above. A similar code can be written for c = 5 * 13 * 17 * 29 = 32045 leading to (3^4 - 1)/2 = 40 non-trivial solutions.
In [26]: L5 = [complex(2, 1)**2, 5, complex(2, -1)**2]
In [27]: L13 = [complex(3, 2)**2, 13, complex(3, -2)**2]
In [28]: L17 = [complex(4, 1)**2, 17, complex(4, -1)**2]
In [29]: L29 = [complex(5, 2)**2, 29, complex(5, -2)**2]
In [30]: z_list = [z1*z2*z3*z4
...: for z1 in L5 for z2 in L13
...: for z3 in L17 for z4 in L29]
In [31]: ab_list = [(int(abs(z.real)), int(abs(z.imag))) for z in z_list]
In [32]: len(ab_list)
Out[32]: 81
In [33]: ab_list = list(set([(min(a, b), max(a, b)) for (a, b) in ab_list]))
In [34]: ab_list.sort()
In [35]: len(ab_list)
Out[35]: 41
In [36]: ab_list[:10]
Out[36]:
[(0, 32045),
(716, 32037),
(1363, 32016),
(2277, 31964),
(2400, 31955),
(3045, 31900),
(3757, 31824),
(3955, 31800),
(4901, 31668),
(5304, 31603)]
(Feel free to also use powers of two in c.)
#There is a general formula for pythagoran triples
take 2 numbers, m & n where m > n
a = (m^2) - (n^2)
b = 2mn
c = (m^2) + (n^2)
That will always give you a pythagoran triple. Its more efficient but it might not be what you're looking for.
I would like SymPy to evaluate an expression like the following:
How would I define the symbols and the expression so that SymPy could handle it nicely? I would like to keep N as just a symbol, i.e. not make an actual finite list of x's. I have tried various combinations of IndexedBase and Sum /Product, but didn't get it working right.
Ideally it would be this:
x = IndexedBase("x")
i, j, N = symbols("i j N")
expr = Sum(Product(exp(-x[j]**2), (j, 1, N)).diff(x[i]), (i, 1, N))
So far this is unevaluated, expr is
Sum(Derivative(Product(exp(-x[j]**2), (j, 1, N)), x[i]), (i, 1, N))
The method doit can be used to evaluate it. Unfortunately the differentiation of a product doesn't quite work yet: expr.doit() returns
N*Derivative(Product(exp(-x[j]**2), (j, 1, N)), x[i])
Rewriting the product as the sum prior to differentiation helps:
expr = Sum(Product(exp(-x[j]**2), (j, 1, N)).rewrite(Sum).diff(x[i]), (i, 1, N))
expr.doit()
returns
Sum(Piecewise((-2*exp(Sum(log(exp(-x[j]**2)), (j, 1, N)))*x[i], (1 <= i) & (i <= N)), (0, True)), (i, 1, N))
which is the correct result of differentiation. Sadly we have that extraneous condition in Piecewise, and also log(exp(...)) that should have been simplified. SymPy doesn't infer that (1 <= i) & (i <= N) is True from the context of the outer sum, and it also hesitates to simplify log(exp thinking x[j] might be complex. So I resort to surgical procedure with Piecewise, replacing it by the first piece, and to forcefully expanding logs:
e = expr.doit()
p = next(iter(e.atoms(Piecewise)))
e = expand_log(e.xreplace({p: p.args[0][0]}), force=True)
Now e is
Sum(-2*exp(Sum(-x[j]**2, (j, 1, N)))*x[i], (i, 1, N))
Couldn't get exp(Sum(..)) to become a Product again, unfortunately.