Been trying to implement Rabin-Miller Strong Pseudoprime Test today.
Have used Wolfram Mathworld as reference, lines 3-5 sums up my code pretty much.
However, when I run the program, it says (sometimes) that primes (even low such as 5, 7, 11) are not primes. I've looked over the code for a very long while and cannot figure out what is wrong.
For help I've looked at this site aswell as many other sites but most use another definition (probably the same, but since I'm new to this kind of math, I can't see the same obvious connection).
My Code:
import random
def RabinMiller(n, k):
# obviously not prime
if n < 2 or n % 2 == 0:
return False
# special case
if n == 2:
return True
s = 0
r = n - 1
# factor n - 1 as 2^(r)*s
while r % 2 == 0:
s = s + 1
r = r // 2 # floor
# k = accuracy
for i in range(k):
a = random.randrange(1, n)
# a^(s) mod n = 1?
if pow(a, s, n) == 1:
return True
# a^(2^(j) * s) mod n = -1 mod n?
for j in range(r):
if pow(a, 2**j*s, n) == -1 % n:
return True
return False
print(RabinMiller(7, 5))
How does this differ from the definition given at Mathworld?
1. Comments on your code
A number of the points I'll make below were noted in other answers, but it seems useful to have them all together.
In the section
s = 0
r = n - 1
# factor n - 1 as 2^(r)*s
while r % 2 == 0:
s = s + 1
r = r // 2 # floor
you've got the roles of r and s swapped: you've actually factored n − 1 as 2sr. If you want to stick to the MathWorld notation, then you'll have to swap r and s in this section of the code:
# factor n - 1 as 2^(r)*s, where s is odd.
r, s = 0, n - 1
while s % 2 == 0:
r += 1
s //= 2
In the line
for i in range(k):
the variable i is unused: it's conventional to name such variables _.
You pick a random base between 1 and n − 1 inclusive:
a = random.randrange(1, n)
This is what it says in the MathWorld article, but that article is written from the mathematician's point of view. In fact it is useless to pick the base 1, since 1s = 1 (mod n) and you'll waste a trial. Similarly, it's useless to pick the base n − 1, since s is odd and so (n − 1)s = −1 (mod n). Mathematicians don't have to worry about wasted trials, but programmers do, so write instead:
a = random.randrange(2, n - 1)
(n needs to be at least 4 for this optimization to work, but we can easily arrange that by returning True at the top of the function when n = 3, just as you do for n = 2.)
As noted in other replies, you've misunderstood the MathWorld article. When it says that "n passes the test" it means that "n passes the test for the base a". The distinguishing fact about primes is that they pass the test for all bases. So when you find that as = 1 (mod n), what you should do is to go round the loop and pick the next base to test against.
# a^(s) = 1 (mod n)?
x = pow(a, s, n)
if x == 1:
continue
There's an opportunity for optimization here. The value x that we've just computed is a20 s (mod n). So we could test it immediately and save ourselves one loop iteration:
# a^(s) = ±1 (mod n)?
x = pow(a, s, n)
if x == 1 or x == n - 1:
continue
In the section where you calculate a2j s (mod n) each of these numbers is the square of the previous number (modulo n). It's wasteful to calculate each from scratch when you could just square the previous value. So you should write this loop as:
# a^(2^(j) * s) = -1 (mod n)?
for _ in range(r - 1):
x = pow(x, 2, n)
if x == n - 1:
break
else:
return False
It's a good idea to test for divisibility by small primes before trying Miller–Rabin. For example, in Rabin's 1977 paper he says:
In implementing the algorithm we incorporate some laborsaving steps. First we test for divisibility by any prime p < N, where, say N = 1000.
2. Revised code
Putting all this together:
from random import randrange
small_primes = [2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31] # etc.
def probably_prime(n, k):
"""Return True if n passes k rounds of the Miller-Rabin primality
test (and is probably prime). Return False if n is proved to be
composite.
"""
if n < 2: return False
for p in small_primes:
if n < p * p: return True
if n % p == 0: return False
r, s = 0, n - 1
while s % 2 == 0:
r += 1
s //= 2
for _ in range(k):
a = randrange(2, n - 1)
x = pow(a, s, n)
if x == 1 or x == n - 1:
continue
for _ in range(r - 1):
x = pow(x, 2, n)
if x == n - 1:
break
else:
return False
return True
In addition to what Omri Barel has said, there is also a problem with your for loop. You will return true if you find one a that passes the test. However, all a have to pass the test for n to be a probable prime.
I'm wondering about this piece of code:
# factor n - 1 as 2^(r)*s
while r % 2 == 0:
s = s + 1
r = r // 2 # floor
Let's take n = 7. So n - 1 = 6. We can express n - 1 as 2^1 * 3. In this case r = 1 and s = 3.
But the code above finds something else. It starts with r = 6, so r % 2 == 0. Initially, s = 0 so after one iteration we have s = 1 and r = 3. But now r % 2 != 0 and the loop terminates.
We end up with s = 1 and r = 3 which is clearly incorrect: 2^r * s = 8.
You should not update s in the loop. Instead, you should count how many times you can divide by 2 (this will be r) and the result after the divisions will be s. In the example of n = 7, n - 1 = 6, we can divide it once (so r = 1) and after the division we end up with 3 (so s = 3).
Here's my version:
# miller-rabin pseudoprimality checker
from random import randrange
def isStrongPseudoprime(n, a):
d, s = n-1, 0
while d % 2 == 0:
d, s = d/2, s+1
t = pow(a, d, n)
if t == 1:
return True
while s > 0:
if t == n - 1:
return True
t, s = pow(t, 2, n), s - 1
return False
def isPrime(n, k):
if n % 2 == 0:
return n == 2
for i in range(1, k):
a = randrange(2, n)
if not isStrongPseudoprime(n, a):
return False
return True
If you want to know more about programming with prime numbers, I modestly recommend this essay on my blog.
You should also have a look at Wikipedia, where known "random" sequences gives guaranteed answers up to a given prime.
if n < 1,373,653, it is enough to test a = 2 and 3;
if n < 9,080,191, it is enough to test a = 31 and 73;
if n < 4,759,123,141, it is enough to test a = 2, 7, and 61;
if n < 2,152,302,898,747, it is enough to test a = 2, 3, 5, 7, and 11;
if n < 3,474,749,660,383, it is enough to test a = 2, 3, 5, 7, 11, and 13;
if n < 341,550,071,728,321, it is enough to test a = 2, 3, 5, 7, 11, 13, and 17;
Related
I am currently stuck on Project Euler problem 60. The problem goes like this:
The primes 3, 7, 109, and 673, are quite remarkable. By taking any two primes and concatenating them in any order the result will always be prime. For example, taking 7 and 109, both 7109 and 1097 are prime. The sum of these four primes, 792, represents the lowest sum for a set of four primes with this property.
Find the lowest sum for a set of five primes for which any two primes concatenate to produce another prime.
In Python, I used and created the following functions.
def isprime(n):
"""Primality test using 6k+-1 optimization."""
if n <= 3:
return n > 1
elif n % 2 == 0 or n % 3 == 0:
return False
i = 5
while i * i <= n:
if n % i == 0 or n % (i + 2) == 0:
return False
i += 6
return True
def prime_pair_sets(n):
heap = [[3,7]]
x = 11
#Phase 1: Create a list of length n.
while getMaxLength(heap)<n:
if isprime(x):
m = [x]
for lst in heap:
tmplst = [x]
for p in lst:
if primePair(x,p):
tmplst.append(p)
if len(tmplst)>len(m):
m=tmplst
heap.append(m)
x+=2
s = sum(maxList(heap))
#Phase 2: Find the lowest sum.
for li in heap:
y=x
while s>(sum(li)+y) and len(li)<n:
b = True
for k in li:
if not primePair(k,y):
b = False
if b == True:
li.append(y)
y+=2
if len(li)>=n:
s = sum(li)
return s
def getMaxLength(h):
m = 0
for s in h:
if len(s) > m:
m = len(s)
return m
def primePair(x, y):
return isprime(int(str(x)+str(y))) and isprime(int(str(y)+str(x)))
def maxList(h):
result = []
for s in h:
if len(s)> len(result):
result = s
return result
I executed the primePairSets function using the first phase only. After about an hour of waiting I got the sum of 74,617 (33647 + 23003 + 16451 + 1493 + 23). It turns out that it is not the sum we're looking for. I tried the problem again using both phases. After about two hours, I end up with the same wrong result. Obviously, the answer is less than 74,617. Can somebody help me find a more efficient way to solving this problem. Please tell me how to solve it.
Solution found
Primes: 13, 5197, 5701, 6733, 8389
Sum: 26,033
Run time reduced to ~10 seconds vs. 1-2 hours reported by OP solution.
Approach
Backtracking algorithm based upon extending path containing list of primes
A prime can be added to the path if its pairwise prime with all primes already in path
Use Sieve of Eratosthenes to precompute primes up to max we expect to need
For each prime, precompute which other primes it is pairwise prime with
Code
import functools
import time
# Decorator for function timing
def timer(func):
"""Print the runtime of the decorated function"""
#functools.wraps(func)
def wrapper_timer(*args, **kwargs):
start_time = time.perf_counter() # 1
value = func(*args, **kwargs)
end_time = time.perf_counter() # 2
run_time = end_time - start_time # 3
print(f"Finished {func.__name__!r} in {run_time:.4f} secs")
return value
return wrapper_timer
#*************************************************************
# Prime number algorithms
#-------------------------------------------------------------
def _try_composite(a, d, n, s):
if pow(a, d, n) == 1:
return False
for i in range(s):
if pow(a, 2**i * d, n) == n-1:
return False
return True # n is definitely composite
def is_prime(n, _precision_for_huge_n=16, _known_primes = set([2, 3, 5, 7])):
if n in _known_primes:
return True
if n > 6 and not (n % 6) in (1, 5):
# Check of form 6*q +/- 1
return False
if any((n % p) == 0 for p in _known_primes) or n in (0, 1):
return False
d, s = n - 1, 0
while not d % 2:
d, s = d >> 1, s + 1
# Returns exact according to http://primes.utm.edu/prove/prove2_3.html
if n < 1373653:
return not any(_try_composite(a, d, n, s) for a in (2, 3))
if n < 25326001:
return not any(_try_composite(a, d, n, s) for a in (2, 3, 5))
if n < 118670087467:
if n == 3215031751:
return False
return not any(_try_composite(a, d, n, s) for a in (2, 3, 5, 7))
if n < 2152302898747:
return not any(_try_composite(a, d, n, s) for a in (2, 3, 5, 7, 11))
if n < 3474749660383:
return not any(_try_composite(a, d, n, s) for a in (2, 3, 5, 7, 11, 13))
if n < 341550071728321:
return not any(_try_composite(a, d, n, s) for a in (2, 3, 5, 7, 11, 13, 17))
# otherwise
return not any(_try_composite(a, d, n, s)
for a in _known_primes[:_precision_for_huge_n])
def primes235(limit):
' Prime generator using Sieve of Eratosthenes with factorization wheel of 2, 3, 5 '
# Source: https://rosettacode.org/wiki/Sieve_of_Eratosthenes#Python
yield 2; yield 3; yield 5
if limit < 7: return
modPrms = [7,11,13,17,19,23,29,31]
gaps = [4,2,4,2,4,6,2,6,4,2,4,2,4,6,2,6] # 2 loops for overflow
ndxs = [0,0,0,0,1,1,2,2,2,2,3,3,4,4,4,4,5,5,5,5,5,5,6,6,7,7,7,7,7,7]
lmtbf = (limit + 23) // 30 * 8 - 1 # integral number of wheels rounded up
lmtsqrt = (int(limit ** 0.5) - 7)
lmtsqrt = lmtsqrt // 30 * 8 + ndxs[lmtsqrt % 30] # round down on the wheel
buf = [True] * (lmtbf + 1)
for i in range(lmtsqrt + 1):
if buf[i]:
ci = i & 7; p = 30 * (i >> 3) + modPrms[ci]
s = p * p - 7; p8 = p << 3
for j in range(8):
c = s // 30 * 8 + ndxs[s % 30]
buf[c::p8] = [False] * ((lmtbf - c) // p8 + 1)
s += p * gaps[ci]; ci += 1
for i in range(lmtbf - 6 + (ndxs[(limit - 7) % 30])): # adjust for extras
if buf[i]: yield (30 * (i >> 3) + modPrms[i & 7])
def prime_pair(x, y):
' Checks if two primes are prime pair (i.e. concatenation of two in either order is also a prime)'
return is_prime(int(str(x)+str(y))) and is_prime(int(str(y)+str(x)))
def find_pairs(primes):
' Creates dictionary of what primes can go with others as a pair'
prime_pairs = {}
for i, p in enumerate(primes):
pairs = set()
for j in range(i+1, len(primes)):
if prime_pair(p, primes[j]):
pairs.add(primes[j])
prime_pairs[p] = pairs
return prime_pairs
#*************************************************************
# Main functionality
#-------------------------------------------------------------
#timer
def find_groups(max_prime = 9000, n = 5):
'''
Find group smallest sum of primes that are pairwise prime
max_prime - max prime to consider
n - the size of the group
'''
def fully_connected(p, path):
'''
checks if p is connected to all elements of the path
(i.e. group of primes)
'''
return all(p in prime_pairs.get(path_item, set()) for path_item in path)
def backtracking(prime_pairs, n, path = None):
if path is None:
path = []
if len(path) == n:
yield path[:]
else:
if not path:
for p, v in prime_pairs.items():
if v:
yield from backtracking(prime_pairs, n, path + [p])
else:
p = path[-1]
for t in sorted(prime_pairs[p]):
if t > p and fully_connected(t, path):
yield from backtracking(prime_pairs, n, path + [t])
primes = list(primes235(max_prime)) # Sieve for list of primes up to max_pair
set_primes = set(primes) # set of primes (for easy test if number is prime)
prime_pairs = find_pairs(primes) # Table of primes and set of other primes they can pair with
return next(backtracking(prime_pairs, n), None)
Test Runs
for n, max_prime in [(2, 1000), (3, 1000), (4, 1000), (5, 9000)]:
print(find_groups(max_prime = max_prime, n = n))
Run Times
n max_prime Primes Found Run Time (secs)
2 1000 (3, 7) 0.1280
3 1000 (3, 37, 67) 0.1240
4 1000 (3, 7, 109, 673) 0.1233
5 40,000 (13, 5197, 5701, 6733, 8389) 7.1142
Note: Timings above performed on an older Windows desktop computer,
namely:
~7-year-old HP-Pavilion Desktop i7 CPU 920 # 2.67 GHz
I wrote this code based on an sieve of Eratosthenes which only stores the numbers 1 mod 6 and -1 mod 6. It is very fast.
def find_lowest_sum (pmax,num):
n=int(str(pmax)+str(pmax))
sieve5m6 = [True] * (n//6+1)
sieve1m6 = [True] * (n//6+1)
for i in range(1,int((n**0.5+1)/6)+1):
if sieve5m6[i]:
sieve5m6[6*i*i::6*i-1]=[False]*(((n//6+1)-6*i*i-1)//(6*i-1)+1)
sieve1m6[6*i*i-2*i::6*i-1]=[False]*(((n//6+1)-6*i*i+2*i-1)//(6*i-1)+1)
if sieve1m6[i]:
sieve5m6[6*i*i::6*i+1]=[False]*(((n//6+1)-6*i*i-1)//(6*i+1)+1)
sieve1m6[6*i*i+2*i::6*i+1]=[False]*(((n//6+1)-6*i*i-2*i-1)//(6*i+1)+1)
def test_concatenate (p1,p2):
ck=0
p3=int(str(p1)+str(p2))
if sieve1m6[(p3-1)//6] and p3%6==1:
ck=1
elif sieve5m6[(p3+1)//6]and p3%6==5:
ck=1
if ck==1:
p3=int(str(p2)+str(p1))
if sieve1m6[(p3-1)//6] and p3%6==1:
ck=2
elif sieve5m6[(p3+1)//6]and p3%6==5:
ck=2
if ck==2:
return True
else:
return False
kmax=(pmax+1)//6
s1=5*pmax
P1=[]
P2=[]
p=3
kmin=0
while p<s1//5:
if p==3 or(p%6==1 and sieve1m6[(p-1)//6])or(p%6==5 and sieve5m6[(p+1)//6]):
P=[]
if p%6==1:
kmin=p//6
elif p%6==5:
kmin=(p+1)//6
if sieve1m6[kmin]:
if test_concatenate(p,6*kmin+1):
P.append(6*kmin+1)
for k in range(kmin+1,kmax):
if sieve5m6[k]:
if test_concatenate(p,6*k-1):
P.append(6*k-1)
if sieve1m6[k]:
if test_concatenate(p,6*k+1):
P.append(6*k+1)
i1=0
while i1<len(P) and P[i1]<s1:
P1=[p]
s=p
P1.append(P[i1])
s+=P[i1]
for i2 in range(i1+1,len(P)):
for i3 in range(1,len(P1)):
ck=test_concatenate(P[i2],P1[i3])
if ck==False:
break
if len(P1)-1==i3 and ck==True:
P1.append(P[i2])
s+=P[i2]
if len(P1)==num:
if s<s1:
P2=P1
s1=s
break
i1+=1
p+=2
return P2
P=find_lowest_sum(9001,5)
print(P)
s=0
for i in range(0,len(P)):
s+=P[i]
print(s)
I've implemented Miller-Rabin primality test and every function seems to be working properly in isolation. However, when I try to find a prime by generating random numbers of 70 bits my program generates in average more than 100000 numbers before finding a number that passes the Miller-Rabin test (10 steps). This is very strange, the probability of being prime for a random odd number of less than 70 bits should be very high (more than 1/50 according to Hadamard-de la Vallée Poussin Theorem). What could be wrong with my code? Would it be possible that the random number generator throws prime numbers with very low probability? I guess not... Any help is very welcome.
import random
def miller_rabin_rounds(n, t):
'''Runs miller-rabin primallity test t times for n'''
# First find the values r and s such that 2^s * r = n - 1
r = (n - 1) / 2
s = 1
while r % 2 == 0:
s += 1
r /= 2
# Run the test t times
for i in range(t):
a = random.randint(2, n - 1)
y = power_remainder(a, r, n)
if y != 1 and y != n - 1:
# check there is no j for which (a^r)^(2^j) = -1 (mod n)
j = 0
while j < s - 1 and y != n - 1:
y = (y * y) % n
if y == 1:
return False
j += 1
if y != n - 1:
return False
return True
def power_remainder(a, k, n):
'''Computes (a^k) mod n efficiently by decomposing k into binary'''
r = 1
while k > 0:
if k % 2 != 0:
r = (r * a) % n
a = (a * a) % n
k //= 2
return r
def random_odd(n):
'''Generates a random odd number of max n bits'''
a = random.getrandbits(n)
if a % 2 == 0:
a -= 1
return a
if __name__ == '__main__':
t = 10 # Number of Miller-Rabin tests per number
bits = 70 # Number of bits of the random number
a = random_odd(bits)
count = 0
while not miller_rabin_rounds(a, t):
count += 1
if count % 10000 == 0:
print(count)
a = random_odd(bits)
print(a)
The reason this works in python 2 and not python 3 is that the two handle integer division differently. In python 2, 3/2 = 1, whereas in python 3, 3/2=1.5.
It looks like you should be forcing integer division in python 3 (rather than float division). If you change the code to force integer division (//) as such:
# First find the values r and s such that 2^s * r = n - 1
r = (n - 1) // 2
s = 1
while r % 2 == 0:
s += 1
r //= 2
You should see the correct behaviour regardless of what python version you use.
I've been trying to implement Dixon's factorization method in python, and I'm a bit confused. I know that you need to give some bound B and some number N and search for numbers between sqrtN and N whose squares are B-smooth, meaning all their factors are in the set of primes less than or equal to B. My question is, given N of a certain size, what determines B so that the algorithm will produce non-trivial factors of N? Here is a wikipedia article about the algorithm, and if it helps, here is my code for my implementation:
def factor(N, B):
def isBsmooth(n, b):
factors = []
for i in b:
while n % i == 0:
n = int(n / i)
if not i in factors:
factors.append(i)
if n == 1 and factors == b:
return True
return False
factor1 = 1
while factor1 == 1 or factor1 == N:
Bsmooth = []
BsmoothMod = []
for i in range(int(N ** 0.5), N):
if len(Bsmooth) < 2 and isBsmooth(i ** 2 % N, B):
Bsmooth.append(i)
BsmoothMod.append(i ** 2 % N)
gcd1 = (Bsmooth[0] * Bsmooth[1]) % N
gcd2 = int((BsmoothMod[0] * BsmoothMod[1]) ** 0.5)
factor1 = gcd(gcd1 - gcd2, N)
factor2 = int(N / factor1)
return (factor1, factor2)
Maybe someone could help clean my code up a bit, too? It seems very inefficient.
This article discusses the optimal size for B: https://web.archive.org/web/20160205002504/https://vmonaco.com/dixons-algorithm-and-the-quadratic-sieve/. Briefly, the optimal value is thought to be exp((logN loglogN)^(1/2)).
[ I wrote this for a different purpose, but you might find it interesting. ]
Given x2 ≡ y2 (mod n) with x ≠ ± y, about half the time gcd(x−y, n) is a factor of n. This congruence of squares, observed by Maurice Kraitchik in the 1920s, is the basis for several factoring methods. One of those methods, due to John Dixon, is important in theory because its sub-exponential run time can be proven, though it is too slow to be useful in practice.
Dixon's method begins by choosing a bound b ≈ e√(log n log log n) and identifying the factor base of all primes less than b that are quadratic residues of n (their jacobi symbol is 1).
function factorBase(n, b)
fb := [2]
for p in tail(primes(b))
if jacobi(n, p) == 1
append p to fb
return fb
Then repeatedly choose an integer r on the range 1 < r < n, calculate its square modulo n, and if the square is smooth over the factor base add it to a list of relations, stopping when there are more relations than factors in the factor base, plus a small reserve for those cases that fail. The idea is to identify a set of relations, using linear algebra, where the factor base primes combine to form a square. Then take the square root of the product of all the factor base primes in the relations, take the product of the related r, and calculate the gcd to identify the factor.
struct rel(x, ys)
function dixon(n, fb, count)
r, rels := floor(sqrt(n)), []
while count > 0
fs := smooth((r * r) % n, fb)
if fs is not null
append rel(r, fs) to rels
count := count - 1
r := r + 1
return rels
A number n is smooth if all its factors are in the factor base, which is determined by trial division; the smooth function returns a list of factors, which is null if n doesn't completely factor over the factor base.
function smooth(n, fb)
fs := []
for f in fb
while n % f == 0
append f to fs
n := n / f
if n == 1 return fs
return []
A factor is determined by submitting the accumulated relations to the linear algebra of the congruence of square solver.
For example, consider the factorization of 143. Choose r = 17, so r2 ≡ 3 (mod 143). Then choose r = 19, so r2 ≡ 75 ≡ 3 · 52. Those two relations can be combined as (17 · 19)2 ≡ 32 · 52 ≡ 152 (mod 143), and the two factors are gcd(17·19 − 15, 143) = 11 and gcd(17·19 + 15, 143) = 13. This sometimes fails; for instance, the relation 212 ≡ 22 (mod 143) can be combined with the relation on 19, but the two factors produced, 1 and 143, are trivial.
Thanks for very interesting question!
In pure Python I implemented from scratch Dixon Factorization Algorithm in 3 different flavors:
Using simplest sieve. I'm creating u64 array with all numbers in range [N; N * 2), which signify z^2 value. This array hold result of multiplication of prime numbers. Then through sieving process I iterate all factor base prime numbers and do array[k] *= p in those k positions that are divisible by p. Finally when sieved array is ready I check both that a) array index k is a perfect square, b) and array[k] == k - N. Second b) condition means that all multiplied p primes give final number, this is only true if number is divisible only by factor-base primes, i.e. it is B-smooth. This is simplest and most slowest out of my 3 solutions.
Second solution uses SymPy library to factorize every z^2. I iterate all possible z and do sympy.factorint(z * z), this gives factorization of z^2. If this factorization contains only small primes, i.e. from factor base, then I collect such z and z^2 for later processing. This version of algorithm is also slow, but much faster than first one.
Third solution uses a kind of sieving used in Quadratic Sieve. This sieving process is fastest of all three algorithms. Basically what it does, it finds all roots of equation x^2 = N (mod p) for all primes in factor base, as I have just few primes root finding is done through simple loop through all variants, for bigger primes one can use Shanks Tonelli algorithm of finding root, which is really fast. Only around 50% of primes give a root solution at all, hence only half of primes are actually used in Quadratic Sieve. Roots of such equation can be used to generate lots of solutions at once, because root + k * p is also a valid solution for all k. Sieving is done through array[offset(root) :: p] += Log2(p). Here instead of multiplication of first algorithm I used adding a logarithm of prime. First it is a bit faster to add a number than to multiply. Secondly, what is more important is that it supports any size of number, e.g. even 256-bit. While multiplying is possible only till 64-bit number, because Numpy has no 128 or 256 bit integers support. After logartithms are added, I check which logarithms are equal to logarithm of original z^2 number, this numbers are final sieved numbers.
After all three algorithms above have sieved all z^2 then I do Linear Algebra stage through Gaussian Elemination algorithm. This stage is meant to find such combination of B-smooth z^2 numbers which after multiplication of their prime factors give final number with all EVEN prime powers.
Lets call a Relation a triple z, z^2, prime factors of z^2. Basically all relations are given to Gaussian Elemination stage, where even combinations are found.
Even powers of prime numbers give us equality a^2 = b^2 (mod N), from where we can get a factor by doing factor = GCD(a + b, N), here GCD is Greatest Common Divisor found through Euclidean Algorithm. This GCD sometimes gives trivial factors 1 and N, in this case other even combinations should be checked.
To be 100% sure to get even combinations I do Sieving stage till I find a bit more than amount of prime numbers amount of relations, actually around 105% of amount of prime numbers. This extra 5% of relations ensure us that we certainly will get dependent linear equations in Gaussian stage. All these dependent equation form even combinations.
Actually we need a bit more dependent equations, not just 1 more than amount of primes, but around 5%-10% more, only because some (50-60% of them as I can see experimentally) dependencies give only trivial factor 1 or N. Hence extra equations are needed.
Put a look at console output at the end of my post. This console output shows all the impressions from my program. There I run in parallel (multi-threaded) both 2nd (Sieve_B) and 3rd (Sieve_C) algorithms. 1st one (Sieve_A) is not run by my program because it is so slow that you'll wait forever for it to finish.
At the very end of source file you can tweak variable bits = 64 to some other size, like bits = 96. This is amount of bits in composite number N. This N is created as a product of just two random prime numbers of equal size. Such a composite consisting of two equal in size primes is usually called RSA Number.
Also find B = 1 << 10, this tells degree of B-smoothness, basically factor base consists of all possible primes < B. You may increase this B limit, this will give more frequent answers of sieved z^2 hence whole factoring becomes much faster. The only limitation of huge size of B is Linear Algebra stage (Gaussian Elemination), because with bigger factor base you have to solve more linear equations of bigger size. And my Gauss is done not in very optimal way, for example instead of keeping bits as np.uint8 you may keep bits as dense np.uint64, this will increase Linear Algebra speed by 8x times more.
You may also find variable M = 1 << 23, which tells how large is sieving array size, in other words it is block size that is processed at once. Bigger block is a bit faster, but not much. Bigger values of M will not give much difference because it only tells what size of tasks sieving process is split into, it doesn't influence any computation power. More than that bigger M will occupy more memory, so you can't increases it infinitely, only till you have enough memory.
Besides all mentioned above algorithms I also used Fermat Primality Test, also Sieve of Eratosthenes (for generating prime factor base).
Plus also implemented my own algorithm of filtering square numbers. For this I take some composite modulus that looks close to Primorial, like mod = 2 * 2 * 2 * 3 * 3 * 5 * 7 * 11 * 13. And inside boolean array I mark all numbers modulus mod that are squares. Later when any number K should be checked if it is square or not I get flag_array[K % mod] and if it is True then number is "Possibly" squares, while if it is False then number is "Definitely" not square. Thus this filter gives false positives sometimes but never false negatives. This filter checking stage filters out 95% of non-squares, remaining 5% of possibly squares can be double-checked through math.isqrt().
Please, click below on Try it online! link, to test run my program on online server of ReplIt. This will give you best impression, especially if you have no Python or no personal laptop. My code below can be just run straight away after only PIP-installing python -m pip numpy sympy.
Try it online!
import threading
def GenPrimes_SieveOfEratosthenes(end):
import numpy as np
composites = np.zeros((end,), dtype = np.uint8)
for p in range(2, len(composites)):
if composites[p]:
continue
if p * p >= end:
break
composites[p * p :: p] = 1
primes = []
for p in range(2, len(composites)):
if not composites[p]:
primes.append(p)
return np.array(primes, dtype = np.uint32)
def Print(*pargs, __state = (threading.RLock(),), **nargs):
with __state[0]:
print(*pargs, flush = True, **nargs)
def IsSquare(n, *, state = []):
if len(state) == 0:
import numpy as np
Print('Pre-computing squares filter...')
squares_filter = 2 * 2 * 2 * 3 * 3 * 5 * 7 * 11 * 13
squares = np.zeros((squares_filter,), dtype = np.uint8)
squares[(np.arange(0, squares_filter, dtype = np.uint64) ** 2) % squares_filter] = 1
state.extend([squares_filter, squares])
if not state[1][n % state[0]]:
return False, None
import math
root = math.isqrt(n)
return root ** 2 == n, root
def FactorRef(x):
import sympy
return dict(sorted(sympy.factorint(x).items()))
def CheckZ(z, N, primes):
z2 = pow(z, 2, N)
factors = FactorRef(z2)
assert all(p <= primes[-1] for p in factors), (primes[-1], factors, N, z, z2)
return z
def SieveSimple(N, primes):
import time, math, numpy as np
Print('Simple Sieve of B-smooth z^2...')
sieve_block = 1 << 21
rep0_time = 0
for iiblock, iblock in enumerate(range(N, N * 2, sieve_block)):
if time.time() - rep0_time >= 30:
Print(f'Block {iiblock:>3} (2^{math.log2(max(iblock - N, 1)):>5.2f})')
rep0_time = time.time()
iblock_end = iblock + sieve_block
sieve_arr = np.ones((sieve_block,), dtype = np.uint64)
iblock_modN = iblock % N
for p in primes:
mp = 1
while True:
if mp * p >= sieve_block:
break
mp *= p
off = (mp - iblock_modN % mp) % mp
sieve_arr[off :: mp] *= p
for i in range(1 if iblock == N else 0, sieve_block):
num = iblock + i
z2 = num - N
if sieve_arr[i] < z2:
continue
assert sieve_arr[i] == z2, (sieve_arr[i], round(math.log2(sieve_arr[i]), 3), z2)
is_square, z = IsSquare(num)
if not is_square:
continue
#Print('z', z, 'z^2', z2)
yield CheckZ(z, N, primes)
def SieveFactor(N, primes):
import math
Print('Factor Sieve of B-smooth z^2...')
for iz, z in enumerate(range(math.isqrt(N - 1) + 1, math.isqrt(N * 2 - 1) + 1)):
z2 = z ** 2 - N
assert 0 <= z2 and z2 < N, (z, z2)
factors = FactorRef(z2)
if any(p > primes[-1] for p in factors):
continue
#Print('iz', iz, 'z', z, 'z^2', z2, 'z^2 factors', factors)
yield CheckZ(z, N, primes)
def BinarySearch(begin, end, Test):
while begin + 1 < end:
mid = (begin + end - 1) >> 1
if Test(mid):
end = mid + 1
else:
begin = mid + 1
assert begin + 1 == end and Test(begin), (begin, end, Test(begin))
return begin
def ModSqrt(n, p):
n %= p
def Ret(x):
if pow(x, 2, p) != n:
return []
nx = (p - x) % p
if x == nx:
return [x]
elif x <= nx:
return [x, nx]
else:
return [nx, x]
#if p % 4 == 3 and sympy.isprime(p):
# return Ret(pow(n, (p + 1) // 4, p))
for i in range(p):
if pow(i, 2, p) == n:
return Ret(i)
return []
def SieveQuadratic(N, primes):
import math, numpy as np
# https://en.wikipedia.org/wiki/Quadratic_sieve
# https://www.rieselprime.de/ziki/Multiple_polynomial_quadratic_sieve
M = 1 << 23
def Log2I(x):
return int(round(math.log2(max(1, x)) * (1 << 24)))
def Log2IF(li):
return li / (1 << 24)
Print('Quadratic Sieve of B-smooth z^2...')
plogs = {}
for p in primes:
plogs[int(p)] = Log2I(int(p))
qprimes = []
B = int(primes[-1]) + 1
for p in primes:
p = int(p)
res = []
mp = 1
while True:
if mp * p >= B:
break
mp *= p
roots = ModSqrt(N, mp)
if len(roots) == 0:
if mp == p:
break
continue
res.append((mp, tuple(roots)))
if len(res) > 0:
qprimes.append(res)
qprimes_lin = np.array([pinfo[0][0] for pinfo in qprimes], dtype = np.uint32)
yield qprimes_lin
Print('QSieve num primes', len(qprimes), f'({len(qprimes) * 100 / len(primes):.1f}%)')
x_begin0 = math.isqrt(N - 1) + 1
assert N <= x_begin0 ** 2
for iblock in range(1 << 30):
if (x_begin0 + (iblock + 1) * M) ** 2 - N >= N:
break
x_begin = x_begin0 + iblock * M
if iblock != 0:
Print('\n', end = '')
Print(f'Block {iblock} (2^{math.log2(max(1, x_begin ** 2 - N)):>6.2f})...')
a = np.zeros((M,), np.uint32)
for pinfo in qprimes:
p = pinfo[0][0]
plog = np.uint32(plogs[p])
for imp, (mp, roots) in enumerate(pinfo):
off_done = set()
for root in roots:
for off in range(mp):
if ((x_begin + off) ** 2 - N) % mp == 0 and off not in off_done:
break
else:
continue
a[off :: mp] += plog
off_done.add(off)
logs = np.log2(np.array((np.arange(M).astype(np.float64) + x_begin) ** 2 - N, dtype = np.float64))
logs2if = Log2IF(a.astype(np.float64))
logs_diff = np.abs(logs - logs2if)
for ix in range(M):
if logs_diff[ix] > 0.3:
continue
z = x_begin + ix
z2 = z * z - N
factors = FactorRef(z2)
assert all(p <= primes[-1] for p, c in factors.items())
#Print('iz', ix, 'z', z, 'z^2', z2, f'(2^{math.log2(max(1, z2)):>6.2f})', ', z^2 factors', factors)
yield CheckZ(z, N, primes)
def LinAlg(N, zs, primes):
import numpy as np
Print('Linear algebra...')
Print('Factoring...')
m = np.zeros((len(zs), len(primes) + len(zs)), dtype = np.uint8)
def SwapRows(i, j):
t = np.copy(m[i])
m[i][...] = m[j][...]
m[j][...] = t[...]
def MatToStr(m):
s = '\n'
for i in range(len(m)):
for j in range(len(m[i])):
s += str(m[i, j])
s += '\n'
return s[1:-1]
for iz, z in enumerate(zs):
z2 = z * z - N
fs = FactorRef(z2)
for p, c in fs.items():
i = np.searchsorted(primes, p, 'right') - 1
assert i >= 0 and i < len(primes) and primes[i] == p, (i, primes[i])
m[iz, i] = (int(m[iz, i]) + c) % 2
m[iz, len(primes) + iz] = 1
Print('Gaussian elemination...')
#Print(MatToStr(m)); Print()
one_col, one_rows = 0, 0
while True:
while True:
for i in range(one_rows, len(m)):
if m[i, one_col]:
break
else:
one_col += 1
if one_col >= len(primes):
break
continue
break
if one_col >= len(primes):
break
assert m[i, one_col]
assert np.all(m[i, :one_col] == 0)
for j in range(len(m)):
if i == j:
continue
if not m[j, one_col]:
continue
m[j][...] ^= m[i][...]
SwapRows(one_rows, i)
one_rows += 1
one_col += 1
assert np.all(m[one_rows:, :len(primes)] == 0)
zeros = m[one_rows:, len(primes):]
Print(f'Even combinations ({len(m) - one_rows}):')
Print(MatToStr(zeros))
return zeros
def ProcessResults(N, zs, la_zeros):
import math
Print('Computing final results...')
factors = []
for i in range(len(la_zeros)):
zero = la_zeros[i]
assert len(zero) == len(zs)
cz = []
for j in range(len(zero)):
if not zero[j]:
continue
z = zs[j]
z2 = z * z - N
cz.append((z, z2, FactorRef(z2)))
a = 1
for z, z2, fs in cz:
a = (a * z) % N
cnts = {}
for z, z2, fs in cz:
for p, c in fs.items():
cnts[p] = cnts.get(p, 0) + c
cnts = dict(sorted(cnts.items()))
b = 1
for p, c in cnts.items():
assert c % 2 == 0, (p, c, cnts)
b = (b * pow(p, c // 2, N)) % N
factor = math.gcd(a + b, N)
Print('a', str(a).rjust(len(str(N))), ' b', str(b).rjust(len(str(N))), ' factor', factor if factor != N else 'N')
if factor != 1 and factor != N:
factors.append(factor)
return factors
def SieveCollectResults(N, its):
import time, threading, queue, traceback, math
K = len(its)
qs = [queue.Queue() for i in range(K)]
last_dot, finish = False, False
def Get(it, ty, need, compul):
nonlocal last_dot, finish
try:
cnt = 0
for iz, z in enumerate(it):
if finish:
break
if iz < 4:
z2 = z * z - N
Print(('\n' if last_dot else '') + 'Sieve_' + ('C', 'B', 'A')[K - 1 - ty], ' iz', iz,
'z', z, 'z^2', z2, f'(2^{math.log2(max(1, z2)):>6.2f})', ', z^2 factors', FactorRef(z2))
last_dot = False
else:
Print(('.', 'b', 'a')[K - 1 - ty], end = '')
last_dot = True
qs[ty].put(z)
cnt += 1
if cnt >= need:
break
except:
Print(traceback.format_exc())
thr = []
for ty, (it, need, compul) in enumerate(its):
thr.append(threading.Thread(target = Get, args = (it, ty, need, compul), daemon = True))
thr[-1].start()
for ithr, t in enumerate(thr):
if its[ithr][2]:
t.join()
finish = True
if last_dot:
Print()
zs = [[] for i in range(K)]
for iq, q in enumerate(qs):
while not qs[iq].empty():
zs[iq].append(qs[iq].get())
return zs
def DixonFactor(N):
import time, math, numpy as np, sys
B = 1 << 10
primes = GenPrimes_SieveOfEratosthenes(B)
Print('Num primes', len(primes), 'last prime', primes[-1])
IsSquare(0)
it = SieveQuadratic(N, primes)
qprimes = next(it)
zs = SieveCollectResults(N, [
#(SieveSimple(N, primes), 3, False),
(SieveFactor(N, primes), 3, False),
(it, round(len(qprimes) * 1.06 + 0.5), True),
])[-1]
la_zeros = LinAlg(N, zs, qprimes)
fs = ProcessResults(N, zs, la_zeros)
if len(fs) > 0:
Print('Factored, factors', sorted(set(fs)))
else:
Print('Failed to factor! Try running program again...')
def IsPrime_Fermat(n, *, ntrials = 32):
import random
if n <= 16:
return n in (2, 3, 5, 7, 11, 13)
for i in range(ntrials):
if pow(random.randint(2, n - 2), n - 1, n) != 1:
return False
return True
def GenRandom(bits):
import random
return random.randrange(1 << (bits - 1), 1 << bits)
def RandPrime(bits):
while True:
n = GenRandom(bits) | 1
if IsPrime_Fermat(n):
return n
def Main():
import math
bits = 64
N = RandPrime(bits // 2) * RandPrime((bits + 1) // 2)
Print('N to factor', N, f'(2^{math.log2(N):>5.1f})')
DixonFactor(N)
if __name__ == '__main__':
Main()
Console output:
N to factor 10086068308526249063 (2^ 63.1)
Num primes 172 last prime 1021
Pre-computing squares filter...
Quadratic Sieve of B-smooth z^2...
Factor Sieve of B-smooth z^2...
QSieve num primes 78 (45.3%)
Block 0 (2^ 32.14)...
Sieve_C iz 0 z 3175858067 z^2 6153202727426 (2^ 42.48) , z^2 factors {2: 1, 29: 2, 67: 1, 191: 1, 487: 1, 587: 1}
Sieve_C iz 1 z 3175859246 z^2 13641877439453 (2^ 43.63) , z^2 factors {31: 1, 61: 1, 167: 1, 179: 1, 373: 1, 647: 1}
Sieve_C iz 2 z 3175863276 z^2 39239319203113 (2^ 45.16) , z^2 factors {31: 1, 109: 1, 163: 1, 277: 1, 311: 1, 827: 1}
Sieve_C iz 3 z 3175867115 z^2 63623612174162 (2^ 45.85) , z^2 factors {2: 1, 29: 1, 41: 1, 47: 1, 61: 1, 127: 1, 197: 1, 373: 1}
.........................................................................
Sieve_B iz 0 z 3175858067 z^2 6153202727426 (2^ 42.48) , z^2 factors {2: 1, 29: 2, 67: 1, 191: 1, 487: 1, 587: 1}
......
Linear algebra...
Factoring...
Gaussian elemination...
Even combinations (7):
01000000000000000000000000000000000000000000000000001100000000000000000000000000000
11010100000010000100100000010011100000000001001001001001011001000000110001010000000
11001011000101111100011111001011010011000111101000001001011000001111100101001110000
11010010010000110110101100110101000100001100010011100011101000100010011011001001000
00010110111010000010000010000111010001010010111001000011011011101110110001001100100
00000010111000110010100110001111010101001000011010110011101000110001101101100100010
10010001111111101100011110111110110100000110111011010001010001100000010100000100001
Computing final results...
a 9990591196683978238 b 9990591196683978238 factor 1
a 936902490212600845 b 3051457985176300292 factor 3960321451
a 1072293684177681642 b 8576178744296269655 factor 2546780213
a 1578121372922149955 b 1578121372922149955 factor 1
a 2036768191033218175 b 8049300117493030888 factor N
a 1489997751586754228 b 2231890938565281666 factor 3960321451
a 9673227070299809069 b 3412883990935144956 factor 3960321451
Factored, factors [2546780213, 3960321451]
I am trying this problem for a while but getting wrong answer again and again.
number can be very large <=2^2014.
22086. Prime Power Test
Explanation about my algorithm:
For a Given number I am checking if the number can be represented as form of prime power or not.
So the the maximum limit to check for prime power is log n base 2.
Finally problem reduced to finding nth root of a number and if it is prime we have our answer else check for all i till log (n base 2) and exit.
I have used all sort of optimizations and have tested enormous test-cases and for all my algorithm gives correct answer
but Judge says wrong answer.
Spoj have another similar problem with small constraints n<=10^18 for which I already got accepted with Python and C++(Best solver in c++)
Here is My python code Please suggest me if I am doing something wrong I am not very proficient in python so my algorithm is a bit lengthy. Thanks in advance.
My Algorithm:
import math
import sys
import fractions
import random
import decimal
write = sys.stdout.write
def sieve(n):
sqrtn = int(n**0.5)
sieve = [True] * (n+1)
sieve[0] = False
sieve[1] = False
for i in range(2, sqrtn+1):
if sieve[i]:
m = n//i - i
sieve[i*i:n+1:i] = [False] * (m+1)
return sieve
def gcd(a, b):
while b:
a, b = b, a%b
return a
def mr_pass(a, s, d, n):
a_to_power = pow(a, d, n)
if a_to_power == 1:
return True
for i in range(s-1):
if a_to_power == n - 1:
return True
a_to_power = (a_to_power * a_to_power) % n
return a_to_power == n - 1
isprime=sieve(1000000)
sprime= [2,3,5,7,11,13,17,19,23,29,31,37,41,43,47,53,59,61,67,71,73,79,83,89,97,101,103,107,109,113,127,131,137,139,149,151,157,163,167,173,179,181,191,193,197,199,211,223,227,229,233,239,241,251,257,263,269,271,277,281,283,293,307,311,313,317,331,337,347,349,353,359,367,373,379,383,389,397,401,409,419,421,431,433,439,443,449,457,461,463,467,479,487,491,499,503,509,521,523,541,547,557,563,569,571,577,587,593,599,601,607,613,617,619,631,641,643,647,653,659,661,673,677,683,691,701,709,719,727,733,739,743,751,757,761,769,773,787,797,809,811,821,823,827,829,839,853,857,859,863,877,881,883,887,907,911,919,929,937,941,947,953,967,971,977,983,991,997]
def smooth_num(n):
c=0
for a in sprime:
if(n%a==0):
c+=1
if(c>=2):
return True;
return False
def is_prime(n):
if(n<1000000):
return isprime[n]
if any((n % p) == 0 for p in sprime):
return False
if n==2:
return True
d = n - 1
s = 0
while d % 2 == 0:
d >>= 1
s += 1
for repeat in range(10):
a=random.randint(1,n-1)
if not mr_pass(a, s, d, n):
return False
return True
def iroot(n,k):
hi = 1
while pow(hi, k) < n:
hi *= 2
lo = hi // 2
while hi - lo > 1:
mid = (lo + hi) // 2
midToK = (mid**k)
if midToK < n:
lo = mid
elif n < midToK:
hi = mid
else:
return mid
if (hi**k) == n:
return hi
else:
return lo
def isqrt(x):
n = int(x)
if n == 0:
return 0
a, b = divmod(n.bit_length(), 2)
x = pow(2,(a+b))
while True:
y = (x + n//x)>>1
if y >= x:
return x
x = y
maxx=2**1024;minn=2**64
def nth_rootp(n,k):
return int(round(math.exp(math.log(n)/k),0))
def main():
for cs in range(int(input())):
n=int(sys.stdin.readline().strip())
if(smooth_num(n)):
write("Invalid order\n")
continue;
order = 0;m=0
power =int(math.log(n,2))
for i in range(1,power+1):
if(n<=maxx):
if i==1:m=n
elif(i==2):m=isqrt(n)
elif(i==4):m=isqrt(isqrt(n))
elif(i==8):m=isqrt(isqrt(isqrt(n)))
elif(i==16):m=isqrt(isqrt(isqrt(isqrt(n))))
elif(i==32):m=isqrt(isqrt(isqrt(isqrt(isqrt(n)))))
elif(i==64):m=isqrt(isqrt(isqrt(isqrt(isqrt(isqrt(n))))))
elif(i==128):m=isqrt(isqrt(isqrt(isqrt(isqrt(isqrt(isqrt(n)))))))
elif(i==256):m=isqrt(isqrt(isqrt(isqrt(isqrt(isqrt(isqrt(isqrt(n))))))))
else:m=int(nth_rootp(n,i))
else:
if i==1:m=n
elif i==2:m=isqrt(n)
elif(i==4):m=isqrt(isqrt(n))
elif(i==8):m=isqrt(isqrt(isqrt(n)))
elif(i==16):m=isqrt(isqrt(isqrt(isqrt(n))))
elif(i==32):m=isqrt(isqrt(isqrt(isqrt(isqrt(n)))))
elif(i==64):m=isqrt(isqrt(isqrt(isqrt(isqrt(isqrt(n))))))
elif(i==128):m=isqrt(isqrt(isqrt(isqrt(isqrt(isqrt(isqrt(n)))))))
elif(i==256):m=isqrt(isqrt(isqrt(isqrt(isqrt(isqrt(isqrt(isqrt(n))))))))
else:m=iroot(n,i)
if m<2:
order=0
break
if(is_prime(m) and n==(m**i)):
write("%d %d\n"%(m,i))
order = 1
break
if(order==0):
write("Invalid order\n")
main()
I'm not going to read all that code, though I suspect the problem is floating-point inaccuracy. Here is my program to determine if a number n is a prime power; it returns the prime p and the power k:
# prime power predicate
from random import randint
from fractions import gcd
def findWitness(n, k=5): # miller-rabin
s, d = 0, n-1
while d % 2 == 0:
s, d = s+1, d/2
for i in range(k):
a = randint(2, n-1)
x = pow(a, d, n)
if x == 1 or x == n-1: continue
for r in range(1, s):
x = (x * x) % n
if x == 1: return a
if x == n-1: break
else: return a
return 0
# returns p,k such that n=p**k, or 0,0
# assumes n is an integer greater than 1
def primePower(n):
def checkP(n, p):
k = 0
while n > 1 and n % p == 0:
n, k = n / p, k + 1
if n == 1: return p, k
else: return 0, 0
if n % 2 == 0: return checkP(n, 2)
q = n
while True:
a = findWitness(q)
if a == 0: return checkP(n, q)
d = gcd(pow(a,q,n)-a, q)
if d == 1 or d == q: return 0, 0
q = d
The program uses Fermat's Little Theorem and exploits the witness a to the compositeness of n that is found by the Miller-Rabin algorithm. It is given as Algorithm 1.7.5 in Henri Cohen's book A Course in Computational Algebraic Number Theory. You can see the program in action at http://ideone.com/cNzQYr.
this is not really an answer, but I don't have enough space to write it as a comment.
So, if the problem still not solved, you may try the following function for nth_rootp, though it is a bit ugly (it is just a binary search to find the precise value of the function):
def nth_rootp(n,k):
r = int(round(math.log(n,2)/k))
left = 2**(r-1)
right = 2**(r+1)
if left**k == n:
return left
if right**k == n:
return right
while left**k < n and right**k > n:
tmp = (left + right)/2
if tmp**k == n:
return tmp
if tmp == left or tmp == right:
return tmp
if tmp**k < n:
left = tmp
else:
if tmp**k > n:
right = tmp
your code look like a little overcomplicated for this task, I will not bother to check it, but the thing you need are the following
is_prime, naturally
a prime generator, optional
calculate the nth root of a number in a precise way
for the first one I recommend the deterministic form of the Miller-Rabin test with a appropriate set of witness to guaranty a exact result until 1543267864443420616877677640751301 (1.543 x 1033) for even bigger numbers you can use the probabilistic one or use a bigger list of witness chosen at your criteria
with all that a template for the solution is as follow
import math
def is_prime(n):
...
def sieve(n):
"list of all primes p such that p<n"
...
def inthroot(x,n):
"calculate floor(x**(1/n))"
...
def is_a_power(n):
"return (a,b) if n=a**b otherwise throw ValueError"
for b in sieve( math.log2(n) +1 ):
a = inthroot(n,b)
if a**b == n:
return a,b
raise ValueError("is not a power")
def smooth_factorization(n):
"return (p,e) where p is prime and n = p**e if such value exists, otherwise throw ValueError"
e=1
p=n
while True:
try:
p,n = is_a_power(p)
e = e*n
except ValueError:
break
if is_prime(p):
return p,e
raise ValueError
def main():
for test in range( int(input()) ):
try:
p,e = smooth_factorization( int(input()) )
print(p,e)
except ValueError:
print("Invalid order")
main()
And the code above should be self explanatory
Filling the blacks
As you are familiar with Miller-Rabin test, I will only mention that if you are interested you can find a implementation of the determinist version here just update the list of witness and you are ready to go.
For the sieve, just change the one you are using to return a list with primes number like this for instance [ p for p,is_p in enumerate(sieve) if is_p ]
With those out of the way, the only thing left is calculate the nth root of the number and to do that in a precise way we need to get rip of that pesky floating point arithmetic that only produce headaches, and the answer is implement the Nth root algorithm using only integer arithmetic, which is pretty similar to the one of isqrt that you already use, I guide myself with the one made by Mark Dickinson for cube root and generalize it and I get this
def inthroot(A, n) :
"calculate floor( A**(1/n) )"
#https://en.wikipedia.org/wiki/Nth_root_algorithm
#https://en.wikipedia.org/wiki/Nth_root#nth_root_algorithm
#https://stackoverflow.com/questions/35254566/wrong-answer-in-spoj-cubert/35276426#35276426
#https://stackoverflow.com/questions/39560902/imprecise-results-of-logarithm-and-power-functions-in-python/39561633#39561633
if A<0:
if n%2 == 0:
raise ValueError
return - inthroot(-A,n)
if A==0:
return 0
n1 = n-1
if A.bit_length() < 1024: # float(n) safe from overflow
xk = int( round( pow(A,1.0/n) ) )
xk = ( n1*xk + A//pow(xk,n1) )//n # Ensure xk >= floor(nthroot(A)).
else:
xk = 1 << -(-A.bit_length()//n) # 1 << sum(divmod(A.bit_length(),n))
# power of 2 closer but greater than the nth root of A
while True:
sig = A // pow(xk,n1)
if xk <= sig:
return xk
xk = ( n1*xk + sig )//n
and with all the above you can solve the problem without inconvenient
from sympy.ntheory import factorint
q=int(input("Give me the number q="))
fact=factorint(q) #We factor the number q=p_1^{n_1}*p_2^{n_2}*...
p_1=list(fact.keys()) #We create a list from keys to be the the numbers p_1,p_2,...
n_1=list(fact.values()) #We create a list from values to be the the numbers n_1,n_2,...
p=int(p_1[0])
n=int(n_1[0])
if q!=p**n: #Check if the number q=p_{1}[0]**n_{1}[0]=p**n.
print("The number "+str(q)+" is not a prime power")
else:
print("The number "+str(q)+" is a prime power")
print("The prime number p="+str(p))
print("The natural number n="+str(n))
I am trying to find an efficient way to compute Euler's totient function.
What is wrong with this code? It doesn't seem to be working.
def isPrime(a):
return not ( a < 2 or any(a % i == 0 for i in range(2, int(a ** 0.5) + 1)))
def phi(n):
y = 1
for i in range(2,n+1):
if isPrime(i) is True and n % i == 0 is True:
y = y * (1 - 1/i)
else:
continue
return int(y)
Here's a much faster, working way, based on this description on Wikipedia:
Thus if n is a positive integer, then φ(n) is the number of integers k in the range 1 ≤ k ≤ n for which gcd(n, k) = 1.
I'm not saying this is the fastest or cleanest, but it works.
from math import gcd
def phi(n):
amount = 0
for k in range(1, n + 1):
if gcd(n, k) == 1:
amount += 1
return amount
You have three different problems...
y needs to be equal to n as initial value, not 1
As some have mentioned in the comments, don't use integer division
n % i == 0 is True isn't doing what you think because of Python chaining the comparisons! Even if n % i equals 0 then 0 == 0 is True BUT 0 is True is False! Use parens or just get rid of comparing to True since that isn't necessary anyway.
Fixing those problems,
def phi(n):
y = n
for i in range(2,n+1):
if isPrime(i) and n % i == 0:
y *= 1 - 1.0/i
return int(y)
Calculating gcd for every pair in range is not efficient and does not scales. You don't need to iterate throught all the range, if n is not a prime you can check for prime factors up to its square root, refer to https://stackoverflow.com/a/5811176/3393095.
We must then update phi for every prime by phi = phi*(1 - 1/prime).
def totatives(n):
phi = int(n > 1 and n)
for p in range(2, int(n ** .5) + 1):
if not n % p:
phi -= phi // p
while not n % p:
n //= p
#if n is > 1 it means it is prime
if n > 1: phi -= phi // n
return phi
I'm working on a cryptographic library in python and this is what i'm using. gcd() is Euclid's method for calculating greatest common divisor, and phi() is the totient function.
def gcd(a, b):
while b:
a, b=b, a%b
return a
def phi(a):
b=a-1
c=0
while b:
if not gcd(a,b)-1:
c+=1
b-=1
return c
Most implementations mentioned by other users rely on calling a gcd() or isPrime() function. In the case you are going to use the phi() function many times, it pays of to calculated these values before hand. A way of doing this is by using a so called sieve algorithm.
https://stackoverflow.com/a/18997575/7217653 This answer on stackoverflow provides us with a fast way of finding all primes below a given number.
Oke, now we can replace isPrime() with a search in our array.
Now the actual phi function:
Wikipedia gives us a clear example: https://en.wikipedia.org/wiki/Euler%27s_totient_function#Example
phi(36) = phi(2^2 * 3^2) = 36 * (1- 1/2) * (1- 1/3) = 30 * 1/2 * 2/3 = 12
In words, this says that the distinct prime factors of 36 are 2 and 3; half of the thirty-six integers from 1 to 36 are divisible by 2, leaving eighteen; a third of those are divisible by 3, leaving twelve numbers that are coprime to 36. And indeed there are twelve positive integers that are coprime with 36 and lower than 36: 1, 5, 7, 11, 13, 17, 19, 23, 25, 29, 31, and 35.
TL;DR
With other words: We have to find all the prime factors of our number and then multiply these prime factors together using foreach prime_factor: n *= 1 - 1/prime_factor.
import math
MAX = 10**5
# CREDIT TO https://stackoverflow.com/a/18997575/7217653
def sieve_for_primes_to(n):
size = n//2
sieve = [1]*size
limit = int(n**0.5)
for i in range(1,limit):
if sieve[i]:
val = 2*i+1
tmp = ((size-1) - i)//val
sieve[i+val::val] = [0]*tmp
return [2] + [i*2+1 for i, v in enumerate(sieve) if v and i>0]
PRIMES = sieve_for_primes_to(MAX)
print("Primes generated")
def phi(n):
original_n = n
prime_factors = []
prime_index = 0
while n > 1: # As long as there are more factors to be found
p = PRIMES[prime_index]
if (n % p == 0): # is this prime a factor?
prime_factors.append(p)
while math.ceil(n / p) == math.floor(n / p): # as long as we can devide our current number by this factor and it gives back a integer remove it
n = n // p
prime_index += 1
for v in prime_factors: # Now we have the prime factors, we do the same calculation as wikipedia
original_n *= 1 - (1/v)
return int(original_n)
print(phi(36)) # = phi(2**2 * 3**2) = 36 * (1- 1/2) * (1- 1/3) = 36 * 1/2 * 2/3 = 12
It looks like you're trying to use Euler's product formula, but you're not calculating the number of primes which divide a. You're calculating the number of elements relatively prime to a.
In addition, since 1 and i are both integers, so is the division, in this case you always get 0.
With regards to efficiency, I haven't noticed anyone mention that gcd(k,n)=gcd(n-k,n). Using this fact can save roughly half the work needed for the methods involving the use of the gcd. Just start the count with 2 (because 1/n and (n-1)/k will always be irreducible) and add 2 each time the gcd is one.
Here is a shorter implementation of orlp's answer.
from math import gcd
def phi(n): return sum([gcd(n, k)==1 for k in range(1, n+1)])
As others have already mentioned it leaves room for performance optimization.
Actually to calculate phi(any number say n)
We use the Formula
where p are the prime factors of n.
So, you have few mistakes in your code:
1.y should be equal to n
2. For 1/i actually 1 and i both are integers so their evaluation will also be an integer,thus it will lead to wrong results.
Here is the code with required corrections.
def phi(n):
y = n
for i in range(2,n+1):
if isPrime(i) and n % i == 0 :
y -= y/i
else:
continue
return int(y)