I'm trying to write a python code to find the prime factors of any given number
def pf(n):
for i in range(2,n):
if n%i==0: #find the factors
for j in range(2,i): #check if the factor is prime
if i%j==0:
break
else: #find the prime ones
print(i)
My problem is that this code works fine with small numbers however with big numbers i have to interrupt the execution
for example:
pf(600851475143)
71
839
1471
6857
Traceback (most recent call last):
File "<pyshell#11>", line 1, in <module>
pf(600851475143)
File "<pyshell#1>", line 2, in pf
for i in range(2,n):
KeyboardInterrupt
the prime factors of this big number were found in less than a second, so my question is how to tweak this code to stop unnecessary iterations after finding the factors with the use of the for not the while loop
You can speed things up by dividing n by the obtained value in each iteration step. This way you decrease the number you are iterating. I would implement something like this (not yet sure if this is optimal and results in the lowest number of operations):
from math import sqrt
def pf(n):
if n == 1:
return
if n % 2 == 0:
print(2)
pf(n/2)
return
for i in range(3, int(sqrt(n))+1, 2):
if n % i == 0:
for j in range(3, int(sqrt(i))+1, 2):
if i % j == 0:
break
else:
print(i)
pf(n/i)
return
print(n)
Note, if using the improvement of looping until the root of the number we omit the case that the number itself is a prime number. However, if the function does not result any prime factors it is save to assume that the input is a prime number itself.
The return statements stop the main loop (and the function) after the recursive call. So each call of the function only results in one value and a call for the function on the result of the division of the number by its found prime.
If you make a set with all the prime numbers and check if the value is in this set you will win some time, instead of looping over all values.
Compared to the non-recursive solution by jonrsharpe this one is almost four times as fast:
>>> print timeit.timeit("pf(600851475143)", setup="from __main__ import pf", number=1)
71
839
1471
6857
0.00985789299011
>>> print timeit.timeit("pf2(600851475143)", setup="from __main__ import pf2", number=1)
71
839
1471
6857
0.0450129508972
The implementation is limited by the overflow limit of range(), which results in an overflow for the input value (600851475143**2)+1. More details on the maximum size for range can be found in this question: Python: Range() maximum size; dynamic or static?
A possible issue with this solution could be that the maximum recursion depth is achieved. For more details on that visit this question: Maximum recursion depth
You could try adding prime factors to a list as you find them, and see if they multiply to make the number you are trying to factorize, but I think that might add more time than it would save.
As suggested in the comments, you could also stop at the square root of the number - using for i in range(2, sqrt(n) + 1):.
In terms of generally speeding it up you could also try creating a set of primes, and adding to it when you find them in the 5th line. Example:
if i in primes:
print(i)
else:
for j in range(2,i): # check if the factor is prime
if i%j==0:
break
One further point - use xrange() rather than range(), so you do not internally create the list of all numbers to iterate: (if you are using Python 2 !)
What is the difference between range and xrange functions in Python 2.X?
just iterate square root of value, this is how you can iterate through less nombers and
use generators to skip repeated iteration usinf for else
from math import sqrt
def pf(n):
n = int(sqrt(n))
for i in xrange(2, n): # in python2 use `range` for python3
for j in xrange(2,i):
if i%j == 0:
break
else:
yield i # this will return when prime nomber will found.
print list(pf(4356750))
Here's how I would do it:
from math import sqrt
def pf(n):
"""Print the prime factors of n."""
if n % 2 == 0:
print(2)
for i in range(3, int(sqrt(n))+1, 2):
if n % i == 0: # i is a factor of n
for j in range(3, int(sqrt(i))+1, 2):
if i % j == 0:
break
else: # i is also prime
print(i)
By factoring out the checks for 2 you can almost halve the search space, and using the fact that all prime factors must be below the square root of a number you cut it down even further. This takes about a quarter of a second for 600851475143:
>>> import timeit
>>> timeit.timeit("pf(600851475143)", setup="from __main__ import pf", number=1)
71
839
1471
6857
0.27306951168483806
Another option would be to use a prime sieve to generate all primes below n, then filter out those that are also factors of n (effectively the reverse operation).
Related
To test if a number is prime, I do:
def isprime(n):
if n < 2: return False
for i in range(2, n):
if n % i == 0:
return False
else:
return True
I wonder how to make that more efficient/shorter to write.
With generator:
def isprime(n):
return (all(False for i in range(2,n) if n % i == 0) and not n < 2)
print (isprime(0))
print (isprime(1))
print (isprime(2))
print (isprime(3))
print (isprime(9))
print (isprime(10))
print (isprime(13))
Output:
False
False
True
True
False
False
True
Alongside #Théophile's suggestion, I'd add the following recommendations:
Test whether a number is even and greater than 2 before even calling is_prime (eliminating the need for a function call).
Instead of range(2, n), use range(3, n, 2). This will consider only odd numbers; the third parameters of range is the step by which you'll increment.
Instead of looping through all the integers (or all the odd integers) less than the square root of n, create a cache of the prime numbers you've already found and loop through them. One of the fastest and most elegant ways to do this is using functools.lru_cache, but it will suffice simply to write provide a cache yourself.
Here's a quick and dirty way of doing this that is longer but more efficient than your original proposal:
from math import sqrt
# We seed the cache with the first two odd primes because (1) it's low-
# hanging fruit and (2) we avoid the need to add extra lines of code (that
# will increase execution time) for cases in which the square roots of numbers
# are less than any primes in the list
odd_primes = [3, 5]
def is_prime(n):
# If n is not prime, it must have at least one factor less than its square
# root.
max_prime_factor_limit = int(sqrt(n))
# use a generator here to avoid evaluating all the qualifying values in
# odd_primes until you need them. For example, the square root of 27 is
# 5.1962, but, since it's divisible by 3, you'll only test if 27 is prime
# one time. Not a big deal with smaller integers, but the time to compute
# the next prime will increase pretty fast as there are more potential
# factors.
available_primes = (p for p in odd_primes if p <= max_prime_factor_limit)
for prime in available_primes:
if n % prime == 0:
return False
return True
for i in range(7, 99, 2):
# if no prime factors were found, add to cache
if is_prime(i):
odd_primes.append(i)
print(odd_primes)
There are additional things you can do to speed this up. The one that immediately springs to mind is, instead of calculating the square root of each number you're checking, use the squares of the primes to determine the upper limit of the set of primes you'll check. In other words, if you know that the square of 169 is 13, so you know that any number greater than 169 and less than 289 (the square of 17) will have a prime factor <= 13. You can also use that to save time by calculating the list of prime factors and passing the list to the function you're using to determine if an integer is prime. Note, of course, that this will only work if you're actually creating a list of primes.
number = int(input('please enter a number:'))
if number>1:
for numbers in range(2, number):
if (number % numbers) ==0:
print(f"{number} is not a prime number")
break
else:
print(f"{number} is a prime number")
else:
print(f"{number} is not a prime number")
Hi guys so I was wondering how is this code:
def is_prime(n):
for i in range(2, int(n**.5 + 1)):
if n % i == 0:
return False
return True
able to check for prime when on line 2: for i in range(2, int(n**.5 + 1)): the range is not : range(2, n)? Shouldn't it have to iterate through every number till n but excluding it? This one is not doing that but somehow it works... Could someone explain why it works please.
The loop iterates on all numbers from 2 to the square root on n. For any divisor it could find above that square root (if it continued iterating to n - 1), there would obviously be another divisor below it.
Because the prime factorisation of any number n (by trial division) needs only check the prime numbers up to sqrt(n)
.. Furthermore, the trial factors need go no further than sqrt(n)
because, if n is divisible by some number p, then n = p × q and
if q were smaller than p, n would have been detected earlier as
being divisible by q or by a prime factor of q.
On a sidenote, trial division is slow to check for primality or possible primality. There are faster probabilistic tests like the Miller-Rabin test which can check quickly if a number is composite or probably prime.
I am trying to find out largest prime number of a big number, when I run this I run into an error saying:
Traceback (most recent call last):
File "prb3.py", line 45, in
print prime_factor()
File "prb3.py", line 12, in prime_factor
for n in range(2,i):
MemoryError
It works fine when I run with small number like 13195
"""
Problem:
The prime factors of 13195 are 5, 7, 13 and 29.
What is the largest prime factor of the number 600851475143 ?
"""
import math
def prime_factor():
i=600851475143
factors=[] #list stores all the factors of a number
for n in range(2,i):
if(i%n==0 and (n%2!=0 and n!=2)):
factors.append(n)
"""
Now I have all the factors(which are not even numbers)
Next step is to find prime number from factors list
"""
for factor in factors:
sqr_root=int(math.sqrt(factor))
"""
I take a factor from list and divide it by numbers from 3
to sqroot(factor-1).If I get a 0 as remainder I consider it
as non prime and remove from the list.I apply this only to
factors whose sqr root is greater than 3.If it is less than
3 I divide it by each number between 3 and factor-1.
"""
if(sqr_root<=3):
for num in range(3,factor-1):
if(factor%num==0):
factors.remove(factor)
break
else:
for num in range(3,sqr_root):
if(factor%num==0):
1,1 Top
return len(factors)
if __name__ == "__main__":
print prime_factor()
In Python2, range() returns a list. In your case the list would contain 600851475141 int objects. Since the list is so big, it can't fit in your memory so you get that memory error
Since you don't really need all those numbers in memory at the same time, you could try using xrange() instead.
I think you can simplify your problem by dividing out the factors as you find them. eg.
for n in xrange(2, i):
while(i % n == 0 and (n % 2 != 0 and n != 2)):
i /= n
print n
if i == 1:
break
Not needing to loop 600851475141 times should make your program much faster
I'm relatively new to the python world, and the coding world in general, so I'm not really sure how to go about optimizing my python script. The script that I have is as follows:
import math
z = 1
x = 0
while z != 0:
x = x+1
if x == 500:
z = 0
calculated = open('Prime_Numbers.txt', 'r')
readlines = calculated.readlines()
calculated.close()
a = len(readlines)
b = readlines[(a-1)]
b = int(b) + 1
for num in range(b, (b+1000)):
prime = True
calculated = open('Prime_Numbers.txt', 'r')
for i in calculated:
i = int(i)
q = math.ceil(num/2)
if (q%i==0):
prime = False
if prime:
calculated.close()
writeto = open('Prime_Numbers.txt', 'a')
num = str(num)
writeto.write("\n" + num)
writeto.close()
print(num)
As some of you can probably guess I'm calculating prime numbers. The external file that it calls on contains all the prime numbers between 2 and 20.
The reason that I've got the while loop in there is that I wanted to be able to control how long it ran for.
If you have any suggestions for cutting out any clutter in there could you please respond and let me know, thanks.
Reading and writing to files is very, very slow compared to operations with integers. Your algorithm can be sped up 100-fold by just ripping out all the file I/O:
import itertools
primes = {2} # A set containing only 2
for n in itertools.count(3): # Start counting from 3, by 1
for prime in primes: # For every prime less than n
if n % prime == 0: # If it divides n
break # Then n is composite
else:
primes.add(n) # Otherwise, it is prime
print(n)
A much faster prime-generating algorithm would be a sieve. Here's the Sieve of Eratosthenes, in Python 3:
end = int(input('Generate primes up to: '))
numbers = {n: True for n in range(2, end)} # Assume every number is prime, and then
for n, is_prime in numbers.items(): # (Python 3 only)
if not is_prime:
continue # For every prime number
for i in range(n ** 2, end, n): # Cross off its multiples
numbers[i] = False
print(n)
It is very inefficient to keep storing and loading all primes from a file. In general file access is very slow. Instead save the primes to a list or deque. For this initialize calculated = deque() and then simply add new primes with calculated.append(num). At the same time output your primes with print(num) and pipe the result to a file.
When you found out that num is not a prime, you do not have to keep checking all the other divisors. So break from the inner loop:
if q%i == 0:
prime = False
break
You do not need to go through all previous primes to check for a new prime. Since each non-prime needs to factorize into two integers, at least one of the factors has to be smaller or equal sqrt(num). So limit your search to these divisors.
Also the first part of your code irritates me.
z = 1
x = 0
while z != 0:
x = x+1
if x == 500:
z = 0
This part seems to do the same as:
for x in range(500):
Also you limit with x to 500 primes, why don't you simply use a counter instead, that you increase if a prime is found and check for at the same time, breaking if the limit is reached? This would be more readable in my opinion.
In general you do not need to introduce a limit. You can simply abort the program at any point in time by hitting Ctrl+C.
However, as others already pointed out, your chosen algorithm will perform very poor for medium or large primes. There are more efficient algorithms to find prime numbers: https://en.wikipedia.org/wiki/Generating_primes, especially https://en.wikipedia.org/wiki/Sieve_of_Eratosthenes.
You're writing a blank line to your file, which is making int() traceback. Also, I'm guessing you need to rstrip() off your newlines.
I'd suggest using two different files - one for initial values, and one for all values - initial and recently computed.
If you can keep your values in memory a while, that'd be a lot faster than going through a file repeatedly. But of course, this will limit the size of the primes you can compute, so for larger values you might return to the iterate-through-the-file method if you want.
For computing primes of modest size, a sieve is actually quite good, and worth a google.
When you get into larger primes, trial division by the first n primes is good, followed by m rounds of Miller-Rabin. If Miller-Rabin probabilistically indicates the number is probably a prime, then you do complete trial division or AKS or similar. Miller Rabin can say "This is probably a prime" or "this is definitely composite". AKS gives a definitive answer, but it's slower.
FWIW, I've got a bunch of prime-related code collected together at http://stromberg.dnsalias.org/~dstromberg/primes/
This is my code in python for calculation of sum of prime numbers less than a given number.
What more can I do to optimize it?
import math
primes = [2,] #primes store the prime numbers
for i in xrange(3,20000,2): #i is the test number
x = math.sqrt(i)
isprime = True
for j in primes: #j is the devider. only primes are used as deviders
if j <= x:
if i%j == 0:
isprime = False
break
if isprime:
primes.append(i,)
print sum (primes,)
You can use a different algorithm called the Sieve of Eratosthenes which will be faster but take more memory. Keep an array of flags, signifying whether each number is a prime or not, and for each new prime set it to zero for all multiples of that prime.
N = 10000
# initialize an array of flags
is_prime = [1 for num in xrange(N)]
is_prime[0] = 0 # this is because indexing starts at zero
is_prime[1] = 0 # one is not a prime, but don't mark all of its multiples!
def set_prime(num):
"num is a prime; set all of its multiples in is_prime to zero"
for x in xrange(num*2, N, num):
is_prime[x] = 0
# iterate over all integers up to N and update the is_prime array accordingly
for num in xrange(N):
if is_prime[num] == 1:
set_prime(num)
primes = [num for num in xrange(N) if is_prime[num]]
You can actually do this for pretty large N if you use an efficient bit array, such as in this example (scroll down on the page and you'll find a Sieve of Eratosthenes example).
Another thing you could optimize is move the sqrt computation outside the inner loop. After all, i stays constant through it, so there's no need to recompute sqrt(i) every time.
primes = primes + (i,) is very expensive. It copies every element on every pass of the loop, converting your elegant dynamic programming solution into an O(N2) algorithm. Use lists instead:
primes = [2]
...
primes.append(i)
Also, exit the loop early after passing sqrt(i). And, since you are guaranteed to pass sqrt(i) before running off the end of the list of primes, update the list in-place rather than storing isprime for later consumption:
...
if j > math.sqrt(i):
primes.append(i)
break
if i%j == 0:
break
...
Finally, though this has nothing to do with performance, it is more Pythonic to use range instead of while:
for i in range(3, 10000, 2):
...
Just another code without using any imports:
#This will check n, if it is prime, it will return n, if not, it will return 0
def get_primes(n):
if n < 2:
return 0
i = 2
while True:
if i * i > n:
return n
if n % i == 0:
return 0
i += 1
#this will sum up every prime number up to n
def sum_primes(n):
if n < 2:
return 0
i, s = 2, 0
while i < n:
s += get_primes(i)
i += 1
return s
n = 1000000
print sum_primes(n)
EDIT: removed some silliness while under influence
All brute-force type algorithms for finding prime numbers, no matter how efficient, will become drastically expensive as the upper bound increases. A heuristic approach to testing for primeness can actually save a lot of computation. Established divisibility rules can eliminate most non-primes "at-a-glance".