Sieve of Eratosthenes in Python

Sieve of Eratosthenes in Python - python

I am trying to write a python function to return the number of primes less than a given value and the values of all the primes. I need to use the Sieve of Eratosthenes algorithm. I believe I'm missing something in the function - For example, when I want to find the primes under 100. All I got is 2, 3, 5, 7. I am aware that if I don't use "square root", I can get all the primes I need; but I am told that I need to include square root there. Can someone please take a look at my code and let me know what I am missing? Thanks for your time.
def p(n):
is_p=[False]*2 + [True]*(n-1)
for i in range(2, int(n**0.5)):
if is_p[i]:
yield i
for j in range(i*i, n, i):
is_p[j] = False

"I am told I need to use square root". Why do you think that is? Usually the sieve of E. is used to remove all "non prime" numbers from a list; you can do this by finding a prime number, then checking off all multiples of that prime in your list. The next number "not checked off" is your next prime - you report it (with yield), then continue checking off again. You only need to check for factors less than the square root - factors greater than the square root have a corresponding factor less than the square root, so they have alread been found.
Unfortunately, when it comes to printing out the primes, you can't "stop in the middle". For example, 101 is prime; but if you only loop until 11, you will never discover that it's there. So there need to be two steps:
1) loop over all "possible multiples" - here you can go "only up to the square root"
2) check the list for all numbers that haven't been checked off - here you have to "go all the way"
This makes the following code:
def p(n):
is_p=[False]*2 + [True]*(n-1)
for i in range(2, int(n**0.5)):
if is_p[i]:
for j in range(i*i, n, i):
is_p[j] = False
for i in range(2, n):
if is_p[i]:
yield i
print list(p(102))
The result is a list of primes up to and including 101.

Your logic is correct, except for the for loop. It terminates after reaching sqrt(n)-1. For p(100), it will run only from 2 to 9. Hence you get prime numbers only till 9.

Your use of the square root is terminating your results early. If you want to yield all the primes up to 100, your loop has to go to 100.
The square root isn't necessary in your code because it's implied in your second for loop. If i*i < n then i < sqrt(n).

Related

How does a Sieve of Eratosthenes work in this code?

I need some help explaining the Sieve of Eratosthenes below. I copied it from some other Stackoverflow page, but I do not understand it line by line. Can someone explain this?
def eratosthenes(n):
multiples = []
for i in range(2, n):
if i not in multiples:
print (i)
for j in range(i*i, n, i): #Troubled Part
multiples.append(j)
eratosthenes(100)
I especially do not get the range(i*i...) part. Why are there three attributes in one set of parentheses? Also, why is i squared? Thanks!

Let's go line by line:
multiples = [] makes a list of composite numbers, we'll add here later
for i in range (2,n) we're looping from 2 to n, trying to get all of the primes.
if i not in multiples we're checking if i is a composite number that we've already found
print(i) Well i must be prime, so let's print it out
for j in range(i*i,n,i) Count by is, from i*i to n, as each of these numbers will be a composite number. Note that we're going from i*i, as numbers like i*[number<i] will have been recorded from previous iterations
multiples.append(j) add j to the list of the composite numbers

The third attribute in the range is just an increment. It is saying add i to i*i until we reach n.
So the Sieve of Eratosthenes works by getting rid of multiples of primes (i.e. composites, which are tracked in the multiples list) as we go forward, so like [2,4,6,8,...], [3,9,12,...],[5,25,30,...] as we go on.
In the first of each of the above sequences, we have the prime number, the subsequent elements in the sequence are added to the multiples list
The part where you've highlighted:
for j in range(i*i, n, i): #Troubled Part
multiples.append(j)
This is just adding all the composite multiples of the prime numbers. To be more explicit, let's look at 2.
2 is not in multiples. So it is printed. Then let's think concretely about what happens next:
for j in range(2*2, 100, 2)
This will add 4, 6, 8, 10, and so forth to the multiples list, which are the numbers we ignore because they are composites. You can think of i*i as simply the next element in the sequence of multiples that we start at.
In this way, we continue for 3, starting from 9, 12, 15, and so forth.
Notice that the composite numbers 4,6,8 had already been excluded in the first iteration, so that is why we can start at 9 and continue.
This is in fact a significant point that Ryan Fu has pointed out.
So to put this in the clearest terms:
2 is printed. The multiple list is updated with all other even numbers [4,6,8,10,...]
We go to 3 next because it is not in multiples. We add [9,12,15,18,21,...] to the list.
Notice that we do not need to bother adding 6 to the list because it was already previously added when we considered 2. This is why we do not need to do something like
for j in range(i*2, n, i)
The process continues, the next number we have is 5, so [25,30,35,40,...] are added to multiples
Eventually only the primes are printed.

multiples is storing all the multiple of a prime no for eg for 2 is prime no then 4,6,8,10,12... are multiple of and not primes and there is no need to check these no as they are prime or not at we know that they are not.
if i not in multiple indicate that i is prime number, and then we are saving all the multiple of i. in your solution i think there is no need to use i*i in for j in range(i*i, n , i) but 2*i would be right choice ie for j in range(2*i, n, i) and multiples should be of type set not list as this would allow duplicates and search time complexity would be O(N) in list case and O(1) in set case.
def eratosthenes(n):
multiples = set()
for i in range(2, n):
if i not in multiples:
print (i)
for j in range(2*i, n, i): #Troubled Part
multiples.add(j)
eratosthenes(100)

Optimising code for finding the next prime number

I'm new to both Python and StackOverflow so I apologise if this question has been repeated too much or if it's not a good question. I'm doing a beginner's Python course and one of the tasks I have to do is to make a function that finds the next prime number after a given input. This is what I have so far:
def nextPrime(n):
num = n + 1
for i in range(1, 500):
for j in range(2, num):
if num%j == 0:
num = num + 1
return num
When I run it on the site's IDE, it's fine and everything works well but then when I submit the task, it says the runtime was too long and that I should optimise my code. But I'm not really sure how to do this, so would it be possible to get some feedback or any suggestions on how to make it run faster?

When your function finds the answer, it will continue checking the same number hundreds of times. This is why it is taking so long. Also, when you increase num, you should break out of the nested loop to that the new number is checked against the small factors first (which is more likely to eliminate it and would accelerate progress).
To make this simpler and more efficient, you should break down your problem in areas of concern. Checking if a number is prime or not should be implemented in its own separate function. This will make the code of your nextPrime() function much simpler:
def nextPrime(n):
n += 1
while not isPrime(n): n += 1
return n
Now you only need to implement an efficient isPrime() function:
def isPrime(x):
p,inc = 2,1
while p*p <= x:
if x % p == 0: return False
p,inc = p+inc,2
return x > 1

Looping from 1 to 500, especially because another loop runs through it, is not only inefficient, but also confines the range of the possible "next prime number" that you're trying to find. Therefore, you should make use of while loop and break which can be used to break out of the loop whenever you have found the prime number (of course, if it's stated that the number is less than 501 in the prompt, your approach totally makes sense).
Furthermore, you can make use of the fact that you only need check the integers less than or equal to the square root of the designated integer (which in python, is represented as num**0.5) to determine if that integer is prime, as the divisors of the integers always come in pair and the largest of the smaller divisor is always a square root, if it exists.

How does this python loop check for primality, if it loops less than the number n?

Hi guys so I was wondering how is this code:
def is_prime(n):
for i in range(2, int(n**.5 + 1)):
if n % i == 0:
return False
return True
able to check for prime when on line 2: for i in range(2, int(n**.5 + 1)): the range is not : range(2, n)? Shouldn't it have to iterate through every number till n but excluding it? This one is not doing that but somehow it works... Could someone explain why it works please.

The loop iterates on all numbers from 2 to the square root on n. For any divisor it could find above that square root (if it continued iterating to n - 1), there would obviously be another divisor below it.

Because the prime factorisation of any number n (by trial division) needs only check the prime numbers up to sqrt(n)
.. Furthermore, the trial factors need go no further than sqrt(n)
because, if n is divisible by some number p, then n = p × q and
if q were smaller than p, n would have been detected earlier as
being divisible by q or by a prime factor of q.
On a sidenote, trial division is slow to check for primality or possible primality. There are faster probabilistic tests like the Miller-Rabin test which can check quickly if a number is composite or probably prime.

My python program won't execute or show anything in the terminal

So I was trying to solve a project Euler question that asks we shoud find the largest prime factor of 600851475143.
This is my code:
factors = [i for i in range(1,600851475144) if 600851475143%i is 0]
prime_factors = []
for num in factors:
factors_of_num = [i for i in range(1, num+1) if num%i is 0]
if factors_of_num == [1, num]:
prime_factors.append(num)
print(max(prime_factors))
The issue is that this code won't run for a large number like this. How can \i get this to work?

Your program is executing, but range(1, 600851475144) is just taking a rrrrrrrrrrrrrrrrrrrrrrrreally long time. There are much better ways to get prime factors instead of first checking each number individually whether it is a divisor and then checking which of those are primes.
First, for each pair of divisors p * q = n, either p or q has to be <= sqrt(n), so you'd in fact only have to check the numbers in range(1, 775147) to get one part of those pairs and get the other for free. This alone should be enough to make your program finish in time. But you'd still get all the divisors, and then have to check which of those are prime.
Next, you do not actually have to get all the prime factors of those divisors to determine whether those are prime: You can use any to stop as soon as you find the first non-primitive factor. And here, too, testing up to sqrt(num) is enough. (Also, you could start with the largest divisor, so you can stop the loop as soon as you find the first one that's prime.)
Alternatively, as soon as you find a divisor, divide the target number by that divisor until it can not be divided any more, then continue with the new, smaller target number and the next potential divisor. This way, all your divisors are guaranteed to be prime (otherwise the number would already have been reduced by its prime factors), and you will also need much fewer tests (unless the number itself is prime).

Reduce time complexity of brute forcing - largest prime factor

I am writing a code to find the largest prime factor of a very large number.
Problem 3 of Project Euler :
What is the largest prime factor of the number 600851475143 ?
I coded it in C...but the data type long long int is not sufficient enough to hold the value .
Now, I have rewritten the code in Python. How can I reduce the time taken for execution (as it is taking a considerable amount of time)?
def isprime(b):
x=2
while x<=b/2:
if(b%x)==0:
return 0
x+=1
return 1
def lpf(a):
x=2
i=2
while i<=a/2:
if a%i==0:
if isprime(i)==1:
if i>x:
x=i
print(x)
i+=1
print("final answer"+x)
z=600851475143
lpf(z)

There are many possible algorithmic speed ups. Some basic ones might be:
First, if you are only interested in the largest prime factor, you should check for them from the largest possible ones, not smallest. So instead of looping from 2 to a/2 try to check from a downto 2.
You could load the database of primes instead of using isprime function (there are dozens of such files in the net)
Also, only odd numbers can be primes (except for 2) so you can "jump" 2 values in each iteration
Your isprime checker could also be speededup, you do not have to look for divisiors up to b/2, it is enough to check to sqrt(b), which reduces complexity from O(n) to O(sqrt(n)) (assuming that modulo operation is constant time).

You could use the 128 int provided by GCC: http://gcc.gnu.org/onlinedocs/gcc/_005f_005fint128.html . This way, you can continue to use C and avoid having to optimize Python's speed. In addition, you can always add your own custom storage type to hold numbers bigger than long long in C.

I think you're checking too many numbers (incrementing by 1 and starting at 2 in each case). If you want to check is_prime by trial division, you need to divide by fewer numbers: only odd numbers to start (better yet, only primes). You can range over odd numbers in python the following way:
for x in range(3, some_limit, 2):
if some_number % x == 0:
etc.
In addition, once you have a list of primes, you should be able to run through that list backwards (because the question asks for highest prime factor) and test if any of those primes evenly divides into the number.
Lastly, people usually go up to the square-root of a number when checking trial division because anything past the square-root is not going to provide new information. Consider 100:
1 x 100
2 x 50
5 x 20
10 x 10
20 x 5
etc.
You can find all the important divisor information by just checking up to the square root of the number. This tip is useful both for testing primes and for testing where to start looking for a potential divisor for that huge number.

First off, your two while loops only need to go up to the sqrt(n) since you will have hit anything past that earlier (you then need to check a/i for primeness as well). In addition, if you find the lowest number that divides it, and the result of the division is prime, then you have found the largest.
First, correct your isprime function:
def isprime(b):
x=2
sqrtb = sqrt(b)
while x<=sqrtb:
if(b%x)==0:
return 0
x+=1
return 1
Then, your lpf:
def lpf(a):
x=2
i=2
sqrta = sqrt(a)
while i<=sqrt(a):
if a%i==0:
b = a//i # integer
if isprime(b):
return b
if isprime(i):
x=i
print(x)
i+=1
return x

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.