Issues in implementing Sieve of Eratosthenes algorithm

Issues in implementing Sieve of Eratosthenes algorithm - python

This is the 10th problem in project euler in which we are supposed to find the sum of all the prime numbers below 2 million. I am using Sieve of Eratosthenes algorithm to find the prime numbers. Now I am facing and performance issue with the Sieve of Eratosthenes algorithm.
The performance goes down by a substantial amount if the print(i,"",sum_of_prime) is kept inside the loop. Is there anyway to see it working and keep the performance? If this is done with the conventional method it take some 13 minutes to get the result.
#Euler 10
#Problem:
#The sum of the primes below 10 is 2 + 3 + 5 + 7 = 17.
#Find the sum of all the primes below two million.
#Author: Kshithij Iyer
#Date of creation: 15/1/2017
import time
#Recording start time of the program
start=time.time()
def sum_of_prime_numbers(limit):
"A function to get the sum prime numbers till the limit"
#Variable to store the sum of prime numbers
sum_of_prime=0
#Sieve of Eratosthenes algorithm
sieve=[True]*limit
for i in range(2,limit):
if sieve[i]:
sum_of_prime=sum_of_prime+i
for j in range(i*i,limit,i):
sieve[j]=False
print(i,"",sum_of_prime)
print("The sum of all the prime numbers below",limit,"is",sum_of_prime)
return
#sum_of_prime_numbers(10)
sum_of_prime_numbers(2000001)
print("Execution time of program is",time.time()-start)
#Note:
#I did give the conventioanl method a try but it didn't work well and was taking
#some 13 minutes to get the results.
#Algorithm for reference
#Input: an integer n > 1
#Let A be an array of Boolean values, indexed by integers 2 to n,
#initially all set to true.
#for i = 2, 3, 4, ..., not exceeding √n:
#if A[i] is true:
#for j = i2, i2+i, i2+2i, i2+3i, ..., not exceeding n :
#A[j] := false
#Output: all i such that A[i] is true.

So there are various improvements that could be made here:
Firstly, due to the nature of Eratosthenes' sieve, you can replace for i in range(2,limit): with for i in range(2,int(limit**0.5)+1): and the array will be calculated normally, but much faster; however, as a result, you would then have to sum the numbers later.
Also, you are not going to be able to read every individual prime and nor would you want to; instead, you only need the program to tell you milestones, such as every time the program reaches a certain number to check everything against.
Your program does not appear to take into account the fact that arrays start at 0, which should certainly cause some problems; however, this should be fairly fixable.
Finally, it occurs to me that your program appears to count 1 as prime; this should be another easy fix though.

Related

Why does this 'optimized' prime checker run at the same speed as the regular version?

Given this plain is_prime1 function which checks all the divisors from 1 to sqrt(p) with some bit-playing in order to avoid even numbers which are of-course not primes.
import time
def is_prime1(p):
if p & 1 == 0:
return False
# if the LSD is 5 then it is divisible by 5 (i.e. not a prime)
elif p % 10 == 5:
return False
for k in range(2, int(p ** 0.5) + 1):
if p % k == 0:
return False
return True
Versus this "optimized" version. The idea is to save all the primes we have found until a certain number p, then we iterate on the primes (using this basic arithmetic rule that every number is a product of primes) so we don't iterate through the numbers until sqrt(p) but over the primes we found which supposed to be a tiny bit compared to sqrt(p). We also iterate only on half the elements, because then the largest prime would most certainly won't "fit" in the number p.
import time
global mem
global lenMem
mem = [2]
lenMem = 1
def is_prime2(p):
global mem
global lenMem
# if p is even then the LSD is off
if p & 1 == 0:
return False
# if the LSD is 5 then it is divisible by 5 (i.e. not a prime)
elif p % 10 == 5:
return False
for div in mem[0: int(p ** 0.5) + 1]:
if p % div == 0:
return False
mem.append(p)
lenMem += 1
return True
The only idea I have in mind is that "global variables are expensive and time consuming" but I don't know if there is another way, and if there is, will it really help?
On average, when running this same program:
start = time.perf_counter()
for p in range(2, 100000):
print(f'{p} is a prime? {is_prime2(p)}') # change to is_prime1 or is_prime2
end = time.perf_counter()
I get that for is_prime1 the average time for checking the numbers 1-100K is ~0.99 seconds and so is_prime2 (maybe a difference of +0.01s on average, maybe as I said the usage of global variables ruin some performance?)

The difference is a combination of three things:
You're just not doing that much less work. Your test case includes testing a ton of small numbers, where the distinction between testing "all numbers from 2 to square root" and testing "all primes from 2 to square root" just isn't that much of a difference. Your "average case" is roughly the midpoint of the range, 50,000, square root of 223.6, which means testing 48 primes, or testing 222 numbers if the number is prime, but most numbers aren't prime, and most numbers have at least one small factor (proof left as exercise), so you short-circuit and don't actually test most of the numbers in either set (if there's a factor below 8, which applies to ~77% of all numbers, you've saved maybe two tests by limiting yourself to primes)
You're slicing mem every time, which is performed eagerly, and completely, even if you don't use all the values (and as noted, you almost never do for the non-primes). This isn't a huge cost, but then, you weren't getting huge savings from skipping non-primes, so it likely eats what little savings you got from the other optimization.
(You found this one, good show) Your slice of primes took a number of primes to test equal to the square root of number to test, not all primes less than the square root of the number to test. So you actually performed the same number of tests, just with different numbers (many of them primes larger than the square root that definitely don't need to be tested).
A side-note:
Your up-front tests aren't actually saving you much work; you redo both tests in the loop, so they're wasted effort when the number is prime (you test them both twice). And your test for divisibility by five is pointless; % 10 is no faster than % 5 (computers don't operate in base-10 anyway), and if not p % 5: is a slightly faster, more direct, and more complete (your test doesn't recognize multiples of 10, just multiples of 5 that aren't multiples of 10) way to test for divisibility.
The tests are also wrong, because they don't exclude the base case (they say 2 and 5 are not prime, because they're divisible by 2 and 5 respectively).

First of all, you should remove the print call, it is very time consuming.
You should just time your function, not the print function, so you could do it like this:
start = time.perf_counter()
for p in range(2, 100000):
## print(f'{p} is a prime? {is_prime2(p)}') # change to is_prime1 or is_prime2
is_prime1(p)
end = time.perf_counter()
print ("prime1", end-start)
start = time.perf_counter()
for p in range(2, 100000):
## print(f'{p} is a prime? {is_prime2(p)}') # change to is_prime1 or is_prime2
is_prime2(p)
end = time.perf_counter()
print ("prime2", end-start)
is_prime1 is still faster for me.

If you want to hold primes in global memory to accelerate multiple calls, you need to ensure that the primes list is properly populated even when the function is called with numbers in random order. The way is_prime2() stores and uses the primes assumes that, for example, it is called with 7 before being called with 343. If not, 343 will be treated as a prime because 7 is not yet in the primes list.
So the function must compute and store all primes up to √49 before it can respond to the is_prime(343) call.
In order to quickly build a primes list, the Sieve of Eratosthenes is one of the fastest method. But, since you don't know in advance how many primes you need, you can't allocate the sieve's bit flags in advance. What you can do is use a rolling window of the sieve to move forward by chunks (of let"s say 1000000 bits at a time). When a number beyond your maximum prime is requested, you just generate more primes chunk by chunk until you have enough to respond.
Also, since you're going to build a list of primes, you might as well make it a set and check if the requested number is in it to respond to the function call. This will require generating more primes than needed for divisions but, in the spirit of accelerating subsequent calls, that should not be an issue.
Here's an example of an isPrime() function that uses that approach:
primes = {3}
sieveMax = 3
sieveChunk = 1000000 # must be an even number
def isPrime(n):
if not n&1: return n==2
global primes,sieveMax, sieveChunk
while n>sieveMax:
base,sieveMax = sieveMax, sieveMax + sieveChunk
sieve = [True]* sieveChunk
for p in primes:
i = (p - base%p)%p
sieve[i::p]=[False]*len(sieve[i::p])
for i in range(0, sieveChunk,2):
if not sieve[i]: continue
p = i + base
primes.add(p)
sieve[i::p] = [False]*len(sieve[i::p])
return n in primes
On the first call to an unknown prime, it will perform slower than the divisions approach but as the prime list builds up, it will provide much better response time.

How could I make my prime generator function faster

to gain hands-on experience, I'm trying to solve the problems in spoj . The problem in the link asks to find all prime numbers between given 2 numbers. So how I implement this with python 2.7
# printing all prime numbers between given two inputs
import math
def findPrimes(num1,num2):
for num in range(num1,num2+1):
isPrime=True
for i in range(2,int(math.sqrt(num))+1):
if num%i==0:
isPrime=False
break
if isPrime:
print num
def main():
inputs=[]
numOfTestCases=int(raw_input())
while(numOfTestCases>0):
line=raw_input()
numbers=line.split()
inputs.append(numbers)
numOfTestCases-=1
for testCase in inputs:
findPrimes(int(testCase[0]),int(testCase[1]))
print ""
main()
However, when I send the code, I get time-exceed limit. How could I make my code fast enough?

You should use the Sieve of Eratosthenes and it is quite simple. First you initialize all numbers to be prime. Then for each prime you remove its multiples from the prime list. And it's time complexity is near liner O(nloglogn). Something like this:
N = 1000
is_prime = [1]*N
for i in xrange(2,N):
if is_prime[i]:
for j in xrange(2*i,N,i):
is_prime[j] = 0
This implementation should do just fine. But there are some extra optimizations that you can find them in the link above.
Note that 0 and 1 are not prime.

No, the numbers aren't huge in spoj/PRIME1. The sieve of Eratosthenes works extremely well there, but even trial division gets you through there, if you test by primes, and test odds only (or better, only 6-coprimes or 30-coprimes).
You need only find primes below the square root of your top limit, in advance. sqrt(10^9) is about 32,000, so there are only about 3,400 primes to maintain. That's nothing.
6-coprimes: numbers coprime with 6, i.e. with 2 and 3, so there's no need to test divide them by 2 nor 3, when testing for primes. You need to find a way to generate them directly, so there won't be any multiples of 2 and 3 among the numbers you need to test, by construction.

Project Euler 2 python3

I've got, what I think is a valid solution to problem 2 of Project Euler (finding all even numbers in the Fibonacci sequence up to 4,000,000). This works for lower numbers, but crashes when I run it with 4,000,000. I understand that this is computationally difficult, but shouldn't it just take a long time to compute rather than crash? Or is there an issue in my code?
import functools
def fib(limit):
sequence = []
for i in range(limit):
if(i < 3):
sequence.append(i)
else:
sequence.append(sequence[i-1] + sequence[i-2])
return sequence
def add_even(x, y):
if(y % 2 == 0):
return x + y
return x + 0
print(functools.reduce(add_even,fib(4000000)))

The problem is about getting the Fibonacci numbers that are smaller than 4000000. Your code tries to find the first 4000000 Fibonacci values instead. Since Fibonacci numbers grow exponentially, this will reach numbers too large to fit in memory.
You need to change your function to stop when the last calculated value is more than 4000000.
Another possible improvement is to add the numbers as you are calculating them instead of storing them in a list, but this won't be necessary if you stop at the appropriate time.

Reduce time complexity of brute forcing - largest prime factor

I am writing a code to find the largest prime factor of a very large number.
Problem 3 of Project Euler :
What is the largest prime factor of the number 600851475143 ?
I coded it in C...but the data type long long int is not sufficient enough to hold the value .
Now, I have rewritten the code in Python. How can I reduce the time taken for execution (as it is taking a considerable amount of time)?
def isprime(b):
x=2
while x<=b/2:
if(b%x)==0:
return 0
x+=1
return 1
def lpf(a):
x=2
i=2
while i<=a/2:
if a%i==0:
if isprime(i)==1:
if i>x:
x=i
print(x)
i+=1
print("final answer"+x)
z=600851475143
lpf(z)

There are many possible algorithmic speed ups. Some basic ones might be:
First, if you are only interested in the largest prime factor, you should check for them from the largest possible ones, not smallest. So instead of looping from 2 to a/2 try to check from a downto 2.
You could load the database of primes instead of using isprime function (there are dozens of such files in the net)
Also, only odd numbers can be primes (except for 2) so you can "jump" 2 values in each iteration
Your isprime checker could also be speededup, you do not have to look for divisiors up to b/2, it is enough to check to sqrt(b), which reduces complexity from O(n) to O(sqrt(n)) (assuming that modulo operation is constant time).

You could use the 128 int provided by GCC: http://gcc.gnu.org/onlinedocs/gcc/_005f_005fint128.html . This way, you can continue to use C and avoid having to optimize Python's speed. In addition, you can always add your own custom storage type to hold numbers bigger than long long in C.

I think you're checking too many numbers (incrementing by 1 and starting at 2 in each case). If you want to check is_prime by trial division, you need to divide by fewer numbers: only odd numbers to start (better yet, only primes). You can range over odd numbers in python the following way:
for x in range(3, some_limit, 2):
if some_number % x == 0:
etc.
In addition, once you have a list of primes, you should be able to run through that list backwards (because the question asks for highest prime factor) and test if any of those primes evenly divides into the number.
Lastly, people usually go up to the square-root of a number when checking trial division because anything past the square-root is not going to provide new information. Consider 100:
1 x 100
2 x 50
5 x 20
10 x 10
20 x 5
etc.
You can find all the important divisor information by just checking up to the square root of the number. This tip is useful both for testing primes and for testing where to start looking for a potential divisor for that huge number.

First off, your two while loops only need to go up to the sqrt(n) since you will have hit anything past that earlier (you then need to check a/i for primeness as well). In addition, if you find the lowest number that divides it, and the result of the division is prime, then you have found the largest.
First, correct your isprime function:
def isprime(b):
x=2
sqrtb = sqrt(b)
while x<=sqrtb:
if(b%x)==0:
return 0
x+=1
return 1
Then, your lpf:
def lpf(a):
x=2
i=2
sqrta = sqrt(a)
while i<=sqrt(a):
if a%i==0:
b = a//i # integer
if isprime(b):
return b
if isprime(i):
x=i
print(x)
i+=1
return x

Calculations on sliding windows and memoization

I am working on Project Euler Problem 50, which states:
The prime 41, can be written as the sum of six consecutive primes:
41 = 2 + 3 + 5 + 7 + 11 + 13
This is the longest sum of consecutive primes that adds to a prime below one-hundred.
The longest sum of consecutive primes below one-thousand that adds to a prime, contains 21 terms, and is equal to 953.
Which prime, below one-million, can be written as the sum of the most consecutive primes?
For determining the terms in prime P (if it at all can be written as a sum of primes) I use a sliding window of all the primes (in increasing order) up to (but not including) P, and calculate the sum of all these windows, if the sum is equal to the prime considered, I count the length of the window...
This works fine for all primes up to 1000, but for primes up to 10**6 it is very slow, so I was hoping memozation would help; when calculating the sum of sliding windows, a lot of double work is done...(right?)
So I found the standard memoizaton implemention on the net and just pasted it in my code, is this correct? (I have no idea how it is supposed to work here...)
primes = tuple(n for n in range(1, 10**6) if is_prime(n)==True)
count_best = 0
##http://docs.python.org/release/2.3.5/lib/itertools-example.html:
## Slightly modified (first for loop)
from itertools import islice
def window(seq):
for n in range(2, len(seq) + 1):
it = iter(seq)
result = tuple(islice(it, n))
if len(result) == n:
yield result
for elem in it:
result = result[1:] + (elem,)
yield result
def memoize(function):
cache = {}
def decorated_function(*args):
if args in cache:
return cache[args]
else:
val = function(*args)
cache[args] = val
return val
return decorated_function
#memoize
def find_lin_comb(prime):
global count_best
for windows in window(primes[0 : primes.index(prime)]):
if sum(windows) == prime and len(windows) > count_best:
count_best = len(windows)
print('Prime: ', prime, 'Terms: ', count_best)
##Find them:
for x in primes[::-1]: find_lin_comb(x)
(btw, the tuple of prime numbers is generated "decently" fast)
All input is appreciated, I am just a hobby programmer, so please don´t get to advanced on me.
Thank you!
Edit: here is a working code paste that doesn´t have ruined indentations:
http://pastebin.com/R1NpMqgb

This works fine for all primes up to 1000, but for primes up to 10**6 it is very slow, so I was hoping memozation would help; when calculating the sum of sliding windows, a lot of double work is done...(right?)
Yes, right. And of course it's slow for the primes up to 106.
Say you have n primes up to N, numbered in increasing order, p_1 = 2, p_2 = 3, .... When considering whether prime no. k is the sum of consecutive primes, you consider all windows [p_i, ..., p_j], for pairs (i,j) with i < j < k. There are (k-1)*(k-2)/2 of them. Going through all k to n, you examine about n³/6 windows in total (counting multiplicity, you're examining w(i.j) in total n-j times). Even ignoring the cost of creating the window and summing it, you can see how it scales badly:
For N = 1000, there are n = 168 primes and about 790000 windows to examine (counting multiplicity).
For N = 10**6, there are n = 78498 primes and about 8.3*10**13 windows to examine.
Now factor in the work for creating and summing the windows, estimate it low at j-i+1 for summing the j-i+1 primes in w(i,j), the work for p_k is about k³/6, and the total work becomes roughly k**4/24. Something like 33 million steps for N = 1000, peanuts, but nearly 1.6*10**18 for N = 1000000.
A year contains about 3.1*10**7 seconds, with a ~3GHz CPU, that's roughly 1017 clock cycles. So we're talking of an operation needing something like 100 CPU-years (may be a factor of 10 off or so).
You aren't willing to wait that long, I suppose;)
Now, with memoisation, you still look at each window multiple times, but you do the computation of each window only once. That means you need about n³/6 work for the computation of the windows, and look about n³/6 times at any window.
Problem 1: You still need to look at windows about 8.3*10**13 times, that's several hours even if looking cost only one cycle.
Problem 2: There are about 8.3*10**13 windows to memoise. You don't have that much memory, unless you can use a really large HD.
You can circumvent the memory problem by throwing away data you don't need anymore and only calculating data for the windows when it is needed, but once you know which data you may throw away when, you should be able to see a much better approach.
The longest sum of consecutive primes below one-thousand that adds to a prime, contains 21 terms, and is equal to 953.
What does this tell you about the window generating that sum? Where can it start, where can it stop? How can you use that information to create an efficient algorithm to solve the problem?

The memoize decorator adds a wrapper to a function to cache the return value for each value of the argument (each combination of values in case of multiple arguments). It is useful when the function is called multiple times with the same arguments. You can only use it with a pure function, i.e.
The function has no side effects. Changing a global variable and doing output are examples of side effects.
The return value depends only on the values of the arguments, not on some global variables that may change values between calls.
Your find_lin_comb function does not satisfy the above criteria. For one thing, it is called with a different argument every time, for another, the function does not return a value.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.