Fastest mutable data holder

Fastest mutable data holder - python

I recently downloaded the bitarray module from here, for a faster prime sieve, but the results are dismal.
from bitarray import bitarray
from numpy import ones
from timeit import timeit
def sieve1(n = 1000000):
'''Sieve of Eratosthenes (bitarray)'''
l = (n - 1)/2; a = bitarray(l); a.setall(True)
for i in xrange(500):
if a[i]:
s = i+i+3; t = (s*s-3)/2; a[t:l:s] = False
return [2] + [x+x+3 for x in xrange(l) if a[x]]
def sieve2(n = 1000000):
'''Sieve of Eratosthenes (list)'''
l = (n - 1)/2; a = [True] * l
for i in xrange(500):
if a[i]:
s = i+i+3; t = (s*s-3)/2; u = l-t-1
a[t:l:s] = [False] * (u/s + 1)
return [2] + [x+x+3 for x in xrange(l)]
def sieve3(n = 1000000):
'''Sieve of Eratosthenes (numpy.ones)'''
l = (n - 1)/2; a = ones(l, dtype=bool)
for i in xrange(500):
if a[i]:
s = i+i+3; t = (s*s-3)/2; a[t:l:s] = False
return [2] + [x+x+3 for x in xrange(l)]
print timeit(sieve1, number=10)
print timeit(sieve2, number=10)
print timeit(sieve3, number=10)
Here are the result -
1.59695601594
0.666230770593
0.523708537583
The bitarray sieve is more than twice as slow as a list.
Does anyone have suggestions for better arrays? Anything should be faster than a python list, or so I thought.
numpy.ones is fastest, but I don't like numpy since it takes a long time to import it.
I'm basically looking for a fast data-holder, which is mutable, and can hold True and False.

The actual setting and clearing of bits is very fast in bitarray. What is making is slower building the return list. Instead of iterating through a range, and then testing each bit, take advantage of bitarray's support of iterating through the bits.
Try this:
def bitarray_sieve(n = 1000000):
'''Sieve of Eratosthenes (bitarray)'''
l = (n - 1)//2; a = bitarray(l); a.setall(True)
for i in range(500):
if a[i]:
s = i+i+3; t = (s*s-3)//2; a[t:l:s] = False
return [2] + [x+x+3 for x,b in enumerate(a) if b]
It runs in about 0.38 seconds on my machine while the list version takes about 0.47 seconds.
Have you looked at this question?
I maintain gmpy2 and I added the ability to iterate over the bits in an integer and to set/clear bits.
The following example takes about 0.16 seconds.
def gmpy2_sieve2(n=1000000):
'''Sieve of Eratosthenes (gmpy2, version 2)'''
l = (n - 1)//2; a = gmpy2.xbit_mask(l)
for i in range(500):
if a[i]:
s = i+i+3; t = (s*s-3)//2; u = l-t-1
a[t:l:s] = 0
return [2] + [x+x+3 for x in a.iter_set()]
The bottleneck is now the calculation x+x+3. The following solution doesn't skip sieving 2. It takes twice as much memory but it allow the bit positions to be used immediately. It takes about 0.08 seconds on my machine:
def gmpy2_sieve(limit=1000000):
'''Returns a generator that yields the prime numbers up to limit.
Bits are set to 1 if their position is composite.'''
sieve_limit = gmpy2.isqrt(limit) + 1
limit += 1
# Mark bit positions 0 and 1 as not prime.
bitmap = gmpy2.xmpz(3)
# Process 2 separately. This allows us to use p+p for the step size
# when sieving the remaining primes.
bitmap[4 : limit : 2] = -1
# Sieve the remaining primes.
for p in bitmap.iter_clear(3, sieve_limit):
bitmap[p*p : limit : p+p] = -1
return list(bitmap.iter_clear(2, limit))
For the actual setting/clearing of the bits, bitarray is faster than gmpy2. And bitarray has many capabilities that gmpy2 lacks. However, I couldn't find a faster method in bitarrary to get the index of which bits are set or clear.
BTW, your benchmark functions sieve2() and sieve3() return incorrect results; you are missing if a[x].

Related

Python - "Fast Exponention Algorithm" performance outperformed by "worse" algorithm

Can anyone explain to me how this code:
def pow1(a, n):
DP = [None] * (n+1)
DP[1] = a
i = [1]
while not i[-1] == n:
if(i[-1]*2 <= n):
DP[i[-1]*2] = DP[i[-1]]*DP[i[-1]]
i.append(i[-1]+i[-1])
else:
missing = n-i[-1]
low = 0
high = len(i) - 1
mid = 0
while low <= high:
mid = (high + low) // 2
if i[mid] < missing:
if i[mid+1] > missing:
break
low = mid + 1
elif i[mid] > missing:
high = mid - 1
else:
break
DP[i[-1]+i[mid]] = DP[i[-1]]*DP[i[mid]]
i.append(i[mid]+i[-1])
return DP[n]
out-performs this code:
def pow2(a, n):
res = 1
while (n > 0):
if (n & 1):
res = res * a
a = a * a
n >>= 1
return res
Here is how I check them:
a = 34 # just arbitrary
n = 2487665 # just arbitrary
starttime = timeit.default_timer()
pow1(a, n)
print("pow1: The time difference is :", (
timeit.default_timer() - starttime))
starttime = timeit.default_timer()
pow2(a, n)
print("pow2: The time difference is :",
(timeit.default_timer() - starttime))
This is the result on my MacBook Air m1:
# pow1: The time difference is : 3.71763225
# pow2: The time difference is : 6.091892
As far as I can tell they work very similarly only the first one (pow1) stores all its intermediate results therefore has like O(log(n)) space complexity and then has to find all the required factors(sub-products) to get the final result. So O(log(n)) calculating them all, worst case has to do log(n) binarySearches (again O(log(n)) resulting in a runtime of O( logN*log(logN) ). Where as pow2 essentially never has to search through previous results... so pow2 has Time Complexity: O(logN) and Auxiliary Space: O(1) vs pow1 - Time Complexity: O(logN*logN*logN) and Auxiliary Space: O(logN).
Maybe (probably) I'm missing something essential but I don't see how the algorithm, the hardware (ARM), python or the testing could have this impact.
Also I just realised the space complexity for pow1 is O(n) the way I did it.

Okey so I figured it out. In pow2 (which I implemented from cp-algorithms.com but can also be found on geeksforgeeks.org in python) there is a bug.
The problem is this line gets executed one time too many:
res = 1
while (n > 0):
if (n & 1):
res = res * a
a = a * a #<--- THIS LINE
n >>= 1
return res
that gets called even tough the result has already been calculated, causing the function to do one more unnecessary multiplication, which with big numbers has a big impact. Here would be a very quick fix:
def pow2(a, n):
res = 1
while (n > 0):
if n & 1:
res = res * a
if n == 1:
return res
a = a * a
n >>= 1
return res
New measurements:
# a:34 n:2487665
# pow1(): The time difference is : 3.749621834
# pow2(): The time difference is : 3.072042833
# a**n: The time difference is : 2.119430791000001

Great catch. It's a weakness of while loops that they lead to wasted computations at the bottom of the loop body. Usually it doesn't matter.
Languages like Ada provide an unconditional loop expecting that an internal break (exit in Ada) will be used to leave for this reason.
So fwiw, a cleaner code using this "break in the middle" style would be:
def pow2(a, n):
res = 1
while True:
if (n & 1):
res *= a
n >>= 1
if n == 0:
return res
a *= a
With other algorithms, you might need to guard the loop for the case n = 0. But with this one, that's optional. You could check for that and return 1 explicitly before the loop if desired.
Overall, this avoids a comparison per loop wrt your solution. Maybe with big numbers this is worthwhile.

Is my Sieve of Eratosthenes implemented correctly? (Python) [duplicate]

This question already has answers here:
Sieve of Eratosthenes - Finding Primes Python
(22 answers)
Closed 4 years ago.
I need to generate a large number of prime numbers, however it is taking far too long using the Sieve of Eratosthenes. Currently it takes roughly 3 seconds to generate primes under 100,000 and roughly 30 for primes under 1,000,000. Which seems to indicate an O(n) complexity but as far as I know that's not right. Code:
def generate_primes(limit):
boolean_list = [False] * 2 + [True] * (limit - 1)
for n in range(2, int(limit ** 0.5 + 1)):
if boolean_list[n] == True:
for i in range(n ** 2, limit + 1, n):
boolean_list[i] = False
Am I missing something obvious? How can I improve the performance of the sieve?

Loop indexing is well known in Python to be an incredibly slow operation. By replacing a loop with array slicing, and a list with a Numpy array, we see increases # 3x:
import numpy as np
import timeit
def generate_primes_original(limit):
boolean_list = [False] * 2 + [True] * (limit - 1)
for n in range(2, int(limit ** 0.5 + 1)):
if boolean_list[n] == True:
for i in range(n ** 2, limit + 1, n):
boolean_list[i] = False
return np.array(boolean_list,dtype=np.bool)
def generate_primes_fast(limit):
boolean_list = np.array([False] * 2 + [True] * (limit - 1),dtype=bool)
for n in range(2, int(limit ** 0.5 + 1)):
if boolean_list[n]:
boolean_list[n*n:limit+1:n] = False
return boolean_list
limit = 1000
print(timeit.timeit("generate_primes_fast(%d)"%limit, setup="from __main__ import generate_primes_fast"))
# 30.90620080102235 seconds
print(timeit.timeit("generate_primes_original(%d)"%limit, setup="from __main__ import generate_primes_original"))
# 91.12803511600941 seconds
assert np.array_equal(generate_primes_fast(limit),generate_primes_original(limit))
# [nothing to stdout - they are equal]
To gain even more speed, one option is to use numpy vectorization. Looking at the outer loop, it's not immediately obvious how one could vectorize that.
Second, you will see dramatic speed-ups if you port to Cython, which should be a fairly seamless process.
Edit: you may also see improvements by changing things like n**2 => math.pow(n,2), but minor improvements like that are inconsequential compared to the bigger problem, which is the iterator.

If your are still using Python 2 use xrange instead of range for greater speed

Homework: Implementing the Z algorithm in python, it's really slow, slower than naive string search

I have to implement the Z algorithm and use it to search a target text for a specific pattern. I've implemented what I thought was the correct algorithm and search function using it but it's really slow. For the naive implementation of string search I consistently got times lower than 1.5 seconds and for the z string search I consistently got times over 3 seconds (for my biggest test case) so I have to be doing something wrong. The results seem to be correct, or were at least for the few test cases we were given. The code for the functions mentioned in my rant is below:
import sys
import time
# z algorithm a.k.a. the fundemental preprocessing algorithm
def z(P, start=1, max_box_size=sys.maxsize):
n = len(P)
boxes = [0] * n
l = -1
r = -1
for k in range(start, n):
if k > r:
i = 0
while k + i < n and P[i] == P[k + i] and i < max_box_size:
i += 1
boxes[k] = i
if i:
l = k
r = k + i - 1
else:
kp = k - l
Z_kp = boxes[kp]
if Z_kp < r - k + 1:
boxes[k] = Z_kp
else:
i = r + 1
while i < n and P[i] == P[i - k] and i - k < max_box_size:
i += 1
boxes[k] = i - k
l = k
r = i - 1
return boxes
# a simple string search
def naive_string_search(P, T):
m = len(T)
n = len(P)
indices = []
for i in range(m - n + 1):
if P == T[i: i + n]:
indices.append(i)
return indices
# string search using the z algorithm.
# The pattern you're searching for is simply prepended to the target text
# and than the z algorithm is run on that concatenation
def z_string_search(P, T):
PT = P + T
n = len(P)
boxes = z(PT, start=n, max_box_size=n)
return list(map(lambda x: x[0]-n, filter(lambda x: x[1] >= n, enumerate(boxes))))

Your's implementation of z-function def z(..) is algorithmically ok and asymptotically ok.
It has O(m + n) time complexity in worst case while implementation of naive string search has O(m*n) time complexity in worst case, so I think that the problem is in your test cases.
For example if we take this test case:
T = ['a'] * 1000000
P = ['a'] * 1000
we will get for z-function:
real 0m0.650s
user 0m0.606s
sys 0m0.036s
and for naive string matching:
real 0m8.235s
user 0m8.071s
sys 0m0.085s
PS: You should understand that there are a lot of test cases where naive string matching works in linear time too, for example:
T = ['a'] * 1000000
P = ['a'] * 1000000
Thus the worst case for a naive string matching is where function should apply pattern and check again and again. But in this case it will do only one check because of the lengths of the input (it cannot apply pattern from index 1 so it won't continue).

Optimalization of the primes finding function

After 10 minutes of work I have written a function presented below. It returns a list of all primes lower than an argument. I have used all known for me programing and mathematical tricks in order to make this function as fast as possible. To find all the primes lower than a million it takes about 2 seconds.
Do you see any possibilities to optimize it even further? Any ideas?
def Primes(To):
if To<2:
return []
if To<3:
return [2]
Found=[2]
n=3
LastSqr=0
while n<=To:
k=0
Limit=len(Found)
IsPrime=True
while k<Limit:
if k>=LastSqr:
if Found[k]>pow(n,0.5):
LastSqr=k
break
if n%Found[k]==0:
IsPrime=False
break
k+=1
if IsPrime:
Found.append(n)
n+=1
return Found

You can use a couple tricks to speed things up, using the basic sieve of erastothenes. One is to use Wheel Factorization to skip calculating numbers that are known not to be prime. For example, besides 2 and 3, all primes are congruent to 1 or 5 mod 6. This means you don't have to process 4 of every 6 numbers at all.
At the next level, all primes are congruent to 1, 7, 11, 13, 17, 19, 23, or 29, mod 30. You can throw out 22 of every 30 numbers.
Here is a simple implementation of the sieve of Erastothenes that doesn't calculate or store even numbers:
def basic_gen_primes(n):
"""Return a list of all primes less then or equal to n"""
if n < 2:
return []
# The sieve. Each entry i represents (2i + 1)
size = (n + 1) // 2
sieve = [True] * size
# 2(0) + 1 == 1 is not prime
sieve[0] = False
for i, value in enumerate(sieve):
if not value:
continue
p = 2*i + 1
# p is prime. Remove all of its multiples from the sieve
# p^2 == (2i + 1)(2i + 1) == (4i^2 + 4i + 1) == 2(2i^2 + 2i) + 1
multiple = 2 * i * i + 2 * i
if multiple >= size:
break
while multiple < size:
sieve[multiple] = False
multiple += p
return [2] + [2*i+1 for i, value in enumerate(sieve) if value]
As mentioned, you can use more exotic sieves as well.

You can check only odd numbers. So why don't you use n+=2 instead of n+=1?

google and wikipedia for better algorithms. If you are only looking for small primes this might be fast enough. But the real algorithms are a lot faster for large primes.
http://en.wikipedia.org/wiki/Quadratic_sieve
start with that page.
Increment n by two instead of one. ?

How to implement an efficient infinite generator of prime numbers in Python?

This is not a homework, I am just curious.
INFINITE is the key word here.
I wish to use it as for p in primes(). I believe that this is a built-in function in Haskell.
So, the answer cannot be as naive as "Just do a Sieve".
First of all, you do not know how many consecutive primes will be consumed. Well, suppose you could concoct 100 of them at a time. Would you use the same Sieve approach as well as the frequency of prime numbers formula?
I prefer non-concurrent approach.
Thank you for reading (and writing ;) )!

“If I have seen further…”
The erat2 function from the cookbook can be further sped up (by about 20-25%):
erat2a
import itertools as it
def erat2a( ):
D = { }
yield 2
for q in it.islice(it.count(3), 0, None, 2):
p = D.pop(q, None)
if p is None:
D[q*q] = q
yield q
else:
# old code here:
# x = p + q
# while x in D or not (x&1):
# x += p
# changed into:
x = q + 2*p
while x in D:
x += 2*p
D[x] = p
The not (x&1) check verifies that x is odd. However, since both q and p are odd, by adding 2*p half of the steps are avoided along with the test for oddity.
erat3
If one doesn't mind a little extra fanciness, erat2 can be sped up by 35-40% with the following changes (NB: needs Python 2.7+ or Python 3+ because of the itertools.compress function):
import itertools as it
def erat3( ):
D = { 9: 3, 25: 5 }
yield 2
yield 3
yield 5
MASK= 1, 0, 1, 1, 0, 1, 1, 0, 1, 0, 0, 1, 1, 0, 0,
MODULOS= frozenset( (1, 7, 11, 13, 17, 19, 23, 29) )
for q in it.compress(
it.islice(it.count(7), 0, None, 2),
it.cycle(MASK)):
p = D.pop(q, None)
if p is None:
D[q*q] = q
yield q
else:
x = q + 2*p
while x in D or (x%30) not in MODULOS:
x += 2*p
D[x] = p
The erat3 function takes advantage of the fact that all primes (except for 2, 3, 5) modulo 30 result to only eight numbers: the ones included in the MODULOS frozenset. Thus, after yielding the initial three primes, we start from 7 and work only with the candidates.
The candidate filtering uses the itertools.compress function; the “magic” is in the MASK sequence; MASK has 15 elements (there are 15 odd numbers in every 30 numbers, as chosen by the itertools.islice function) with a 1 for every possible candidate, starting from 7. The cycle repeats as specified by the itertools.cycle function.
The introduction of the candidate filtering needs another modification: the or (x%30) not in MODULOS check. The erat2 algorithm processed all odd numbers; now that the erat3 algorithm processes only r30 candidates, we need to make sure that all D.keys() can only be such —false— candidates.
Benchmarks
Results
On an Atom 330 Ubuntu 9.10 server, versions 2.6.4 and 3.1.1+:
$ testit
up to 8192
==== python2 erat2 ====
100 loops, best of 3: 18.6 msec per loop
==== python2 erat2a ====
100 loops, best of 3: 14.5 msec per loop
==== python2 erat3 ====
Traceback (most recent call last):
…
AttributeError: 'module' object has no attribute 'compress'
==== python3 erat2 ====
100 loops, best of 3: 19.2 msec per loop
==== python3 erat2a ====
100 loops, best of 3: 14.1 msec per loop
==== python3 erat3 ====
100 loops, best of 3: 11.7 msec per loop
On an AMD Geode LX Gentoo home server, Python 2.6.5 and 3.1.2:
$ testit
up to 8192
==== python2 erat2 ====
10 loops, best of 3: 104 msec per loop
==== python2 erat2a ====
10 loops, best of 3: 81 msec per loop
==== python2 erat3 ====
Traceback (most recent call last):
…
AttributeError: 'module' object has no attribute 'compress'
==== python3 erat2 ====
10 loops, best of 3: 116 msec per loop
==== python3 erat2a ====
10 loops, best of 3: 82 msec per loop
==== python3 erat3 ====
10 loops, best of 3: 66 msec per loop
Benchmark code
A primegen.py module contains the erat2, erat2a and erat3 functions. Here follows the testing script:
#!/bin/sh
max_num=${1:-8192}
echo up to $max_num
for python_version in python2 python3
do
for function in erat2 erat2a erat3
do
echo "==== $python_version $function ===="
$python_version -O -m timeit -c \
-s "import itertools as it, functools as ft, operator as op, primegen; cmp= ft.partial(op.ge, $max_num)" \
"next(it.dropwhile(cmp, primegen.$function()))"
done
done

Since the OP asks for an efficient implementation, here's a significant improvement to the active state 2002 code by David Eppstein/Alex Martelli (seen here in his answer): don't record a prime's info in the dictionary until its square is seen among the candidates. Brings space complexity down to below O(sqrt(n)) instead of O(n), for n primes produced ( π(sqrt(n log n)) ~ 2 sqrt(n log n) / log(n log n) ~ 2 sqrt(n / log n) ). Consequently, time complexity is also improved, i.e. it runs faster.
Creates a "sliding sieve" as a dictionary of current multiples of each base prime (i.e. below the sqrt of the current production point), together with their step values:
from itertools import count
# ideone.com/aVndFM
def postponed_sieve(): # postponed sieve, by Will Ness
yield 2; yield 3; yield 5; yield 7; # original code David Eppstein,
sieve = {} # Alex Martelli, ActiveState Recipe 2002
ps = postponed_sieve() # a separate base Primes Supply:
p = next(ps) and next(ps) # (3) a Prime to add to dict
q = p*p # (9) its sQuare
for c in count(9,2): # the Candidate
if c in sieve: # c's a multiple of some base prime
s = sieve.pop(c) # i.e. a composite ; or
elif c < q:
yield c # a prime
continue
else: # (c==q): # or the next base prime's square:
s=count(q+2*p,2*p) # (9+6, by 6 : 15,21,27,33,...)
p=next(ps) # (5)
q=p*p # (25)
for m in s: # the next multiple
if m not in sieve: # no duplicates
break
sieve[m] = s # original test entry: ideone.com/WFv4f
(the older, original code here was edited to incorporate changes as seen in the answer by Tim Peters, below). see also this for a related discussion.
Similar 2-3-5-7 wheel-based code runs ~ 2.15x faster (which is very close to the theoretical improvement of 3/2 * 5/4 * 7/6 = 2.1875).
2022 update: I've recently chanced upon this old "NESL" thing from the 1990s which actually uses the same sqrt-recursion trick. So nothing is new under the sun. :)
The code can be straightforwardly augmented to start the primes enumeration directly from a given value. This can be seen in this JS-based entry.

For posterity, here's a rewrite of Will Ness's beautiful algorithm for Python 3. Some changes are needed (iterators no longer have .next() methods, but there's a new next() builtin function). Other changes are for fun (using the new yield from <iterable> replaces four yield statements in the original. More are for readability (I'm not a fan of overusing ;-) 1-letter variable names).
It's significantly faster than the original, but not for algorithmic reasons. The speedup is mostly due to removing the original's add() function, doing that inline instead.
def psieve():
import itertools
yield from (2, 3, 5, 7)
D = {}
ps = psieve()
next(ps)
p = next(ps)
assert p == 3
psq = p*p
for i in itertools.count(9, 2):
if i in D: # composite
step = D.pop(i)
elif i < psq: # prime
yield i
continue
else: # composite, = p*p
assert i == psq
step = 2*p
p = next(ps)
psq = p*p
i += step
while i in D:
i += step
D[i] = step

This isn't originally my code, however, it's worth posting. The original can be found here: http://code.activestate.com/recipes/117119/
def gen_primes():
D = {}
q = 2 # first integer to test for primality.
while True:
if q not in D:
# not marked composite, must be prime
yield q
#first multiple of q not already marked
D[q * q] = [q]
else:
for p in D[q]:
D.setdefault(p + q, []).append(p)
# no longer need D[q], free memory
del D[q]
q += 1
It's a generator, so use it like any other.
primes = gen_primes()
for p in primes:
print p
It takes 1.62s to generate and put into a set, 1 million primes, on my desktop.

Do a segmented sieve, where the size of a segment is determined by available memory or the maximal size of a bitset.
For each segment represent the numbers in some interval [n; n + segment_size) as a bit set and sieve with all prime numbers below the square root of the upper bound.
Using a bit set uses less memory than a hash table or tree data structure, because you are working with dense sets of numbers.

Another way to do it:
import itertools
def primeseq():
prime = [2]
num = 0
yield 2
for i in itertools.count(3, 2):
is_prime = True
for num in prime:
if i % num == 0:
is_prime = False
break
elif num ** 2 > i:
break
if is_prime:
prime.append(i)
yield i

And another answer, more memory-efficient than my erat3 answer here:
import heapq
def heapprimegen():
hp= []
yield 2
yield 3
cn= 3
nn, inc= 3, 6
while 1:
while cn < nn:
yield cn
heapq.heappush(hp, (3*cn, 2*cn))
cn+= 2
cn= nn+2
nn, inc= heapq.heappushpop(hp, (nn+inc, inc))
It maintains a heap (a list) of prime multiples rather than a dictionary. It loses some speed, obviously.

Here is a complicated heap-based implementation, which is not much faster than other heap-based implementations (see the speed comparison in another answer of mine), but it uses much less memory.
This implementation uses two heaps (tu and wv), which contain the same number elements. Each element is an int pair. In order to find all primes up to q**2 (where q is a prime), each heap will contain at most 2*pi(q-1) elements, where pi(x) is the number of positive primes not larger than x. So the total number of integers is at most 4*pi(floor(sqrt(n))). (We could gain a factor on 2 on memory by pushing half as much stuff to the heap, but that would make the algorithm slower.)
Other dict and heap-based approaches (e.g. erat2b, and heap_prime_gen_squares and heapprimegen) above store about `2*pi(n)' integers, because they extend their heap or dict every time they find a prime. As a comparison: to find the 1_000_000 primes, this implementation stores less than 4141 integers, other implementations store more than 1_000_000 integers.
import heapq
def heap_prime_gen_smallmem():
yield 2
yield 3
f = 5
fmar3 = 2
q = 7
q6 = 7 * 6
qmar3 = 4
tu = [(25, 30), (35, 30)]
vw = [(25, 30), (35, 30)]
while True:
qmar3 += 2
if qmar3 == 6:
qb = q + 4
q6b = q6 + 24
qmar3 = 2
else:
qb = q + 2
q6b = q6 + 12
if q < tu[0][0]:
d = q * q
while f < d:
a, b = vw[0]
if f < a:
yield f
else:
a, b = vw[0]
heapq.heapreplace(vw, (a + b, b))
a, b = vw[0]
while f >= a:
heapq.heapreplace(vw, (a + b, b))
a, b = vw[0]
fmar3 += 2
if fmar3 == 6:
f += 4
fmar3 = 2
else:
f += 2
c = q * qb
heapq.heappush(tu, (d, q6))
heapq.heappush(tu, (c, q6))
heapq.heappush(vw, (d, q6))
heapq.heappush(vw, (c, q6))
else:
a, b = tu[0]
heapq.heapreplace(tu, (a + b, b))
a, b = tu[0]
while q >= a:
heapq.heapreplace(tu, (a + b, b))
a, b = tu[0]
q = qb
q6 = q6b

Here's a pretty fast infinite generator, written in Python2 but easily adjusted to Python3. To use it to add the primes up to 10**9, use the following:
from itertools import takewhile
from functools import partial
from operator import gt
print (sum(takewhile(partial(gt, 10**9), prime_gen_inf())))
It's a segmented sieve, faster but obviously less elegant than Will Ness's algorithm.
from operator import mul
from functools import reduce
def prod(x): return reduce(mul, x, 1)
def build_sieve(wheel):
w = prod(wheel)
w_phi = prod([p-1 for p in wheel])
rems = [a for a in range(w) if all(a % p for p in wheel)]
assert len(rems) == w_phi
inv = {a:pow(a, w_phi - 1, w) for a in rems}
try:
known_p = wheel + rems[1 : rems.index(rems[1]*rems[1])]
except ValueError:
known_p = wheel + rems[1:]
return wheel, w, w_phi, rems, inv, known_p
#Adjust the chunk variable based on your computer's architecture.
#
#Adjust the line with #! if you don't need "true" infinite. If you don't need
#primes larger than 1<<32, use array('H', []), if 1<<64 use 'L', if 1<<128 (in
#Python3) use 'Q', otherwise use empty list [].
#To save memory, comment out the lines with #*, and uncomment the commented-out
#lines
import itertools
from itertools import islice, count, compress, izip
chain_f = itertools.chain.from_iterable
from array import array
def prime_gen_inf(chunk=250000, sieve_info = build_sieve([2,3,5,7])):
""" Indefinitely yields primes """
wheel, w, w_phi, rems, inv, known_p = sieve_info
for p in known_p: yield p
new_n = 0;
while True:
size = min(chunk, (p * p - new_n) / w)
sieve = bytearray([1]) * size * w_phi
n, new_n = new_n, new_n + size * w
if not n:
zero = bytearray([0])
seen = len(known_p) - len(wheel) + 1
sieve[:seen:1] = zero * seen
p_gen = islice(prime_gen_inf(), len(wheel), None)
new_p = next(p_gen)
ps = [] #! array('H', [])
p_invs = bytearray([]) #*
while new_p * new_p < new_n:
ps.append(new_p)
p_invs.append(inv[new_p % w]) #*
new_p = next(p_gen)
for p, p_inv, modp in izip(ps, p_invs, [-n % p for p in ps]): #*
s = [(modp + p * (p_inv * (r - modp) % w)) / w for r in rems] #*
#for p in ps:
# s = [(-n%p + p * (inv[p%w] * (r - -n%p) % w)) / w for r in rems]
for i, start in enumerate(s):
slice_size = ((size - start - 1) / p + 1)
sieve[i + start * w_phi :: p * w_phi] = zero * slice_size
for p in compress(chain_f(izip(*[count(n+r, w) for r in rems])), sieve):
yield p

Here is a simple but not terribly slow one using a heap instead of a dict:
import heapq
def heap_prime_gen_squares():
yield 2
yield 3
h = [(9, 6)]
n = 5
while True:
a, b = h[0]
while n < a:
yield n
heapq.heappush(h, (n * n, n << 1))
n += 2
heapq.heapreplace(h, (a + b, b)) # Replace h[0], which is still (a, b).
My speed measurements of user time for the first 1 million primes (smaller numbers are better):
postponed_sieve (dict-based): 8.553s
erat2b (dict-based): 9.513s
erat2a (dict-based): 10.313s
heap_prime_gen_smallmem (heap-based): 23.935s
heap_prime_gen_squares (heap-based): 27.302s
heapprimegen (dict-based): 145.029s
So dict-based approaches seem to be the fastest.

Here's a generator that's a little truer to how it's done in Haskell: filtering against composites of known primes, then adding the remaining primes to the list.
def gen_primes():
primes = []
i = 2
while True:
prime = True
for p in primes:
if not (i % p):
prime = False
break
if prime:
yield i
primes.append(i)
i += 1

I know the post is old, but I came across this question... The following code is based on a very simple idea: a growing sieve of Eratosthenes. Although this solution is slower than the best ones here, it is easy to grasp and designed to be readable...
I used integers to store the results of the sieve.
In binary format, an integer is a list of 0s and 1s, 0 at position i if i is not a prime, 1 if it may be a prime.
The requisite infinity is a result of the fact that Python 3 integers are unbounded.
def primes():
container, size = 1 << 2, 3 # we start with 0b100 (from right to left: 0 and 1 are not primes, 2 is
last_prime = 1
while True:
prime = next((j for j in range(last_prime+1, size) if container & 1 << j), None) # find the next prime
while not prime:
container, size = expand(container, size, 2**16) # add 65536 cells and sieve the container
prime = next((j for j in range(last_prime+1, size) if container & 1 << j), None)
yield prime
last_prime = prime
How to expand the container? Just add a bunch of 1s at the left of the container (in binary format) and sieve them. This is
identical to the standard sieve, with a slight difference. In the standard sieve, if we find a prime i, we start to cross the cells at i*i, with a step of i.
Here, this may have been done for the first part of container. We just need to start at the beginning of the new part of the container if it is farther than i*i.
def expand(container, size, n):
new_size = size + n
container += (1 << (new_size + 1) - 1) - (1 << size) # add n 1's
for i in range(2, new_size):
if container & (1 << i): # i is a prime
t = sum(1 << j for j in range(max(i, size // i)*i, new_size, i)) # set 1 for all mutiple
container &= ~t # cross the cells
return container, new_size
Test for a million primes:
import itertools
assert 78498 == len(list(itertools.takewhile(lambda p: p<1000000, primes())))

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Fastest mutable data holder - python

Related

Python - "Fast Exponention Algorithm" performance outperformed by "worse" algorithm

Is my Sieve of Eratosthenes implemented correctly? (Python) [duplicate]

Homework: Implementing the Z algorithm in python, it's really slow, slower than naive string search

Optimalization of the primes finding function

How to implement an efficient infinite generator of prime numbers in Python?

Categories

Resources