Why is this algorithm worse? - python

In Wikipedia this is one of the given algorithms to generate prime numbers:
def eratosthenes_sieve(n):
# Create a candidate list within which non-primes will be
# marked as None; only candidates below sqrt(n) need be checked.
candidates = [i for i in range(n + 1)]
fin = int(n ** 0.5)
# Loop over the candidates, marking out each multiple.
for i in range(2, fin + 1):
if not candidates[i]:
continue
candidates[i + i::i] = [None] * (n // i - 1)
# Filter out non-primes and return the list.
return [i for i in candidates[2:] if i]
I changed the algorithm slightly.
def eratosthenes_sieve(n):
# Create a candidate list within which non-primes will be
# marked as None; only candidates below sqrt(n) need be checked.
candidates = [i for i in range(n + 1)]
fin = int(n ** 0.5)
# Loop over the candidates, marking out each multiple.
candidates[4::2] = [None] * (n // 2 - 1)
for i in range(3, fin + 1, 2):
if not candidates[i]:
continue
candidates[i + i::i] = [None] * (n // i - 1)
# Filter out non-primes and return the list.
return [i for i in candidates[2:] if i]
I first marked off all the multiples of 2, and then I considered odd numbers only. When I timed both algorithms (tried 40.000.000) the first one was always better (albeit very slightly). I don't understand why. Can somebody please explain?
P.S.: When I try 100.000.000, my computer freezes. Why is that? I have Core Duo E8500, 4GB RAM, Windows 7 Pro 64 Bit.
Update 1: This is Python 3.
Update 2: This is how I timed:
start = time.time()
a = eratosthenes_sieve(40000000)
end = time.time()
print(end - start)
UPDATE: Upon valuable comments (especially by nightcracker and Winston Ewert) I managed to code what I intended in the first place:
def eratosthenes_sieve(n):
# Create a candidate list within which non-primes will be
# marked as None; only c below sqrt(n) need be checked.
c = [i for i in range(3, n + 1, 2)]
fin = int(n ** 0.5) // 2
# Loop over the c, marking out each multiple.
for i in range(fin):
if not c[i]:
continue
c[c[i] + i::c[i]] = [None] * ((n // c[i]) - (n // (2 * c[i])) - 1)
# Filter out non-primes and return the list.
return [2] + [i for i in c if i]
This algorithm improves the original algorithm (mentioned at the top) by (usually) 50%. (Still, worse than the algorithm mentioned by nightcracker, naturally).
A question to Python Masters: Is there a more Pythonic way to express this last code, in a more "functional" way?
UPDATE 2: I still couldn't decode the algorithm mentioned by nightcracker. I guess I'm too stupid.

The question is, why would it even be faster? In both examples you are filtering multiples of two, the hard way. It doesn't matter whether you hardcode candidates[4::2] = [None] * (n // 2 - 1) or that it gets executed in the first loop of for i in range(2, fin + 1):.
If you are interested in an optimized sieve of Eratosthenes, here you go:
def primesbelow(N):
# https://stackoverflow.com/questions/2068372/fastest-way-to-list-all-primes-below-n-in-python/3035188#3035188
#""" Input N>=6, Returns a list of primes, 2 <= p < N """
correction = N % 6 > 1
N = (N, N-1, N+4, N+3, N+2, N+1)[N%6]
sieve = [True] * (N // 3)
sieve[0] = False
for i in range(int(N ** .5) // 3 + 1):
if sieve[i]:
k = (3 * i + 1) | 1
sieve[k*k // 3::2*k] = [False] * ((N//6 - (k*k)//6 - 1)//k + 1)
sieve[(k*k + 4*k - 2*k*(i%2)) // 3::2*k] = [False] * ((N // 6 - (k*k + 4*k - 2*k*(i%2))//6 - 1) // k + 1)
return [2, 3] + [(3 * i + 1) | 1 for i in range(1, N//3 - correction) if sieve[i]]
Explanation here: Porting optimized Sieve of Eratosthenes from Python to C++
The original source is here, but there was no explanation. In short this primesieve skips multiples of 2 and 3 and uses a few hacks to make use of fast Python assignment.

You do not save a lot of time avoiding the evens. Most of the computation time within the algorithm is spent doing this:
candidates[i + i::i] = [None] * (n // i - 1)
That line causes a lot of action on the part of the computer. Whenever the number in question is even, this is not run as the loop bails on the if statement. The time spent running the loop for even numbers is thus really really small. So eliminating those even rounds does not produce a significant change in the timing of the loop. That's why your method isn't considerably faster.
When python produces numbers for range it uses a formula: start + index * step. Multiplying by two as you do in your case is going to be slightly more expensive then one as in the original case.
There is also quite possibly a small overhead to having a longer function.
Neither are of those are really significant speed issues, but they override the very small amount of benefit your version brings.

Its probably slightly slower because you are performing extra set up to do something that was done in the first case anyway (marking off multiples of two). That setup time might be what you see if it is as slight as you say

Your extra step is unnecessary and will actually traverse the whole collection n once doing that 'get rid of evens' operation rather than just operating on n^1/2.

Related

Runtime of two nested for loops

I am calculating the run time complexity of the below Python code as n^2 but it seems that it is incorrect. The correct answer shown is n(n-1)/2. Can anyone help help me in understanding why the inner loop is not running n*n times but rather n(n-1)/2?
for i in range(n):
for j in range(i):
val += 1
On the first run of the inner loop, when i = 0, the for loop does not execute at all. (There are zero elements to iterate over in range(0)).
On the second run of the inner loop, when i = 1, the for loop executes for one iteration. (There is one element to iterate over in range(1)).
Then, there are two elements in the third run, three in the fourth, following this pattern, up until i = n - 1.
The total number of iterations is 0 + 1 + ... + (n - 1). This summation has a well known form*: for any natural m, 0 + 1 + ... + m = m * (m + 1) / 2. In this case, we have m = n - 1, so there are n * (n - 1) / 2 iterations in total.
You're right that the asymptotic time complexity of this algorithm is O(n^2). However, given that you've said that the "correct answer" is n * (n - 1) / 2, the question most likely asked for the exact number of iterations (rather than an asymptotic bound).
* See here for a proof of this closed form.

What is the time complexity of this recursion?

def s_r(n):
if(n==1):
return 1
temp = 1
for i in range(int(n)):
temp += i
return temp * s_r(n/2) * s_r(n/2) * s_r(n/2) * s_r(n/2)
Using recursion tree what’s the Big Oh
And also how do we write this function into one recursive call.
I could only do the first part which I got O(n^2). This was one my exam questions which I want to know the answer and guidance to. Thank you
First, note that the program is incorrect: n/2 is floating point division in python (since Python3), so the program does not terminate unless n is a power of 2 (or eventually rounds to a power of 2). The corrected version of the program that works for all integer n>=1, is this:
def s_r(n):
if(n==1):
return 1
temp = 1
for i in range(n):
temp += i
return temp * s_r(n//2) * s_r(n//2) * s_r(n//2) * s_r(n//2)
If T(n) is the number of arithmetic operations performed by the function, then we have the recurrence relation:
T(1) = 0
T(n) = n + 4T(n//2)
T(n) is Theta(n^2) -- for n a power of 2 and telescoping we get: n + 4(n/2) + 16(n/4) + ... = 1n + 2n + 4n + 8n + ... + nn = (2n-1)n
We can rewrite the program to use Theta(log n) arithmetic operations. First, the temp variable is 1 + 0+1+...+(n-1) = n(n-1)/2 + 1. Second, we can avoid making the same recursive call 4 times.
def s_r(n):
return (1 + n * (n-1) // 2) * s_r(n//2) ** 4 if n > 1 else 1
Back to complexity, I have been careful to say that the first function does Theta(n^2) arithmetic operations and the second Theta(log n) arithmetic operations, rather than use the expression "time complexity". The result of the function grows FAST, so it's not practical to assume arithmetic operations run in O(1) time. If we print the length of the result and the time taken to compute it (using the second, faster version of the code) for powers of 2 using this...
import math
import timeit
def s_r(n):
return (1 + n * (n-1) // 2) * s_r(n//2) ** 4 if n > 1 else 1
for i in range(16):
n = 2 ** i
start = timeit.default_timer()
r = s_r(n)
end = timeit.default_timer()
print('n=2**%d,' % i, 'digits=%d,' % (int(math.log(r))+1), 'time=%.3gsec' % (end - start))
... we get this table, in which you can see the number of digits in the result grows FAST (s_r(2**14) has 101 million digits for example), and the time measured does not grow with log(n), but when n is doubled the time increases by something like a factor of 10, so it grows something like n^3 or n^4.
n=2**0, digits=1, time=6e-07sec
n=2**1, digits=1, time=1.5e-06sec
n=2**2, digits=5, time=9e-07sec
n=2**3, digits=23, time=1.1e-06sec
n=2**4, digits=94, time=2.1e-06sec
n=2**5, digits=382, time=2.6e-06sec
n=2**6, digits=1533, time=3.8e-06sec
n=2**7, digits=6140, time=3.99e-05sec
n=2**8, digits=24569, time=0.000105sec
n=2**9, digits=98286, time=0.000835sec
n=2**10, digits=393154, time=0.00668sec
n=2**11, digits=1572628, time=0.0592sec
n=2**12, digits=6290527, time=0.516sec
n=2**13, digits=25162123, time=4.69sec
n=2**14, digits=100648510, time=42.1sec
n=2**15, digits=402594059, time=377sec
Note that it's not wrong to say the time complexity of the original function is O(n^2) and the improved version of the code O(log n), it's just that these describe a measure of the program (arithmetic operations) and are not at all useful as an estimate of actual program running time.
As #kcsquared said in comments I also believe this function is O(n²). This code can be refactored into one function call by storing the result of recursion call, or just doing some math application. Also you can simplify the range sum, by using the built-in sum
def s_r(n):
if n == 1:
return 1
return (sum(range(int(n))) + 1) * s_r(n/2) ** 4

Is my Sieve of Eratosthenes implemented correctly? (Python) [duplicate]

This question already has answers here:
Sieve of Eratosthenes - Finding Primes Python
(22 answers)
Closed 4 years ago.
I need to generate a large number of prime numbers, however it is taking far too long using the Sieve of Eratosthenes. Currently it takes roughly 3 seconds to generate primes under 100,000 and roughly 30 for primes under 1,000,000. Which seems to indicate an O(n) complexity but as far as I know that's not right. Code:
def generate_primes(limit):
boolean_list = [False] * 2 + [True] * (limit - 1)
for n in range(2, int(limit ** 0.5 + 1)):
if boolean_list[n] == True:
for i in range(n ** 2, limit + 1, n):
boolean_list[i] = False
Am I missing something obvious? How can I improve the performance of the sieve?
Loop indexing is well known in Python to be an incredibly slow operation. By replacing a loop with array slicing, and a list with a Numpy array, we see increases # 3x:
import numpy as np
import timeit
def generate_primes_original(limit):
boolean_list = [False] * 2 + [True] * (limit - 1)
for n in range(2, int(limit ** 0.5 + 1)):
if boolean_list[n] == True:
for i in range(n ** 2, limit + 1, n):
boolean_list[i] = False
return np.array(boolean_list,dtype=np.bool)
def generate_primes_fast(limit):
boolean_list = np.array([False] * 2 + [True] * (limit - 1),dtype=bool)
for n in range(2, int(limit ** 0.5 + 1)):
if boolean_list[n]:
boolean_list[n*n:limit+1:n] = False
return boolean_list
limit = 1000
print(timeit.timeit("generate_primes_fast(%d)"%limit, setup="from __main__ import generate_primes_fast"))
# 30.90620080102235 seconds
print(timeit.timeit("generate_primes_original(%d)"%limit, setup="from __main__ import generate_primes_original"))
# 91.12803511600941 seconds
assert np.array_equal(generate_primes_fast(limit),generate_primes_original(limit))
# [nothing to stdout - they are equal]
To gain even more speed, one option is to use numpy vectorization. Looking at the outer loop, it's not immediately obvious how one could vectorize that.
Second, you will see dramatic speed-ups if you port to Cython, which should be a fairly seamless process.
Edit: you may also see improvements by changing things like n**2 => math.pow(n,2), but minor improvements like that are inconsequential compared to the bigger problem, which is the iterator.
If your are still using Python 2 use xrange instead of range for greater speed

Trying to figure out the run time of my function

I have this python code for finding the longest substring. I'm trying to figure out the asymptotic run time of it and I've arrived at an answer but I'm not sure if it's correct. Here is the code:
def longest_substring(s, t):
best = ' '
for s_start in range(0, len(s)):
for s_end in range(s_start, len(s)+1):
for t_start in range(0, len(t)):
for t_end in range(t_start, len(t)+1):
if s[s_start:s_end] == t[t_start:t_end]:
current = s[s_start:s_end]
if len(current) > len(best):
best = current
return best
Obviously this function has a very slow run time. It was designed that way. My approach was that because there is a for loop with 3 more nested for-loops, the run-time is something like O(n^4). I am not sure if this is correct due to not every loop iterating over the input size. Also, it is to be assumed that s = t = n(input size). Any ideas?
If you're not convinced that it's O(n^5), try calculating how many loops you run through for string s alone (i.e. the outer two loops). When s_start == 0, the inner loop runs n + 1 times; when s_start == 1, the inner loop runs n times, and so on, until s_start = n - 1, for which the inner loop runs twice.
The sum
(n + 1) + (n) + (n - 1) + ... + 2
is an arithmetic series for which the formula is
((n + 1) + 2) * n / 2
which is O(n^2).
An additional n factor comes from s[s_start:s_end] == t[t_start:t_end], which is O(n).

Sieve of Eratosthenes - Primes between X and N

I found this highly optimised implementation of the Sieve of Eratosthenes for Python on Stack Overflow. I have a rough idea of what it's doing but I must admit the details of it's workings elude me.
I would still like to use it for a little project (I'm aware there are libraries to do this but I would like to use this function).
Here's the original:
'''
Sieve of Eratosthenes
Implementation by Robert William Hanks
https://stackoverflow.com/questions/2068372/fastest-way-to-list-all-primes-below-n/3035188
'''
def sieve(n):
"""Return an array of the primes below n."""
prime = numpy.ones(n//3 + (n%6==2), dtype=numpy.bool)
for i in range(3, int(n**.5) + 1, 3):
if prime[i // 3]:
p = (i + 1) | 1
prime[ p*p//3 ::2*p] = False
prime[p*(p-2*(i&1)+4)//3::2*p] = False
result = (3 * prime.nonzero()[0] + 1) | 1
result[0] = 3
return numpy.r_[2,result]
What I'm trying to achieve is to modify it to return all primes below n starting at x so that:
primes = sieve(50, 100)
would return primes between 50 and 100. This seemed easy enough, I tried replacing these two lines:
def sieve(x, n):
...
for i in range(x, int(n**.5) + 1, 3):
...
But for a reason I can't explain, the value of x in the above has no influence on the numpy array returned!
How can I modify sieve() to only return primes between x and n
The implementation you've borrowed is able to start at 3 because it replaces sieving out the multiples of 2 by just skipping all even numbers; that's what the 2*… that appear multiple times in the code are about. The fact that 3 is the next prime is also hardcoded in all over the place, but let's ignore that for the moment, because if you can't get past the special-casing of 2, the special-casing of 3 doesn't matter.
Skipping even numbers is a special case of a "wheel". You can skip sieving multiples of 2 by always incrementing by 2; you can skip sieving multiples of 2 and 3 by alternately incrementing by 2 and 4; you can skip sieving multiples of 2, 3, 5, and 7 by alternately incrementing by 2, 4, 2, 4, 6, 2, 6, … (there's 48 numbers in the sequence), and so on. So, you could extend this code by first finding all the primes up to x, then building a wheel, then using that wheel to find all the primes between x and n.
But that's adding a lot of complexity. And once you get too far beyond 7, the cost (both in time, and in space for storing the wheel) swamps the savings. And if your whole goal is not to find the primes before x, finding the primes before x so you don't have to find them seems kind of silly. :)
The simpler thing to do is just find all the primes up to n, and throw out the ones below x. Which you can do with a trivial change at the end:
primes = numpy.r_[2,result]
return primes[primes>=x]
Or course there are ways to do this without wasting storage for those initial primes you're going to throw away. They'd be a bit complicated to work into this algorithm (you'd probably want to build the array in sections, then drop each section that's entirely < x as you go, then stack all the remaining sections); it would be far easier to use a different implementation of the algorithm that isn't designed for speed and simplicity over space…
And of course there are different prime-finding algorithms that don't require enumerating all the primes up to x in the first place. But if you want to use this implementation of this algorithm, that doesn't matter.
Since you're now interested in looking into other algorithms or other implementations, try this one. It doesn't use numpy, but it is rather fast. I've tried a few variations on this theme, including using sets, and pre-computing a table of low primes, but they were all slower than this one.
#! /usr/bin/env python
''' Prime range sieve.
Written by PM 2Ring 2014.10.15
For range(0, 30000000) this is actually _faster_ than the
plain Eratosthenes sieve in sieve3.py !!!
'''
import sys
def potential_primes():
''' Make a generator for 2, 3, 5, & thence all numbers coprime to 30 '''
s = (2, 3, 5, 7, 11, 13, 17, 19, 23, 29)
for i in s:
yield i
s = (1,) + s[3:]
j = 30
while True:
for i in s:
yield j + i
j += 30
def range_sieve(lo, hi):
''' Create a list of all primes in the range(lo, hi) '''
#Mark all numbers as prime
primes = [True] * (hi - lo)
#Eliminate 0 and 1, if necessary
for i in range(lo, min(2, hi)):
primes[i - lo] = False
ihi = int(hi ** 0.5)
for i in potential_primes():
if i > ihi:
break
#Find first multiple of i: i >= i*i and i >= lo
ilo = max(i, 1 + (lo - 1) // i ) * i
#Determine how many multiples of i >= ilo are in range
n = 1 + (hi - ilo - 1) // i
#Mark them as composite
primes[ilo - lo : : i] = n * [False]
return [i for i,v in enumerate(primes, lo) if v]
def main():
lo = int(sys.argv[1]) if len(sys.argv) > 1 else 0
hi = int(sys.argv[2]) if len(sys.argv) > 2 else lo + 30
#print lo, hi
primes = range_sieve(lo, hi)
#print len(primes)
print primes
#print primes[:10], primes[-10:]
if __name__ == '__main__':
main()
And here's a link to the plain Eratosthenes sieve that I mentioned in the docstring, in case you want to compare this program to that one.
You could improve this slightly by getting rid of the loop under #Eliminate 0 and 1, if necessary. And I guess it might be slightly faster if you avoided looking at even numbers; it'd certainly use less memory. But then you'd have to handle the cases when 2 was inside the range, and I figure that the less tests you have the faster this thing will run.
Here's a minor improvement to that code: replace
#Mark all numbers as prime
primes = [True] * (hi - lo)
#Eliminate 0 and 1, if necessary
for i in range(lo, min(2, hi)):
primes[i - lo] = False
with
#Eliminate 0 and 1, if necessary
lo = max(2, lo)
#Mark all numbers as prime
primes = [True] * (hi - lo)
However, the original form may be preferable if you want to return the plain bool list rather than performing the enumerate to build a list of integers: the bool list is more useful for testing if a given number is prime; OTOH, the enumerate could be used to build a set rather than a list.

Categories