NZEC error in python code - python

MY code is working perfectly fine on my machine but it gives M+NZEC erro when compiled by spoj.
Here is the link to my ques:
http://www.spoj.com/problems/CPRIME/
Here is my code:
def smallPrimes(n):
"""Given an integer n, compute a list of the primes <= n"""
if n <= 1:
return []
sieve = range(3, n+1, 2)
top = len(sieve)
for si in sieve:
if si:
bottom = (si*si - 3)//2
if bottom >= top:
break
sieve[bottom::si] = [0] * -((bottom-top)//si)
return [2]+filter(None, sieve)
from math import *
import sys
def main():
flag=True
while(flag==True):
x=input()
if(x==0):
flag=False
return 0
z=x/log(x)
v=len(smallPrimes(x))
print round((abs(v-z)*100/(v)),1)
if __name__ == "__main__":
main()

In SPOJ the NZEC error is raised when an exception occurs in the Python script execution.
In your case as the problem in input is well specified and terminates on a zero, so it can't be because of input as you take that into consideration.
The error is most likely because of usage of more memory than allowed. In your problem the memory limit is specified as 256 MB. But in your code
sieve = range(3, n+1, 2)
This line declares a list of size about n/2. When, n=10^8 it means that you will declare a list with 5*10^7 integers which with a naive approximation and ignoring all the overheads will be
(5*10^7)*4 bytes
~ 200 MB
Including the overheads and other memory usage for your second big list declaration
[0] * -((bottom-top)//si)
which can reach about 130 MB neglecting all the overheads, you will exceed the memory limit to just store that many integers in the list. I noticed memory usage of about 1 GB by your code on my machine. So your code crosses the memory limit on SPOJ and it raises an exception.
The best thing to do is to optimize your approach, declaring lists of the order of 10^8 is seldom needed in such questions. I can see a way in which you won't need to declare a list that big but since it's a question of an online judge, it's best to let you figure out the approach. :)

Related

I wrote a recursive search function in python. Why does it complete in Jupyter Notebook successfully, but error out in pycharm with a stack overflow [duplicate]

I have this tail recursive function here:
def recursive_function(n, sum):
if n < 1:
return sum
else:
return recursive_function(n-1, sum+n)
c = 998
print(recursive_function(c, 0))
It works up to n=997, then it just breaks and spits out a RecursionError: maximum recursion depth exceeded in comparison. Is this just a stack overflow? Is there a way to get around it?
It is a guard against a stack overflow, yes. Python (or rather, the CPython implementation) doesn't optimize tail recursion, and unbridled recursion causes stack overflows. You can check the recursion limit with sys.getrecursionlimit:
import sys
print(sys.getrecursionlimit())
and change the recursion limit with sys.setrecursionlimit:
sys.setrecursionlimit(1500)
but doing so is dangerous -- the standard limit is a little conservative, but Python stackframes can be quite big.
Python isn't a functional language and tail recursion is not a particularly efficient technique. Rewriting the algorithm iteratively, if possible, is generally a better idea.
Looks like you just need to set a higher recursion depth:
import sys
sys.setrecursionlimit(1500)
If you often need to change the recursion limit (e.g. while solving programming puzzles) you can define a simple context manager like this:
import sys
class recursionlimit:
def __init__(self, limit):
self.limit = limit
def __enter__(self):
self.old_limit = sys.getrecursionlimit()
sys.setrecursionlimit(self.limit)
def __exit__(self, type, value, tb):
sys.setrecursionlimit(self.old_limit)
Then to call a function with a custom limit you can do:
with recursionlimit(1500):
print(fib(1000, 0))
On exit from the body of the with statement the recursion limit will be restored to the default value.
P.S. You may also want to increase the stack size of the Python process for big values of the recursion limit. That can be done via the ulimit shell builtin or limits.conf(5) file, for example.
It's to avoid a stack overflow. The Python interpreter limits the depths of recursion to help you avoid infinite recursions, resulting in stack overflows.
Try increasing the recursion limit (sys.setrecursionlimit) or re-writing your code without recursion.
From the Python documentation:
sys.getrecursionlimit()
Return the current value of the recursion limit, the maximum depth of the Python interpreter stack. This limit prevents infinite recursion from causing an overflow of the C stack and crashing Python. It can be set by setrecursionlimit().
resource.setrlimit must also be used to increase the stack size and prevent segfault
The Linux kernel limits the stack of processes.
Python stores local variables on the stack of the interpreter, and so recursion takes up stack space of the interpreter.
If the Python interpreter tries to go over the stack limit, the Linux kernel makes it segmentation fault.
The stack limit size is controlled with the getrlimit and setrlimit system calls.
Python offers access to those system calls through the resource module.
sys.setrecursionlimit mentioned e.g. at https://stackoverflow.com/a/3323013/895245 only increases the limit that the Python interpreter self imposes on its own stack size, but it does not touch the limit imposed by the Linux kernel on the Python process.
Example program:
main.py
import resource
import sys
print resource.getrlimit(resource.RLIMIT_STACK)
print sys.getrecursionlimit()
print
# Will segfault without this line.
resource.setrlimit(resource.RLIMIT_STACK, [0x10000000, resource.RLIM_INFINITY])
sys.setrecursionlimit(0x100000)
def f(i):
print i
sys.stdout.flush()
f(i + 1)
f(0)
Of course, if you keep increasing setrlimit, your RAM will eventually run out, which will either slow your computer to a halt due to swap madness, or kill Python via the OOM Killer.
From bash, you can see and set the stack limit (in kb) with:
ulimit -s
ulimit -s 10000
The default value for me is 8Mb.
See also:
Setting stacksize in a python script
What is the hard recursion limit for Linux, Mac and Windows?
Tested on Ubuntu 16.10, Python 2.7.12.
Use a language that guarantees tail-call optimisation. Or use iteration. Alternatively, get cute with decorators.
I realize this is an old question but for those reading, I would recommend against using recursion for problems such as this - lists are much faster and avoid recursion entirely. I would implement this as:
def fibonacci(n):
f = [0,1,1]
for i in xrange(3,n):
f.append(f[i-1] + f[i-2])
return 'The %.0fth fibonacci number is: %.0f' % (n,f[-1])
(Use n+1 in xrange if you start counting your fibonacci sequence from 0 instead of 1.)
I had a similar issue with the error "Max recursion depth exceeded". I discovered the error was being triggered by a corrupt file in the directory I was looping over with os.walk. If you have trouble solving this issue and you are working with file paths, be sure to narrow it down, as it might be a corrupt file.
If you want to get only few Fibonacci numbers, you can use matrix method.
from numpy import matrix
def fib(n):
return (matrix('0 1; 1 1', dtype='object') ** n).item(1)
It's fast as numpy uses fast exponentiation algorithm. You get answer in O(log n). And it's better than Binet's formula because it uses only integers. But if you want all Fibonacci numbers up to n, then it's better to do it by memorisation.
Of course Fibonacci numbers can be computed in O(n) by applying the Binet formula:
from math import floor, sqrt
def fib(n):
return int(floor(((1+sqrt(5))**n-(1-sqrt(5))**n)/(2**n*sqrt(5))+0.5))
As the commenters note it's not O(1) but O(n) because of 2**n. Also a difference is that you only get one value, while with recursion you get all values of Fibonacci(n) up to that value.
RecursionError: maximum recursion depth exceeded in comparison
Solution :
First it’s better to know when you execute a recursive function in Python on a large input ( > 10^4), you might encounter a “maximum recursion depth exceeded error”.
The sys module in Python have a function getrecursionlimit() can show the recursion limit in your Python version.
import sys
print("Python Recursive Limitation = ", sys.getrecursionlimit())
The default in some version of Python is 1000 and in some other it was 1500
You can change this limitation but it’s very important to know if you increase it very much you will have memory overflow error.
So be careful before increase it. You can use setrecursionlimit() to increase this limitation in Python.
import sys
sys.setrecursionlimit(3000)
Please follow this link for more information about somethings cause this issue :
https://elvand.com/quick-sort-binary-search/
We can do that using #lru_cache decorator and setrecursionlimit() method:
import sys
from functools import lru_cache
sys.setrecursionlimit(15000)
#lru_cache(128)
def fib(n: int) -> int:
if n == 0:
return 0
if n == 1:
return 1
return fib(n - 2) + fib(n - 1)
print(fib(14000))
Output
3002468761178461090995494179715025648692747937490792943468375429502230242942284835863402333575216217865811638730389352239181342307756720414619391217798542575996541081060501905302157019002614964717310808809478675602711440361241500732699145834377856326394037071666274321657305320804055307021019793251762830816701587386994888032362232198219843549865275880699612359275125243457132496772854886508703396643365042454333009802006384286859581649296390803003232654898464561589234445139863242606285711591746222880807391057211912655818499798720987302540712067959840802106849776547522247429904618357394771725653253559346195282601285019169360207355179223814857106405285007997547692546378757062999581657867188420995770650565521377874333085963123444258953052751461206977615079511435862879678439081175536265576977106865074099512897235100538241196445815568291377846656352979228098911566675956525644182645608178603837172227838896725425605719942300037650526231486881066037397866942013838296769284745527778439272995067231492069369130289154753132313883294398593507873555667211005422003204156154859031529462152953119957597195735953686798871131148255050140450845034240095305094449911578598539658855704158240221809528010179414493499583473568873253067921639513996596738275817909624857593693291980841303291145613566466575233283651420134915764961372875933822262953420444548349180436583183291944875599477240814774580187144637965487250578134990402443365677985388481961492444981994523034245619781853365476552719460960795929666883665704293897310201276011658074359194189359660792496027472226428571547971602259808697441435358578480589837766911684200275636889192254762678512597000452676191374475932796663842865744658264924913771676415404179920096074751516422872997665425047457428327276230059296132722787915300105002019006293320082955378715908263653377755031155794063450515731009402407584683132870206376994025920790298591144213659942668622062191441346200098342943955169522532574271644954360217472458521489671859465232568419404182043966092211744372699797375966048010775453444600153524772238401414789562651410289808994960533132759532092895779406940925252906166612153699850759933762897947175972147868784008320247586210378556711332739463277940255289047962323306946068381887446046387745247925675240182981190836264964640612069909458682443392729946084099312047752966806439331403663934969942958022237945205992581178803606156982034385347182766573351768749665172549908638337611953199808161937885366709285043276595726484068138091188914698151703122773726725261370542355162118164302728812259192476428938730724109825922331973256105091200551566581350508061922762910078528219869913214146575557249199263634241165352226570749618907050553115468306669184485910269806225894530809823102279231750061652042560772530576713148647858705369649642907780603247428680176236527220826640665659902650188140474762163503557640566711903907798932853656216227739411210513756695569391593763704981001125
Source
functools lru_cache
As #alex suggested, you could use a generator function to do this sequentially instead of recursively.
Here's the equivalent of the code in your question:
def fib(n):
def fibseq(n):
""" Iteratively return the first n Fibonacci numbers, starting from 0. """
a, b = 0, 1
for _ in xrange(n):
yield a
a, b = b, a + b
return sum(v for v in fibseq(n))
print format(fib(100000), ',d') # -> no recursion depth error
Edit: 6 years later I realized my "Use generators" was flippant and didn't answer the question. My apologies.
I guess my first question would be: do you really need to change the recursion limit? If not, then perhaps my or any of the other answers that don't deal with changing the recursion limit will apply. Otherwise, as noted, override the recursion limit using sys.getrecursionlimit(n).
Use generators?
def fib():
a, b = 0, 1
while True:
yield a
a, b = b, a + b
fibs = fib() #seems to be the only way to get the following line to work is to
#assign the infinite generator to a variable
f = [fibs.next() for x in xrange(1001)]
for num in f:
print num
Above fib() function adapted from Introduction to Python Generators.
Many recommend that increasing recursion limit is a good solution however it is not because there will be always limit. Instead use an iterative solution.
def fib(n):
a,b = 1,1
for i in range(n-1):
a,b = b,a+b
return a
print fib(5)
I wanted to give you an example for using memoization to compute Fibonacci as this will allow you to compute significantly larger numbers using recursion:
cache = {}
def fib_dp(n):
if n in cache:
return cache[n]
if n == 0: return 0
elif n == 1: return 1
else:
value = fib_dp(n-1) + fib_dp(n-2)
cache[n] = value
return value
print(fib_dp(998))
This is still recursive, but uses a simple hashtable that allows the reuse of previously calculated Fibonacci numbers instead of doing them again.
import sys
sys.setrecursionlimit(1500)
def fib(n, sum):
if n < 1:
return sum
else:
return fib(n-1, sum+n)
c = 998
print(fib(c, 0))
We could also use a variation of dynamic programming bottom up approach
def fib_bottom_up(n):
bottom_up = [None] * (n+1)
bottom_up[0] = 1
bottom_up[1] = 1
for i in range(2, n+1):
bottom_up[i] = bottom_up[i-1] + bottom_up[i-2]
return bottom_up[n]
print(fib_bottom_up(20000))
I'm not sure I'm repeating someone but some time ago some good soul wrote Y-operator for recursively called function like:
def tail_recursive(func):
y_operator = (lambda f: (lambda y: y(y))(lambda x: f(lambda *args: lambda: x(x)(*args))))(func)
def wrap_func_tail(*args):
out = y_operator(*args)
while callable(out): out = out()
return out
return wrap_func_tail
and then recursive function needs form:
def my_recursive_func(g):
def wrapped(some_arg, acc):
if <condition>: return acc
return g(some_arg, acc)
return wrapped
# and finally you call it in code
(tail_recursive(my_recursive_func))(some_arg, acc)
for Fibonacci numbers your function looks like this:
def fib(g):
def wrapped(n_1, n_2, n):
if n == 0: return n_1
return g(n_2, n_1 + n_2, n-1)
return wrapped
print((tail_recursive(fib))(0, 1, 1000000))
output:
..684684301719893411568996526838242546875
(actually tones of digits)

Code finding the first triangular number with more than 500 divisors will not finish running

Okay, so I'm working on Euler Problem 12 (find the first triangular number with a number of factors over 500) and my code (in Python 3) is as follows:
factors = 0
y=1
def factornum(n):
x = 1
f = []
while x <= n:
if n%x == 0:
f.append(x)
x+=1
return len(f)
def triangle(n):
t = sum(list(range(1,n)))
return t
while factors<=500:
factors = factornum(triangle(y))
y+=1
print(y-1)
Basically, a function goes through all the numbers below the input number n, checks if they divide into n evenly, and if so add them to a list, then return the length in that list. Another generates a triangular number by summing all the numbers in a list from 1 to the input number and returning the sum. Then a while loop continues to generate a triangular number using an iterating variable y as the input for the triangle function, and then runs the factornum function on that and puts the result in the factors variable. The loop continues to run and the y variable continues to increment until the number of factors is over 500. The result is then printed.
However, when I run it, nothing happens - no errors, no output, it just keeps running and running. Now, I know my code isn't the most efficient, but I left it running for quite a bit and it still didn't produce a result, so it seems more likely to me that there's an error somewhere. I've been over it and over it and cannot seem to find an error.
I'd merely request that a full solution or a drastically improved one isn't given outright but pointers towards my error(s) or spots for improvement, as the reason I'm doing the Euler problems is to improve my coding. Thanks!
You have very inefficient algorithm.
If you ask for pointers rather than full solution, main pointers are:
There is a more efficient way to calculate next triangular number. There is an explicit formula in the wiki. Also if you generate sequence of all numbers it is just more efficient to add next n to the previous number. (Sidenote list in sum(list(range(1,n))) makes no sense to me at all. If you want to use this approach anyway, sum(xrange(1,n) will probably be much more efficient as it doesn't require materialization of the range)
There are much more efficient ways to factorize numbers
There is a more efficient way to calculate number of factors. And it is actually called after Euler: see Euler's totient function
Generally Euler project problems (as in many other programming competitions) are not supposed to be solvable by sheer brute force. You should come up with some formula and/or more efficient algorithm first.
As far as I can tell your code will work, but it will take a very long time to calculate the number of factors. For 150 factors, it takes on the order of 20 seconds to run, and that time will grow dramatically as you look for higher and higher number of factors.
One way to reduce the processing time is to reduce the number of calculations that you're performing. If you analyze your code, you're calculating n%1 every single time, which is an unnecessary calculation because you know every single integer will be divisible by itself and one. Are there any other ways you can reduce the number of calculations? Perhaps by remembering that if a number is divisible by 20, it is also divisible by 2, 4, 5, and 10?
I can be more specific, but you wanted a pointer in the right direction.
From the looks of it the code works fine, it`s just not the best approach. A simple way of optimizing is doing until the half the number, for example. Also, try thinking about how you could do this using prime factors, it might be another solution. Best of luck!
First you have to def a factor function:
from functools import reduce
def factors(n):
step = 2 if n % 2 else 1
return set(reduce(list.__add__,
([i, n//i] for i in range(1, int(pow(n,0.5) + 1)) if n % i
== 0)))
This will create a set and put all of factors of number n into it.
Second, use while loop until you get 500 factors:
a = 1
x = 1
while len(factors(a)) < 501:
x += 1
a += x
This loop will stop at len(factors(a)) = 500.
Simple print(a) and you will get your answer.

Sum of primes below 2,000,000 in python

I am attempting problem 10 of Project Euler, which is the summation of all primes below 2,000,000. I have tried implementing the Sieve of Erasthotenes using Python, and the code I wrote works perfectly for numbers below 10,000.
However, when I attempt to find the summation of primes for bigger numbers, the code takes too long to run (finding the sum of primes up to 100,000 took 315 seconds). The algorithm clearly needs optimization.
Yes, I have looked at other posts on this website, like Fastest way to list all primes below N, but the solutions there had very little explanation as to how the code worked (I am still a beginner programmer) so I was not able to actually learn from them.
Can someone please help me optimize my code, and clearly explain how it works along the way?
Here is my code:
primes_below_number = 2000000 # number to find summation of all primes below number
numbers = (range(1, primes_below_number + 1, 2)) # creates a list excluding even numbers
pos = 0 # index position
sum_of_primes = 0 # total sum
number = numbers[pos]
while number < primes_below_number and pos < len(numbers) - 1:
pos += 1
number = numbers[pos] # moves to next prime in list numbers
sum_of_primes += number # adds prime to total sum
num = number
while num < primes_below_number:
num += number
if num in numbers[:]:
numbers.remove(num) # removes multiples of prime found
print sum_of_primes + 2
As I said before, I am new to programming, therefore a thorough explanation of any complicated concepts would be deeply appreciated. Thank you.
As you've seen, there are various ways to implement the Sieve of Erasthotenes in Python that are more efficient than your code. I don't want to confuse you with fancy code, but I can show how to speed up your code a fair bit.
Firstly, searching a list isn't fast, and removing elements from a list is even slower. However, Python provides a set type which is quite efficient at performing both of those operations (although it does chew up a bit more RAM than a simple list). Happily, it's easy to modify your code to use a set instead of a list.
Another optimization is that we don't have to check for prime factors all the way up to primes_below_number, which I've renamed to hi in the code below. It's sufficient to just go to the square root of hi, since if a number is composite it must have a factor less than or equal to its square root.
We don't need to keep a running total of the sum of the primes. It's better to do that at the end using Python's built-in sum() function, which operates at C speed, so it's much faster than doing the additions one by one at Python speed.
# number to find summation of all primes below number
hi = 2000000
# create a set excluding even numbers
numbers = set(xrange(3, hi + 1, 2))
for number in xrange(3, int(hi ** 0.5) + 1):
if number not in numbers:
#number must have been removed because it has a prime factor
continue
num = number
while num < hi:
num += number
if num in numbers:
# Remove multiples of prime found
numbers.remove(num)
print 2 + sum(numbers)
You should find that this code runs in a a few seconds; it takes around 5 seconds on my 2GHz single-core machine.
You'll notice that I've moved the comments so that they're above the line they're commenting on. That's the preferred style in Python since we prefer short lines, and also inline comments tend to make the code look cluttered.
There's another small optimization that can be made to the inner while loop, but I let you figure that out for yourself. :)
First, removing numbers from the list will be very slow. Instead of this, make a list
primes = primes_below_number * True
primes[0] = False
primes[1] = False
Now in your loop, when you find a prime p, change primes[k*p] to False for all suitable k. (You wouldn't actually do multiply, you'd continually add p, of course.)
At the end,
primes = [n for n i range(primes_below_number) if primes[n]]
This should be a great deal faster.
Second, you can stop looking once your find a prime greater than the square root of primes_below_number, since a composite number must have a prime factor that doesn't exceed its square root.
Try using numpy, should make it faster. Replace range by xrange, it may help you.
Here's an optimization for your code:
import itertools
primes_below_number = 2000000
numbers = list(range(3, primes_below_number, 2))
pos = 0
while pos < len(numbers) - 1:
number = numbers[pos]
numbers = list(
itertools.chain(
itertools.islice(numbers, 0, pos + 1),
itertools.ifilter(
lambda n: n % number != 0,
itertools.islice(numbers, pos + 1, len(numbers))
)
)
)
pos += 1
sum_of_primes = sum(numbers) + 2
print sum_of_primes
The optimization here is because:
Removed the sum to outside the loop.
Instead of removing elements from a list we can just create another one, memory is not an issue here (I hope).
When creating the new list we create it by chaining two parts, the first part is everything before the current number (we already checked those), and the second part is everything after the current number but only if they are not divisible by the current number.
Using itertools can make things faster since we'd be using iterators instead of looping through the whole list more than once.
Another solution would be to not remove parts of the list but disable them like #saulspatz said.
And here's the fastest way I was able to find: http://www.wolframalpha.com/input/?i=sum+of+all+primes+below+2+million 😁
Update
Here is the boolean method:
import itertools
primes_below_number = 2000000
numbers = [v % 2 != 0 for v in xrange(primes_below_number)]
numbers[0] = False
numbers[1] = False
numbers[2] = True
number = 3
while number < primes_below_number:
n = number * 3 # We already excluded even numbers
while n < primes_below_number:
numbers[n] = False
n += number
number += 1
while number < primes_below_number and not numbers[number]:
number += 1
sum_of_numbers = sum(itertools.imap(lambda index_n: index_n[1] and index_n[0] or 0, enumerate(numbers)))
print(sum_of_numbers)
This executes in seconds (took 3 seconds on my 2.4GHz machine).
Instead of storing a list of numbers, you can instead store an array of boolean values. This use of a bitmap can be thought of as a way to implement a set, which works well for dense sets (there aren't big gaps between the values of members).
An answer on a recent python sieve question uses this implementation python-style. It turns out a lot of people have implemented a sieve, or something they thought was a sieve, and then come on SO to ask why it was slow. :P Look at the related-questions sidebar from some of them if you want more reading material.
Finding the element that holds the boolean that says whether a number is in the set or not is easy and extremely fast. array[i] is a boolean value that's true if i is in the set, false if not. The memory address can be computed directly from i with a single addition.
(I'm glossing over the fact that an array of boolean might be stored with a whole byte for each element, rather than the more efficient implementation of using every single bit for a different element. Any decent sieve will use a bitmap.)
Removing a number from the set is as simple as setting array[i] = false, regardless of the previous value. No searching, not comparison, no tracking of what happened, just one memory operation. (Well, two for a bitmap: load the old byte, clear the correct bit, store it. Memory is byte-addressable, but not bit-addressable.)
An easy optimization of the bitmap-based sieve is to not even store the even-numbered bytes, because there is only one even prime, and we can special-case it to double our memory density. Then the membership-status of i is held in array[i/2]. (Dividing by powers of two is easy for computers. Other values are much slower.)
An SO question:
Why is Sieve of Eratosthenes more efficient than the simple "dumb" algorithm? has many links to good stuff about the sieve. This one in particular has some good discussion about it, in words rather than just code. (Nevermind the fact that it's talking about a common Haskell implementation that looks like a sieve, but actually isn't. They call this the "unfaithful" sieve in their graphs, and so on.)
discussion on that question brought up the point that trial division may be fast than big sieves, for some uses, because clearing the bits for all multiples of every prime touches a lot of memory in a cache-unfriendly pattern. CPUs are much faster than memory these days.

Python prime number code giving runtime error(NZEC) on spoj

I am trying to get an accepted answer for this question:http://www.spoj.com/problems/PRIME1/
It's nothing new, just wanting prime numbers to be generated between two given numbers. Eventually, I have coded the following. But spoj is giving me runtime-error(nzec), and I have no idea how it should be dealt with. I hope you can help me with it. Thanks in advance.
def is_prime(m,n):
myList= []
mySieve= [True] * (n+1)
for i in range(2,n+1):
if mySieve[i]:
myList.append(i)
for x in range(i*i,n+1,i):
mySieve[x]= False
for a in [y for y in myList if y>=m]:
print(a)
t= input()
count = 0
while count <int(t):
m, n = input().split()
count +=1
is_prime(int(m),int(n))
if count == int(t):
break
print("\n")
Looking at the problem definition:
In each of the next t lines there are two numbers m and n (1 <= m <= n <= 1000000000, n-m<=100000) separated by a space.
Looking at your code:
mySieve= [True] * (n+1)
So, if n is 1000000000, you're going to try to create a list of 1000000001 boolean values. That means you're asking Python to allocate storage for a billion pointers. On a 64-bit platform, that's 8GB—which is fine as far as Python's concerned, but might well throw your system into swap hell or get it killed by a limit or watchdog. On a 32-bit platform, that's 4GB—which will guarantee you a MemoryError.
The problem also explicitly has this warning:
Warning: large Input/Output data, be careful with certain languages
So, if you want to implement it this way, you're going to have to come up with a more compact storage. For example, array.array('B', [True]) * (n+1) will only take 1GB instead of 4 or 8. And you can make it even smaller (128MB) if you store it in bits instead of bytes, but that's not quite as trivial a change to code.
Calculating prime numbers between two numbers is meaningless. You can only calculate prime numbers until a given number by using other primes you found before, then show only range you wanted.
Here is a python code and some calculated primes you can continue by using them:
bzr branch http://bzr.ceremcem.net/calc-primes
This code is somewhat trivial but is working correctly and tested well.

Finding a Prime Sieve Inconsistency in Python

I'm attempting to learn python and I thought trying to develop my own prime sieve would be an interesting problem for the afternoon. When required thus far, I would just import a version of the Sieve of Eratosthenes that I found online -- it's this that I used as my benchmark.
After trying several different optimizations, I thought I had written a pretty decent sieve:
def sieve3(n):
top = n+1
sieved = dict.fromkeys(xrange(3,top,2), True)
for si in sieved:
if si * si > top:
break
if sieved[si]:
for j in xrange((si*2) + si, top, si*2): [****]
sieved[j] = False
return [2] + [pr for pr in sieved if sieved[pr]]
Using the first 1,000,000 integers as my range, this code would generate the correct number of primes and was only about 3-5x slower than my benchmark. I was about to give up and pat myself on the back when I tried it on a larger range, but it no longer worked!
n = 1,000 -- Benchmark = 168 in 0.00010 seconds
n = 1,000 -- Sieve3 = 168 in 0.00022 seconds
n = 4,194,304 -- Benchmark = 295,947 in 0.288 seconds
n = 4,194,304 -- Sieve3 = 295,947 in 1.443 seconds
n = 4,194,305 -- Benchmark = 295,947 in 3.154 seconds
n = 4,194,305 -- Sieve3 = 2,097,153 in 0.8465 seconds
I think the problem comes from the line with [****], but I can't figure out why it's so broken. It's supposed to mark each odd multiple of 'j' as False and it works most of the time, but for anything above 4,194,304 the sieve is broken. (To be fair, it breaks on random other numbers too, like 10,000 for instance).
I made a change and it significantly slowed my code down, but it would actually work for all values. This version includes all numbers (not just odds) but is otherwise identical.
def sieve2(n):
top = n+1
sieved = dict.fromkeys(xrange(2,top), True)
for si in sieved:
if si * si > top:
break
if sieved[si]:
for j in xrange((si*2), top, si):
sieved[j] = False
return [pr for pr in sieved if sieved[pr]]
Can anyone help me figure out why my original function (sieve3) doesn't work consistently?
Edit: I forgot to mention, that when Sieve3 'breaks', sieve3(n) returns n/2.
The sieve requires the loop over candidate primes to be ordered. The code in question is enumerating the keys of a dictionary, which are not guaranteed to be ordered. Instead, go ahead and use the xrange you used to initialize the dictionary for your main sieve loop as well as the return result loop as follows:
def sieve3(n):
top = n+1
sieved = dict.fromkeys(xrange(3,top,2), True)
for si in xrange(3,top,2):
if si * si > top:
break
if sieved[si]:
for j in xrange(3*si, top, si*2):
sieved[j] = False
return [2] + [pr for pr in xrange(3,top,2) if sieved[pr]]
It's because dictionary keys are not ordered. Some of the time, by chance, for si in sieved: will loop through your keys in increasing order.
With your last example, the first value si gets is big enough to break the loop immediately.
You can simply use:
for si in sorted(sieved):
Well, look at the runtime -- you see that the runtime on the last case you showed was almost 5 times faster than the benchmark, while it had usually been 5 times slower. So that is a red flag, maybe you aren't performing all of the iterations? (And it is 5 times faster while having almost 10 times as many primes...)
I don't have time to look into the code more right now, but I hope this helps.

Categories