Memory management for python scripts - python

So I'm trying to solve some problems from the Euler project in python. I'm currently working on Problem 92, square digit chains. Basically the idea is that if you take any integer, and square its component digits recursively (e.g. 42 = 42 + 22 = 20, then 22 + 02 = 4, etc.), you always end up either at 1 or 89.
I am trying to write a program that can compute how many numbers, in a range 1 to 10K, will end up in 89 and how many will end up in 1. I am not trying to store which integers end up where, only how many. The goal is to be able to do that for the largest K possible. (This is a challenge from Hackerrank for those curious).
In other to do for large number within my lifetime, I need to use caching. But then that's a balancing act between caching (which eventually takes up lots of RAM) and computing time.
My problem is that I eventually run out of memory. So I have tried to cap the length of the cache that I am using. However, I still run out of memory. I cannot seem to be able to find what is causing me to run out of memory.
I am running it on pycharm on ubuntu 14.04 LTS.
My question:
Is there a way to check what is taking up my RAM? Is there some tool (or script) that can allow me to basically monitor memory use by variables within my program? Or an wrong in assuming that if I run out of RAM, it is necessarily because some variable in my program is too large? I have to admit I am not all that clear on the fine details of memory use within a program....
EDIT: I run out of mem when K = 8, so for integers up to 108, which is not so large. Also, I did testing before 108 (so 107, which terminates but takes some time and uses more memory than smaller computation). And it doesn't seem that capping my cache size variables makes a differences.....

I would suggest testing various cache sizes to see if it is actually beneficial to have as large a cache as possible.
If you take any 10-digit number and compute the sum of squares of its digits, the sum will be at most 10*9*9 = 810. Thus, if you cache the result for numbers 1 to 810, then you should be able to process all numbers with between 4 and 10 digits without recursion.
In this way, I have processed the first 10^8 numbers in around 6 minutes with memory usage staying constant at roughly 10 MB.

This is a variation of Mathias Rav's excellent idea but keeps your idea of using a recursive function with memozation. The idea is to use a helper function to do the heavy lifting and have the main function just do the first step of the iteration. The very first step reduces the problem size to one for which caching is useful. The cache remains small. I was able to do all numbers up to 10**8 in about 10 minutes (the overhead due to the recursion makes this solution less efficient than Mathias' solution):
cache = {}
def helper(n):
if n == 1 or n == 89:
return n
elif n in cache:
return cache[n]
else:
ss = sum(int(d)**2 for d in str(n))
v = helper(ss)
cache[n] = v
return v
def f(n):
ss = sum(int(d)**2 for d in str(n))
return helper(ss)
def freq89(n):
total = 0
for i in range(1,n+1):
if f(i) == 89: total += 1
return total/n

This is an extended comment on the answers by Mathias Rav and John Coleman. I was going to make this a community wiki answer. John Coleman said not to do so, so I'm not.
I'll start with John Coleman's answer.
cache = {}
def helper(n):
if n == 1 or n == 89:
return n
elif n in cache:
return cache[n]
else:
ss = sum(int(d)**2 for d in str(n))
v = helper(ss)
cache[n] = v
return v
def f(n):
ss = sum(int(d)**2 for d in str(n))
return helper(ss)
A small thing that will speed things up a bit is to avoid that first if in helper(n) by initializing cache to {1:some_value, 89:some_other_value}. The obvious initialization is {1:1, 89:89}. A less obvious, but ultimately faster initialization is {1:False, 89:True}. This enables changing if f(i) == 89: total += 1 to if f(i): total += 1.
Another small thing that might help is to get rid of the recursion. That's not the case here. To get rid of the recursion, we'd have to do something along the lines of
def helper(n):
l = []
while n not in cache :
l.append(n)
n = sum(int(d)**2 for d in str(n))
v = cache[n]
for k in l :
cache[k] = v
return v
The problem is that almost all of the numbers encountered by f(n) will already be in the cache thanks to how helper is called from f(n). Getting rid of the recursion needlessly creates an empty list that needs to be garbage collected.
The big issue with John Coleman's answer is the calculation of the sum of the square of the digits via sum(int(d)**2 for d in str(n)). While very pythonic, this is extremely expensive. I'll start by changing the variable ss in helper and in f into a function:
def ss(n):
return sum(int(d)**2 for d in str(n))
This alone does nothing for performance. In fact, it hurts performance. Function calls are expensive in python. By making this a function, we can do some non-pythonic things by replacing the string operations with integer arithmetic:
def ss(n):
s = 0
while n != 0:
d = n % 10
n = n // 10
s += d**2
return s
The speedup here is quite significant; I get a 30% reduction in computation time. That's not great. There's another problem, the use of the exponentiation operator. In almost any language but Fortran and Matlab, using d*d is much faster than is d**2. That's certainly the case in python. That simple change almost halves the execution time from that already significant 30% reduction.
Putting this all together yields
cache = {1:False, 89:True}
def ss (n):
s = 0
while n != 0:
d = n % 10
n = n // 10
s += d*d
return s
def helper(n):
if n in cache:
return cache[n]
else:
v = helper(ss(n))
cache[n] = v
return v
def f(n):
return helper(ss(n))
def freq89(n):
total = 0
for i in range(1,n+1):
if f(i): total += 1
return total/n
print (freq89(int(1e7)))
I have yet to take advantage of Mathias Rav's answer. In this case, it will make sense to get rid of the recursion. It will also help to embed the loop over the initial range inside of the function that initializes the cache (function calls are expensive in python).
N = int(1e7)
cache = {1:False, 89:True}
def ss(n):
s = 0
while n != 0:
d = n % 10
n //= 10
s += d*d
return s
def initialize_cache(maxsum):
for n in range(1,maxsum+1):
l = []
while n not in cache:
l.append(n)
n = ss(n)
v = cache[n]
for k in l:
cache[k] = v
def freq89(n):
total = 0
for i in range(1,n):
if cache[ss(i)]:
total += 1
return total/n
maxsum = 81*len(str(N-1))
initialize_cache(maxsum)
print (freq89(N))
The above takes about 16.5 seconds (on my computer) to calculate the ratio for numbers between 1 (inclusive) and 10000000 (exclusive) on my computer. This is almost three times faster than the initial version (44.7 seconds). It takes a bit over three minutes for the above to calculate calculate the ratio for numbers between 1 (inclusive) and 1e8 (exclusive).
It turns out I'm not done. There's no need to calculate the sum of the squares of the digits of (for example) 12345679 digit by digit when the program just did that for 12345678. A shortcut that reduces the calculation time for nine out of ten use cases pays off. The function ss(n) becomes a bit more complex:
prevn = 0
prevd = 0
prevs = 0
def ss(n):
global prevn, prevd, prevs
d = n % 10
if (n == prevn+1) and (d == prevd+1):
s = prevs + 2*prevd + 1
prevs = s
prevn = n
prevd = d
return s
s = 0
prevn = n
prevd = d
while n != 0:
d = n % 10
n //= 10
s += d*d
prevs = s
return s
With this, calculating the ratio for numbers up to (but not including) 1e7 takes 6.6 seconds, 68 seconds for numbers up to but not including 1e8.

Related

This program i am creating for Fibonacci but it return none.Kindly solve this problem

def fibonacci(n):
for i in range(n,1):
fab=0
if(i>1):
fab=fab+i
i=i-1
return fab
elif i==0:
return 0
else:
return 1
n1 = int(input("enter the nth term: "))
n2=fibonacci(n1)
print(n2)
The only way your code can return none is if you enter an invalid range, where the start value is greater than or equal to the stop value (1)
you probably just need range(n) instead of range(n, 1)
You can do this too:
def fibonacci(n):
return 0 if n == 1 else (1 if n == 2 else (fibonacci(n - 1) + fibonacci(n - 2) if n > 0 else None))
print(fibonacci(12))
You may need to use recursion for for nth Fibonacci number:
ex:
def Fibonacci(n):
if n==1:
return 0
elif n==2:
return 1
else:
return Fibonacci(n-1)+Fibonacci(n-2)
print(Fibonacci(9))
# output:21
If you do not plan to use large numbers, you can use the easy and simple typical recursive way of programming this function, although it may be slow for big numbers (>25 is noticeable), so take it into account.
def fibonacci(n):
if n<=0:
return 0
if n==1:
return 1
return fibonacci(n-1)+fibonacci(n-2)
You can also add a cache for the numbers you already stepped in, in order to make it run much faster. It will consume a very small amount of extra memory but it allows you to calculate larger numbers instantaneously (you can test it with fibonacci(1000), which is almost the last number you can calculate due to recursion limitation)
cache_fib = {}
def fibonacci(n):
if n<=0:
return 0
if n==1:
return 1
if n in cache_fib.keys():
return cache_fib[n]
result = fibonacci(n-1)+fibonacci(n-2)
cache_fib[n] = result
return result
In case you really need big numbers, you can do this trick to allow more recursion levels:
cache_fib = {1:1}
def fibonacci(n):
if n<=0:
return 0
if n in cache_fib.keys():
return cache_fib[n]
max_cached = max(cache_fib.keys())
if n-max_cached>500:
print("max_cached:", max_cached)
fibonacci(max_cached+500)
result = fibonacci(n-1)+fibonacci(n-2)
cache_fib[n] = result
return result
range(n,1) creates a range starting with n, incrementing in steps of 1 and stopping when n is larger or equal to 1. So, in case n is negative or zero, your loop will be executed. But in case n is 1 or larger, the loop will never be executed and the function just returns None.
If you would like a range going from n downwards to 1, you can use range(n,1,-1) where -1 is the step value. Note that when stepping down, the last number is included range(5,1,-1) is [5,4,3,2,1] while when stepping upwards range(1,5) is [1,2,3,4] the last element is not included. range(n) with only one parameter also exists. It is equivalent to range(0, n) and goes from 0 till n-1, which means the loop would be executed exactly n times.
Also note that you write return in every clause of the if statement. That makes that your function returns its value and interrupts the for loop.
Further, note that you set fab=0 at the start of your for loop. This makes that it is set again and again to 0 each time you do a pass of the loop. Therefore, it is better to put fab=0 just before the start of the loop.
As others have mentioned, even with these changes, your function will not calculate the Fibonacci numbers. A recursive function is a simple though inefficient solution. Some fancy playing with two variables can calculate Fibonacci in a for loop. Another interesting approach is memorization or caching as demonstrated by #Ganathor.
Here is a solution that without recursion and without caching. Note that Fibonacci is a very special case where this works. Recursion and caching are very useful tools for more real world problems.
def fibonacci(n):
a = 0
b = 1
for i in range(n):
a, b = a + b, a # note that a and b get new values simultaneously
return a
print (fibonacci(100000))
And if you want a really, really fast and fancy code:
def fibonacci_fast(n):
a = 1
b = 0
p = 0
q = 1
while n > 0 :
if n % 2 == 0 :
p, q = p*p + q*q, 2*p*q + q*q
n = n // 2
else:
a, b = b*q + a*q + a*p, b*p + a*q
n = n - 1
return b
print (fibonacci_fast(1000000))
Note that this relies on some special properties of the Fibonacci sequence. It also gets slow for Python to do calculations with really large numbers. The millionth Fibonacci number has more than 200,000 digits.

Most efficient way to find all factors with GMPY2 (or GMP)?

I know there's already a question similar to this, but I want to speed it up using GMPY2 (or something similar with GMP).
Here is my current code, it's decent but can it be better?
Edit: new code, checks divisors 2 and 3
def factors(n):
result = set()
result |= {mpz(1), mpz(n)}
def all_multiples(result, n, factor):
z = mpz(n)
while gmpy2.f_mod(mpz(z), factor) == 0:
z = gmpy2.divexact(z, factor)
result |= {mpz(factor), z}
return result
result = all_multiples(result, n, 2)
result = all_multiples(result, n, 3)
for i in range(1, gmpy2.isqrt(n) + 1, 6):
i1 = mpz(i) + 1
i2 = mpz(i) + 5
div1, mod1 = gmpy2.f_divmod(n, i1)
div2, mod2 = gmpy2.f_divmod(n, i2)
if mod1 == 0:
result |= {i1, div1}
if mod2 == 0:
result |= {i2, div2}
return result
If it's possible, I'm also interested in an implementation with divisors only within n^(1/3) and 2^(2/3)*n(1/3)
As an example, mathematica's factor() is much faster than the python code. I want to factor numbers between 20 and 50 decimal digits. I know ggnfs can factor these in less than 5 seconds.
I am interested if any module implementing fast factorization exists in python too.
I just made some quick changes to your code to eliminate redundant name lookups. The algorithm is still the same but it is about twice as fast on my computer.
import gmpy2
from gmpy2 import mpz
def factors(n):
result = set()
n = mpz(n)
for i in range(1, gmpy2.isqrt(n) + 1):
div, mod = divmod(n, i)
if not mod:
result |= {mpz(i), div}
return result
print(factors(12345678901234567))
Other suggestions will need more information about the size of the numbers, etc. For example, if you need all the possible factors, it may be faster to construct those from all the prime factors. That approach will let you decrease the limit of the range statement as you proceed and also will let you increment by 2 (after removing all the factors of 2).
Update 1
I've made some additional changes to your code. I don't think your all_multiplies() function is correct. Your range() statement isn't optimal since 2 is check again but my first fix made it worse.
The new code delays computing the co-factor until it knows the remainder is 0. I also tried to use the built-in functions as much as possible. For example, mpz % integer is faster than gmpy2.f_mod(mpz, integer) or gmpy2.f_mod(integer, mpz) where integer is a normal Python integer.
import gmpy2
from gmpy2 import mpz, isqrt
def factors(n):
n = mpz(n)
result = set()
result |= {mpz(1), n}
def all_multiples(result, n, factor):
z = n
f = mpz(factor)
while z % f == 0:
result |= {f, z // f}
f += factor
return result
result = all_multiples(result, n, 2)
result = all_multiples(result, n, 3)
for i in range(1, isqrt(n) + 1, 6):
i1 = i + 1
i2 = i + 5
if not n % i1:
result |= {mpz(i1), n // i1}
if not n % i2:
result |= {mpz(i2), n // i2}
return result
print(factors(12345678901234567))
I would change your program to just find all the prime factors less than the square root of n and then construct all the co-factors later. Then you decrease n each time you find a factor, check if n is prime, and only look for more factors if n isn't prime.
Update 2
The pyecm module should be able to factor the size numbers you are trying to factor. The following example completes in about a second.
>>> import pyecm
>>> list(pyecm.factors(12345678901234567890123456789012345678901, False, True, 10, 1))
[mpz(29), mpz(43), mpz(43), mpz(55202177), mpz(2928109491677), mpz(1424415039563189)]
There exist different Python factoring modules in the Internet. But if you want to implement factoring yourself (without using external libraries) then I can suggest quite fast and very easy to implement Pollard-Rho Algorithm. I implemented it fully in my code below, you just scroll down directly to my code (at the bottom of answer) if you don't want to read.
With great probability Pollard-Rho algorithm finds smallest non-trivial factor P (not equal to 1 or N) within time of O(Sqrt(P)). To compare, Trial Division algorithm that you implemented in your question takes O(P) time to find factor P. It means for example if a prime factor P = 1 000 003 then trial division will find it after 1 000 003 division operations, while Pollard-Rho on average will find it just after 1 000 operations (Sqrt(1 000 003) = 1 000), which is much much faster.
To make Pollard-Rho algorithm much faster we should be able to detect prime numbers, to exclude them from factoring and don't wait unnecessarily time, for that in my code I used Fermat Primality Test which is very fast and easy to implement within just 7-9 lines of code.
Pollard-Rho algorithm itself is very short, 13-15 lines of code, you can see it at the very bottom of my pollard_rho_factor() function, the remaining lines of code are supplementary helpers-functions.
I implemented all algorithms from scratch without using extra libraries (except random module). That's why you can see my gcd() function there although you can use built-in Python's math.gcd() instead (which finds Greatest Common Divisor).
You can see function Int() in my code, it is used just to convert Python's integers to GMPY2. GMPY2 ints will make algorithm faster, you can just use Python's int(x) instead. I didn't use any specific GMPY2 function, just converted all ints to GMPY2 ints to have around 50% speedup.
As an example I factor first 190 digits of Pi!!! It takes 3-15 seconds to factor them. Pollard-Rho algorithm is randomized hence it takes different time to factor same number on each run. You can restart program again and see that it will print different running time.
Of course factoring time depends greatly on size of prime divisors. Some 50-200 digits numbers can be factoring within fraction of second, some will take months. My example 190 digits of Pi has quite small prime factors, except largest one, that's why it is fast. Other digits of Pi may be not that fast to factor. So digit-size of number doesn't matter very much, only size of prime factors matter.
I intentionally implemented pollard_rho_factor() function as one standalone function, without breaking it into smaller separate functions. Although it breaks Python's style guide, which (as I remember) suggests not to have nested functions and place all possible functions at global scope. Also style guide suggests to do all imports at global scope in first lines of script. I did single function intentionally so that it is easy copy-pastable and fully ready to use in your code. Fermat primality test is_fermat_probable_prime() sub-function is also copy pastable and works without extra dependencies.
In very rare cases Pollard-Rho algorithm may fail to find non-trivial prime factor, especially for very small factors, for example you can replace n inside test() with small number 4 and see that Pollard-Rho fails. For such small failed factors you can easily use your Trial Division algorithm that you implemented in your question.
Try it online!
def pollard_rho_factor(N, *, trials = 16):
# https://en.wikipedia.org/wiki/Pollard%27s_rho_algorithm
import math, random
def Int(x):
import gmpy2
return gmpy2.mpz(x) # int(x)
def is_fermat_probable_prime(n, *, trials = 32):
# https://en.wikipedia.org/wiki/Fermat_primality_test
import random
if n <= 16:
return n in (2, 3, 5, 7, 11, 13)
for i in range(trials):
if pow(random.randint(2, n - 2), n - 1, n) != 1:
return False
return True
def gcd(a, b):
# https://en.wikipedia.org/wiki/Greatest_common_divisor
# https://en.wikipedia.org/wiki/Euclidean_algorithm
while b != 0:
a, b = b, a % b
return a
def found(f, prime):
print(f'Found {("composite", "prime")[prime]} factor, {math.log2(f):>7.03f} bits... {("Pollard-Rho failed to fully factor it!", "")[prime]}')
return f
N = Int(N)
if N <= 1:
return []
if is_fermat_probable_prime(N):
return [found(N, True)]
for j in range(trials):
i, stage, y, x = 0, 2, Int(1), Int(random.randint(1, N - 2))
while True:
r = gcd(N, abs(x - y))
if r != 1:
break
if i == stage:
y = x
stage <<= 1
x = (x * x + 1) % N
i += 1
if r != N:
return sorted(pollard_rho_factor(r) + pollard_rho_factor(N // r))
return [found(N, False)] # Pollard-Rho failed
def test():
import time
# http://www.math.com/tables/constants/pi.htm
# pi = 3.
# 1415926535 8979323846 2643383279 5028841971 6939937510 5820974944 5923078164 0628620899 8628034825 3421170679
# 8214808651 3282306647 0938446095 5058223172 5359408128 4811174502 8410270193 8521105559 6446229489 5493038196
# n = first 190 fractional digits of Pi
n = 1415926535_8979323846_2643383279_5028841971_6939937510_5820974944_5923078164_0628620899_8628034825_3421170679_8214808651_3282306647_0938446095_5058223172_5359408128_4811174502_8410270193_8521105559_6446229489
tb = time.time()
print('N:', n)
print('Factors:', pollard_rho_factor(n))
print(f'Time: {time.time() - tb:.03f} sec')
test()
Output:
N: 1415926535897932384626433832795028841971693993751058209749445923078164062862089986280348253421170679821480865132823066470938446095505822317253594081284811174502841027019385211055596446229489
Found prime factor, 1.585 bits...
Found prime factor, 6.150 bits...
Found prime factor, 20.020 bits...
Found prime factor, 27.193 bits...
Found prime factor, 28.311 bits...
Found prime factor, 545.087 bits...
Factors: [mpz(3), mpz(71), mpz(1063541), mpz(153422959), mpz(332958319), mpz(122356390229851897378935483485536580757336676443481705501726535578690975860555141829117483263572548187951860901335596150415443615382488933330968669408906073630300473)]
Time: 2.963 sec

Finding digits in powers of 2 fast

The task is to search every power of two below 2^10000, returning the index of the first power in which a string is contained. For example if the given string to search for is "7" the program will output 15, as 2^15 is the first power to contain 7 in it.
I have approached this with a brute force attempt which times out on ~70% of test cases.
for i in range(1,9999):
if search in str(2**i):
print i
break
How would one approach this with a time limit of 5 seconds?
Try not to compute 2^i at each step.
pow = 1
for i in xrange(1,9999):
if search in str(pow):
print i
break
pow *= 2
You can compute it as you go along. This should save a lot of computation time.
Using xrange will prevent a list from being built, but that will probably not make much of a difference here.
in is probably implemented as a quadratic string search algorithm. It may (or may not, you'd have to test) be more efficient to use something like KMP for string searching.
A faster approach could be computing the numbers directly in decimal
def double(x):
carry = 0
for i, v in enumerate(x):
d = v*2 + carry
if d > 99999999:
x[i] = d - 100000000
carry = 1
else:
x[i] = d
carry = 0
if carry:
x.append(carry)
Then the search function can become
def p2find(s):
x = [1]
for y in xrange(10000):
if s in str(x[-1])+"".join(("00000000"+str(y))[-8:]
for y in x[::-1][1:]):
return y
double(x)
return None
Note also that the digits of all powers of two up to 2^10000 are just 15 millions, and searching the static data is much faster. If the program must not be restarted each time then
def p2find(s, digits = []):
if len(digits) == 0:
# This precomputation happens only ONCE
p = 1
for k in xrange(10000):
digits.append(str(p))
p *= 2
for i, v in enumerate(digits):
if s in v: return i
return None
With this approach the first check will take some time, next ones will be very very fast.
Compute every power of two and build a suffix tree using each string. This is linear time in the size of all the strings. Now, the lookups are basically linear time in the length of each lookup string.
I don't think you can beat this for computational complexity.
There are only 10000 numbers. You don't need any complex algorithms. Simply calculated them in advance and do search. This should take merely 1 or 2 seconds.
powers_of_2 = [str(1<<i) for i in range(10000)]
def search(s):
for i in range(len(powers_of_2)):
if s in powers_of_2[i]:
return i
Try this
twos = []
twoslen = []
two = 1
for i in xrange(10000):
twos.append(two)
twoslen.append(len(str(two)))
two *= 2
tens = []
ten = 1
for i in xrange(len(str(two))):
tens.append(ten)
ten *= 10
s = raw_input()
l = len(s)
n = int(s)
for i in xrange(len(twos)):
for j in xrange(twoslen[i]):
k = twos[i] / tens[j]
if k < n: continue
if (k - n) % tens[l] == 0:
print i
exit()
The idea is to precompute every power of 2, 10 and and also to precompute the number of digits for every power of 2. In this way the problem is reduces to finding the minimum i for which there exist a j such that after removing the last j digits from 2 ** i you obtain a number which ends with n or expressed as a formula (2 ** i / 10 ** j - n) % 10 ** len(str(n)) == 0.
A big problem here is that converting a binary integer to decimal notation takes time quadratic in the number of bits (at least in the straightforward way Python does it). It's actually faster to fake your own decimal arithmetic, as #6502 did in his answer.
But it's very much faster to let Python's decimal module do it - at least under Python 3.3.2 (I don't know how much C acceleration is built in to Python decimal versions before that). Here's code:
class S:
def __init__(self):
import decimal
decimal.getcontext().prec = 4000 # way more than enough for 2**10000
p2 = decimal.Decimal(1)
full = []
for i in range(10000):
s = "%s<%s>" % (p2, i)
##assert s == "%s<%s>" % (str(2**i), i)
full.append(s)
p2 *= 2
self.full = "".join(full)
def find(self, s):
import re
pat = s + "[^<>]*<(\d+)>"
m = re.search(pat, self.full)
if m:
return int(m.group(1))
else:
print(s, "not found!")
and sample usage:
>>> s = S()
>>> s.find("1")
0
>>> s.find("2")
1
>>> s.find("3")
5
>>> s.find("65")
16
>>> s.find("7")
15
>>> s.find("00000")
1491
>>> s.find("666")
157
>>> s.find("666666")
2269
>>> s.find("66666666")
66666666 not found!
s.full is a string with a bit over 15 million characters. It looks like this:
>>> print(s.full[:20], "...", s.full[-20:])
1<0>2<1>4<2>8<3>16<4 ... 52396298354688<9999>
So the string contains each power of 2, with the exponent following a power enclosed in angle brackets. The find() method constructs a regular expression to search for the desired substring, then look ahead to find the power.
Playing around with this, I'm convinced that just about any way of searching is "fast enough". It's getting the decimal representations of the large powers that sucks up the vast bulk of the time. And the decimal module solves that one.

Is there any more optimal way to solve this idempotent equation (modulo n ring)?

This is one of the problems on Project Euler:
If we calculate a^2 mod 6 for 0 <= a <= 5 we get: 0, 1, 4, 3, 4, 1.
The largest value of "a" such that a^2 mod 6 = a is 4.
Let's call M(n) the largest value of a < n such that a^2 mod n = a.
So M(6) = 4.
Find M(n) for 1 <=n <=10^7.
So far, this is what I have:
import time
start = time.time()
from math import sqrt
squares=[]
for numba in xrange(0,10000001/2+2):
squares.append(numba*numba)
def primes1(n):
""" Returns a list of primes < n """
sieve = [True] * (n/2)
for i in xrange(3,int(sqrt(n))+1,2):
if sieve[i/2]:
sieve[i*i/2::i] = [False] * ((n-i*i-1)/(2*i)+1)
return [2] + [2*i+1 for i in xrange(1,n/2) if sieve[i]]
tot=0
gor = primes1(10000001)
def factor1(n):
'''Returns whether a number has more than 1 prime factor'''
boo = False
'''if n in gor:
return True'''
for e in xrange(0,len(gor)):
z=gor[e]
if n%z==0:
if boo:
return False
boo = True
elif z*2>n:
break
return True
for n in xrange(2,10000001):
if factor1(n):
tot+=1
else:
for a in xrange(int(sqrt(n))+1,n/2+1):
if squares[a]%n==a:
tot+=n+1-a
break
print tot
print time.time()-start
I've tried this code for smaller cases and it works perfectly; however, it is way too slow to do 10^7 cases.
Currently, for n being less than 20000, it runs in about 8 seconds.
When n is less than 90000, it runs in about 150 seconds.
As far as I can tell, for n is less than 10^7, it will run for many hours if not days.
I'm already using the sieve to generate prime numbers so that part is as fast as it can be, is there anything I can do to speed up the rest of the code?
I've already tried using different compiler like psyco, pypy, and shedskin. Psyco provides a minimal increase, shedskin speeds it up about 7 times but creates errors when large numbers occur, pypy speeds it up the most (about 20-30x the speed). But even then, it's still not fast enough for the amount of cases it has to go through.
Edit:
I added
squares=[]
for numba in xrange(0,10000001/2+2):
squares.append(numba*numba)
This pre-generates all the squares of a before-hand so that I don't have to keep generating the same ones over and over again. Program became slightly faster but still not enough
This might depend on the size of N because of memory usage, but in smaller tests I found something of an improvement by precalculating the factor counts. So something like this:
factors = [0]*N
for z in gor:
for n in xrange(1,N):
m = z*n
if m >= N: break
factors[m] += 1
where N is 10000001, or whatever counter you're using.
Then instead of if factor1(n) you do if factors[n] < 2.

Project Euler #25: Keep getting Overflow error (result to large) - is it to do with calculating fibonacci number?

I'm working on solving the Project Euler problem 25:
What is the first term in the Fibonacci sequence to contain 1000
digits?
My piece of code works for smaller digits, but when I try a 1000 digits, i get the error:
OverflowError: (34, 'Result too large')
I'm thinking it may be on how I compute the fibonacci numbers, but i've tried several different methods, yet i get the same error.
Here's my code:
'''
What is the first term in the Fibonacci sequence to contain 1000 digits
'''
def fibonacci(n):
phi = (1 + pow(5, 0.5))/2 #Golden Ratio
return int((pow(phi, n) - pow(-phi, -n))/pow(5, 0.5)) #Formula: http://bit.ly/qDumIg
n = 0
while len(str(fibonacci(n))) < 1000:
n += 1
print n
Do you know what may the cause of this problem and how i could alter my code avoid this problem?
Thanks in advance.
The problem here is that only integers in Python have unlimited length, floating point values are still calculated using normal IEEE types which has a maximum precision.
As such, since you're using an approximation, using floating point calculations, you will get that problem eventually.
Instead, try calculating the Fibonacci sequence the normal way, one number (of the sequence) at a time, until you get to 1000 digits.
ie. calculate 1, 1, 2, 3, 5, 8, 13, 21, 34, etc.
By "normal way" I mean this:
/ 1 , n < 3
Fib(n) = |
\ Fib(n-2) + Fib(n-1) , n >= 3
Note that the "obvious" approach given the above formulas is wrong for this particular problem, so I'll post the code for the wrong approach just to make sure you don't waste time on that:
def fib(n):
if n <= 3:
return 1
else:
return fib(n-2) + fib(n-1)
n = 1
while True:
f = fib(n)
if len(str(f)) >= 1000:
print("#%d: %d" % (n, f))
exit()
n += 1
On my machine, the above code starts going really slow at around the 30th fibonacci number, which is still only 6 digits long.
I modified the above recursive approach to output the number of calls to the fib function for each number, and here are some values:
#1: 1
#10: 67
#20: 8361
#30: 1028457
#40: 126491971
I can reveal that the first Fibonacci number with 1000 digits or more is the 4782th number in the sequence (unless I miscalculated), and so the number of calls to the fib function in a recursive approach will be this number:
1322674645678488041058897524122997677251644370815418243017081997189365809170617080397240798694660940801306561333081985620826547131665853835988797427277436460008943552826302292637818371178869541946923675172160637882073812751617637975578859252434733232523159781720738111111789465039097802080315208597093485915332193691618926042255999185137115272769380924184682248184802491822233335279409301171526953109189313629293841597087510083986945111011402314286581478579689377521790151499066261906574161869200410684653808796432685809284286820053164879192557959922333112075826828349513158137604336674826721837135875890203904247933489561158950800113876836884059588285713810502973052057892127879455668391150708346800909439629659013173202984026200937561704281672042219641720514989818775239313026728787980474579564685426847905299010548673623281580547481750413205269166454195584292461766536845931986460985315260676689935535552432994592033224633385680958613360375475217820675316245314150525244440638913595353267694721961
And that is just for the 4782th number. The actual value is the sum of all those values for all the fibonacci numbers from 1 up to 4782. There is no way this will ever complete.
In fact, if we would give the code 1 year of running time (simplified as 365 days), and assuming that the machine could make 10.000.000.000 calls every second, the algorithm would get as far as to the 83rd number, which is still only 18 digits long.
Actually, althought the advice given above to avoid floating-point numbers is generally good advice for Project Euler problems, in this case it is incorrect. Fibonacci numbers can be computed by the formula F_n = phi^n / sqrt(5), so that the first fibonacci number greater than a thousand digits can be computed as 10^999 < phi^n / sqrt(5). Taking the logarithm to base ten of both sides -- recall that sqrt(5) is the same as 5^(1/2) -- gives 999 < n log_10(phi) - 1/2 log_10(5), and solving for n gives (999 + 1/2 log_10(5)) / log_10(phi) < n. The left-hand side of that equation evaluates to 4781.85927, so the smallest n that gives a thousand digits is 4782.
You can use the sliding window trick to compute the terms of the Fibonacci sequence iteratively, rather than using the closed form (or doing it recursively as it's normally defined).
The Python version for finding fib(n) is as follows:
def fib(n):
a = 1
b = 1
for i in range(2, n):
b = a + b
a = b - a
return b
This works when F(1) is defined as 1, as it is in Project Euler 25.
I won't give the exact solution to the problem here, but the code above can be reworked so it keeps track of n until a sentry value (10**999) is reached.
An iterative solution such as this one has no trouble executing. I get the answer in less than a second.
def fibonacci():
current = 0
previous = 1
while True:
temp = current
current = current + previous
previous = temp
yield current
def main():
for index, element in enumerate(fibonacci()):
if len(str(element)) >= 1000:
answer = index + 1 #starts from 0
break
print(answer)
import math as m
import time
start = time.time()
fib0 = 0
fib1 = 1
n = 0
k = 0
count = 1
while k<1000 :
n = fib0 + fib1
k = int(m.log10(n))+1
fib0 = fib1
fib1 = n
count += 1
print n
print count
print time.time()-start
takes 0.005388 s on my pc. did nothing fancy just followed simple code.
Iteration will always be better. Recursion was taking to long for me as well.
Also used a math function for calculating the number of digits in a number instead of taking the number in a list and iterating through it. Saves a lot of time
Here is my very simple solution
list = [1,1,2]
for i in range(2,5000):
if len(str(list[i]+list[i-1])) == 1000:
print (i + 2)
break
else:
list.append(list[i]+list[i-1])
This is sort of a "rogue" way of doing it, but if you change the 1000 to any number except one, it gets it right.
You can use the datatype Decimal. This is a little bit slower but you will be able to have arbitrary precision.
So your code:
'''
What is the first term in the Fibonacci sequence to contain 1000 digits
'''
from Decimal import *
def fibonacci(n):
phi = (Decimal(1) + pow(Decimal(5), Decimal(0.5))) / 2 #Golden Ratio
return int((pow(phi, Decimal(n))) - pow(-phi, Decimal(-n)))/pow(Decimal(5), Decimal(0.5)))
n = 0
while len(str(fibonacci(n))) < 1000:
n += 1
print n

Categories