A very important equations in statistical mechanics is Stirling approximation for large num-bers, lnN! =NlnN−N (N >>1). Write a Python program to verify this approximation. More specifically, evaluate the ratio lnN!/NlnN−N for N= 1000000.
Here is my program, but I can't get it to work. It doesn't give me an error, just Python breaks. I haven't been taught a lot of numpy, so I haven't been using that.
from math import log
N = 1000000
N_factorial=1
for i in range(1,N + 1):
N_factorial = N_factorial*i
a = log(N_factorial)
b = N*log(N)-N
print(a/b)
You can just use the math.factorial() function:
>>> import math
>>> n = 1000000
>>> math.log(math.factorial(n))/(n*math.log(n)-n)
1.0000006107204127
However, using the logarithm product rule, you can sum the natural log of the factors of n (since log(a*b) = log(a) + log(b), log(a!) = log(a) + log(a-1) + log(a-2) + ... + log(2) + log(1))
>>> import math
>>> n = 1000000
>>> sum([math.log(i+1) for i in range(n)])/(n*math.log(n)-n)
LOL, why is everyone posting to do this directly.
Clearly, you should be adding, not attempting to calculate a number with 6 million digits. Hence, you should have code like:
def logfac(n):
return sum(math.log(i) for i in range(1,n+1))
def sterling(n):
return n*math.log(n) − n
n = 1000000
print(logfac(n)/sterling(n))
Related
Im working on RSA so I'm dealing with very large numbers (308 digits). In RSA a number N is the product of 2 primes p and q.
My N:
20254083928313901046078299908836135556415829454193867459405514358320313885965296062600909040071281223146837763723113350068483510086809787065437344845044248205975654791622356467691953988928774211033663314876745580293750456921795999384782277674803240671474563131823612882192899349325870727676292313218782419561
For the task I'm completing, I have been given N and am trying to find the primes p and q by implementing the method from this other post: https://crypto.stackexchange.com/questions/87417/finding-p-and-q-in-rsa-with-a-given-n-p-q10000.
When I square root N I get:
4500453746936401829977490795263804776361530154559603855210407318900755249674017838942492466443373259250056015327414929135301293865748694108450793034088448
And when I square this number I would expect to get N back, however, I get:
20254083928313899038600080147064458144896171593553283932412228091641105206147936089547530020826698707611325067918592113664216112071557998883417732874096894330570809935758528713783460134686650819864956839352000831110894044634083630533310853814832242550420262010702947392454262240042077177552422858018628042752
I'm not sure why I'm getting this result so any help would greatly be appreciated.
My code:
modulo = 20254083928313901046078299908836135556415829454193867459405514358320313885965296062600909040071281223146837763723113350068483510086809787065437344845044248205975654791622356467691953988928774211033663314876745580293750456921795999384782277674803240671474563131823612882192899349325870727676292313218782419561
sqrt = math.sqrt(modulo)
print('%i' %(sqrt))
print('%i' %(sqrt*sqrt))
There are a couple of related algorithms that can be used, and the answer in your linked question shows some of the math behind it. The one I'll show is named Fermat's factorization method.
In python it's relatively easy to implement. I've implemented it myself from the Basic Method writeup in the wikipedia article.
def is_perfect_square(x: int):
"""
If x is a perfect square then True, sqrt(x) is returned, else False ceiling(sqrt(x)) is returned
"""
sqrt = math.isqrt(x)
return (True, sqrt) if sqrt ** 2 == x else (False, sqrt + 1)
def fermat_factor(n: int):
is_square, a = is_perfect_square(n)
b2 = a ** 2 - n
while True:
is_square, sqrt = is_perfect_square(b2)
if is_square:
break
b2 += a + a + 1
a += 1
return a + sqrt, a - sqrt
This method is fast only when there is a factor close to the square root of the number you're trying to factor, and this will be true if n = p*q and p and q are relatively close together as in your problem.
The number you are testing with is not a perfect square. As such this means that the returned value has been rounded to some degree.
This will mean you are getting a rounding error due to the size of the value you are loosing accuracy. I would recommend looking at explicitly declaring your number as a Decimal. This should help.
Does anyone know how to write a program in Python that will calculate the addition of the harmonic series. i.e. 1 + 1/2 +1/3 +1/4...
#Kiv's answer is correct but it is slow for large n if you don't need an infinite precision. It is better to use an asymptotic formula in this case:
#!/usr/bin/env python
from math import log
def H(n):
"""Returns an approximate value of n-th harmonic number.
http://en.wikipedia.org/wiki/Harmonic_number
"""
# Euler-Mascheroni constant
gamma = 0.57721566490153286060651209008240243104215933593992
return gamma + log(n) + 0.5/n - 1./(12*n**2) + 1./(120*n**4)
#Kiv's answer for Python 2.6:
from fractions import Fraction
harmonic_number = lambda n: sum(Fraction(1, d) for d in xrange(1, n+1))
Example:
>>> N = 100
>>> h_exact = harmonic_number(N)
>>> h = H(N)
>>> rel_err = (abs(h - h_exact) / h_exact)
>>> print n, "%r" % h, "%.2g" % rel_err
100 5.1873775176396242 6.8e-16
At N = 100 relative error is less then 1e-15.
#recursive's solution is correct for a floating point approximation. If you prefer, you can get the exact answer in Python 3.0 using the fractions module:
>>> from fractions import Fraction
>>> def calc_harmonic(n):
... return sum(Fraction(1, d) for d in range(1, n + 1))
...
>>> calc_harmonic(20) # sum of the first 20 terms
Fraction(55835135, 15519504)
Note that the number of digits grows quickly so this will require a lot of memory for large n. You could also use a generator to look at the series of partial sums if you wanted to get really fancy.
Just a footnote on the other answers that used floating point; starting with the largest divisor and iterating downward (toward the reciprocals with largest value) will put off accumulated round-off error as much as possible.
A fast, accurate, smooth, complex-valued version of the H function can be calculated using the digamma function as explained here. The Euler-Mascheroni (gamma) constant and the digamma function are available in the numpy and scipy libraries, respectively.
from numpy import euler_gamma
from scipy.special import digamma
def digamma_H(s):
""" If s is complex the result becomes complex. """
return digamma(s + 1) + euler_gamma
from fractions import Fraction
def Kiv_H(n):
return sum(Fraction(1, d) for d in xrange(1, n + 1))
def J_F_Sebastian_H(n):
return euler_gamma + log(n) + 0.5/n - 1./(12*n**2) + 1./(120*n**4)
Here's a comparison of the three methods for speed and precision (with Kiv_H for reference):
Kiv_H(x) J_F_Sebastian_H(x) digamma_H(x)
x seconds bits seconds bits seconds bits
1 5.06e-05 exact 2.47e-06 8.8 1.16e-05 exact
10 4.45e-04 exact 3.25e-06 29.5 1.17e-05 52.6
100 7.64e-03 exact 3.65e-06 50.4 1.17e-05 exact
1000 7.62e-01 exact 5.92e-06 52.9 1.19e-05 exact
The harmonic series diverges, i.e. its sum is infinity..
edit: Unless you want partial sums, but you weren't really clear about that.
This ought to do the trick.
def calc_harmonic(n):
return sum(1.0/d for d in range(2,n+1))
How about this:
partialsum = 0
for i in xrange(1,1000000):
partialsum += 1.0 / i
print partialsum
where 1000000 is the upper bound.
Homework?
It's a divergent series, so it's impossible to sum it for all terms.
I don't know Python, but I know how to write it in Java.
public class Harmonic
{
private static final int DEFAULT_NUM_TERMS = 10;
public static void main(String[] args)
{
int numTerms = ((args.length > 0) ? Integer.parseInt(args[0]) : DEFAULT_NUM_TERMS);
System.out.println("sum of " + numTerms + " terms=" + sum(numTerms));
}
public static double sum(int numTerms)
{
double sum = 0.0;
if (numTerms > 0)
{
for (int k = 1; k <= numTerms; ++k)
{
sum += 1.0/k;
}
}
return sum;
}
}
Using the simple for loop
def harmonicNumber(n):
x=0
for i in range (0,n):
x=x+ 1/(i+1)
return x
I add another solution, this time using recursion, to find the n-th Harmonic number.
General implementation details
Function Prototype: harmonic_recursive(n)
Function Parameters: n - the n-th Harmonic number
Base case: If n equals 1 return 1.
Recur step: If not the base case, call harmonic_recursive for the n-1 term and add that result with 1/n. This way we add each time the i-th term of the Harmonic series with the sum of all the previous terms until that point.
Pseudocode
(this solution can be implemented easily in other languages too.)
harmonic_recursive(n):
if n == 1:
return 1
else:
return 1/n + harmonic_recursive(n-1)
Python code
def harmonic_recursive(n):
if n == 1:
return 1
else:
return 1.0/n + harmonic_recursive(n-1)
By using the numpy module, you can also alternatively use:
import numpy as np
def HN(n):
return sum(1/arange(1,n+1))
This question already has answers here:
Integer square root in python
(14 answers)
Closed 2 months ago.
I was working with numbers of 200 digits in python. When finding the square root of a number using math.sqrt(n) I am getting a wrong answer.
In[1]: n=9999999999999999999999999999999999999999999999999999999999999999999999
999999999999999999999999998292000000000000000000000000000000000000000000
0000000000000000000000000000000000000000000000000000726067
In[2]: x=int(math.sqrt(n))
In[3]: x
Out[1]: 10000000000000000159028911097599180468360808563945281389781327
557747838772170381060813469985856815104L
In[4]: x*x
Out[2]: 1000000000000000031805782219519836346574107361670094060730052612580
0264077231077619856175974095677538298443892851483731336069235827852
3336313169161345893842466001164011496325176947445331439002442530816L
In[5]: math.sqrt(n)
Out[3]: 1e+100
The value of x is coming larger than expected since x*x (201 digits) is larger than n (200 digits). What is happening here? Is there some concept I am getting wrong here? How else can I find the root of very large numbers?
Using the decimal module:
import decimal
D = decimal.Decimal
n = D(99999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999982920000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000726067)
with decimal.localcontext() as ctx:
ctx.prec = 300
x = n.sqrt()
print(x)
print(x*x)
print(n-x*x)
yields
9999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999145.99999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999983754999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999998612677
99999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999982920000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000726067.0000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000
0E-100
math.sqrt returns an IEEE-754 64-bit result, which is roughly 17 digits. There are other libraries that will work with high-precision values. In addition to the decimal and mpmath libraries mentioned above, I maintain the gmpy2 library (https://code.google.com/p/gmpy/).
>>> import gmpy2
>>> n=gmpy2.mpz(99999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999982920000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000726067)
>>> gmpy2.get_context().precision=2048
>>> x=gmpy2.sqrt(n)
>>> x*x
mpfr('99999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999982920000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000726067.0',2048)
>>>
The gmpy2 library can also return integer square roots (isqrt) or quickly check if an integer is an exact square (is_square).
Here's an integer square root program using Hero's method that I wrote a while back. For the initial approximation it uses a number of half the bit length of the input value, so it starts converging pretty quickly. However, I haven't timed it to see if it's faster in Python than just using a simpler initial approximation. :)
#! /usr/bin/env python
''' Long integer square roots. Newton's method.
Written by PM 2Ring. Adapted from C to Python 2008.10.19
'''
import sys
def root(m):
# Get initial approximation
n, a, k = m, 1, 0
while n > a:
n >>= 1
a <<= 1
k += 1
#print k, ':', n, a
# Go back one step & average
a = n + (a>>2)
#print a
# Apply Newton's method
while k:
a = (a + m // a) >> 1
k >>= 1
#print k, ':', a
return a
def main():
m = len(sys.argv) > 1 and int(sys.argv[1]) or 2*10L**100
print "The Square Root of", m
print root(m)
if __name__ == '__main__':
main()
The task is to search every power of two below 2^10000, returning the index of the first power in which a string is contained. For example if the given string to search for is "7" the program will output 15, as 2^15 is the first power to contain 7 in it.
I have approached this with a brute force attempt which times out on ~70% of test cases.
for i in range(1,9999):
if search in str(2**i):
print i
break
How would one approach this with a time limit of 5 seconds?
Try not to compute 2^i at each step.
pow = 1
for i in xrange(1,9999):
if search in str(pow):
print i
break
pow *= 2
You can compute it as you go along. This should save a lot of computation time.
Using xrange will prevent a list from being built, but that will probably not make much of a difference here.
in is probably implemented as a quadratic string search algorithm. It may (or may not, you'd have to test) be more efficient to use something like KMP for string searching.
A faster approach could be computing the numbers directly in decimal
def double(x):
carry = 0
for i, v in enumerate(x):
d = v*2 + carry
if d > 99999999:
x[i] = d - 100000000
carry = 1
else:
x[i] = d
carry = 0
if carry:
x.append(carry)
Then the search function can become
def p2find(s):
x = [1]
for y in xrange(10000):
if s in str(x[-1])+"".join(("00000000"+str(y))[-8:]
for y in x[::-1][1:]):
return y
double(x)
return None
Note also that the digits of all powers of two up to 2^10000 are just 15 millions, and searching the static data is much faster. If the program must not be restarted each time then
def p2find(s, digits = []):
if len(digits) == 0:
# This precomputation happens only ONCE
p = 1
for k in xrange(10000):
digits.append(str(p))
p *= 2
for i, v in enumerate(digits):
if s in v: return i
return None
With this approach the first check will take some time, next ones will be very very fast.
Compute every power of two and build a suffix tree using each string. This is linear time in the size of all the strings. Now, the lookups are basically linear time in the length of each lookup string.
I don't think you can beat this for computational complexity.
There are only 10000 numbers. You don't need any complex algorithms. Simply calculated them in advance and do search. This should take merely 1 or 2 seconds.
powers_of_2 = [str(1<<i) for i in range(10000)]
def search(s):
for i in range(len(powers_of_2)):
if s in powers_of_2[i]:
return i
Try this
twos = []
twoslen = []
two = 1
for i in xrange(10000):
twos.append(two)
twoslen.append(len(str(two)))
two *= 2
tens = []
ten = 1
for i in xrange(len(str(two))):
tens.append(ten)
ten *= 10
s = raw_input()
l = len(s)
n = int(s)
for i in xrange(len(twos)):
for j in xrange(twoslen[i]):
k = twos[i] / tens[j]
if k < n: continue
if (k - n) % tens[l] == 0:
print i
exit()
The idea is to precompute every power of 2, 10 and and also to precompute the number of digits for every power of 2. In this way the problem is reduces to finding the minimum i for which there exist a j such that after removing the last j digits from 2 ** i you obtain a number which ends with n or expressed as a formula (2 ** i / 10 ** j - n) % 10 ** len(str(n)) == 0.
A big problem here is that converting a binary integer to decimal notation takes time quadratic in the number of bits (at least in the straightforward way Python does it). It's actually faster to fake your own decimal arithmetic, as #6502 did in his answer.
But it's very much faster to let Python's decimal module do it - at least under Python 3.3.2 (I don't know how much C acceleration is built in to Python decimal versions before that). Here's code:
class S:
def __init__(self):
import decimal
decimal.getcontext().prec = 4000 # way more than enough for 2**10000
p2 = decimal.Decimal(1)
full = []
for i in range(10000):
s = "%s<%s>" % (p2, i)
##assert s == "%s<%s>" % (str(2**i), i)
full.append(s)
p2 *= 2
self.full = "".join(full)
def find(self, s):
import re
pat = s + "[^<>]*<(\d+)>"
m = re.search(pat, self.full)
if m:
return int(m.group(1))
else:
print(s, "not found!")
and sample usage:
>>> s = S()
>>> s.find("1")
0
>>> s.find("2")
1
>>> s.find("3")
5
>>> s.find("65")
16
>>> s.find("7")
15
>>> s.find("00000")
1491
>>> s.find("666")
157
>>> s.find("666666")
2269
>>> s.find("66666666")
66666666 not found!
s.full is a string with a bit over 15 million characters. It looks like this:
>>> print(s.full[:20], "...", s.full[-20:])
1<0>2<1>4<2>8<3>16<4 ... 52396298354688<9999>
So the string contains each power of 2, with the exponent following a power enclosed in angle brackets. The find() method constructs a regular expression to search for the desired substring, then look ahead to find the power.
Playing around with this, I'm convinced that just about any way of searching is "fast enough". It's getting the decimal representations of the large powers that sucks up the vast bulk of the time. And the decimal module solves that one.
I am calculating the n-th fibonacci number using
(a) a linear approach, and
(b) this expression
Python code:
'Different implementations for computing the n-th fibonacci number'
def lfib(n):
'Find the n-th fibonacci number iteratively'
a, b = 0, 1
for i in range(n):
a, b = b, a + b
return a
def efib(n):
'Compute the n-th fibonacci number using the formulae'
from math import sqrt, floor
x = (1 + sqrt(5))/2
return long(floor((x**n)/sqrt(5) + 0.5))
if __name__ == '__main__':
for i in range(60,80):
if lfib(i) != efib(i):
print i, "lfib:", lfib(i)
print " efib:", efib(i)
For n > 71 I see that the two functions return different values.
Is this due to floating point arithmetic involved in efib()?
If so, is it then advisable to calculate the number using the matrix form?
You are indeed seeing rounding errors.
The matrix form is the more accurate and much faster algorithm. Literateprograms.org lists a good implementation, but it also lists the following algorithm based on Lucas numbers:
def powLF(n):
if n == 1: return (1, 1)
L, F = powLF(n//2)
L, F = (L**2 + 5*F**2) >> 1, L*F
if n & 1:
return ((L + 5*F)>>1, (L + F) >>1)
else:
return (L, F)
def fib(n):
if n & 1:
return powLF(n)[1]
else:
L, F = powLF(n // 2)
return L * F
Take a look at Lecture 3 of the MIT Open Courseware course on algorithms for a good analysis of the matrix approach.
Both the above algorithm and the matrix approach has Θ(lg n) complexity, just like the naive recursive squaring method you used, yet without the rounding problems. The Lucas numbers approach has the lowest constant cost, making it the faster algorithm (about twice as fast as the matrix approach):
>>> timeit.timeit('fib(1000)', 'from __main__ import fibM as fib', number=10000)
0.40711593627929688
>>> timeit.timeit('fib(1000)', 'from __main__ import fibL as fib', number=10000)
0.20211100578308105
Is this due to floating point arithmetic involved in efib()?
Yes, it is. Within efib you have
>>> log(x**72)/log(2)
49.98541778140445
and Python floats have about 53 bits of precision on x86-64 hardware, so you're running close to the edge.
I have a very simple purely python code...
def fibonum(n): # Give the nth fibonacci number
x=[0,1]
for i in range(2,n):
x.append(x[i-2]+x[i-1])
print(x[n-1])