Speed up a for loop [duplicate] - python

Does anyone know how to write a program in Python that will calculate the addition of the harmonic series. i.e. 1 + 1/2 +1/3 +1/4...

#Kiv's answer is correct but it is slow for large n if you don't need an infinite precision. It is better to use an asymptotic formula in this case:
#!/usr/bin/env python
from math import log
def H(n):
"""Returns an approximate value of n-th harmonic number.
http://en.wikipedia.org/wiki/Harmonic_number
"""
# Euler-Mascheroni constant
gamma = 0.57721566490153286060651209008240243104215933593992
return gamma + log(n) + 0.5/n - 1./(12*n**2) + 1./(120*n**4)
#Kiv's answer for Python 2.6:
from fractions import Fraction
harmonic_number = lambda n: sum(Fraction(1, d) for d in xrange(1, n+1))
Example:
>>> N = 100
>>> h_exact = harmonic_number(N)
>>> h = H(N)
>>> rel_err = (abs(h - h_exact) / h_exact)
>>> print n, "%r" % h, "%.2g" % rel_err
100 5.1873775176396242 6.8e-16
At N = 100 relative error is less then 1e-15.

#recursive's solution is correct for a floating point approximation. If you prefer, you can get the exact answer in Python 3.0 using the fractions module:
>>> from fractions import Fraction
>>> def calc_harmonic(n):
... return sum(Fraction(1, d) for d in range(1, n + 1))
...
>>> calc_harmonic(20) # sum of the first 20 terms
Fraction(55835135, 15519504)
Note that the number of digits grows quickly so this will require a lot of memory for large n. You could also use a generator to look at the series of partial sums if you wanted to get really fancy.

Just a footnote on the other answers that used floating point; starting with the largest divisor and iterating downward (toward the reciprocals with largest value) will put off accumulated round-off error as much as possible.

A fast, accurate, smooth, complex-valued version of the H function can be calculated using the digamma function as explained here. The Euler-Mascheroni (gamma) constant and the digamma function are available in the numpy and scipy libraries, respectively.
from numpy import euler_gamma
from scipy.special import digamma
def digamma_H(s):
""" If s is complex the result becomes complex. """
return digamma(s + 1) + euler_gamma
from fractions import Fraction
def Kiv_H(n):
return sum(Fraction(1, d) for d in xrange(1, n + 1))
def J_F_Sebastian_H(n):
return euler_gamma + log(n) + 0.5/n - 1./(12*n**2) + 1./(120*n**4)
Here's a comparison of the three methods for speed and precision (with Kiv_H for reference):
Kiv_H(x) J_F_Sebastian_H(x) digamma_H(x)
x seconds bits seconds bits seconds bits
1 5.06e-05 exact 2.47e-06 8.8 1.16e-05 exact
10 4.45e-04 exact 3.25e-06 29.5 1.17e-05 52.6
100 7.64e-03 exact 3.65e-06 50.4 1.17e-05 exact
1000 7.62e-01 exact 5.92e-06 52.9 1.19e-05 exact

The harmonic series diverges, i.e. its sum is infinity..
edit: Unless you want partial sums, but you weren't really clear about that.

This ought to do the trick.
def calc_harmonic(n):
return sum(1.0/d for d in range(2,n+1))

How about this:
partialsum = 0
for i in xrange(1,1000000):
partialsum += 1.0 / i
print partialsum
where 1000000 is the upper bound.

Homework?
It's a divergent series, so it's impossible to sum it for all terms.
I don't know Python, but I know how to write it in Java.
public class Harmonic
{
private static final int DEFAULT_NUM_TERMS = 10;
public static void main(String[] args)
{
int numTerms = ((args.length > 0) ? Integer.parseInt(args[0]) : DEFAULT_NUM_TERMS);
System.out.println("sum of " + numTerms + " terms=" + sum(numTerms));
}
public static double sum(int numTerms)
{
double sum = 0.0;
if (numTerms > 0)
{
for (int k = 1; k <= numTerms; ++k)
{
sum += 1.0/k;
}
}
return sum;
}
}

Using the simple for loop
def harmonicNumber(n):
x=0
for i in range (0,n):
x=x+ 1/(i+1)
return x

I add another solution, this time using recursion, to find the n-th Harmonic number.
General implementation details
Function Prototype: harmonic_recursive(n)
Function Parameters: n - the n-th Harmonic number
Base case: If n equals 1 return 1.
Recur step: If not the base case, call harmonic_recursive for the n-1 term and add that result with 1/n. This way we add each time the i-th term of the Harmonic series with the sum of all the previous terms until that point.
Pseudocode
(this solution can be implemented easily in other languages too.)
harmonic_recursive(n):
if n == 1:
return 1
else:
return 1/n + harmonic_recursive(n-1)
Python code
def harmonic_recursive(n):
if n == 1:
return 1
else:
return 1.0/n + harmonic_recursive(n-1)

By using the numpy module, you can also alternatively use:
import numpy as np
def HN(n):
return sum(1/arange(1,n+1))

Related

Split up a number as the sum of two powers

x is my input.
I need to find i,j>=0 and n,m>1 such as x = i**m+j**n
For now I have been doing this , but it is way to slow !! How can I improve it ?
from math import sqrt
import numpy as np
def check(x):
for i in range(1,int(np.ceil(sqrt(x)))):
for j in range(1,int(np.ceil(sqrt(x)))):
for m in range(2,x/2+1):
for n in range(2,x/2+1):
if((pow(i,m) +pow(j,n))==x):
print 'Yes';
return ;
print 'No';
thank you !
You could reverse the process, by finding all powers (i**m) smaller than x. Then you just check if any pair of these powers adds up to x.
def check(x):
all_powers = set([1]) #add 1 as a special case
#find all powers smaller than x
for base in range(2,int(math.ceil(sqrt(x)))):
exponent = 2;
while pow(base, exponent) < x:
all_powers.add(pow(base, exponent))
exponent+=1
#check if a pair of elements in all_powers adds up to x
for power in all_powers:
if (x - power) in all_powers:
print 'Yes'
return
print 'No'
The code above is simple but can be optimized, e.g., by integrating the check if a pair adds up to x in the while loop you can stop early in most cases.
From time to time a question pops up here about determining if a positive integer is the integral power of another positive integer. I.e. given positive integer z find positive integers j and n such that z == j**n. This can be done in time complexity O(log(z)) so it is fairly fast.
So find or develop such a routine: call it is_a_power(z) which returns a tuple (j, n) if z is such a power of (0, 0) if it is not. Then loop over i and m, then check if x - i**m is a power. When it becomes one, you are done.
I'll let you finish the code from here, except for one more pointer. Given x and i where i > 1, you can find the upper limit on m such that i**m <= x with
m <= log(x) / log(i)
Note that i == 1 is a special case, since i**m does not actually depend on m in that case.
from math import sqrt
import numpy as np
def build(x):
# this function creates number that are in form of
# a^b such that a^b <= x and b>1
sq=sqrt(x);
dict[1]:1; # 1 is always obtainable
dict[0]:1; # also 0 is always obtainable
for i in range(1,sq): # try the base
number=i*i; # firstly our number is i^2
while number<=x:
dict[number]:1; # this number is in form of a^b
number*=i; # increase power of the number
def check(x):
sq=sqrt(x);
for i in range(1,sq): # we will try base of the first number
firstnumber=1;
while firstnumber<=x: # we are trying powers of i
remaining=x-firstnumber; # this number is remaining number when we substract firstnumber from x
if dict[remaining]==1: # if remaining number is in dictionary which means it is representable as a^b
print("YES"); # then print YES
return ;
firstnumber*=i; # increase the power of the base
print("NO");
return ;
Above code works in O( sqrt(x) * log(x) * log(x) ) which is faster.
You can read comments in code to understand it.

Sum of powers for lists of tuples

My assignment is to create a function to sum the powers of tuples.
def sumOfPowers(tups, primes):
x = 0;
for i in range (1, len(primes) + 1):
x += pow(tups, i);
return x;
So far I have this.
tups - list of one or more tuples, primes - list of one or more primes
It doesn't work because the inputs are tuples and not single integers. How could I fix this to make it work for lists?
[/edit]
Sample output:
sumOfPowers([(2,3), (5,6)], [3,5,7,11,13,17,19,23,29]) == 2**3 + 5**6
True
sumOfPowers([(2,10**1000000 + 1), (-2,10**1000000 + 1), (3,3)], primes)
27
Sum of powers of [(2,4),(3,5),(-6,3)] is 2^4 + 3^5 + (−6)^3
**The purpose of the prime is to perform the computation of a^k1 + ... a^kn modulo every prime in the list entered. (aka perform the sum computation specified by each input modulo each of the primes in the second input list, then solve using the chinese remainder theorem )
Primes list used in the example input:
15481619,15481633,15481657,15481663,15481727,15481733,15481769,15481787
,15481793,15481801,15481819,15481859,15481871,15481897,15481901,15481933
,15481981,15481993,15481997,15482011,15482023,15482029,15482119,15482123
,15482149,15482153,15482161,15482167,15482177,15482219,15482231,15482263
,15482309,15482323,15482329,15482333,15482347,15482371,15482377,15482387
,15482419,15482431,15482437,15482447,15482449,15482459,15482477,15482479
,15482531,15482567,15482569,15482573,15482581,15482627,15482633,15482639
,15482669,15482681,15482683,15482711,15482729,15482743,15482771,15482773
,15482783,15482807,15482809,15482827,15482851,15482861,15482893,15482911
,15482917,15482923,15482941,15482947,15482977,15482993,15483023,15483029
,15483067,15483077,15483079,15483089,15483101,15483103,15483121,15483151
,15483161,15483211,15483253,15483317,15483331,15483337,15483343,15483359
,15483383,15483409,15483449,15483491,15483493,15483511,15483521,15483553
,15483557,15483571,15483581,15483619,15483631,15483641,15483653,15483659
,15483683,15483697,15483701,15483703,15483707,15483731,15483737,15483749
,15483799,15483817,15483829,15483833,15483857,15483869,15483907,15483971
,15483977,15483983,15483989,15483997,15484033,15484039,15484061,15484087
,15484099,15484123,15484141,15484153,15484187,15484199,15484201,15484211
,15484219,15484223,15484243,15484247,15484279,15484333,15484363,15484387
,15484393,15484409,15484421,15484453,15484457,15484459,15484471,15484489
,15484517,15484519,15484549,15484559,15484591,15484627,15484631,15484643
,15484661,15484697,15484709,15484723,15484769,15484771,15484783,15484817
,15484823,15484873,15484877,15484879,15484901,15484919,15484939,15484951
,15484961,15484999,15485039,15485053,15485059,15485077,15485083,15485143
,15485161,15485179,15485191,15485221,15485243,15485251,15485257,15485273
,15485287,15485291,15485293,15485299,15485311,15485321,15485339,15485341
,15485357,15485363,15485383,15485389,15485401,15485411,15485429,15485441
,15485447,15485471,15485473,15485497,15485537,15485539,15485543,15485549
,15485557,15485567,15485581,15485609,15485611,15485621,15485651,15485653
,15485669,15485677,15485689,15485711,15485737,15485747,15485761,15485773
,15485783,15485801,15485807,15485837,15485843,15485849,15485857,15485863
I am not quite sure if I understand you correctly, but maybe you are looking for something like this:
from functools import reduce
def sumOfPowersModuloPrimes (tups, primes):
return [reduce(lambda x, y: (x + y) % p, (pow (b, e, p) for b, e in tups), 0) for p in primes]
You shouldn't run into any memory issues as your (intermediate) values never exceed max(primes). If your resulting list is too large, then return a generator and work with it instead of a list.
Ignoring primes, since they don't appear to be used for anything:
def sumOfPowers(tups, primes):
return sum( pow(x,y) for x,y in tups)
Is it possible that you are supposed to compute the sum modulo one or more of the prime numbers? Something like
2**3 + 5**2 mod 3 = 8 + 25 mod 3 = 33 mod 3 = 0
(where a+b mod c means to take the remainder of the sum a+b after dividing by c).
One guess at how multiple primes would be used is to use the product of the primes as the
divisor.
def sumOfPower(tups, primes):
# There are better ways to compute this product. Loop
# is for explanatory purposes only.
c = 1
for p in primes:
p *= c
return sum( pow(x,y,c) for x,y in tups)
(I also seem to remember that a mod pq == (a mod p) mod q if p and q are both primes, but I could be mistaken.)
Another is to return one sum for each prime:
def sumOfPower(tups, primes):
return [ sum( pow(x,y,c) for x,y in tups ) for c in primes ]
def sumOfPowers (powerPairs, unusedPrimesParameter):
sum = 0
for base, exponent in powerPairs:
sum += base ** exponent
return sum
Or short:
def sumOfPowers (powerPairs, unusedPrimesParameter):
return sum(base ** exponent for base, exponent in powerPairs)
perform the sum computation specified by each input modulo each of the primes in the second input list
That’s a completely different thing. However, you still haven’t really explained what your function is supposed to do and how it should work. Given that you mentioned Euler's theorem and the Chinese remainder theorem, I guess there is a lot more to it than you actually made us believe. You probably want to solve the exponentiations by using Euler's theorem to reduce those large powers. I’m not willing to further guess what is going on though; this seems to involve a non-trivial math problem you should solve on the paper first.
def sumOfPowers (powerPairs, primes):
for prime in primes:
sum = 0
for base, exponent in powerPairs:
sum += pow(base, exponent, prime)
# do something with the sum here
# Chinese remainder theorem?
return something

Finding digits in powers of 2 fast

The task is to search every power of two below 2^10000, returning the index of the first power in which a string is contained. For example if the given string to search for is "7" the program will output 15, as 2^15 is the first power to contain 7 in it.
I have approached this with a brute force attempt which times out on ~70% of test cases.
for i in range(1,9999):
if search in str(2**i):
print i
break
How would one approach this with a time limit of 5 seconds?
Try not to compute 2^i at each step.
pow = 1
for i in xrange(1,9999):
if search in str(pow):
print i
break
pow *= 2
You can compute it as you go along. This should save a lot of computation time.
Using xrange will prevent a list from being built, but that will probably not make much of a difference here.
in is probably implemented as a quadratic string search algorithm. It may (or may not, you'd have to test) be more efficient to use something like KMP for string searching.
A faster approach could be computing the numbers directly in decimal
def double(x):
carry = 0
for i, v in enumerate(x):
d = v*2 + carry
if d > 99999999:
x[i] = d - 100000000
carry = 1
else:
x[i] = d
carry = 0
if carry:
x.append(carry)
Then the search function can become
def p2find(s):
x = [1]
for y in xrange(10000):
if s in str(x[-1])+"".join(("00000000"+str(y))[-8:]
for y in x[::-1][1:]):
return y
double(x)
return None
Note also that the digits of all powers of two up to 2^10000 are just 15 millions, and searching the static data is much faster. If the program must not be restarted each time then
def p2find(s, digits = []):
if len(digits) == 0:
# This precomputation happens only ONCE
p = 1
for k in xrange(10000):
digits.append(str(p))
p *= 2
for i, v in enumerate(digits):
if s in v: return i
return None
With this approach the first check will take some time, next ones will be very very fast.
Compute every power of two and build a suffix tree using each string. This is linear time in the size of all the strings. Now, the lookups are basically linear time in the length of each lookup string.
I don't think you can beat this for computational complexity.
There are only 10000 numbers. You don't need any complex algorithms. Simply calculated them in advance and do search. This should take merely 1 or 2 seconds.
powers_of_2 = [str(1<<i) for i in range(10000)]
def search(s):
for i in range(len(powers_of_2)):
if s in powers_of_2[i]:
return i
Try this
twos = []
twoslen = []
two = 1
for i in xrange(10000):
twos.append(two)
twoslen.append(len(str(two)))
two *= 2
tens = []
ten = 1
for i in xrange(len(str(two))):
tens.append(ten)
ten *= 10
s = raw_input()
l = len(s)
n = int(s)
for i in xrange(len(twos)):
for j in xrange(twoslen[i]):
k = twos[i] / tens[j]
if k < n: continue
if (k - n) % tens[l] == 0:
print i
exit()
The idea is to precompute every power of 2, 10 and and also to precompute the number of digits for every power of 2. In this way the problem is reduces to finding the minimum i for which there exist a j such that after removing the last j digits from 2 ** i you obtain a number which ends with n or expressed as a formula (2 ** i / 10 ** j - n) % 10 ** len(str(n)) == 0.
A big problem here is that converting a binary integer to decimal notation takes time quadratic in the number of bits (at least in the straightforward way Python does it). It's actually faster to fake your own decimal arithmetic, as #6502 did in his answer.
But it's very much faster to let Python's decimal module do it - at least under Python 3.3.2 (I don't know how much C acceleration is built in to Python decimal versions before that). Here's code:
class S:
def __init__(self):
import decimal
decimal.getcontext().prec = 4000 # way more than enough for 2**10000
p2 = decimal.Decimal(1)
full = []
for i in range(10000):
s = "%s<%s>" % (p2, i)
##assert s == "%s<%s>" % (str(2**i), i)
full.append(s)
p2 *= 2
self.full = "".join(full)
def find(self, s):
import re
pat = s + "[^<>]*<(\d+)>"
m = re.search(pat, self.full)
if m:
return int(m.group(1))
else:
print(s, "not found!")
and sample usage:
>>> s = S()
>>> s.find("1")
0
>>> s.find("2")
1
>>> s.find("3")
5
>>> s.find("65")
16
>>> s.find("7")
15
>>> s.find("00000")
1491
>>> s.find("666")
157
>>> s.find("666666")
2269
>>> s.find("66666666")
66666666 not found!
s.full is a string with a bit over 15 million characters. It looks like this:
>>> print(s.full[:20], "...", s.full[-20:])
1<0>2<1>4<2>8<3>16<4 ... 52396298354688<9999>
So the string contains each power of 2, with the exponent following a power enclosed in angle brackets. The find() method constructs a regular expression to search for the desired substring, then look ahead to find the power.
Playing around with this, I'm convinced that just about any way of searching is "fast enough". It's getting the decimal representations of the large powers that sucks up the vast bulk of the time. And the decimal module solves that one.

Calculating nth fibonacci number using the formulae in python

I am calculating the n-th fibonacci number using
(a) a linear approach, and
(b) this expression
Python code:
'Different implementations for computing the n-th fibonacci number'
def lfib(n):
'Find the n-th fibonacci number iteratively'
a, b = 0, 1
for i in range(n):
a, b = b, a + b
return a
def efib(n):
'Compute the n-th fibonacci number using the formulae'
from math import sqrt, floor
x = (1 + sqrt(5))/2
return long(floor((x**n)/sqrt(5) + 0.5))
if __name__ == '__main__':
for i in range(60,80):
if lfib(i) != efib(i):
print i, "lfib:", lfib(i)
print " efib:", efib(i)
For n > 71 I see that the two functions return different values.
Is this due to floating point arithmetic involved in efib()?
If so, is it then advisable to calculate the number using the matrix form?
You are indeed seeing rounding errors.
The matrix form is the more accurate and much faster algorithm. Literateprograms.org lists a good implementation, but it also lists the following algorithm based on Lucas numbers:
def powLF(n):
if n == 1: return (1, 1)
L, F = powLF(n//2)
L, F = (L**2 + 5*F**2) >> 1, L*F
if n & 1:
return ((L + 5*F)>>1, (L + F) >>1)
else:
return (L, F)
def fib(n):
if n & 1:
return powLF(n)[1]
else:
L, F = powLF(n // 2)
return L * F
Take a look at Lecture 3 of the MIT Open Courseware course on algorithms for a good analysis of the matrix approach.
Both the above algorithm and the matrix approach has Θ(lg n) complexity, just like the naive recursive squaring method you used, yet without the rounding problems. The Lucas numbers approach has the lowest constant cost, making it the faster algorithm (about twice as fast as the matrix approach):
>>> timeit.timeit('fib(1000)', 'from __main__ import fibM as fib', number=10000)
0.40711593627929688
>>> timeit.timeit('fib(1000)', 'from __main__ import fibL as fib', number=10000)
0.20211100578308105
Is this due to floating point arithmetic involved in efib()?
Yes, it is. Within efib you have
>>> log(x**72)/log(2)
49.98541778140445
and Python floats have about 53 bits of precision on x86-64 hardware, so you're running close to the edge.
I have a very simple purely python code...
def fibonum(n): # Give the nth fibonacci number
x=[0,1]
for i in range(2,n):
x.append(x[i-2]+x[i-1])
print(x[n-1])

How do you a double factorial in python?

I've been stucked on this question for a really long time.
I've managed to do a single recursive factorial.
def factorial(n):
if n == 0:
return 1
else:
return n * factorial(n-1)
Double factorial
For an even integer n, the double factorial is the product of all even positive integers less than or equal to n. For an odd integer p, the double factorial is the product of all odd positive integers less than or equal to p.
If n is even, then n!! = n*(n - 2)*(n - 4)*(n - 6)* ... *4*2
If p is odd, then p!! = p*(p - 2)*(p - 4)*(p - 6)* ... *3*1
But I have no idea to do a double factorial. Any help?
from functools import reduce # only in Python 3
reduce(int.__mul__, range(n, 0, -2))
Isn't that just the same as the factorial with a different ending condition and a different parameter to the recursion call?
def doublefactorial(n):
if n <= 0:
return 1
else:
return n * doublefactorial(n-2)
If n is even, then it will halt when n == 0. If n is odd, then it will halt when n == -1.
The problem here is that the double factorial is defined for negative real numbers (-1)!! = 1, (-3)!! = -1 (even negative integers (such -2, -4, ...) should have solution as +/- inf) so... something is smelling bad in all solutions for negative numbers. If one want to define the double factorial for al reals those solutions don't work. The solution is to define the double factorial using gamma function.
import scipy.special as sp
from numpy import pi
def dfact(x):
n = (x + 1.)/2.
return 2.**n * sp.gamma(n + 0.5)/(pi**(0.5))
It works! :D
Starting Python 3.8, we can use the prod function from the math module which calculates the product of all elements in an iterable, which in our case is range(n, 0, -2):
import math
math.prod(range(n, 0, -2))
Note that this also handles the case n = 0 in which case the result is 1.
My version of the recursive solution, in one line:
dfact = lambda n: (n <= 0) or n * dfact(n-2)
However, it is also interesting to note that the double factorial can be expressed in terms of the "normal" factorial. For odd numbers,
n!! = (2*k)! / (2**k * k!)
where k = (n+1)/2. For even arguments n=2k, although this is not consistent with a generalization to complex arguments, the expression is simpler,
n!! = (2k)!! = 2*k * k!.
All this means that you can write code using the factorial function from the standard math library, which is always nice:
import math
fact = math.factorial
def dfact(n):
if n % 2 == 1:
k = (n+1)/2
return fact(2*k) / (2**k * fact(k))
else:
return 2**k * fact(k)
Now, this code is obviously not very efficient for large n, but it is quite instructive. More importantly, since we are dealing with standard factorials now, it is a very good starting point for optimizations when dealing with really large numbers. You try to use logarithms or gamma functions to get approximate double factorials for large numbers.
def doublefactorial(n):
if n in (0, 1):
return 1
else:
return n * doublefactorial(n-2)
should do it.
def double_fact(number):
if number==0 or number==1:
return 1
else:
return number*double_fact(number-2)
I think this should work for you.
I hope I understand it correctly, but will this work
def factorial(n):
if n == 0 or n == 1:
return 1
else:
return n * factorial(n-2)
reduce(lambda x,y: y*x, range(n,1,-2))
Which is basically the same as the simple iterative version:
x = n
for y in range(n-2, 1, -2):
x*=y
Obviously you can also do it recursively, but what's the point ? This kind of example implemented using recursivity are fine when using all recursive languages, but with imperative language it's always making simple tools like recursivity looking more complex than necessary, while recursivity can be a real simplifier when dealing with fundamentally recursive structures like trees.
def doublefactorial(n):
if n <= 0:
return 1
else:
return n * doublefactorial(n-2)
That should do it. Unless I'm misunderstanding

Categories