How to calculate derivative of moment generating function in python? - python

Here is my code so far, I thought I could use scipy but it doesn't give me the right answer for the second derivative, moment(0, 2). My guess is that I'm not applying scipy.misc.derivative correctly and that I should use diffs_exp from sympy but I couldn't get that to work either..
from scipy import misc
import numpy as np
def mgf(s):
mu = 2
sigma = 0.5
mgf = np.exp(mu*s + ((sigma**2)*(s**2))/2)
return mgf
def moment(s, i):
mo = misc.derivative(mgf, s, dx=0.000000001, n=i)
return mo
moment(s, i) evaluates correctly when i=1 but not when i>1. moment(0,2) should equal sigma^2 or .25 but the function returns 0.0 currently
the function will only be evaluated when s=0, the more important part is that the differentiation is correct.

Here's how one would do it symbolically with sympy and numerically evaluate the result for a particular mu, sigma and s
In [1]: from sympy import *
In [2]: mu, sigma, s = symbols("mu sigma s")
In [3]: expr = exp(mu*s+(sigma*s)**2/2)
In [4]: f = lambdify((mu, sigma, s), expr.diff(s, 2))
In [5]: f(2, 0.5, 0)
Out[5]: 4.25

Choosing a good step size for a finite difference scheme is a tricky businness. Too small step and you're doomed because of the round-off error (as you've found). Too large step, and the scheme is too coarse (as you've found as well). scipy.misc.derivative's default step is not very useful, BTW. There is some literature on how to choose a sensible step. Eg, Numerical Recipes has a brief introduction to a simple scheme.
In this particular case finding a sensible step is reasonably easy:
In [41]: from scipy.misc import derivative
In [42]: def f(x):
....: arg = 2.*x + (0.5*x)**2 / 2.
....: return np.exp(arg)
....:
In [53]: derivative(f, 0., dx=1e-5, n=2)
Out[53]: 4.2499981312005266
An alternative is to use a package which does a smarter step size selection (one keyword for internet/literature searches is Romberg extrapolation). For example, numdifftools:
In [57]: import numdifftools as nd
In [59]: fdd = nd.Derivative(f, n=2)
In [60]: fdd(0)
Out[60]: array([ 4.25])

So I mucked around in the source code after misreading the question in the earlier answer(since deleted).
Scipy misc.derivative calculates second order derivative by default as
lim h->0 (f(x+h)-2*f(x)+f(x-h))/h^2
The problem here occurs because the output of np.exp() is float64, which is of limited precision i.e 52 bits for mantissa and 11 for exponent. When we decrease dx, the difference between the terms occurs in higher order digits, which due to limited precision are not present. On summation this vanishes to zero.
For reference, in the above function, the values are
0.999999998, 1, 1.000000002 for f(x+h),f(x),f(x-h) respectively with x=0 and h =1e-9. The solution could be using either function that have larger precision, but that would involve changing scipy source code and is not a small undertaking.(Python pow functions are not arbitrary precision)
Other(practical) option is to use smaller values for dx. dx=1e-2 seems to actually give close enough answer i.e 4.2501848996 as compared to 4.25 which is the actual second derivative.

Related

Can scipy.optimize minimize functions of complex variables at all and how?

I am trying to minimize a function of a complex (vector) variable using scipy.optimize. My results so far indicate that it may not be possible. To investigate the problem, I have implemented a simple example - minimize the 2-norm of a complex vector with an offset:
import numpy as np
from scipy.optimize import fmin
def fun(x):
return np.linalg.norm(x - 1j * np.ones(2), 2)
sol = fmin(fun, x0=np.ones(2) + 0j)
The output is
Optimization terminated successfully.
Current function value: 2.000000
Iterations: 38
Function evaluations: 69
>>> sol
array([-2.10235293e-05, 2.54845649e-05])
Clearly, the solution should be
array([0.+1.j, 0.+1.j])
Disappointed with this outcome, I have also tried scipy.optimize.minimize:
from scipy.optimize import minimize
def fun(x):
return np.linalg.norm(x - 1j * np.ones(2), 1)
sol = minimize(fun, x0=np.ones(2) + 0j)
The output is
>>> sol
fun: 2.0
hess_inv: array([[ 9.99997339e-01, -2.66135332e-06],
[-2.66135332e-06, 9.99997339e-01]])
jac: array([0., 0.])
message: 'Optimization terminated successfully.'
nfev: 24
nit: 5
njev: 6
status: 0
success: True
x: array([6.18479071e-09+0.j, 6.18479071e-09+0.j])
Not good either. I have tried specifying all of the possible methods for minimize (supplying the Jacobian and Hessian as necessary), but none of them reach the correct result. Most of them cause ComplexWarning: Casting complex values to real discards the imaginary part, indicating that they cannot handle complex numbers correctly.
Is this possible at all using scipy.optimize?
If so, I would very much appreciate if someone can tell me what I am doing wrong.
If not, do you perhaps have suggestions for alternative optimization tools (for Python) that allow this?
The minimization methods of SciPy work with real arguments only. But minimization on the complex space Cn amounts to minimization on R2n, the algebra of complex numbers never enters the consideration. Thus, adding two wrappers for conversion from Cn to R2n and back, you can optimize over complex numbers.
def real_to_complex(z): # real vector of length 2n -> complex of length n
return z[:len(z)//2] + 1j * z[len(z)//2:]
def complex_to_real(z): # complex vector of length n -> real of length 2n
return np.concatenate((np.real(z), np.imag(z)))
sol = minimize(lambda z: fun(real_to_complex(z)), x0=complex_to_real(np.ones(2) + 0j))
print(real_to_complex(sol.x)) # [-7.40376620e-09+1.j -8.77719406e-09+1.j]
You mention Jacobian and Hessian... but minimization only makes sense for real-valued functions, and those are never differentiable with respect to complex variables. The Jacobian and Hessian would have to be computed over R2n anyway, treating the real and imaginary parts as separate variables.
I have needed to minimize the departure of a complex valued model function upon complex valued parameters, over a real domain.
A toy example:
def f(x, a, b):
ab = complex(a,b);
return np.exp(x*ab)
And suppose that I have data DATA for x = np.arange(N). Note that x is real.
What I did was this:
def helper(x, a, b):
return abs(f(x,a,b) - DATA[x])
and then I can use curve_fit():
curve_fit(helper, np.arange(N), np.zeros(N), p0 = [1,0])
What is happening is this: By subtracting the data from the model function, the new "ideal" output is all zeroes, which can be (must be) real in order for curve_fit() to work. The complex parameter ab = a + jb has been broken into its real and imaginary parts. The helper() function returns the absolute value of the difference between the model and the data.
A critical issue is that curve_fit() doesn't evaluate any other x values than those you give it. Otherwise DATA[x] would fail.
Note that by using abs() I'm achieving an L1 fit (more or less). One could just as well use abs()**2 to get an L2 fit ... but why one would use L1 or L2 is a topic for another day.
You could fret, "suppose that the x[] aren't integers (but are real)?" which my code requires. Well, that's doable, simply by putting them into an array, and indexing that. There's probably some clever hack using a dictionary that would address this issue, too.
Sorry about the code formatting; haven't figured out the markup yet.

Instability in Mittag-Leffler function using NumPy

In trying to reproduce the plot on Wolfram MathWorld, and in trying to help with this SO question, I ran into some numerical instability I don't understand:
import numpy as np
import matplotlib.pyplot as plt
from scipy.special import gamma
def MLf(z, a):
"""Mittag-Leffler function
"""
k = np.arange(100).reshape(-1, 1)
E = z**k / gamma(a*k + 1)
return np.sum(E, axis=0)
x = np.arange(-50, 10, 0.1)
plt.figure(figsize=(10,5))
for i in range(5):
plt.plot(x, MLf(x, i), label="alpha = "+str(i))
plt.legend()
plt.ylim(-5, 5); plt.xlim(-55, 15); plt.grid()
You can see the instability best in the orange line where a = 1, starting at about x = -35, but there's a problem with a = 0 (blue line) too. Changing the number of terms to sum (i.e. j) changes the x at which the instability occurs.
What's going on? How can I avoid this?
If a=0, the series definition of MLf that you are using only applies when |z|<1. Indeed, when the base z is greater than 1 in absolute value, the powers z**k keep increasing, and the series diverges. Looking at its 100th, or another, partial sum is pointless, those sums have nothing to do with the function outside of the interval -1 < z < 1. Just use the formula 1/(1-z) for the case a=0.
Case a = 1
The function is exp(z) and technically, it is represented by the power series z**k / k! for all z. But for large negative z this power series experiences catastrophic loss of significance: the individual terms are huge, for example, (-40)**40/factorial(40) is over 1e16, but their sum is tiny (exp(-40) is nearly zero). Since 1e16 approaches the limits of double precision, the output becomes dominated by the noise of truncating/rounding operations.
In general, evaluating polynomials by adding c(k) * z**k is not the best thing to do, both from the efficiency and precision standpoints. Horner's scheme is implemented in NumPy already, and using it simplifies the code:
k = np.arange(100)
return np.polynomial.polynomial.polyval(z, 1/gamma(a*k + 1))
However, this is not going to save the series for exp(z), its numeric issues are beyond NumPy.
You could use mpmath for evaluation, gaining in accuracy (mpmath supports arbitrarily high precision of floating point operations) and losing in speed (no compiled code, no vectorization).
Or you could just return exp(z) from MLf when a=1.
Case 0 < a < 1
The series converges, but again with catastrophic precision loss; and now there is no explicit formula to fall back on. The aforementioned mpmath is one option: set really high precision (mp.dps = 50) and hope it's enough to sum the series. The alternative is to look for another way to compute the function.
Looking around, I found the paper "Computation of the Mittag-Leffler function and its derivative" by
Rudolf Gorenflo, Joulia Loutchko & Yuri Luchko; 2002. I took formula (23) from it, and used it for negative z and 0 < a < 1.
import numpy as np
import matplotlib.pyplot as plt
from scipy.special import gamma
from scipy.integrate import quad
def MLf(z, a):
"""Mittag-Leffler function
"""
z = np.atleast_1d(z)
if a == 0:
return 1/(1 - z)
elif a == 1:
return np.exp(z)
elif a > 1 or all(z > 0):
k = np.arange(100)
return np.polynomial.polynomial.polyval(z, 1/gamma(a*k + 1))
# a helper for tricky case, from Gorenflo, Loutchko & Luchko
def _MLf(z, a):
if z < 0:
f = lambda x: (np.exp(-x*(-z)**(1/a)) * x**(a-1)*np.sin(np.pi*a)
/ (x**(2*a) + 2*x**a*np.cos(np.pi*a) + 1))
return 1/np.pi * quad(f, 0, np.inf)[0]
elif z == 0:
return 1
else:
return MLf(z, a)
return np.vectorize(_MLf)(z, a)
x = np.arange(-50, 10, 0.1)
plt.figure(figsize=(10,5))
for i in range(1, 5):
plt.plot(x, MLf(x, i/3), label="alpha = "+str(i/3))
plt.legend()
plt.ylim(-5, 5); plt.xlim(-55, 15); plt.grid()
No numerical issues here.
There is a Python package implementation based on a better algorithm, which is more precise and does not have the numerical instabilities shown in the OP; it's available at https://pypi.org/project/numfracpy/
It is based on Laplace transform and integration along an optimal path. The details are in the paper by Garrappa, Roberto, and Marina Popolizio. “Fast Methods for the Computation of the Mittag-Leffler Function.” In Handbook of Fractional Calculus with Applications | Volume 3: Numerical Methods, 329–46. De Gruyter, 2019. https://doi.org/10.1515/9783110571684-013. This is a great and more efficient improvement on the paper by Gorenflo et al. shown in the the best answer above.
There was a dead link in the reply selected as best, which I updated and added here as well: Gorenflo, Rudolf. “COMPUTATION OF THE MITTAG-LEFFLER FUNCTION Eα,β(z) AND ITS DERIVATIVE.” Fractional Calculus and Applied Analysis, January 1, 2002. Remember that the latter is prone to numerical problems.

Integration of the tail of a Gaussian function with Scipy, giving zero instead of 8.19e-26

I am trying to integrate a Gaussian function, the limits are way inside the Gaussian tail, so trying the integrate.quad gave me zero. Is there a way to integrate a Gaussian function that suppose to give extremely small answer?
The function's integrand is:
sigma = 9.5e-5
integrand = lambda delta: (1./(np.sqrt(2*np.pi)*sigma))*np.exp(-(delta**2)/(2*sigma**2))
I need to integrate between 10^-3 to 0.3
With Wolfram Alpha I got an answer of 8.19e-26
But with Romberg integration of Scipy I got zero. Can I turn the knobs in Scipy to integrate such a small result?
Let F(x; s) be the CDF of the normal (i.e. Gaussian) distribution with
standard deviation s. You are computing
F(x1;s) - F(x0;s), where x0 = 1e-3 and x1 = 0.3.
This can be rewritten as S(x0;s) - S(x1;s) where S(x;s) = 1 - F(x;s) is the
"survival function".
You can compute this with the sf method of the the norm object of scipy.stats.
In [99]: x0 = 1e-3
In [100]: x1 = 0.3
In [101]: s = 9.5e-5
In [102]: from scipy.stats import norm
In [103]: norm.sf(x0, scale=s)
Out[103]: 3.2671026385171459e-26
In [104]: norm.sf(x1, scale=s)
Out[104]: 0.0
Note that norm.sf(x1, scale=s) gives 0. The exact value of this expression is a
number that is smaller than can be represented as a 64 bit floating point value (as #Zhenya points out in a comment).
So this calculation gives the answer 3.267e-26.
You could also compute this with scipy.special.ndtr. ndtr computes the CDF of the standard normal distribution, and by symmetry, S(x; s) = ndtr(-x/s).
In [105]: from scipy.special import ndtr
In [106]: ndtr(-x0/s)
Out[106]: 3.2671026385171459e-26
If you want achieve the same result using numerical integration, you'll have to experiment with the error control parameters of the integration algorithm. For example, to get this answer using scipy.integrate.romberg, I tweaked divmax and tol, as follows:
In [60]: from scipy.integrate import romberg
In [61]: def integrand(x, s):
....: return np.exp(-0.5*(x/s)**2)/(np.sqrt(2*np.pi)*s)
....:
In [62]: romberg(integrand, 0.001, 0.3, args=(9.5e-5,), divmax=20, tol=1e-30)
Out[62]: 3.2671026554875259e-26
With scipy.integrate.quad, it required the trick of telling it that 0.002 was a "special" point that would require more work:
In [81]: from scipy.integrate import quad
In [82]: p, err = quad(integrand, 0.001, 0.3, args=(9.5e-5,), epsabs=1e-32, points=[0.002])
In [83]: p
Out[83]: 3.267102638517144e-26
In [84]: err
Out[84]: 4.769436484142494e-37
Yes.
>>> from scipy.special import erfc
>>> erfc(1e-3/9.5e-5/np.sqrt(2.))
6.534205277034387e-26
That far in the tail you're better off using a complemented error function (erfc) or, possibly, erfcx, which is the complemented error function scaled by exp(x**2).
Thanks for the help,
after more consultations I went to the numerical integration option, after checking with c++ script I've found up that if I set divmax = 120 at the scipy.integrate.romberg I arrive to the same result that I got at Wolfram Alpha.
But this solution takes lots of time to calculate. I will try to work with the error function to see if I can make sense of it..
Cheers

Integration on python

hi i have been given a question by my lecturer to integrate a function through python and he gave us very little information. the boundaries are +infinity and -infinity and the function is
(cos a*x) * (e**-x**2)
so far I have
def gauss_cosine(a, n):
sum=0.0
dx = ((math.cosine(a*x)*math.exp(-x**2)))
return
for k in range (0,n):
x=a+k*dx
sum=sum+f(x)
return dx*sum
not sure if this is right at all.
kind regards
I don't see it recommended much on this site, but you could try sympy:
In [1]: import sympy as sp
In [2]: x, a = sp.symbols(('x', 'a'))
In [3]: f = sp.cos(a*x) * sp.exp(-x**2)
In [4]: res = sp.integrate(f, (x, -sp.oo, sp.oo))
In [5]: res
Out[5]: sqrt(pi) * exp
In [6]: sp.pprint(res)
Out[6]:
2
-a
────
___ 4
╲╱ π ⋅ℯ
For numerical integration, try the scipy package.
Well, your integral has an analytical solution, and you can calculate it with sympy, as #Bill pointed out, +1.
However, what I think was the point of the question is how to numerically calculate this integral, and this is what I discuss here.
The integrand is even. We reduce the domain to [0,+inf], and will multiply by 2 the result.
We still have an oscillatory integral on an unbounded domain. This is often a nasty beast, but we know that it is convergent, and well behaved at +- inf. In other words, the exp(-x**2) decays to zero fast enough.
The trick is to change variable of integration, x=tan(t), so that dx=(1+x**2)dt. The domain becomes [0,pi/2], it is bounded and the numerical integration is then a piece of cake.
Example with the simpson's rule from scipy, with a=2. With just 100 discretization points we have a 5 digits precision!
from scipy.integrate import simps
from numpy import seterr, pi, sqrt, linspace, tan, cos, exp
N = 100
a = 2.
t = linspace(0, pi / 2, N)
x = tan(t)
f = cos(a * x) * exp(-x ** 2) * (1 + x ** 2)
print "numerical solution = ", 2 * simps(f, t)
print "analytical solution = ",sqrt(pi) * exp(-a ** 2 / 4)
Your computer will have a very hard time representing those boundary limits.
Start by plotting your function.
It also helps to know the answer before you start.
I'd recommend breaking it into two integrals: one from minus-infinity to zero and another from zero to plus-infinity. As noted by flebool below, it's an even function. Make sure you know what that means and the implications for your solution.
Next you'll need an integration scheme that can deal with boundary conditions at infinity. Look for a log quadrature scheme.
A naive Euler integration would not be my first thought.

Calculating the area underneath a mathematical function

I have a range of data that I have approximated using a polynomial of degree 2 in Python. I want to calculate the area underneath this polynomial between 0 and 1.
Is there a calculus, or similar package from numpy that I can use, or should I just make a simple function to integrate these functions?
I'm a little unclear what the best approach for defining mathematical functions is.
Thanks.
If you're integrating only polynomials, you don't need to represent a general mathematical function, use numpy.poly1d, which has an integ method for integration.
>>> import numpy
>>> p = numpy.poly1d([2, 4, 6])
>>> print p
2
2 x + 4 x + 6
>>> i = p.integ()
>>> i
poly1d([ 0.66666667, 2. , 6. , 0. ])
>>> integrand = i(1) - i(0) # Use call notation to evaluate a poly1d
>>> integrand
8.6666666666666661
For integrating arbitrary numerical functions, you would use scipy.integrate with normal Python functions for functions. For integrating functions analytically, you would use sympy. It doesn't sound like you want either in this case, especially not the latter.
Look, Ma, no imports!
>>> coeffs = [2., 4., 6.]
>>> sum(coeff / (i+1) for i, coeff in enumerate(reversed(coeffs)))
8.6666666666666661
>>>
Our guarantee: Works for a polynomial of any positive degree or your money back!
Update from our research lab: Guarantee extended; s/positive/non-negative/ :-)
Update Here's the industrial-strength version that is robust in the face of stray ints in the coefficients without having a function call in the loop, and uses neither enumerate() nor reversed() in the setup:
>>> icoeffs = [2, 4, 6]
>>> tot = 0.0
>>> divisor = float(len(icoeffs))
>>> for coeff in icoeffs:
... tot += coeff / divisor
... divisor -= 1.0
...
>>> tot
8.6666666666666661
>>>
It might be overkill to resort to general-purpose numeric integration algorithms for your special case...if you work out the algebra, there's a simple expression that gives you the area.
You have a polynomial of degree 2: f(x) = ax2 + bx + c
You want to find the area under the curve for x in the range [0,1].
The antiderivative F(x) = ax3/3 + bx2/2 + cx + C
The area under the curve from 0 to 1 is: F(1) - F(0) = a/3 + b/2 + c
So if you're only calculating the area for the interval [0,1], you might consider
using this simple expression rather than resorting to the general-purpose methods.
'quad' in scipy.integrate is the general purpose method for integrating functions of a single variable over a definite interval. In a simple case (such as the one described in your question) you pass in your function and the lower and upper limits, respectively. 'quad' returns a tuple comprised of the integral result and an upper bound on the error term.
from scipy import integrate as TG
fnx = lambda x: 3*x**2 + 9*x # some polynomial of degree two
aoc, err = TG.quad(fnx, 0, 1)
[Note: after i posted this i an answer posted before mine, and which represents polynomials using 'poly1d' in Numpy. My scriptlet just above can also accept a polynomial in this form:
import numpy as NP
px = NP.poly1d([2,4,6])
aoc, err = TG.quad(px, 0, 1)
# returns (8.6666666666666661, 9.6219328800846896e-14)
If one is integrating quadratic or cubic polynomials from the get-go, an alternative to deriving the explicit integral expressions is to use Simpson's rule; it is a deep fact that this method exactly integrates polynomials of degree 3 and lower.
To borrow Mike Graham's example (I haven't used Python in a while; apologies if the code looks wonky):
>>> import numpy
>>> p = numpy.poly1d([2, 4, 6])
>>> print p
2
2 x + 4 x + 6
>>> integrand = (1 - 0)(p(0) + 4*p((0 + 1)/2) + p(1))/6
uses Simpson's rule to compute the value of integrand. You can verify for yourself that the method works as advertised.
Of course, I did not simplify the expression for integrand to indicate that the 0 and 1 can be replaced with arbitrary values u and v, and the code will still work for finding the integral of the function from u to v.

Categories