How to calculate derivatives at the boundary in SciPy? - python

I have a script drawing a set of (x,y) curves at various z.
import numpy as np
import matplotlib.pyplot as plt
x = np.linspace(0,1,100)
z = np.linspace(0,30,30)
def y(z, x):
return z**(1-x)
for i in z:
plt.plot(x, y(i,x))
How can I draw dy/dx at x=0 versus z?
plt.plot(z, dy/dx at x=0)
In fact, I need to calculate the slope at the x=0 boundary for each (x,y) curve (shown below), then plot the slopes against z.

You must use the derivative function:
scipy.misc.derivative(func, x0, dx=1.0, n=1, args=(), order=3)
Find the n-th derivative of a function at a point.
Given a function, use a central difference formula with spacing dx to
compute the n-th derivative at x0.
func : function Input function.
x0 : float The point at which n-th derivative is found.
dx : float, optional Spacing.
n : int,optional Order of the derivative. Default is 1.
args : tuple, optional
Arguments order : int, optional Number of points to use, must be odd.
In your case:
import numpy as np
import matplotlib.pyplot as plt
from scipy.misc import derivative
x = np.linspace(0,1,100)
z = np.linspace(0,30,30)
x0 = 0
def y(z, x):
return z**(1-x)
dydx = [derivative(lambda x : y(zi, x) , x0) for zi in z]
plt.plot(z, dydx)

You mixed up the variables in the description. I assume you have a function y in variables (x,z). So you need to calculate dy/dx and dy/dz.
You have a few options to calculate the derivative, including symbolic calcultation (using SymPY) or just straightfoward finite differences calculation (prone to numerical errors) See this: How do I compute derivative using Numpy?.
But, you cannot plot this derivative since you are calculating it at a point (x=0,z=0), therefore the result is a float number, and not a function. To make the plot you want you need to calculate the general symbolic derivative (dydx) and make the plot you suggested. To get the result at point (0,0), just dydx(0,0).
Btw, dydz = (1-x)z**(-x) and dydx = -ln(z)*z**(1-x) using this.


Manually recover the original function from numpy rfft

I have performed a numpy.fft.rfft on a function to obtain the Fourier coefficients. Since the docs do not seem to contain the exact formula used, I have been assuming a formula found in a textbook of mine:
S(x) = a_0/2 + SUM(real(a_n) * cos(nx) + imag(a_n) * sin(nx))
where imag(a_n) is the imaginary part of the n_th element of the Fourier coefficients.
To translate this into python-speak, I have implemented the following:
def fourier(freqs, X):
# input the fourier frequencies from np.fft.rfft, and arbitrary X
const_term = np.repeat(np.real(freqs[0])/2, X.shape[0]).reshape(-1,1)
# this is the "n" part of the inside of the trig terms
trig_terms = np.tile(np.arange(1,len(freqs)), (X.shape[0],1))
sin_terms = np.imag(freqs[1:])*np.sin(np.einsum('i,ij->ij', X, trig_terms))
cos_terms = np.real(freqs[1:])*np.cos(np.einsum('i,ij->ij', X, trig_terms))
return np.concatenate((const_term, sin_terms, cos_terms), axis=1)
This should give me an [X.shape[0], 2*freqs.shape[0] - 1] array, containing at entry i,j the i_th element of X evaluated at the j_th term of the Fourier decomposition (where the j_th term is a sin term for odd j).
By summing this array over the axis of Fourier terms, I should obtain the function evaluated at the i_th term in X:
import numpy as np
import matplotlib.pyplot as plt
X = np.linspace(-1,1,50)
y = X*(X-0.8)*(X+1)
reconstructed_y = np.sum(
axis = 1
plt.plot(X, reconstructed_y, c='r')
In any case, the red line should be basically on top of the blue line. Something has gone wrong either in my assumptions about what numpy.fft.rfft returns, or in my specific implementation, but I am having a hard time tracking down the bug. Can anyone shed some light on what I've done wrong here?

Invert interpolation to give the variable associated with a desired interpolation function value

I am trying to invert an interpolated function using scipy's interpolate function. Let's say I create an interpolated function,
import scipy.interpolate as interpolate
interpolatedfunction = interpolated.interp1d(xvariable,data,kind='cubic')
Is there some function that can find x when I specify a:
interpolatedfunction(x) == a
In other words, "I want my interpolated function to equal a; what is the value of xvariable such that my function is equal to a?"
I appreciate I can do this with some numerical scheme, but is there a more straightforward method? What if the interpolated function is multivalued in xvariable?
There are dedicated methods for finding roots of cubic splines. The simplest to use is the .roots() method of InterpolatedUnivariateSpline object:
spl = InterpolatedUnivariateSpline(x, y)
roots = spl.roots()
This finds all of the roots instead of just one, as generic solvers (fsolve, brentq, newton, bisect, etc) do.
x = np.arange(20)
y = np.cos(np.arange(20))
spl = InterpolatedUnivariateSpline(x, y)
outputs array([ 1.56669456, 4.71145244, 7.85321627, 10.99554642, 14.13792756, 17.28271674])
However, you want to equate the spline to some arbitrary number a, rather than 0. One option is to rebuild the spline (you can't just subtract a from it):
solutions = InterpolatedUnivariateSpline(x, y - a).roots()
Note that none of this will work with the function returned by interp1d; it does not have roots method. For that function, using generic methods like fsolve is an option, but you will only get one root at a time from it. In any case, why use interp1d for cubic splines when there are more powerful ways to do the same kind of interpolation?
Non-object-oriented way
Instead of rebuilding the spline after subtracting a from data, one can directly subtract a from spline coefficients. This requires us to drop down to non-object-oriented interpolation methods. Specifically, sproot takes in a tck tuple prepared by splrep, as follows:
tck = splrep(x, y, k=3, s=0)
tck_mod = (tck[0], tck[1] - a, tck[2])
solutions = sproot(tck_mod)
I'm not sure if messing with tck is worth the gain here, as it's possible that the bulk of computation time will be in root-finding anyway. But it's good to have alternatives.
After creating an interpolated function interp_fn, you can find the value of x where interp_fn(x) == a by the roots of the function
interp_fn2 = lambda x: interp_fn(x) - a
There are number of options to find the roots in scipy.optimize. For instance, to use Newton's method with the initial value starting at 10:
from scipy import optimize
optimize.newton(interp_fn2, 10)
Actual example
Create an interpolated function and then find the roots where fn(x) == 5
import numpy as np
from scipy import interpolate, optimize
x = np.arange(10)
y = 1 + 6*np.arange(10) - np.arange(10)**2
y2 = 5*np.ones_like(x)
# create the interpolated function, and then the offset
# function used to find the roots
interp_fn = interpolate.interp1d(x, y, 'quadratic')
interp_fn2 = lambda x: interp_fn(x)-5
# to find the roots, we need to supply a starting value
# because there are more than 1 root in our range, we need
# to supply multiple starting values. They should be
# fairly close to the actual root
root1, root2 = optimize.newton(interp_fn2, 1), optimize.newton(interp_fn2, 5)
root1, root2
# returns:
(0.76393202250021064, 5.2360679774997898)
If your data are monotonic you might also try the following:
inversefunction = interpolated.interp1d(data, xvariable, kind='cubic')
Mentioning another option because I found this page in a google search and the other option works for my simple use case. Hopefully it'll be of use to someone.
If the function you're interpolating is very simple and always has a 1:1 relationship between y and x, then you can simply take your data, swap x and y when you pass it into interp1d, and then call the interpolation function in that direction.
Adapting code from
import numpy as np
import matplotlib.pyplot as plt
from scipy import interpolate
x = np.arange(0, 10)
y = np.exp(-x/3.0)
f = interpolate.interp1d(x, y)
xnew = np.arange(0, 9, 0.1)
ynew = f(xnew)
plt.plot(x, y, 'o', xnew, ynew, '-')
When x and y have been swapped you can call swappedInterpolationFunction(a) to get the x value where that would occur.
f = interpolate.interp1d(y, x)
xnew = np.arange(np.exp(-9/3), np.exp(0), 0.01)
ynew = f(xnew)
plt.plot(y, x, 'o', xnew, ynew, '-')
Of course, if the function ever has multiple x values for a given y value (like sine or a parabola) then this will not work because it will no longer be a 1:1 function from x to y, and the above answers are necessary. This is just a simplification in a limited use case.

scipy -- how to integrate a linearly interpolated function?

I have a function which is an interpolation of a relative large set of data. I use linear interpolation interp1d so there are a lot of non-smooth sharp point like this. The quad function from scipy will give warning because of the sharp points. I wonder how to do the integration without the warning?
Thank you!
Thanks for all the answers. Here I summarize the solutions in case some others run into the same problem:
Just like what #Stelios did, use points to avoid warnings and to get a more accurate result.
In practice the number of points are usually larger than the default limit(limit=50) of quad, so I choose quad(f_interp, a, b, limit=2*p.shape[0], points=p) to avoid all those warnings.
If a and b are not the same start or the end point of the data set x, the points p can be chosen by p = x[where(x>=a and x<=b)]
quad accepts an optional argument, called points. According to the documentation:
points : (sequence of floats,ints), optional
A sequence of break points in the bounded integration interval where
local difficulties of the integrand may occur (e.g., singularities,
discontinuities). The sequence does not have to be sorted.
In your case, the "difficult" points are exactly the x-coordinates of the data points. Here is an example:
import numpy as np
from scipy.integrate import quad
# generate random data set
x = np.arange(0,10)
y = np.random.rand(10)
# construct a linear interpolation function of the data set
f_interp = lambda xx: np.interp(xx, x, y)
Here is a plot of the data points and f_interp:
Now calling quad as
return a series of warnings along with
(4.89770017785734, 1.3762838395159349e-05)
If you provide the points argument, i.e.,
quad(f_interp,0,9, points = x)
it issues no warnings and the result is
(4.8977001778573435, 5.437539505167948e-14)
which also implies a much greater accuracy of the result compared to the previous call.
Instead of interp1d, you could use scipy.interpolate.InterpolatedUnivariateSpline. That interpolator has the method integral(a, b) that computes the definite integral.
Here's an example:
import numpy as np
from scipy.interpolate import InterpolatedUnivariateSpline
import matplotlib.pyplot as plt
# Create some test data.
x = np.linspace(0, np.pi, 21)
y = np.sin(1.5*x) + np.random.laplace(scale=0.35, size=len(x))**3
# Create the interpolator. Use k=1 for linear interpolation.
finterp = InterpolatedUnivariateSpline(x, y, k=1)
# Create a finer mesh of points on which to compute the integral.
xx = np.linspace(x[0], x[-1], 5*len(x))
# Use the interpolator to compute the integral from 0 to t for each
# t in xx.
qq = [finterp.integral(0, t) for t in xx]
# Plot stuff
p = plt.plot(x, y, '.', label='data')
plt.plot(x, y, '-', color=p[0].get_color(), label='linear interpolation')
plt.plot(xx, qq, label='integral of linear interpolation')
plt.legend(framealpha=1, shadow=True)
The plot:

Scipy Curve_Fit return value explained

Below is an example of using Curve_Fit from Scipy based on a linear equation. My understanding of Curve Fit in general is that it takes a plot of random points and creates a curve to show the "best fit" to a series of data points. My question is using scipy curve_fit it returns:
"Optimal values for the parameters so that the sum of the squared error of f(xdata, *popt) - ydata is minimized".
What exactly do these two values mean in simple English? Thanks!
import numpy as np
from scipy.optimize import curve_fit
# Creating a function to model and create data
def func(x, a, b):
return a * x + b
# Generating clean data
x = np.linspace(0, 10, 100)
y = func(x, 1, 2)
# Adding noise to the data
yn = y + 0.9 * np.random.normal(size=len(x))
# Executing curve_fit on noisy data
popt, pcov = curve_fit(func, x, yn)
# popt returns the best fit values for parameters of
# the given model (func).
You're asking SciPy to tell you the "best" line through a set of pairs of points (x, y).
Here's the equation of a straight line:
y = a*x + b
The slope of the line is a; the y-intercept is b.
You have two parameters, a and b, so you only need two equations to solve for two unknowns. Two points define a line, right?
So what happens when you have more than two points? You can't go through all the points. How do you choose the slope and intercept to give you the "best" line?
One way is to define "best" is to calculate the slope and intercept that minimize the square of the difference between each y value and the predicted y at that x on the line:
error = sum[(y(i) - (a*x(i) + b))^2]
It's an easy exercise if you know calculus: take the first derivatives of error w.r.t. a and b and set them equal to zero. You'll have two equations with two unknowns, a and b. You solve them to get the coefficients for the "best" line.

Peak curvature in Scipy spline

How can I find the peak curvature of a spline fitted using scipy? (Actually, peak second differential would be enough)
I have calculated the tck values as follows, using my 1d xs and ys vectors:
tck = splrep(xs, ys, s=0)
I know I can evaluate the second differential at any x of my choice:
ddy = splev([x], tck, 2)
So I could loop over many values of x, calculate the curvature and take the maximum. But I would prefer to interpret the values in tck to get the coefficients of the individual cubic functions, and thus calculate the peak curvature directly. However, tck appears rather opaque - how can I extract the cubic function coefficients from it?
Just use the der keyword argument on splev function:
ddy = splev(X, tck, der=2)
and preferrably don't loop over many values of x, instead make a Nx1 array X containing every value you want to evaluate, so as to get back an array of values instead of individual values you'll have to put in a sequence anyway.
Also, it is extremely adviseable to PLOT your results as a way to debug it. If plots make sense, things are most likely working (and, if not, they surely are NOT working) as you expect.
EDIT: in case the interpolation using X gives just an approximate value and you want the TRUE maximum, you can use parabolic interpolation of the three points that define the maximum (the local interpolated maximum and its neighbors), considering the spline is locally smooth:
def parabolic_interpolation(p1, p2, p3):
x1, y1 = p1
x2, y2 = p2
x3, y3 = p3
denom = (x1-x2)*(x1-x3)*(x2-x3);
a = (x3*(y2-y1)+x2*(y1-y3)+x1*(y3-y2))/denom
b = (x3*x3*(y1-y2)+x2*x2*(y3-y1)+x1*x1*(y2-y3))/denom
c = (x2*x3*(x2-x3)*y1+x3*x1*(x3-x1)*y2+x1*x2*(x1-x2)*y3)/denom
xv = -b/(2*a)
yv = c-b**2/(4*a)
return (xv, yv) # coordinates of the vertex
Hope this helps!
