Constrain specific values in Scipy curve fitting - python

I have what may be quite a basic question, but a quick googling was not able to solve it.
So I have some experimental data that I need to fit with an equation like
a * exp^{-x/t}
in the case of needing more components the expression is
a * exp^{-x/t1} + b * exp^{-x/t2} ... + n * exp^{-x/tn}
for n elements
Right now I have the following code
x = np.array([0.0001, 0.0004, 0.0006, 0.0008, 0.001, 0.0015, 0.002, 0.004, 0.006, 0.008, 0.01, 0.05, 0.1, 0.2, 0.5, 0.6, 0.8, 1, 1.5, 2, 4, 6, 8])
y1= np.array([5176350.00, 5144208.69, 4998297.04, 4787100.79, 4555731.93, 4030741.17, 3637802.79, 2949911.45, 2816472.26, 2831962.09, 2833262.53, 2815205.34, 2610685.14, 3581566.94, 1820610.74, 2100882.80, 1762737.50, 1558251.40, 997259.21, 977892.00, 518709.91, 309594.88, 186184.52])
y2 = np.array([441983.26, 423371.31, 399370.82, 390603.58, 378351.08, 356511.93, 349582.29, 346425.39, 351191.31, 329363.40, 325154.86, 352906.21, 333150.81, 301613.81, 94043.05, 100885.77, 86193.40, 75548.26, 27958.11, 20262.68, 27945.10])
def fitcurve (x, a, b, t1, t2):
return a * np.exp(- x / t1) + b * np.exp(- x / t2)
popt, pcov = curve_fit(fitcurve, x, y)
print('a = ', popt[0], 'b = ', popt[1], 't1 = ', popt[2], 't2 = ', popt[3])
plt.plot(x,y, 'bo')
plt.plot(x,fitcurve(x, *popt))
Something important is that a+b+...n = is equal to 1. Basically the percentage of each component. Ideally, I want to plot 1, 2, 3 and 4 components and see which ones provide a better fitting

I am afraid that your data cannot be fitted with a simple sum of exponential functions. Did you draw the points on a graph in order to see what is the shape of the curve ?
This looks more like a function of logistic kind (but not exactly logistic) than a sum of exponentials.
I could provide some advises to fit a sum of exponential (even with condition about the sum of coefficients). But this would be of no use with your data. Of course if you have other data convenient to fit a sum of exponentials, I would be pleased to show how to proceed.

I am not going into the model-fitting procedure but what you can do is argparse variable number of paramters and then try to fit for various numbers of exponentials. You can make use of the broadcasting feature of numpy to achieve this.
EDIT: you have to take care of the number of elements in argparse. Only even numbers works now. I leave it up to you to edit that part in (trivial).
Target
We want to fit $$\sum_i^N a_i \exp(-b_i x)$$ for variable $n$
Output:
Implementation:
from scipy import optimize, ndimage, interpolate
x = np.array([0.0001, 0.0004, 0.0006, 0.0008, 0.0010, 0.0015, 0.0020, 0.0040, 0.0060, 0.0080, 0.0100, 0.0500, 0.1000, 0.2000, 0.5000, 0.6000, 0.8000, 1.0000, 1.5000, 2.0000, 4.0000, 6.0000, 8.0000, 10.0000])
y = np.array([416312.6500, 387276.6400, 364153.7600, 350981.7000, 336813.8800, 314992.6100, 310430.4600, 318255.1700, 318487.1700, 291768.9700, 276617.3000, 305250.2100, 272001.3500, 260540.5600, 173677.1900, 155821.5500, 151502.9700, 83559.9000, 256097.3600, 20761.8400, 1.0000, 1.0000, 1.0000, 1.0000])
# variable args fit
def fitcurve (x, *args):
args = np.array(args)
half = len(args)//2
y = args[:half] * np.exp(-x[:, None] * args[half:])
return y.sum(-1)
# data seems to contain outlier?
# y = ndimage.median_filter(y, 5)
popt, pcov = optimize.curve_fit(fitcurve, x, y,
bounds = (0, np.inf),
p0 = np.ones(6), # set variable size
maxfev = 1e3,
)
fig, ax = plt.subplots()
ax.plot(x,y, 'bo')
# ax.set_yscale('log')
ax.set_xscale('symlog')
ax.plot(x,fitcurve(x, *popt))
fig.show()

Related

Fitting a curve to some datapoints

the fitted curve doesn't fit the datapoints (xH_data, nH_data) as expected. Does someone know what might be the issue here?
from scipy.optimize import curve_fit
import numpy as np
import matplotlib
import matplotlib.pyplot as plt
xH_data = np.array([1., 1.03, 1.06, 1.1, 1.2, 1.3, 1.5, 1.7, 2., 2.6, 3., 4., 5., 6.])
nH_data = np.array([403., 316., 235., 160., 70.8, 37.6, 14.8, 7.11, 2.81, 0.665, 0.313, 0.090, 0.044, 0.029])*1.0e6
plt.plot(xH_data, nH_data)
plt.yscale("log")
plt.xscale("log")
def eTemp(x, A, a, B):
n = B*(A+x)**a
return n
parameters, covariance = curve_fit(eTemp, xH_data, nH_data, maxfev=200000)
fit_A = parameters[0]
fit_a = parameters[1]
fit_B = parameters[2]
print(fit_A)
print(fit_a)
print(fit_B)
r = np.logspace(0, 0.7, 1000)
ne = fit_B *(fit_A + r)**(fit_a)
plt.plot(r, ne)
plt.yscale("log")
plt.xscale("log")
Thanks in advance for the help.
Ok, here is a different approach. As usual, the main problem are initial guesses for the non linear fit (For details, check this). Here, those are evaluated by using an integro relation of the fit function y( x ) = a ( x - c )^p, namely int( y ) = ( x - c ) / ( p + 1 ) y + d = x y / ( p + 1 ) - c y / ( p + 1 ) + d This means we can get c and p via a linear fit of int y against x y and y. Once those are known, a is a simple linear fit. It will turn out that these guesses are already quite good. Nevertheless, those will go as initial values into a non-linear fit providing the final result. In detail this goes like this:
import matplotlib.pyplot as plt
import numpy as np
from scipy.integrate import cumtrapz
from scipy.optimize import curve_fit
xHdata = np.array(
[
1.0, 1.03, 1.06, 1.1, 1.2, 1.3, 1.5,
1.7, 2.0, 2.6, 3.0, 4.0, 5.0, 6.0
]
)
nHdata = np.array(
[
403.0, 316.0, 235.0, 160.0, 70.8, 37.6,
14.8, 7.11, 2.81, 0.665, 0.313, 0.090, 0.044, 0.029
]
) * 1.0e6
def fit_func( x, a, c, p ):
out = a * ( x - c )**p
return out
### fitting the non-linear parameters as part of an integro-equation
### this is the standard matrix formulation of a linear fit
Sy = cumtrapz( nHdata, x=xHdata, initial=0 ) ## int( y )
VMXT = np.array( [ xHdata * nHdata , nHdata, np.ones( len( nHdata ) ) ] ) ## ( x y, y, d )
VMX = VMXT.transpose()
A = np.dot( VMXT, VMX )
SV = np.dot( VMXT, Sy )
sol = np.linalg.solve( A , SV )
print ( sol )
pF = 1 / sol[0] - 1
print( pF )
cF = -sol[1] * ( pF + 1 )
print( cF )
### making a linear fit on the scale
### the short version of the matrix form if only one factor is calculated
fk = fit_func( xHdata, 1, cF, pF )
aF = np.dot( nHdata, fk ) / np.dot( fk, fk )
print( aF )
#### using these guesses as input for a final non-linear fit
sol, cov = curve_fit(fit_func, xHdata, nHdata, p0=( aF, cF, pF ) )
print( sol )
print( cov )
### plotting
xth = np.linspace( 1, 6, 125 )
fig = plt.figure()
ax = fig.add_subplot( 1, 1, 1 )
ax.scatter( xHdata, nHdata )
ax.plot( xth, fit_func( xth, aF, cF, pF ), ls=':' )
ax.plot( xth, fit_func( xth, *sol ) )
plt.show()
Providing:
[-3.82334284e-01 2.51613126e-01 5.41522867e+07]
-3.6155122388787175
0.6580972107001803
8504146.59883185
[ 5.32486242e+07 2.44780953e-01 -7.24897172e+00]
[[ 1.03198712e+16 -2.71798924e+07 -2.37545914e+08]
[-2.71798924e+07 7.16072922e-02 6.26461373e-01]
[-2.37545914e+08 6.26461373e-01 5.49910325e+00]]
(note the high correlation from a to c and p)
and
I know of two things that might help you
Provide the p0 input parameter to curve_fit with a set of appropriate starting parameters to the function. That can keep the algorithm from running wild.
Change the function you are fitting so that it returns np.log(n) and then make the fit to np.log(nH_data). As it is now, there is a far larger penalty for not fitting the first data points than for not fitting the last data points, as the values are about 10^2 larger for the first ones. Thus, the first data points become "more important" to fit for the algorithm. Taking the logarithm puts them more on the same scale, so that points are weighed equally.
Go ahead and play around with it. I managed a pretty fine fit with these parameters
[-7.21450545e-01 -3.36131028e+00 5.97293632e+06]
I think you're nearly there, just need to fit on a log scale and throw in a decent guess. To make the guess you just need to throw in a plot like
plt.figure()
plt.plot(np.log(xH_data), np.log(nH_data))
and you'll see it's nearly linear. So your B will be the exponentiated intercept (i.e. exp(20ish)) and the a is the approximate slope (-5ish). A is weird one, does it have some physical meaning or you just threw it in there? If there's no physical meaning, I'd say get rid of it.
from scipy.optimize import curve_fit
import numpy as np
import matplotlib
import matplotlib.pyplot as plt
import numpy as np
xH_data = np.array([1., 1.03, 1.06, 1.1, 1.2, 1.3, 1.5, 1.7, 2., 2.6, 3., 4., 5., 6.])
nH_data = np.array([403., 316., 235., 160., 70.8, 37.6, 14.8, 7.11, 2.81, 0.665, 0.313, 0.090, 0.044, 0.029])*1.0e6
def eTemp(x, A, a, B):
logn = np.log(B*(x + A)**a)
return logn
parameters, covariance = curve_fit(eTemp, xH_data, np.log(nH_data), p0=[np.exp(0.1), -5, np.exp(20)], maxfev=200000)
fit_A = parameters[0]
fit_a = parameters[1]
fit_B = parameters[2]
print(fit_A)
print(fit_a)
print(fit_B)
r = np.logspace(0, 0.7, 1000)
ne = np.exp(eTemp(r, fit_A, fit_a, fit_B))
plt.plot(xH_data, nH_data)
plt.plot(r, ne)
plt.yscale("log")
plt.xscale("log")
There is a problem with your fit equation. If A is less than -1 and your a parameter is negative then you get an imaginary value for your function within your fit range. For this reason you need to add constraints and an initial set of parameters to your curve_fit function for example:
parameters, covariance = curve_fit(eTemp, xH_data, nH_data, method='dogbox', p0 = [100, -3.3, 10E8], bounds=((-0.9, -10, 0), (200, -1, 10e9)), maxfev=200000)
You need to change the method to 'dogbox' in order to perform this fit with the constraints.

Unexpected behaviour when plotting 3D scenes with mayavi

I want to plot a sphere with latitudes 3D using mayavi. But I don't want the the latitudes in an equidistant angular range, but in an arangement according to this: https://en.wikipedia.org/wiki/Spherical_segment
This should result in spherical segments which have the same surface area.
So far... Lets consider theta to be the polar angle and phi to be the azimutal angle. Then I have the following code:
import numpy as np
from mayavi import mlab
## Create a sphere
r = 1.0
pi = np.pi
cos = np.cos
sin = np.sin
arccos=np.arccos
phi, theta = np.mgrid[-0.5*pi:0.5*pi:101j, 0:1*pi:101j]
x = r*sin(phi)*cos(theta)
y = r*sin(phi)*sin(theta)
z = r*cos(phi)
## Basic settings mlab
mlab.figure(1, bgcolor=(1, 1, 1), fgcolor=(0, 0, 0), size=(500, 500))
mlab.clf()
mlab.mesh(x , y , z, color=(0.9,0.,0.), opacity=0.3)
phi1=np.linspace(0, 2 * np.pi, 100)
theta1=arccos(np.linspace(0,1,11))
for i in range(len(theta1)):
x_pol = np.cos(phi1) * np.cos(theta1[i])
y_pol = np.sin(phi1) * np.cos(theta1[i])
z_pol = np.ones_like(phi1) * np.sin(theta1[i])
mlab.plot3d(x_pol, y_pol, z_pol, color=(0,0,0), opacity=0.2, tube_radius=None)
mlab.show()
The result is shown in image0 below.
As you can see, the arrangement of the segments is not correctly ordered. So I changed the order in theta1:
theta1=arccos(np.linspace(1,0,11))
The result is shown in image1 below. As you can see, the arrangement of the segments didn't change.
So, why is that? When I arrange the angular spacing from 0...1 this should come up with a different result then a spacing from 1...0. But actually it doesn't?!?
Has anyone a clue, what I did wrong?
Thanks
image0
image1
The ranges have the same values. The segments are the same, but in reversed order.
See the values of theta:
In [1]: np.flip(np.linspace(0,1,11), 0), np.linspace(1,0,11)
Out[1]:
(array([ 1. , 0.9, 0.8, 0.7, 0.6, 0.5, 0.4, 0.3, 0.2, 0.1, 0. ]),
array([ 1. , 0.9, 0.8, 0.7, 0.6, 0.5, 0.4, 0.3, 0.2, 0.1, 0. ]))
Thanks for your reply. I am not sure, whether I got your point. In the first case theta looks like this:
In [1]: np.arccos(np.linspace(1,0,11))
Out [1]:
array([0. , 0.45102681, 0.64350111, 0.79539883, 0.92729522,
1.04719755, 1.15927948, 1.26610367, 1.36943841, 1.47062891,
1.57079633])
In the second case it looks like:
In [1]: np.arccos(np.linspace(0,1,11))
Out [1]:
array([1.57079633, 1.47062891, 1.36943841, 1.26610367, 1.15927948,
1.04719755, 0.92729522, 0.79539883, 0.64350111, 0.45102681,
0. ])
So to me, it seems correct.
Ok,
sometimes it takes quite a time for me ^^
I figured out, what I did wrong. I simply changed
np.arccos(np.linspace(0,1,11))
to
np.pi/2 - np.arccos(np.linspace(0,1,11))
which produces the correct output.
correct Image
Well, sometimes you (I) don't see the forest for the trees... ^^
Greetings...

python numpy.fft.rfft: why is when NFFT included or not, outputs are very different

I am trying to understand to the meaning of NFFT in numpy.fft.rfft. But I get confused why when NFFT included or not, the outputs get very different. Please see the example below.
numpy.fft.rfft([0, 1, 0, 0, 4.3, 3, 599], 8)
array([ 607.3 +0.j , -5.71421356+600.41421356j,
-594.7 -4.j , -2.88578644-597.58578644j,
599.3 +0.j ])
numpy.fft.rfft([0, 1, 0, 0, 4.3, 3, 599])
array([ 607.3 +0.j , 369.55215218+472.32571033j,
-133.53446083+578.34336489j, -539.66769135+261.30917157j])
The FFT is an efficient implementation of the Discrete Fourier Transform (DFT), which is a discrete function of frequency. It is also related to the Discrete-Time Fourier Transform (DTFT), itself a continuous function of frequency. More specifically, the DFT corresponds exactly to the DTFT evaluated at the discrete frequencies of the DFT.
In other words, when computing a Discrete Fourier Transform with numpy.fft.rfft, you are essentially sampling the DTFT function at discrete frequency points. You can see this by plotting transforms of different lengths on the same graph with the following:
import numpy as np
import matplotlib.pyplot as plt
x = [0, 1, 0, 0, 4.3, 3, 599]
# Compute the DTFT at a sufficiently large number of points using the explicit formula
N = 2048
f = np.linspace(0, 0.5, N)
dtft = np.zeros(len(f), dtype=np.complex128)
for n in range(0,len(x)):
dtft += x[n] * np.exp(-1j*2*np.pi*f*n)
# Compute the FFT without NFFT argument (NFFT defaults to the length of the input)
y1 = np.fft.rfft(x)
f1 = np.fft.rfftfreq(len(x))
# Compute the FFT with NFFT argument
N2 = 8
y2 = np.fft.rfft(x,N2)
f2 = np.fft.rfftfreq(N2)
# Plot results
plt.figure(1)
plt.subplot(2,1,1)
plt.plot(f, np.abs(dtft), label='DTFT')
plt.plot(f1, np.abs(y1), 'C1x', label='FFT N=7')
plt.plot(f2, np.abs(y2), 'C2s', label='FFT N=8')
plt.title('Magnitude')
plt.legend(loc='upper right')
plt.subplot(2,1,2)
plt.plot(f, np.angle(dtft), label='DTFT')
plt.plot(f1, np.angle(y1), 'C1x', label='FFT N=7')
plt.plot(f2, np.angle(y2), 'C2s', label='FFT N=8')
plt.title('Phase')
plt.legend(loc='upper right')
plt.show()

Pixelwise 2D Radiometric Calibration

I have 3 images, with an applyied mean filter.
I0 beeing just the noise image, taken with the cap on.
I20 taken an image which only shows a 20% reflectance target
I90 an image showing only a 90% reflectance target for each pixel.
So rather than looping over each pixel and using polynomial fit (https://docs.scipy.org/doc/numpy/reference/generated/numpy.polyfit.html)
Where X = [I0(i), I20(i), I90(i)] and Y=[0,0.2,0.9]
and then applying the polyfit to get the parameters for each pixel,
is there a way to feed a X(i,3) and Y(i,3) into polyfit or something similar to get the same result but faster?
Thanks
Ben
If your goal is to vectorize polyfit then yes, this can be done but requires rewriting np.polyfit manually. Fortunately, it can be built on top of np.linalg.lstsq and the polynomial design matrix provided by np.vander. All in all, the routine looks like the following:
import numpy as np
def fit_many(x, y, order=2):
'''
arguments:
x: [N]
y: [N x S]
where:
N - # of measurements per pixel
S - # pixels
returns [`order` x S]
'''
A = np.vander(x, N=order)
return np.linalg.lstsq(A, y, rcond=None)[0]
And can be used like below
# measurement x values. I suppose those are your reflectances?
x = np.array([0, 1, 2])
y = np.array([ # a row per pixel
[-1, 0.2, 0.9],
[-.9, 0.1, 1.2],
]).T
params = fit_many(x, y)
import matplotlib.pyplot as plt
poly1 = np.poly1d(params[:, 0])
poly2 = np.poly1d(params[:, 1])
plt.plot(x, y[:, 0], 'bo')
plt.plot(x, poly1(x), 'b-')
plt.plot(x, y[:, 1], 'ro')
plt.plot(x, poly2(x), 'r-')
plt.show()
Keep in mind np.linalg.lstsq doesn't allow for dimensions higher than two, so you will have to reshape your 2d image into flattened versions, fit and convert back.

fitting data with numpy

I have the following data:
>>> x
array([ 3.08, 3.1 , 3.12, 3.14, 3.16, 3.18, 3.2 , 3.22, 3.24,
3.26, 3.28, 3.3 , 3.32, 3.34, 3.36, 3.38, 3.4 , 3.42,
3.44, 3.46, 3.48, 3.5 , 3.52, 3.54, 3.56, 3.58, 3.6 ,
3.62, 3.64, 3.66, 3.68])
>>> y
array([ 0.000857, 0.001182, 0.001619, 0.002113, 0.002702, 0.003351,
0.004062, 0.004754, 0.00546 , 0.006183, 0.006816, 0.007362,
0.007844, 0.008207, 0.008474, 0.008541, 0.008539, 0.008445,
0.008251, 0.007974, 0.007608, 0.007193, 0.006752, 0.006269,
0.005799, 0.005302, 0.004822, 0.004339, 0.00391 , 0.003481,
0.003095])
Now, I want to fit these data with, say, a 4 degree polynomial. So I do:
>>> coefs = np.polynomial.polynomial.polyfit(x, y, 4)
>>> ffit = np.poly1d(coefs)
Now I create a new grid for x values to evaluate the fitting function ffit:
>>> x_new = np.linspace(x[0], x[-1], num=len(x)*10)
When I do all the plotting (data set and fitting curve) with the command:
>>> fig1 = plt.figure()
>>> ax1 = fig1.add_subplot(111)
>>> ax1.scatter(x, y, facecolors='None')
>>> ax1.plot(x_new, ffit(x_new))
>>> plt.show()
I get the following:
fitting_data.png
What I expect is the fitting function to fit correctly (at least near the maximum value of the data). What am I doing wrong?
Unfortunately, np.polynomial.polynomial.polyfit returns the coefficients in the opposite order of that for np.polyfit and np.polyval (or, as you used np.poly1d). To illustrate:
In [40]: np.polynomial.polynomial.polyfit(x, y, 4)
Out[40]:
array([ 84.29340848, -100.53595376, 44.83281408, -8.85931101,
0.65459882])
In [41]: np.polyfit(x, y, 4)
Out[41]:
array([ 0.65459882, -8.859311 , 44.83281407, -100.53595375,
84.29340846])
In general: np.polynomial.polynomial.polyfit returns coefficients [A, B, C] to A + Bx + Cx^2 + ..., while np.polyfit returns: ... + Ax^2 + Bx + C.
So if you want to use this combination of functions, you must reverse the order of coefficients, as in:
ffit = np.polyval(coefs[::-1], x_new)
However, the documentation states clearly to avoid np.polyfit, np.polyval, and np.poly1d, and instead to use only the new(er) package.
You're safest to use only the polynomial package:
import numpy.polynomial.polynomial as poly
coefs = poly.polyfit(x, y, 4)
ffit = poly.polyval(x_new, coefs)
plt.plot(x_new, ffit)
Or, to create the polynomial function:
ffit = poly.Polynomial(coefs) # instead of np.poly1d
plt.plot(x_new, ffit(x_new))
Note that you can use the Polynomial class directly to do the fitting and return a Polynomial instance.
from numpy.polynomial import Polynomial
p = Polynomial.fit(x, y, 4)
plt.plot(*p.linspace())
p uses scaled and shifted x values for numerical stability. If you need the usual form of the coefficients, you will need to follow with
pnormal = p.convert(domain=(-1, 1))

Categories