Orthogonal regression fitting in scipy least squares method - python

The leastsq method in scipy lib fits a curve to some data. And this method implies that in this data Y values depends on some X argument. And calculates the minimal distance between curve and the data point in the Y axis (dy)
But what if I need to calculate minimal distance in both axes (dy and dx)
Is there some ways to implement this calculation?
Here is a sample of code when using one axis calculation:
import numpy as np
from scipy.optimize import leastsq
xData = [some data...]
yData = [some data...]
def mFunc(p, x, y):
return y - (p[0]*x**p[1]) # is takes into account only y axis
plsq, pcov = leastsq(mFunc, [1,1], args=(xData,yData))
print plsq
I recently tryed scipy.odr library and it returns the proper results only for linear function. For other functions like y=a*x^b it returns wrong results. This is how I use it:
def f(p, x):
return p[0]*x**p[1]
myModel = Model(f)
myData = Data(xData, yData)
myOdr = ODR(myData, myModel , beta0=[1,1])
myOdr.set_job(fit_type=0) #if set fit_type=2, returns the same as leastsq
out = myOdr.run()
out.pprint()
This returns wrong results, not desired, and in some input data not even close to real.
May be, there is some special ways of using it, what do I do wrong?

I've found the solution. Scipy Odrpack works noramally but it needs a good initial guess for correct results. So I divided the process into two steps.
First step: find the initial guess by using ordinaty least squares method.
Second step: substitude these initial guess in ODR as beta0 parameter.
And it works very well with an acceptable speed.
Thank you guys, your advice directed me to the right solution

scipy.odr implements the Orthogonal Distance Regression. See the instructions for basic use in the docstring and documentation.

If/when you are able to invert the function described by p you may just include x-pinverted(y) in mFunc, I guess as sqrt(a^2+b^2), so (pseudo code)
return sqrt( (y - (p[0]*x**p[1]))^2 + (x - (pinverted(y))^2)
for example for
y=kx+m p=[m,k]
pinv=[-m/k,1/k]
return sqrt( (y - (p[0]+x*p[1]))^2 + (x - (pinv[0]+y*pinv[1]))^2)
But what you ask for is in some cases problematic. For example, if a polynomial (or your x^j) curve has a minimum ym at y(m) and you have a point x,y lower than ym, what kind of value do you want to return? There's not always a solution.

you can use the ONLS package in R.

Related

Mixed partial dervative w.r.t. tensor in Pytorch

Question:
Is there any working method to calculate gradient of (non-scalar) tensor function?
Example
Given n by n symmetric matrices X, Y and matrix function Z(X, Y) = torch.mm(X.mm(X), Y) calculate d(dZ/dX)/dY.
Expected answer
d(dZ/dX)/dY = d(2*XY)/dY = 2*X
Attempts
Because torch's .backward() works only for scalar variables I've tried to calculate derivative by applying torch.autograd.grad() to each element of tensor Z, but this approach is not correct, because it gives d(X^2)/dX = X + 2*D where D is a diagonal matrix with diagonal values of X. For me it's a bit weird that torch has an ability to build a computational graph, but can't track tensor through it as a variable to get tensor derivative.
Edit
Question was not very clear, so I decided to give more details.
My aim is to get partial derivative of loss function, which involves two matrices as variables. It looks like that:
loss = torch.linalg.norm(my_formula(X, Y) , ord='fro')
And I need to find
d^2(loss)/d(Y^2)
d/dX[d(loss)/dY]
Torch is capable of calculating 1. by using .backward() two times, but it's problematic to find 2. because torch.autograd.grad() expects scalar input and not the tensor
TL;DR
For function f which takes a matrix and gives a scalar:
Find first order derivative, let's name it dX
Take trace: Tr(dX)
To get mixed partial derivative just use the trace from above: d/dY[df/dX] = d/dY[Tr(df/dX)]
Intro
At the moment of posting the question I was not really that good at theory of matrix derivatives, but now I know much more all thanks to this Yandex ml book (unfortunately, I didn't find the english equivalent). This is an attempt to give a full answer to my question.
Basic Theory
Forgive me, Lord, for ugly representation of latex
Let's say you have a function which takes matrix X and returns it's squared Frobenius norm: f(X) = ||X||_F^2
It is a well-known fact that: ||X||_F^2 = Tr(X X^T)
Let's define derivative as shown in same book: D[f] at X_0 = f(X + H) - f(X)
We are ready to find dg(X)/dX:
df(X)/dX = dTr(X X^T)/dX =
(using Trace's feature)
= Tr(d/dX[X X^T]) = Tr(dX/dX X^T + X d[X^T]/dX ) =
(then we should use the definition of derivative from above)
= Tr(HX^T + XH^T) = Tr(HX^T) + Tr(XH^T) =
(now the main trick is to get all matrices H on the right side and get something like
Tr(g(X) H) or Tr(g(X) H^T), where g(X) will be the derivative we are looking for)
= Tr(HX^T) + Tr(XH^T) = Tr(XH^T) + Tr(XH^T) = Tr(2*XH^T)
That means: df(X)/dX = 2X
Second order derivative
Now, after we found out how to get matrix derivatives, let's try to find second order derivative of the same function f(X):
d/dX[df(X)/dX] = d/dX[Tr(2XH_1^T)] = Tr(d/dX[2XH_1^T]) =
= Tr(2I H_2 H_1^T)
We found out that d/dX[df(X)/dX] = 2I where I stands for Identity matrix. But how will it help us to find derivatives in Pytorch?
Trace is the trick
As we can see from the formulas, both first and second order derivatives have Trace inside them, but when we take first order derivative we just instantly get matrix as a result. To get a higher order derivative we just need to take the derivative of trace of first order derivative:
d/dY[df/dX] = d/dY[Tr(df/dX)]
The thing is I was using JAX autograd library when this trick came to my mind, so the code with a function f(X,Y) will look like this:
def scalarized_dy(X, Y):
dY = grad(f, argnums=1)(X, Y)
return jnp.trace(dY)
dYX = grad(scalarized_dy, argnums=0)(X, Y)
dYY = grad(scalarized_dy, argnums=1)(X, Y)
In case of Pytorch I guess we will need to look after tensors' gradients (let loss be a function with X and Y as arguments):
loss = f(X, Y)
loss.backward(create_graph = True)
dX = torch.trace(X.grad)
dX.backward()
dXX = X.grad
dXY = Y.grad
Epilogue
I thought that the question itself is in some way interesting. Also, it took me several months to figure things out, so I decided to give my current point of view on this problem. I will not mark my answer as correct yet in hope that I will get some kind of feedback or, perhaps, even better answers or ideas.

Solving two coupled second order boundary value problems

I have solved a single second order differential equation with two boundary conditions using the module solve_bvp. However, now I am trying to solve the system of two second order differential equations;
U'' + a*B' = 0
B'' + b*U' = 0
with the boundary conditions U(+/-0.5) = +/-0.01 and B(+/-0.5) = 0. I have split this into a system of first ordinary differential equations and I am trying to use solve_bvp to solve them numerically. However, I am just getting arrays full of zeros for my solution. I believe I am implementing the boundary conditions wrong. It is not clear to me how to handle more than two equations from the documentation. My attempt is below
import numpy as np
from scipy.integrate import solve_bvp
import matplotlib.pyplot as plt
%matplotlib inline
from scipy.integrate import solve_bvp
alpha = 1E-8
zeta = 8E-3
C_k = 0.05
sigma = 0.01
def fun(x, y):
return np.vstack((y[1],-((alpha)/(C_k*sigma))*y[2],y[2], -(1/(C_k*zeta))*y[1]))
def bc(ya, yb):
return np.array([ya[0]+0.001, yb[0]-0.001,ya[0]-0, yb[0]-0])
x = np.linspace(-0.5, 0.5, 5000)
y = np.zeros((4, x.size))
print(y)
sol = solve_bvp(fun, bc, x, y)
print(sol)
In my question I have just relabeled a and b, but they're just parameters that I input. I have the analytic solution for this set of equations so I know one exists that is non-trivial. Any help would be greatly appreciated.
It is most times really helpful if you state at least once in a comment or by assignment to specifically named variables how you want to compose the state vector.
By the form of the derivative return vector, I would think you intend
U, U', B, B'
which means that U=y[0], U'=y[1] and B=y[2],B'=y[3], so that your derivatives vector should correctly be
return y[1], -((alpha)/(C_k*sigma))*y[3], y[3], -(1/(C_k*zeta))*y[1]
and the boundary conditions
return ya[0]+0.001, yb[0]-0.001, ya[2]-0, yb[2]-0
Especially your boundary condition should throw the algorithm in the first step because of a singular Jacobian, always check the .success field and the .message field of the solution structure.
Note that by default the absolute and relative tolerance of the experimental solve_bvp is 1e-3, and the number of nodes is limited to 500.
Setting the initial node number to 50 (5000 is much too much, the solver refines where necessary), and the tolerance to 1-6, I get the following solution plots that visibly satisfy the boundary conditions.

Optimization of two vectors in python

I need to do an optimization of two vectors x and y, the objective function is a function of these two vectors f(x,y) and x and y are also related with a-x/y =0, is there a well-known method to solve this on python?
Well, your question is general, it'd be great if you provide more details. But I grabbed a code snippet from here, where you can edit. scipy has a class optimize which has a couple of methods to optimize functions.
import numpy as np
from scipy.optimize import minimize
def f(x,y):
return 10 - x/y
# initial values
x0 = 1.3
y0 = 0.5
res = minimize(f, [x0, y0], method='...')
print(res.x)
If you provide more info like what algorithm you want to use, I can provide more precise code.

Exponential fit with the least squares Python

I have a very specific task, where I need to find the slope of my exponential function.
I have two arrays, one denoting the wavelength range between 400 and 750 nm, the other the absorption spectrum. x = wavelengths, y = absorption.
My fit function should look something like that:
y_mod = np.float(a_440) * np.exp(-S*(x - 440.))
where S is the slope and in the image equals 0.016, which should be in the range of S values I should get (+/- 0.003). a_440 is the reference absorption at 440 nm, x is the wavelength.
Modelled vs. original plot:
I would like to know how to define my function in order to get an exponential fit (not on log transformed quantities) of it without guessing beforehand what the S value is.
What I've tried so far was to define the function in such way:
def func(x, a, b):
return a * np.exp(-b * (x-440))
And it gives pretty nice matches
fitted vs original.
What I'm not sure is whether this approach is correct or should I do it differently?
How would one use also the least squares or the absolute differences in y approaches for minimization in order to remove the effect of overliers?
Is it possible to also add random noise to the data and recompute the fit?
Your situation is the same as the one described in the documentation for scipy's curve_fit.
The problem you're incurring is that your definition of the function accepts only one argument when it should receive three: x (the independent variable where the function is evaluated), plus a_440 and S.
Cleaning a bit, the function should be more like this.
def func(x, A, S):
return A*np.exp(-S*(x-440.))
It might be that you run into a warning about the covariance matrix. you solve that by providing a decent starting point to the curve_fit through the argument p0 and providing a list. For example in this case p0=[1,0.01] and in the fitting call it would look like the following
curve_fit(func, x, y, p0=[1,0.01])

Curvefitting optimization error when fitting piecewise linear function

I have some data in two arrays, which appears to have a break in it. I want my code to figure out where the break is with using piecewise in scipy. Here is what I have:
from scipy import optimize
import matplotlib.pyplot as plt
import numpy as np
%matplotlib inline
x = np.array([7228,7620,7730,7901,8139,8370,8448,8737,8824,9089,9233,9321,9509,9568,9642,9756,9915,10601,10942], dtype=np.float)
y= np.array([.874,.893,.8905,.8916,.9095,.9142,.9109,.9185,.9169,.9251,.9290,.9304,.9467,.9378,0.9464,0.9508,0.9583,0.9857,0.9975],dtype=np.float)
def piecewise_linear(x, x0, y0, k1, k2):
return np.piecewise(x, [x < x0], [lambda x:k1*x + y0-k1*x0, lambda x:k2*x + y0-k2*x0])
p , e = optimize.curve_fit(piecewise_linear, x, y)
perr = np.sqrt(np.diag(e))
xd = np.linspace(7228, 11000, 3000)
plt.plot(x, y, "o")
plt.plot(xd, piecewise_linear(xd, *p))
My issue is if I run this, I get an error, "OptimizeWarning: Covariance of the parameters could not be estimated
category=OptimizeWarning)". Not sure how to get around this? Is there maybe a way to feed initial parameters into this function to help it converge or similar?
Note, I do realize that the other way I could be getting this to work is interpolating and finding the second derivative of my data. I've already done this, but because my data is not evenly spaced/ the y axis data has some error in it I am interested in getting it to work this way as well for statistical purposes. So, to be clear, what I want here are the parameters for the two lines (slope/intercept), and the inflection point. (Ideally I would love to get an error too on these too, but not sure if that's possible with this method.) Thanks in advance!
The code works perfectly fine, only the initial values are causing problems.
By default curve_fit starts with all parameters set to 1. Thus, x0 starts way out of range of the x in your data and the optimizer cannot compute a sensible gradient.
This small modification will fix the issue:
# make sure initial x0 and y0 are in range of the data
p0 = [np.mean(x), np.mean(y), 1, 1]
p , e = optimize.curve_fit(piecewise_linear, x, y, p0) # set initial parameter estimates
perr = np.sqrt(np.diag(e))
xd = np.linspace(7228, 11000, 3000)
plt.plot(x, y, "o")
plt.plot(xd, piecewise_linear(xd, *p))
print(p) # [ 9.32099947e+03 9.32965835e-01 2.58225121e-05 4.05400820e-05]
print(np.diag(e)) # [ 4.56978067e+04 5.52060368e-05 3.88418404e-12 7.05010755e-12]
Probably your software uses an iterative method starting from an initial guess. Generally the initial guess is the weakness of those methods.
If you want to overcome this kind of difficulty, use a non iterative method which don't require an initial guess. If the criteria of fitting of the non iterative method is not convenient for you, nevertheless first use the non iterative method to obtain a first solution. Then use a classical iterative method, starting from the solution found first.
For example, the next result is obtained thanks to the very simple algorithm (not iterative, no initial guess) which is given pp. 12-13in the paper : https://fr.scribd.com/document/380941024/Regression-par-morceaux-Piecewise-Regression-pdf

Categories