I'm trying to calculate the value of the variables D and theta for which the function E attains its minimum value.
E = sqrt(x1-Dcos(theta)^2 + (y1-Dsin(theta)^2)) + sqrt(x2-2Dcos(theta)^2 + (y2-2Dsin(theta)^2)) + sqrt(x3-3Dcos(theta)^2 + (y3-3Dsin(theta)^2)).
Here, (x1,y1), (x2,y2), (x3,y3) are known. Now I calculate the partial derivatives of E wrt to D and theta and set them to zero. Now I have 2 equations and 2 unknowns, so theoretically this system should be exactly solvable. The only issue here is that this is highly non-linear. So analytical solutions are out of the question. I'm using Sympy here to calculate the partial derivatives and generate the equations to be used in least_squares from scipy.optimize. I do get a solution for D and theta but it doesn't make any physical sense. Furthermore, the cost value of least_squares is ~17, so my solutions are not very reliable, right? Could someone help me out here? Here's the code:
import sympy as sym
D, theta = sym.symbols("D, theta")
x1,x2,x3 = 9.0,22.0,24.0
y1,y2,y3 = 14.0,14.0,14.0
E = ((x1-D*sym.cos(theta))**2 + (y1-D*sym.sin(theta))**2)**0.5 + ((x2-2*D*sym.cos(theta))**2 + (y2-2*D*sym.sin(theta))**2)**0.5 + ((x3-3*D*sym.cos(theta))**2 + (y3-3*D*sym.sin(theta))**2)**0.5
gradient = sym.derive_by_array(E, (D, theta))
from scipy.optimize import least_squares as ls
grad = sym.lambdify((D, theta), gradient)
A = ls(lambda v: grad(v[0],v[1]), (8,0), bounds=([0, -22.5*np.pi/180], [12, 22.5*np.pi/180])) #theta should be between -22.5 degree and 22.5 degree and D should be between 0 and 12 but ideally not 0.
D2, theta2 = A.x # 0.04561884938833529 -0.3926990816987241
I should also mention that calculating D and theta is part of a problem that involves fitting a line through (x1,y1), (x2,y2), (x3,y3), and (13,2). D is the distance between (13,2) and the first point closest to (x1,y1) on the fitted line, 2D is similarly the distance between (13,2) and the second point closest to (x2,y2), and so on. This analysis has to be done over all the gridpoints on a lat-lon grid of size (21,69). All alternate suggestions to solve this problem are also welcome. Thanks in advance!
You are looking to approximate the sequence 9, 22, 24 with an equidistant sequence like 11,18,25. This does not have a good fit, so a large residual value is reasonable.
Doing the least square sum manually gives
(a-d-9)^2 + (a-22)^2 + (a+d-24)^2 = 3*a^2 +2*d^2 - 110*a - 30*d +const
so the optimum is at a = 55/3 = 18+1/3 and d = 15/2 = 7+1/2
On a second glance, you try to minimize the sum of norms, that is, the sum of distances from points on a line parallel to the x axis to points equidistant on a line through the origin. You can do that without doing any least-squares gradient optimization (which is a strange idea anyway, why not just find the zero location of the gradient with fsolve?)
x = np.array([ 9.0,22.0,24.0])
y = np.array([14.0,14.0,14.0])
K = 1+np.arange(len(x))
target = lambda X,Y: sum(np.hypot(x-K*X, y-K*Y))
from scipy.optimize import fmin, minimize
#U = fmin(lambda u: target(*u), [4,4])
U = minimize(lambda u: target(*u), [4,4])
print(U)
X,Y = U.x
print(X,Y, np.hypot(X,Y), np.arctan2(Y,X))
K = np.arange(1+len(x))
plt.plot(x,y,'o', ms=8); plt.plot(K*X,K*Y, '-s', ms=4); plt.grid(); plt.show()
with the results
fmin:
-----
Optimization terminated successfully.
Current function value: 16.987937
Iterations: 49
Function evaluations: 92
minimize:
---------
fun: 16.987921401556633
hess_inv: array([[2.72893764e-08, 6.01803467e-08],
[6.01803467e-08, 1.53257534e-07]])
jac: array([ 1.29665709, -0.08849001])
message: 'Desired error not necessarily achieved due to precision loss.'
nfev: 740
nit: 29
njev: 182
status: 2
success: False
x: array([8. , 4.66666667])
X: 8.000000, Y: 4.666667,
D: 9.261629, theta: 0.5280744
Related
I would like to get an optimal solution for following equation set:
x_w * 1010 + x_m * d_m = 1017
x_w + x_m = 1
my code is as follows:
from scipy.optimize import minimize
import numpy as np
def f1(p):
x_w, x_m, d_m = p
return (x_w*1010 + x_m*d_m) - 1017.7
def f2(p):
x_w, x_m, d_m = p
return x_w + x_m - 1
bounds =[(0,1), (0,1), (1000, 10000)]
x0 = np.array([0.5, 0.5, 1500])
res = minimize(lambda p: f1(p)+f2(p), x0=x0, bounds=bounds)
However, all I get back (res.x) are the initial values (x0).
How do I make it work? Is there a better approach? There are just these two equations for the three variables.
In general, you can't solve the equation system by minimizing f1(p) + f2(p) since the minimum of this objective is no solution of the equation system. However, you have to minimize the sum of squared errors of each equation, i.e. you minimize f1(p)**2 + f2(p)**2:
minimize(lambda p: f1(p)**2 + f2(p)**2, x0=x0, bounds=bounds)
Alternatively, you could use scipy.optimize.fsolve which doesn't support bounds, unfortunately.
I'm looking for a way to analyze two cubic splines and find the point where they come the closest to each other. I've seen a lot of solutions and posts but I've been unable to implement the methods suggested. I know that the closest point will be one of the end-points of the two curves or a point where the first derivative of both curves is equal. Checking the end points is easy. Finding the points where the first derivatives match is hard.
Given:
Curve 0 is B(t) (red)
Curve 1 is C(s) (blue)
A candidate for closest point is where:
B'(t) = C'(s)
The first derivative of each curve takes the following form:
Where the a, b, c coefficients are formed from the control points of the curves:
a=P1-P0
b=P2-P1
c=P3-P2
Taking the 4 control points for each cubic spline I can get each curve's parametric sections into a matrix form that can be expressed with Numpy with the following Python code:
def test_closest_points():
# Control Points for the two qubic splines.
spline_0 = [(1,28), (58,93), (113,95), (239,32)]
spline_1 = [(58, 241), (26,76), (225,83), (211,205)]
first_derivative_matrix = np.array([[3, -6, 3], [-6, 6, 0], [3, 0, 0]])
spline_0_x_A = spline_0[1][0] - spline_0[0][0]
spline_0_x_B = spline_0[2][0] - spline_0[1][0]
spline_0_x_C = spline_0[3][0] - spline_0[2][0]
spline_0_y_A = spline_0[1][1] - spline_0[0][1]
spline_0_y_B = spline_0[2][1] - spline_0[1][1]
spline_0_y_C = spline_0[3][1] - spline_0[2][1]
spline_1_x_A = spline_1[1][0] - spline_1[0][0]
spline_1_x_B = spline_1[2][0] - spline_1[1][0]
spline_1_x_C = spline_1[3][0] - spline_1[2][0]
spline_1_y_A = spline_1[1][1] - spline_1[0][1]
spline_1_y_B = spline_1[2][1] - spline_1[1][1]
spline_1_y_C = spline_1[3][1] - spline_1[2][1]
spline_0_first_derivative_x_coefficients = np.array([[spline_0_x_A], [spline_0_x_B], [spline_0_x_C]])
spline_0_first_derivative_y_coefficients = np.array([[spline_0_y_A], [spline_0_y_B], [spline_0_y_C]])
spline_1_first_derivative_x_coefficients = np.array([[spline_1_x_A], [spline_1_x_B], [spline_1_x_C]])
spline_1_first_derivative_y_coefficients = np.array([[spline_1_y_A], [spline_1_y_B], [spline_1_y_C]])
# Show All te matrix values
print 'first_derivative_matrix:'
print first_derivative_matrix
print
print 'spline_0_first_derivative_x_coefficients:'
print spline_0_first_derivative_x_coefficients
print
print 'spline_0_first_derivative_y_coefficients:'
print spline_0_first_derivative_y_coefficients
print
print 'spline_1_first_derivative_x_coefficients:'
print spline_1_first_derivative_x_coefficients
print
print 'spline_1_first_derivative_y_coefficients:'
print spline_1_first_derivative_y_coefficients
print
# Now taking B(t) as spline_0 and C(s) as spline_1, I need to find the values of t and s where B'(t) = C'(s)
This post has some good high-level advice but I'm unsure how to implement a solution in python that can find the correct values for t and s that have matching first derivatives (slopes). The B'(t) - C'(s) = 0 problem seems like a matter of finding roots. Any advice on how to do it with python and Numpy would be greatly appreciated.
Using Numpy assumes that the problem should be solved numerically. Without loss of generality we can treat that 0<s<=1 and 0<t<=1. You can use SciPy package to solve the problem numerically, e.g.
from scipy.optimize import minimize
import numpy as np
def B(t):
"""Assumed for simplicity: 0 < t <= 1
"""
return np.sin(6.28 * t), np.cos(6.28 * t)
def C(s):
"""0 < s <= 1
"""
return 10 + np.sin(3.14 * s), 10 + np.cos(3.14 * s)
def Q(x):
"""Distance function to be minimized
"""
b = B(x[0])
c = C(x[1])
return (b[0] - c[0]) ** 2 + (b[1] - c[1]) ** 2
res = minimize(Q, (0.5, 0.5))
print("B-Point: ", B(res.x[0]))
print("C-Point: ", C(res.x[1]))
B-Point: (0.7071067518175205, 0.7071068105555733)
C-Point: (9.292893243165555, 9.29289319446135)
This is example for two circles (one circle and arc). This will likely work with splines.
Your assumption of B'(t) = C'(s) is too strong.
Derivatives have direction and magnitude. Directions must coincide in the candidate points, but magnitudes might differ.
To find points with the same derivative slopes and the closest distance you can solve equation system (of course, high power :( )
yb'(t) * xc'(u) - yc'(t) * xb'(u) = 0 //vector product of (anti)collinear vectors is zero
((xb(t) - xc(u))^2 + (xb(t) - xc(u))^2)' = 0 //distance derivative
You can use the function fmin also:
import numpy as np
import matplotlib.pylab as plt
from scipy.optimize import fmin
def BCubic(t, P0, P1, P2, P3):
a=P1-P0
b=P2-P1
c=P3-P2
return a*3*(1-t)**2 + b*6*(1-t)*t + c*3*t**2
def B(t):
return BCubic(t,4,2,3,1)
def C(t):
return BCubic(t,1,4,3,4)
def f(t):
# L1 or manhattan distance
return abs(B(t) - C(t))
init = 0 # 2
tmin = fmin(f,np.array([init]))
#Optimization terminated successfully.
#Current function value: 2.750000
# Iterations: 23
# Function evaluations: 46
print(tmin)
# [0.5833125]
tmin = tmin[0]
t = np.linspace(0, 2, 100)
plt.plot(t, B(t), label='B')
plt.plot(t, C(t), label='C')
plt.plot(t, abs(B(t)-C(t)), label='|B-C|')
plt.plot(tmin, B(tmin), 'r.', markersize=12, label='min')
plt.axvline(x=tmin, linestyle='--', color='k')
plt.legend()
plt.show()
I am trying to find the minimum of a natural cubic spline. I have written the following code to find the natural cubic spline. (I have been given test data and have confirmed this method is correct.) Now I can not figure out how to find the minimum of this function.
This is the data
xdata = np.linspace(0.25, 2, 8)
ydata = 10**(-12) * np.array([1,2,1,2,3,1,1,2])
This is the function
import scipy as sp
import numpy as np
import math
from numpy.linalg import inv
from scipy.optimize import fmin_slsqp
from scipy.optimize import minimize, rosen, rosen_der
def phi(x, xd,yd):
n = len(xd)
h = np.array(xd[1:n] - xd[0:n-1])
f = np.divide(yd[1:n] - yd[0:(n-1)],h)
q = [0]*(n-2)
for i in range(n-2):
q[i] = 3*(f[i+1] - f[i])
A = np.zeros(((n-2),(n-2)))
#define A for j=0
A[0,0] = 2*(h[0] + h[1])
A[0,1] = h[1]
#define A for j = n-2
A[-1,-2] = h[-2]
A[-1,-1] = 2*(h[-2] + h[-1])
#define A for in the middle
for j in range(1,(n-3)):
A[j,j-1] = h[j]
A[j,j] = 2*(h[j] + h[j+1])
A[j,j+1] = h[j+1]
Ainv = inv(A)
B = Ainv.dot(q)
b = (n)*[0]
b[1:(n-1)] = B
# now we find a, b, c and d
a = [0]*(n-1)
c = [0]*(n-1)
d = [0]*(n-1)
s = [0]*(n-1)
for r in range(n-1):
a[r] = 1/(3*h[r]) * (b[r + 1] - b[r])
c[r] = f[r] - h[r]*((2*b[r] + b[r+1])/3)
d[r] = yd[r]
#solution 1 start
for m in range(n-1):
if xd[m] <= x <= xd[m+1]:
s = a[m]*(x - xd[m])**3 + b[m]*(x-xd[m])**2 + c[m]*(x-xd[m]) + d[m]
return(s)
#solution 1 end
I want to find the minimum on the domain of my xdata, so a fmin didn't work as you can not define bounds there. I tried both fmin_slsqp and minimize. They are not compatible with the phi function I wrote so I rewrote phi(x, xd,yd) and added an extra variable such that phi is phi(x, xd,yd, m). M indicates in which subfunction of the spline we are calculating a solution (from x_m to x_m+1). In the code we replaced #solution 1 by the following
# solution 2 start
return(a[m]*(x - xd[m])**3 + b[m]*(x-xd[m])**2 + c[m]*(x-xd[m]) + d[m])
# solution 2 end
To find the minimum in a domain x_m to x_(m+1) we use the following code: (we use an instance where m=0, so x from 0.25 to 0.5. The initial guess is 0.3)
fmin_slsqp(phi, x0 = 0.3, bounds=([(0.25,0.5)]), args=(xdata, ydata, 0))
What I would then do (I know it's crude), is iterate this with a for loop to find the minimum on all subdomains and then take the overall minimum. However, the function fmin_slsqp constantly returns the initial guess as the minimum. So there is something wrong, which I do not know how to fix. If you could help me this would be greatly appreciated. Thanks for reading this far.
When I plot your function phi and the data you feed in, I see that its range is of the order of 1e-12. However, fmin_slsqp is unable to handle that level of precision and fails to find any change in your objective.
The solution I propose is scaling the return of your objective by the same order of precision like so:
return(s*1e12)
Then you get good results.
>>> sol = fmin_slsqp(phi, x0=0.3, bounds=([(0.25, 0.5)]), args=(xdata, ydata))
>>> print(sol)
Optimization terminated successfully. (Exit mode 0)
Current function value: 1.0
Iterations: 2
Function evaluations: 6
Gradient evaluations: 2
[ 0.25]
I have a logarithmic function being fit to a set of points
def log_func(x, a, b):
return a * np.log(x) + b
popt, pcov = curve_fit(log_func, x, yn)
This results in a plot as follows - Plotted Curve
However, the system has constraints that the range should be fixed between 0 and 100. I've specifically passed points at those bounds (i.e. x = np.array([3200 ... other points ... 42000 ]) and y = np.array([0 ... other points ... 100 ] ) but obviously the curve does not necessarily fix those values.
I've read that I can add bounds to the parameters (so a and b here), but is there a way to constrain the output by specifically forcing the curve through two endpoints. Or alternatively, do I have to introduce some sort of extreme penalization to the function to result in parameters that will be force a result between 0 and 100?
You could formulate the curve fitting problem as a constrained optimization problem and use scipy.optimize.minimize to solve it. Considering a data set for which the optimal a should be positive, it follows that the requirement on the range of the fitted function is equivalent to the constraints a*np.log(3200)+b>=0 and a*np.log(42000)+b<=100.
One can proceed as follows (I used a simple data set).
from scipy.optimize import minimize
x = np.array([3200, 14500, 42000])
yn = np.array([0, 78, 100])
def LS_obj(p):
a, b = p
return ((log_func(x, a, b) - yn)**2).sum()
cons = ({'type': 'ineq', 'fun': lambda p: p[0] * np.log(3200) + p[1]},
{'type': 'ineq', 'fun': lambda p: 100 -p[0] * np.log(42000) - p[1]})
p0 = [10,-100] #initial estimate
sol = minimize(LS_obj,p0 ,constraints=cons)
print(sol.x) #optimal parameters
[ 36.1955 -285.316 ]
The following figure compares the curve_fit and minimize solutions. As expected, the minimize solution is within the required range.
I have a set of simulation data where I would like to find the lowest slope in n dimensions. The spacing of the data is constant along each dimension, but not all the same (I could change that for the sake of simplicity).
I can live with some numerical inaccuracy, especially towards the edges. I would heavily prefer not to generate a spline and use that derivative; just on the raw values would be sufficient.
It is possible to calculate the first derivative with numpy using the numpy.gradient() function.
import numpy as np
data = np.random.rand(30,50,40,20)
first_derivative = np.gradient(data)
# second_derivative = ??? <--- there be kudos (:
This is a comment regarding laplace versus the hessian matrix; this is no more a question but is meant to help understanding of future readers.
I use as a testcase a 2D function to determine the 'flattest' area below a threshold. The following pictures show the difference in results between using the minimum of second_derivative_abs = np.abs(laplace(data)) and the minimum of the following:
second_derivative_abs = np.zeros(data.shape)
hess = hessian(data)
# based on the function description; would [-1] be more appropriate?
for i in hess[0]: # calculate a norm
for j in i[0]:
second_derivative_abs += j*j
The color scale depicts the functions values, the arrows depict the first derivative (gradient), the red dot the point closest to zero and the red line the threshold.
The generator function for the data was ( 1-np.exp(-10*xi**2 - yi**2) )/100.0 with xi, yi being generated with np.meshgrid.
Laplace:
Hessian:
The second derivatives are given by the Hessian matrix. Here is a Python implementation for ND arrays, that consists in applying the np.gradient twice and storing the output appropriately,
import numpy as np
def hessian(x):
"""
Calculate the hessian matrix with finite differences
Parameters:
- x : ndarray
Returns:
an array of shape (x.dim, x.ndim) + x.shape
where the array[i, j, ...] corresponds to the second derivative x_ij
"""
x_grad = np.gradient(x)
hessian = np.empty((x.ndim, x.ndim) + x.shape, dtype=x.dtype)
for k, grad_k in enumerate(x_grad):
# iterate over dimensions
# apply gradient again to every component of the first derivative.
tmp_grad = np.gradient(grad_k)
for l, grad_kl in enumerate(tmp_grad):
hessian[k, l, :, :] = grad_kl
return hessian
x = np.random.randn(100, 100, 100)
hessian(x)
Note that if you are only interested in the magnitude of the second derivatives, you could use the Laplace operator implemented by scipy.ndimage.filters.laplace, which is the trace (sum of diagonal elements) of the Hessian matrix.
Taking the smallest element of the the Hessian matrix could be used to estimate the lowest slope in any spatial direction.
Slopes, Hessians and Laplacians are related, but are 3 different things.
Start with 2d: a function( x, y ) of 2 variables, e.g. a height map of a range of hills,
slopes aka gradients are direction vectors, a direction and length at each point x y.
This can be given by 2 numbers dx dy in cartesian coordinates,
or an angle θ and length sqrt( dx^2 + dy^2 ) in polar coordinates.
Over a whole range of hills, we get a
vector field.
Hessians describe curvature near x y, e.g. a paraboloid or a saddle,
with 4 numbers: dxx dxy dyx dyy.
a Laplacian is 1 number, dxx + dyy, at each point x y.
Over a range of hills, we get a
scalar field.
(Functions or hills with Laplacian = 0
are particularly smooth.)
Slopes are linear fits and Hessians quadratic fits, for tiny steps h near a point xy:
f(xy + h) ~ f(xy)
+ slope . h -- dot product, linear in both slope and h
+ h' H h / 2 -- quadratic in h
Here xy, slope and h are vectors of 2 numbers,
and H is a matrix of 4 numbers dxx dxy dyx dyy.
N-d is similar: slopes are direction vectors of N numbers,
Hessians are matrices of N^2 numbers, and Laplacians 1 number, at each point.
(You might find better answers over on
math.stackexchange .)
You can see the Hessian Matrix as a gradient of gradient, where you apply gradient a second time for each component of the first gradient calculated here is a wikipedia link definig Hessian matrix and you can see clearly that is a gradient of gradient, here is a python implementation defining gradient then hessian :
import numpy as np
#Gradient Function
def gradient_f(x, f):
assert (x.shape[0] >= x.shape[1]), "the vector should be a column vector"
x = x.astype(float)
N = x.shape[0]
gradient = []
for i in range(N):
eps = abs(x[i]) * np.finfo(np.float32).eps
xx0 = 1. * x[i]
f0 = f(x)
x[i] = x[i] + eps
f1 = f(x)
gradient.append(np.asscalar(np.array([f1 - f0]))/eps)
x[i] = xx0
return np.array(gradient).reshape(x.shape)
#Hessian Matrix
def hessian (x, the_func):
N = x.shape[0]
hessian = np.zeros((N,N))
gd_0 = gradient_f( x, the_func)
eps = np.linalg.norm(gd_0) * np.finfo(np.float32).eps
for i in range(N):
xx0 = 1.*x[i]
x[i] = xx0 + eps
gd_1 = gradient_f(x, the_func)
hessian[:,i] = ((gd_1 - gd_0)/eps).reshape(x.shape[0])
x[i] =xx0
return hessian
As a test, the Hessian matrix of (x^2 + y^2) is 2 * I_2 where I_2 is the identity matrix of dimension 2
hessians = np.asarray(np.gradient(np.gradient(f(X, Y))))
hessians[1:]
Worked for 3-d function f.