How to implement the following formula for derivatives in python?

How to implement the following formula for derivatives in python? - python

I'm trying to implement the following formula in python for X and Y points
I have tried following approach
def f(c):
"""This function computes the curvature of the leaf."""
tt = c
n = (tt[0]*tt[3] - tt[1]*tt[2])
d = (tt[0]**2 + tt[1]**2)
k = n/d
R = 1/k # Radius of Curvature
return R
There is something incorrect as it is not giving me correct result. I think I'm making some mistake while computing derivatives in first two lines. How can I fix that?
Here are some of the points which are in a data frame:
pts = pd.DataFrame({'x': x, 'y': y})
x y
0.089631 97.710199
0.089831 97.904541
0.090030 98.099313
0.090229 98.294513
0.090428 98.490142
0.090627 98.686200
0.090827 98.882687
0.091026 99.079602
0.091225 99.276947
0.091424 99.474720
0.091623 99.672922
0.091822 99.871553
0.092022 100.070613
0.092221 100.270102
0.092420 100.470020
0.092619 100.670366
0.092818 100.871142
0.093017 101.072346
0.093217 101.273979
0.093416 101.476041
0.093615 101.678532
0.093814 101.881451
0.094013 102.084800
0.094213 102.288577
pts_x = np.gradient(x_c, t) # first derivatives
pts_y = np.gradient(y_c, t)
pts_xx = np.gradient(pts_x, t) # second derivatives
pts_yy = np.gradient(pts_y, t)
After getting the derivatives I am putting the derivatives x_prim, x_prim_prim, y_prim, y_prim_prim in another dataframe using the following code:
d = pd.DataFrame({'x_prim': pts_x, 'y_prim': pts_y, 'x_prim_prim': pts_xx, 'y_prim_prim':pts_yy})
after having everything in the data frame I am calling function for each row of the data frame to get curvature at that point using following code:
# Getting the curvature at each point
for i in range(len(d)):
temp = d.iloc[i]
c_temp = f(temp)
curv.append(c_temp)

You do not specify exactly what the structure of the parameter pts is. But it seems that it is a two-dimensional array where each row has two values x and y and the rows are the points in your curve. That itself is problematic, since the documentation is not quite clear on what exactly is returned in such a case.
But you clearly are not getting the derivatives of x or y. If you supply only one array to np.gradient then numpy assumes that the points are evenly spaced with a distance of one. But that is probably not the case. The meaning of x' in your formula is the derivative of x with respect to t, the parameter variable for the curve (which is separate from the parameters to the computer functions). But you never supply the values of t to numpy. The values of t must be the second parameter passed to the gradient function.
So to get your derivatives, split the x, y, and t values into separate one-dimensional arrays--lets call them x and y and t. Then get your first and second derivatives with
pts_x = np.gradient(x, t) # first derivatives
pts_y = np.gradient(y, t)
pts_xx = np.gradient(pts_x, t) # second derivatives
pts_yy = np.gradient(pts_y, t)
Then continue from there. You no longer need the t values to calculate the curvatures, which is the point of the formula you are using. Note that gradient is not really designed to calculate the second derivatives, and it absolutely should not be used to calculate third or higher-order derivatives. More complex formulas are needed for those. Numpy's gradient uses "second order accurate central differences" which are pretty good for the first derivative, poor for the second derivative, and worthless for higher-order derivatives.

I think your problem is that x and y are arrays of double values.
The array x is the independent variable; I'd expect it to be sorted into ascending order. If I evaluate y[i], I expect to get the value of the curve at x[i].
When you call that numpy function you get an array of derivative values that are the same shape as the (x, y) arrays. If there are n pairs from (x, y), then
y'[i] gives the value of the first derivative of y w.r.t. x at x[i];
y''[i] gives the value of the second derivative of y w.r.t. x at x[i].
The curvature k will also be an array with n points:
k[i] = abs(x'[i]*y''[i] -y'[i]*x''[i])/(x'[i]**2 + y'[i]**2)**1.5
Think of x and y as both being functions of a parameter t. x' = dx/dt, etc. This means curvature k is also a function of that parameter t.
I like to have a well understood closed form solution available when I program a solution.
y(x) = sin(x) for 0 <= x <= pi
y'(x) = cos(x)
y''(x) = -sin(x)
k = sin(x)/(1+(cos(x))**2)**1.5
Now you have a nice formula for curvature as a function of x.
If you want to parameterize it, use
x(t) = pi*t for 0 <= t <= 1
x'(t) = pi
x''(t) = 0
See if you can plot those and make your Python solution match it.

Related

Mixed partial dervative w.r.t. tensor in Pytorch

Question:
Is there any working method to calculate gradient of (non-scalar) tensor function?
Example
Given n by n symmetric matrices X, Y and matrix function Z(X, Y) = torch.mm(X.mm(X), Y) calculate d(dZ/dX)/dY.
Expected answer
d(dZ/dX)/dY = d(2*XY)/dY = 2*X
Attempts
Because torch's .backward() works only for scalar variables I've tried to calculate derivative by applying torch.autograd.grad() to each element of tensor Z, but this approach is not correct, because it gives d(X^2)/dX = X + 2*D where D is a diagonal matrix with diagonal values of X. For me it's a bit weird that torch has an ability to build a computational graph, but can't track tensor through it as a variable to get tensor derivative.
Edit
Question was not very clear, so I decided to give more details.
My aim is to get partial derivative of loss function, which involves two matrices as variables. It looks like that:
loss = torch.linalg.norm(my_formula(X, Y) , ord='fro')
And I need to find
d^2(loss)/d(Y^2)
d/dX[d(loss)/dY]
Torch is capable of calculating 1. by using .backward() two times, but it's problematic to find 2. because torch.autograd.grad() expects scalar input and not the tensor

TL;DR
For function f which takes a matrix and gives a scalar:
Find first order derivative, let's name it dX
Take trace: Tr(dX)
To get mixed partial derivative just use the trace from above: d/dY[df/dX] = d/dY[Tr(df/dX)]
Intro
At the moment of posting the question I was not really that good at theory of matrix derivatives, but now I know much more all thanks to this Yandex ml book (unfortunately, I didn't find the english equivalent). This is an attempt to give a full answer to my question.
Basic Theory
Forgive me, Lord, for ugly representation of latex
Let's say you have a function which takes matrix X and returns it's squared Frobenius norm: f(X) = ||X||_F^2
It is a well-known fact that: ||X||_F^2 = Tr(X X^T)
Let's define derivative as shown in same book: D[f] at X_0 = f(X + H) - f(X)
We are ready to find dg(X)/dX:
df(X)/dX = dTr(X X^T)/dX =
(using Trace's feature)
= Tr(d/dX[X X^T]) = Tr(dX/dX X^T + X d[X^T]/dX ) =
(then we should use the definition of derivative from above)
= Tr(HX^T + XH^T) = Tr(HX^T) + Tr(XH^T) =
(now the main trick is to get all matrices H on the right side and get something like
Tr(g(X) H) or Tr(g(X) H^T), where g(X) will be the derivative we are looking for)
= Tr(HX^T) + Tr(XH^T) = Tr(XH^T) + Tr(XH^T) = Tr(2*XH^T)
That means: df(X)/dX = 2X
Second order derivative
Now, after we found out how to get matrix derivatives, let's try to find second order derivative of the same function f(X):
d/dX[df(X)/dX] = d/dX[Tr(2XH_1^T)] = Tr(d/dX[2XH_1^T]) =
= Tr(2I H_2 H_1^T)
We found out that d/dX[df(X)/dX] = 2I where I stands for Identity matrix. But how will it help us to find derivatives in Pytorch?
Trace is the trick
As we can see from the formulas, both first and second order derivatives have Trace inside them, but when we take first order derivative we just instantly get matrix as a result. To get a higher order derivative we just need to take the derivative of trace of first order derivative:
d/dY[df/dX] = d/dY[Tr(df/dX)]
The thing is I was using JAX autograd library when this trick came to my mind, so the code with a function f(X,Y) will look like this:
def scalarized_dy(X, Y):
dY = grad(f, argnums=1)(X, Y)
return jnp.trace(dY)
dYX = grad(scalarized_dy, argnums=0)(X, Y)
dYY = grad(scalarized_dy, argnums=1)(X, Y)
In case of Pytorch I guess we will need to look after tensors' gradients (let loss be a function with X and Y as arguments):
loss = f(X, Y)
loss.backward(create_graph = True)
dX = torch.trace(X.grad)
dX.backward()
dXX = X.grad
dXY = Y.grad
Epilogue
I thought that the question itself is in some way interesting. Also, it took me several months to figure things out, so I decided to give my current point of view on this problem. I will not mark my answer as correct yet in hope that I will get some kind of feedback or, perhaps, even better answers or ideas.

Using a forloop to solve coupled differential equations in python

I am trying to solve a set of differential equations, but I have been having difficulty making this work. My differential equations contain an "i" subscript that represents numbers from 1 to n. I tried implementing a forloop as follows, but I have been getting this index error (the error message is below). I have tried changing the initial conditions (y0) and other values, but nothing seems to work. In this code, I am using solve_ivp. The code is as follows:
import numpy as np
import matplotlib.pyplot as plt
%matplotlib inline
from scipy.integrate import solve_ivp
def testmodel(t, y):
X = y[0]
Y = y[1]
J = y[2]
Q = y[3]
a = 3
S = 0.4
K = 0.8
L = 2.3
n = 100
for i in range(1,n+1):
dXdt[i] = K**a+(Q[i]**a) - S*X[i]
dYdt[i] = (K*X[i])-(L*Y[i])
dJdt[i] = S*Y[i]-(K*Q[i])
dQdt[i] = K*X[i]/L+J[i]
return dXdt, dYdt, dJdt, dQdt
t_span= np.array([0, 120])
times = np.linspace(t_span[0], t_span[1], 1000)
y0 = 0,0,0,0
soln = solve_ivp(testmodel, t_span, y0, t_eval=times,
vectorized=True)
t = soln.t
X = soln.y[0]
Y = soln.y[1]
J = soln.y[2]
Q = soln.y[3]
plt.plot(t, X,linewidth=2, color='red')
plt.show()
The error I get is
IndexError Traceback (most recent call last)
<ipython-input-107-3a0cfa6e42ed> in testmodel(t, y)
15 n = 100
16 for i in range(1,n+1):
--> 17 dXdt[i] = K**a+(Q[i]**a) - S*X[i]
IndexError: index 1 is out of bounds for axis 0 with size 1
I have scattered the web for a solution to this, but I have been unable to apply any solution to this problem. I am not sure what I am doing wrong and what to actually change.
I have tried to remove the "vectorized=True" argument, but then I get an error that states I cannot index scalar variables. This is confusing because I do not think these values should be scalar. How do I resolve this problem, my ultimate goal is to plot these differential equations. Thank you in advance.

It is nice that you provide the standard solver with a vectorized ODE function for multi-point evalutions. But the default method is the explicit RK45, and explicit methods do not use Jacobi matrices. So there is no need for multi-point evaluations for difference quotients for the partial derivatives.
In essence, the coordinate arrays always have size 1, as the evaluation is at a single point, so for instance Q is an array of length 1, the only valid index is 0. Remember, in all "true" programming languages, array indices start at 0. It is only some CAS script languages that use the "more mathematical" 1 as index start. (Setting n=100 and ignoring the length of the arrays provided by the solver is wrong as well.)
You can avoid all that and shorten your routine by taking into account that the standard arithmetic operations are applied element-wise for numpy arrays, so
def testmodel(t, y):
X,Y,J,Q = y
a = 3; S = 0.4; K = 0.8; L = 2.3
dXdt = K**a + Q**a - S*X
dYdt = K*X - L*Y
dJdt = S*Y - K*Q
dQdt = K*X/L + J
return dXdt, dYdt, dJdt, dQdt
Modifying your code for multiple compartments with the same dynamic
You need to pass the solver a flat vector of the state. The first design decision is how the compartments and their components are arranged in the flat vector. One variant that is most compatible with the existing code is to cluster the same components together. Then in the ODE function the first operation is to separate out these clusters.
X,Y,J,Q = y.reshape([4,-1])
This splits the input vector into 4 pieces of equal length. At the end you need to reverse this split so that the derivatives are again in a flat vector.
return np.concatenate([dXdt, dYdt, dJdt, dQdt])
Everything else remains the same. Apart from the initial vector, which needs to have 4 segments of length N containing the data for the compartments. Here that could just be
y0 = np.zeros(4*N)
If the initial data is from any other source, and given in records per compartment, you might have to transpose the resulting array before flattening it.
Note that this construction is not vectorized, so leave that option unset in its default False.
For uniform interaction patterns like in a circle I recommend the use of numpy.roll to continue to avoid the use of explicit loops. For an interaction pattern that looks like a network one can use connectivity matrices and masks like in Using python built-in functions for coupled ODEs

Calculating wind gradient du_dx, dv_dy using np.gradient

I am trying to calculate the wind gradient given u-wind and v-wind. The u and v values have a 3d-array with the following shape:
u(122,9,9) such that u(time,latitude,longitude). The same applies for v.
I have also calculated the dx and dy values (in 2-d array for both lat and lon direction)
The sample of my code is as below at time 0 for example:
dudx = np.gradient(u[0,0,:], dx[0,0], edge_order=2)
dvdy = np.gradient(v[0,:,0], dy[0,0], edge_order=2)
I can then sum dudx and dvdy to get the gradient. I have a data that has already calculated the divergence, and upon comparing my calculation with the divergence data, i expected the values to be the same, but they're not. I can't seem to figure out where i went wrong besides using the np.gradient function incorrectly.
I would like to know if my methods above to calculate the gradient of u and v winds are correct.
Cheers.
Edit
The full code i am using to calculate the wind gradient is as below:
dqu_dx = np.zeros((122,9,9))
dqv_dy = np.zeros((122,9,9))
for i in range(122):
for j in range(9):
for k in range(9):
dqu_dx[i,j,:] = np.gradient(dqu_18hr[i,j,:], dx[0,k], edge_order=2)
dqv_dy[i,:,k] = np.gradient(dqv_18hr[i,:,k], dy[j,0], edge_order=2)

Unfortunately I can't comment on your question to ask for explanations because I don't have enough reputation, so I am forced to make some assumptions. Feel free to correct me if I am wrong.
I will assume that dqu_18hr and dqv_18hr are arrays storing the value of two different functions, u(t, y, x) and v(t, y, x). If I understand correctly you want to calculate du/dx and dv/dy.
I don't know what are the dx and dy values that you store in the arrays, also because you define them as 2D-arrays but use them as 1D-arrays. I will assume that dx and dy are coordinates of the points at which you computed u and v, and that the grid they produce is regular.
A first problem with your code is that you are passing a single scalar number as the second argument of np.gradient. When this is done, numpy assumes that this is the distance between points. However, this distance changes at every iteration. I can think of a quite convoluted case for which the definition of dx is such that this gives the correct result, but generally this is a mistake.
Another problem with the code is that it doesn't take advantage of numpy vectorization, using explicitly three for loops. This is extremely inefficient computationally.
I would suggest you the following code:
x = dx[0, :] # or whatever is the correct definition
y = dy[:, 0] # not enough info in the post to understand it
a = np.gradient(dqu_18hr, x, axis=2, edge_order=2)
b = np.gradient(dqv_18hr, y, axis=1, edge_order=2)
Please also notice that in your code x is associated to the axis 2 and y to the axis 1, which is absolutely legit but unusual so you might want to check if that's a mistake.

Derivatives blow up in python

I am trying to find higher order derivatives of a dataset (x,y). x and y are 1D arrays of length N.
Let's say I generate them as :
xder0=np.linspace(0,10,1000)
yder0=np.sin(xder0)
I define the derivative function which takes in 2 array (x,y) and returns (x1, y1) where y1 is the derivative calculated at each index as : (y[i+1]-y[i])/(x[i+1]-x[i]). x1 is just the mean of x[i+1] and x[i]
Here is the function that does it:
def deriv(x,y):
delx =np.zeros((len(x)-1), dtype=np.longdouble)
ydiff=np.zeros((len(x)-1), dtype=np.longdouble)
for i in range(len(x)-1):
delx[i] =(x[i+1]+x[i])/2.0
ydiff[i] =(y[i+1]-y[i])/(x[i+1]-x[i])
return delx, ydiff
Now to calculate the first derivative, I call this function as:
xder1, yder1 = deriv(xder0, yder0)
Similarly for second derivative, I call this function giving first derivatives as input:
xder2, yder2 = deriv(xder1, yder1)
And it goes on:
xder3, yder3 = deriv(xder2, yder2)
xder4, yder4 = deriv(xder3, yder3)
xder5, yder5 = deriv(xder4, yder4)
xder6, yder6 = deriv(xder5, yder5)
xder7, yder7 = deriv(xder6, yder6)
xder8, yder8 = deriv(xder7, yder7)
xder9, yder9 = deriv(xder8, yder8)
Something peculiar happens after I reach order 7. The 7th order becomes very noisy! Earlier derivatives are all either sine or cos functions as expected. However 7th order is a noisy sine. And hence all derivatives after that blow up.
Any idea what is going on?

This is a well known stability issue with numerical interpolation using equally-spaced points. Read the answers at http://math.stackexchange.com.
To overcome this problem you have to use non-equally-spaced points, like the roots of Lagendre polynomial. The instability occurs due to the unavailability of information at the boundaries, thus more concentration of points at the boundaries is required, as per the roots of say Lagendre polynomials or others with similar properties such as Chebyshev polynomial.

Reverse output of polyfit numpy

I have used numpy's polyfit and obtained a very good fit (using a 7th order polynomial) for two arrays, x and y. My relationship is thus;
y(x) = p[0]* x^7 + p[1]*x^6 + p[2]*x^5 + p[3]*x^4 + p[4]*x^3 + p[5]*x^2 + p[6]*x^1 + p[7]
where p is the polynomial array output by polyfit.
Is there a way to reverse this method easily, so I have a solution in the form of,
x(y) = p[0]*y^n + p[1]*y^n-1 + .... + p[n]*y^0

No there is no easy way in general. Closed form-solutions for arbitrary polynomials are not available for polynomials of the seventh order.
Doing the fit in the reverse direction is possible, but only on monotonically varying regions of the original polynomial. If the original polynomial has minima or maxima on the domain you are interested in, then even though y is a function of x, x cannot be a function of y because there is no 1-to-1 relation between them.
If you are (i) OK with redoing the fitting procedure, and (ii) OK with working piecewise on single monotonic regions of your fit at a time, then you could do something like this:
-
import numpy as np
# generate a random coefficient vector a
degree = 1
a = 2 * np.random.random(degree+1) - 1
# an assumed true polynomial y(x)
def y_of_x(x, coeff_vector):
"""
Evaluate a polynomial with coeff_vector and degree len(coeff_vector)-1 using Horner's method.
Coefficients are ordered by increasing degree, from the constant term at coeff_vector[0],
to the linear term at coeff_vector[1], to the n-th degree term at coeff_vector[n]
"""
coeff_rev = coeff_vector[::-1]
b = 0
for a in coeff_rev:
b = b * x + a
return b
# generate some data
my_x = np.arange(-1, 1, 0.01)
my_y = y_of_x(my_x, a)
# verify that polyfit in the "traditional" direction gives the correct result
# [::-1] b/c polyfit returns coeffs in backwards order rel. to y_of_x()
p_test = np.polyfit(my_x, my_y, deg=degree)[::-1]
print p_test, a
# fit the data using polyfit but with y as the independent var, x as the dependent var
p = np.polyfit(my_y, my_x, deg=degree)[::-1]
# define x as a function of y
def x_of_y(yy, a):
return y_of_x(yy, a)
# compare results
import matplotlib.pyplot as plt
%matplotlib inline
plt.plot(my_x, my_y, '-b', x_of_y(my_y, p), my_y, '-r')
Note: this code does not check for monotonicity but simply assumes it.
By playing around with the value of degree, you should see that see the code only works well for all random values of a when degree=1. It occasionally does OK for other degrees, but not when there are lots of minima / maxima. It never does perfectly for degree > 1 because approximating parabolas with square-root functions doesn't always work, etc.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.