Invert a 2D --> 2D non-linear function - python

I'm sorry if this is an unusual way to write a question here, since the scope of it seems quite large (to me). I'd be happy to be directed to pre-defined packages that already do what I'm needing, indeed I hope there is (should be?) a standardised solution to my question. I was wondering if there is any help out there? I'm still learning Python through a project I'm doing, and I feel I'm slightly weak on certain points...
Ok so here goes: I would like to invert a 2D --> 2D function in python, but none of my efforts have succeeded yet.
Let's say I have two relationships in a (non-linear) systems of equations, so
a = f(x,y)
b = g(x,y)
where both f and g are continuous and invertible, and x and y have a certain pre-defined rectangular domain. a and b also have their own rectangular domain, but it is different from that of x and y.
Some extra info on f and g: One of the functions will be linear, let's call this f. So, a = f(x,y) = qx + py + r (where q, p and r are known constants). In Python terms, I guess you would write a[ i, j ] = qx[ i ] + py[ j ]. The other function, g, has no analytic expression but looks similar to ksin(x) + lsin(y), for x and y between 0 and pi/2.
Moreover, the overall "mother-function" that I wish to make a 3D surface plot of, takes a and b as arguments. Calling the mother function M, we then have that M = M(a,b) = M(f(x,y),g(x,y)). So far so good.
The essence of the problem is that I need to first choose a pair (a,b) on the "mother-grid", then find the corresponding pair (x,y) that gives rise to this particular (a,b). f and g do not have any analytically "simple" inverses however, and I need to find these numerically.
So the basic question is, "given a[ i ] and b[ j ] as two sorted lists, and given x[ ii ] and y [ jj ] that are used to obtain each a and b, how do I find the two inverse functions x = inv1 (a,b) and y = inv2 (a,b)?"
PS. I have tried the "cheap way" of circumventing this problem by first choosing a (x,y) pair, calculating a tentative (A,B) pair and then interpolate this into the pre-defined (a,b) mesh as best as I could. However, since the (x,y) mesh and the (a,b) meshes are (quite) different, the corresponding "fitting error" always make the end result come out jagged and messy (I have a control scenario where I know what the end result should look like, before doing my own cases). This is because I am essentially forcing the values of A and B onto the height of the M function at position (a,b) if that makes sense. I've tried averaging and smoothing "cheats" to this, but it is still not passable imo. Hence, I really need to choose an (a,b) pair FIRST, and then only finding the relevant (x,y) pair after that.
Note: Some parameters in the M-function depends directly on x and y, hence the need for knowing the exact values of x and y.

Thanks for the additional information! I think you can solve this directly and then analytically.
Starting with your final result (a, b):
First solve against a to find your x-y line, e.g.:
a = qx + py + r
y = (qx + r - a) / -p
Since you said it is monotonically increasing, I just solve for y for simplicity.
Next, plug this into your non-analytic g using a simple binary search across x:
def invert(a, b):
def get_y(x):
return (Q * x + R - a) / -P
def g_constrained(x):
return g(x, func(x))
x = binary_search(g_constrained, b, min_x, max_x, guess_x)
y = get_y(x)
return x, y
Note that your function is not guaranteed to have a single solution in general, consider a planar solution for g that looks like an arc, since f is a line, you can have two intersections. You will need to decide what you want to do with this information.
Previous concerns
I am suspicious of your claim that a = f(x,y) is continuous and invertible.
Put succinctly: if your function z = f(x, y) doesn't intersect the plane z = K in exactly one point for every K, it is not invertible.
For a detailed example:
Consider some point, and 4 points around it - here I use (0,0) and unit length, for convenience.
z = f(0,0)
a = f(-1,0)
b = f(1,0)
p = f(0,-1)
q = f(0,1)
If f provides a scalar value (or anything where x < y and y < z implies x < z), then we have a problem.
Since f is continuous and invertible, either a < z < b or b < z < a, and likewise for p and q. So f-inv(z+) will map to two different values, one on the line (-1, 0) -> (1, 0) and one on the line (0, -1) -> (0, 1).

Related

Python: How can I test for a relatively straight line in a series of cubic lines?

I have a collection of curved lines, representing the third degree polynomial line of best fit for some datasets.
I want to differentiate relatively flat lines, filtering these plots, for further analyses.
For example I want to filter subplots 20935, 21004, 21010, 18761, 21037.
How can I do this, with a list of floats as input for these lines?
(using Python 3.8, Numpy, Math, mathplotlib in an anaconda env)
If you have got a list of xs and their respective ys, you can compute the slope for each point and check if the slope is always a constant value.
threshold = 0.001 # add your precision here. zero indicates a perfect straight line
is_straight_line = True
slope = (y[1]-y[0]) / (x[1] - x[0])
for i, (xval, yval) in enumerate(zip(x[2:], y[2:])):
s = (yval - y[i-1]) / (xval - x[i-1])
if abs(s - slope) > threshold:
is_straight_line = False
break
print(is_straight_line)
if you need the computation to be efficient, you should consider using numpy instead.
Knowledge of first-year calculus is assumed. There's a geometric property called "curvature" that basically determines how much a shape bends at a certain point (really the inverse of the radius of the osculating circle at that point).
We can use this link to develop a formula for a cubic function with coefficients [a, b, c, d] at x = x.
def cubic_curvature(a, b, c, d, x):
k = abs(6*a*x + 2*b) / (1 + (3*a*x**2 + 2*b*x + c)**2) ** 1.5
return k
More general algorithms can be created for any polynomial, possibly with assistance from the sympy library depending on your needs.
With this in mind, you can set some threshold for curvature that determines whether the cubic is "straight" enough given its coefficients (I believe scipy or similar should be able to give you these from a list of points) and the x-value to be evaluated at (try the median independent variable).

How to implement the following formula for derivatives in python?

I'm trying to implement the following formula in python for X and Y points
I have tried following approach
def f(c):
"""This function computes the curvature of the leaf."""
tt = c
n = (tt[0]*tt[3] - tt[1]*tt[2])
d = (tt[0]**2 + tt[1]**2)
k = n/d
R = 1/k # Radius of Curvature
return R
There is something incorrect as it is not giving me correct result. I think I'm making some mistake while computing derivatives in first two lines. How can I fix that?
Here are some of the points which are in a data frame:
pts = pd.DataFrame({'x': x, 'y': y})
x y
0.089631 97.710199
0.089831 97.904541
0.090030 98.099313
0.090229 98.294513
0.090428 98.490142
0.090627 98.686200
0.090827 98.882687
0.091026 99.079602
0.091225 99.276947
0.091424 99.474720
0.091623 99.672922
0.091822 99.871553
0.092022 100.070613
0.092221 100.270102
0.092420 100.470020
0.092619 100.670366
0.092818 100.871142
0.093017 101.072346
0.093217 101.273979
0.093416 101.476041
0.093615 101.678532
0.093814 101.881451
0.094013 102.084800
0.094213 102.288577
pts_x = np.gradient(x_c, t) # first derivatives
pts_y = np.gradient(y_c, t)
pts_xx = np.gradient(pts_x, t) # second derivatives
pts_yy = np.gradient(pts_y, t)
After getting the derivatives I am putting the derivatives x_prim, x_prim_prim, y_prim, y_prim_prim in another dataframe using the following code:
d = pd.DataFrame({'x_prim': pts_x, 'y_prim': pts_y, 'x_prim_prim': pts_xx, 'y_prim_prim':pts_yy})
after having everything in the data frame I am calling function for each row of the data frame to get curvature at that point using following code:
# Getting the curvature at each point
for i in range(len(d)):
temp = d.iloc[i]
c_temp = f(temp)
curv.append(c_temp)
You do not specify exactly what the structure of the parameter pts is. But it seems that it is a two-dimensional array where each row has two values x and y and the rows are the points in your curve. That itself is problematic, since the documentation is not quite clear on what exactly is returned in such a case.
But you clearly are not getting the derivatives of x or y. If you supply only one array to np.gradient then numpy assumes that the points are evenly spaced with a distance of one. But that is probably not the case. The meaning of x' in your formula is the derivative of x with respect to t, the parameter variable for the curve (which is separate from the parameters to the computer functions). But you never supply the values of t to numpy. The values of t must be the second parameter passed to the gradient function.
So to get your derivatives, split the x, y, and t values into separate one-dimensional arrays--lets call them x and y and t. Then get your first and second derivatives with
pts_x = np.gradient(x, t) # first derivatives
pts_y = np.gradient(y, t)
pts_xx = np.gradient(pts_x, t) # second derivatives
pts_yy = np.gradient(pts_y, t)
Then continue from there. You no longer need the t values to calculate the curvatures, which is the point of the formula you are using. Note that gradient is not really designed to calculate the second derivatives, and it absolutely should not be used to calculate third or higher-order derivatives. More complex formulas are needed for those. Numpy's gradient uses "second order accurate central differences" which are pretty good for the first derivative, poor for the second derivative, and worthless for higher-order derivatives.
I think your problem is that x and y are arrays of double values.
The array x is the independent variable; I'd expect it to be sorted into ascending order. If I evaluate y[i], I expect to get the value of the curve at x[i].
When you call that numpy function you get an array of derivative values that are the same shape as the (x, y) arrays. If there are n pairs from (x, y), then
y'[i] gives the value of the first derivative of y w.r.t. x at x[i];
y''[i] gives the value of the second derivative of y w.r.t. x at x[i].
The curvature k will also be an array with n points:
k[i] = abs(x'[i]*y''[i] -y'[i]*x''[i])/(x'[i]**2 + y'[i]**2)**1.5
Think of x and y as both being functions of a parameter t. x' = dx/dt, etc. This means curvature k is also a function of that parameter t.
I like to have a well understood closed form solution available when I program a solution.
y(x) = sin(x) for 0 <= x <= pi
y'(x) = cos(x)
y''(x) = -sin(x)
k = sin(x)/(1+(cos(x))**2)**1.5
Now you have a nice formula for curvature as a function of x.
If you want to parameterize it, use
x(t) = pi*t for 0 <= t <= 1
x'(t) = pi
x''(t) = 0
See if you can plot those and make your Python solution match it.

Check whether coordinates are in a certain region on a coordinate system

I have a coordinate system with a certain amount of regions, similar to this one:
The difference in my case is however, that all regions are uniquely numbered, are all of the same size and there are 16 of them (so each quadrant would have 4 slices of exactly the same size).
I also have a set of tuples (two dimensional coordinates), which are all between (-1,-1) and (1,1). I'd now like to check into which region (i.e. 1 to 16) they'd land if mapped onto the coordinate system.
As a total beginner, I have no idea on how to tackle this, but here is my approach so far:
Make all the dividing lines functions and check for each point whether they're above and below them. Ignore those on the decision boundary
For example: Quadrant 1 has four regions. From the x-axis to the y-axis (counter-clockwise) let's call them a, b, c and d.
a would be the region between the x-axis and f1(x) = 0.3333x (red)
b between f1 and f2, f2(x) = x (yellow)
c between f2 and f3, f3(x) = 3x (blue)
d between f3 and the y-axis
As code:
def a(p):
if(y > 0 and y < 0.3333x):
return "a"
else:
b(p)
def b(p):
if(y > 0.3333x and y < x)
return "b"
else:
c(p)
def c(p):
if(y > x and y < 3x):
return "c"
else:
d(p)
def d(p):
if(y > 3x and x > 0):
return "d"
Note: for readability's sake I just wrote "x" and "y" for the tuple's respective coordinates, instead p[0] or p[1] every time. Also, as stated above, I'm assuming that there are not items directly on the functions, so those are ignored.
Now, that is a possible solution, but I feel like there's almost certainly a more efficient one.
Since you're working between (-1,-1) and (1,1) coordinates and divinding equaly the cartesian plane, it becomes naturally to use trigonometry functions. Thinking in the unitary circle, which has 2*pi deegres, you are dividing it in n equal parts (in this case n = 16). So each slice has (2*pi)/16 = pi/8 deegres. Now you can imagine an arbitray point (x, y) connected to the origin point (0, 0), it formes an angle with the x-axis. To find this angle you just need to calculate the arc-tangent of y/x. Then you just need to verify in which angle section it is.
Here is a sketch:
And to directly map to the interval you can use the bisect module:
import bisect
from math import atan2
from math import pi
def find_section(x, y):
# create intervals
sections = [2 * pi * i / 16 for i in range(1, 17)]
# find the angle
angle = atan2(y, x)
# adjusts the angle to the other half circle
if y < 0:
angle += 2*pi
# map into sections
return bisect.bisect_left(sections, angle)
Usage:
In [1]: find_section(0.4, 0.2)
Out[1]: 1
In [2]: find_section(0.8, 0.2)
Out[2]: 0
Shapely is a python library that can help you with typical cartesian geometry, but as far as I know it doesn't have an easy way of extending its Line objects indefinitely based on a function.
If you're ok with that, then you can check if any Point is in any Polygon using the Polygon.contains(Point) pattern, as shown here: https://shapely.readthedocs.io/en/stable/manual.html#object.contains

Minimizing a function with simplex using constraint in python

I currently have a function that take a lot of parameters (let's call them
*lot, a, b), do a lot of stuff and return 3 float (x, y , z).
What I'm trying to do is to minimize x by varying a and b and puting some constraint to a, b, y and z (a and b are limited with a min and a max value while y and z just have a min value).
If I just explore my search area I can see that the surface that form all the x is in fact a bowl, so my result is linear (or am I that wrong?) if my result is linear I can just use a simplex to find my minimum allowed with my constraint (or Am i wrong here?).
Now my problem: In python, using optimize.minimize from scipy it seems that I can't use constraint with simplex (except maybe min and max on a and b but that's not the biggest issue here) so did I misunderstand completly the minimize function? Do I really need something else than simplex or is there another solution than the optimize function?
Here is some kind of code snippet that can emulate my function:
# coding=utf-8
def bowl(a,b):
y = a*b
z = a/b
x = (a*a)/(y*y) + (b*b)/(z*z)
return x,y,z
def surface(min_a, max_a, min_b, max_b):
for a in range(min_a,max_a):
buffer = list()
for b in range(min_b, max_b):
buffer.append(bowl(a,b))
print(buffer)
surface(1,10,1,10)

Find intersection of A(x) and B(y) in complex plane plus corr. x and y

suppose I have the following Problem:
I have a complex function A(x) and a complex function B(y). I know these functions cross in the complex plane. I would like to find out the corresponding x and y of this intersection point, numerically ( and/or graphically). What is the most clever way of doing that?
This is my starting point:
import matplotlib.pyplot as plt
import numpy as np
from numpy import sqrt, pi
x = np.linspace(1, 10, 10000)
y = np.linspace(1, 60, 10000)
def A_(x):
return -1/( 8/(pi*x)*sqrt(1-(1/x)**2) - 1j*(8/(pi*x**2)) )
A = np.vectorize(A_)
def B_(y):
return 3/(1j*y*(1+1j*y))
B = np.vectorize(B_)
real_A = np.real(A(x))
imag_A = np.imag(A(x))
real_B = np.real(B(y))
imag_B = np.imag(B(y))
plt.plot(real_A, imag_A, color='blue')
plt.plot(real_B, imag_B, color='red')
plt.show()
I don't have to plot it necessarily. I just need x_intersection and y_intersection (with some error that depends on x and y).
Thanks a lot in advance!
EDIT:
I should have used different variable names. To clarify what i need:
x and y are numpy arrays and i need the index of the intersection point of each array plus the corresponding x and y value (which again is not the intersection point itself, but some value of the arrays x and y ).
Here I find the minimum of the distance between the two curves. Also, I cleaned up your code a bit (eg, vectorize wasn't doing anything useful).
import matplotlib.pyplot as plt
import numpy as np
from numpy import sqrt, pi
from scipy import optimize
def A(x):
return -1/( 8/(pi*x)*sqrt(1-(1/x)**2) - 1j*(8/(pi*x**2)) )
def B(y):
return 3/(1j*y*(1+1j*y))
# The next three lines find the intersection
def dist(x):
return abs(A(x[0])-B(x[1]))
sln = optimize.minimize(dist, [1, 1])
# plotting everything....
a0, b0 = A(sln.x[0]), B(sln.x[1])
x = np.linspace(1, 10, 10000)
y = np.linspace(1, 60, 10000)
a, b = A(x), B(y)
plt.plot(a.real, a.imag, color='blue')
plt.plot(b.real, b.imag, color='red')
plt.plot(a0.real, a0.imag, "ob")
plt.plot(b0.real, b0.imag, "xr")
plt.show()
The specific x and y values at the intersection point are sln.x[0] and sln.x[1], since A(sln.x[0])=B(sln.x[1]). If you need the index, as you also mention in your edit, I'd use, for example, numpy.searchsorted(x, sln.x[0]), to find where the values from the fit would insert into your x and y arrays.
I think what's a bit tricky with this problem is that the space for graphing where the intersection is (ie, the complex plane) does not show the input space, but one has to optimize over the input space. It's useful for visualizing the solution, then, to plot the distance between the curves over the input space. That can be done like this:
data = dist((X, Y))
fig, ax = plt.subplots()
im = ax.imshow(data, cmap=plt.cm.afmhot, interpolation='none',
extent=[min(x), max(x), min(y), max(y)], origin="lower")
cbar = fig.colorbar(im)
plt.plot(sln.x[0], sln.x[1], "xw")
plt.title("abs(A(x)-B(y))")
From this it seems much more clear how optimize.minimum is working -- it just rolls down the slope to find the minimum distance, which is zero in this case. But still, there's no obvious single visualization that one can use to see the whole problem.
For other intersections one has to dig a bit more. That is, #emma asked about other roots in the comments, and there I mentioned that there's no generally reliable way to find all roots to arbitrary equations, but here's how I'd go about looking for other roots. Here I won't lay out the complete program, but just list the changes and plots as I go along.
First, it's obvious that for the domain shown in my first plot that there's only one intersection, and that there are no intersection in the region to the left. The only place there could be another intersection is to the right, but for that I'll need to allow the sqrt in the def of B to get a negative argument without throwing an exception. An easy way to do this is to add 0j to the argument of the sqrt, like this, sqrt(1+0j-(1/x)**2). Then the plot with the intersection becomes
I plotted this over a broader range (x=np.linspace(-10, 10, 10000) and y=np.linspace(-400, 400, 10000)) and the above is the zoom of the only place where anything interesting is going on. This shows the intersection found above, plus the point where it looks like the two curves might touch (where the red curve, B, comes to a point nearly meeting the blue curve A going upward), so that's the new interesting thing, and the thing I'll look for.
A bit of playing around with limits, etc, show that B is coming to a point asymptotically, and the equation of B is obvious that it will go to 0 + 0j for large +/- y, so that's about all there is to say for B.
It's difficult to understand A from the above plot, so I'll look at the real and imaginary parts independently:
So it's not a crazy looking function, and the jumping between Re=const and Im=const is just the nature of sqrt(1-x-2), which is pure complex for abs(x)<1 and pure real for abs(x)>1.
It's pretty clear now that the other time the curves are equal is at y= +/-inf and x=0. And, quick look at the equations show that A(0)=0+0j and B(+/- inf)=0+0j, so this is another intersection point (though since it occurs at B(+/- inf), it's sort-of ambiguous on whether or not it would be called an intersection).
So that's about it. One other point to mention is that if these didn't have such an easy analytic solution, like it wasn't clear what B was at inf, etc, one could also graph/minimize, etc, by looking at B(1/y), and then go from there, using the same tools as above to deal with the infinity. So using:
def dist2(x):
return abs(A(x[0])-B(1./x[1]))
Where the min on the right is the one initially found, and the zero, now at x=-0 and 1./y=0 is the other one (which, again, isn't interesting enough to apply an optimizer here, but it could be interesting in other equations).
Of course, it's also possible to estimate this by just finding the minimum of the data that goes into the above graph, like this:
X, Y = np.meshgrid(x, y)
data = dist((X, Y))
r = np.unravel_index(data.argmin(), data.shape)
print x[r[1]], y[r[0]]
# 2.06306306306 1.8008008008 # min approach gave 2.05973231 1.80069353
But this is only approximate (to the resolution of data) and involved many more calculations (1M compared to a few hundred). I only post this because I think it might be what the OP originally had in mind.
Briefly, two analytic solutions are derived for the roots of the problem. The first solution removes the parametric representation of x and solves for the roots directly in the (u, v) plane, where for example A(x): u(x) + i v(y) gives v(u) = f(u). The second solution uses a polar representation, e.g. A(x) is given by r(x) exp(i theta(x)), and offers a better understanding of the behavior of the square root as x passes through unity towards zero. Possible solutions occurring at the singular points are explored. Finally, a bisection root finding algorithm is constructed as a Python iterator to invert certain solutions. Summarizing, the one real root can be found as a solution to either of the following equations:
and gives:
x0 = -2.059732
y0 = +1.800694
A(x0) = B(y0) = (-0.707131, -i 0.392670)
As in most problems there are a number of ways to proceed. One can use a "black box" and hopefully find the root they are looking for. Sometimes an answer is all that is desired, and with a little understanding of the functions this may be an adequate way forward. Unfortunately, it is often true that such an approach will provide less insight about the problem then others.
For example, algorithms find it difficult locating roots in the global space. Local roots may be found with other roots lying close by and yet undiscovered. Consequently, the question arises: "Are all the roots accounted for?" A more complete understanding of the functions, e.g. asymptotic behaviors, branch cuts, singular points, can provide the global perspective to better answer this, as well as other important questions.
So another possible solution would be building one's own "black box." A simple bisection routine might be a starting point. Robust if the root lies in the initial interval and fairly efficient. This encourages us to look at the global behavior of the functions. As the code is structured and debugged the various functions are explored, new insights are gained, and the algorithm has become a tool towards a more complete solution to the problem. Perhaps, with some patience, a closed-form solution can be found. A Python iterator is constructed and listed below implementing a bisection root finding algorithm.
Begin by putting the functions A(x) and B(x) in a more standard form:
C(x) = u(x) + i v(x)
and here the complex number i is brought out of the denominator and into the numerator, casting the problem into the form of functions of a complex variable. The new representation simplifies the original functions considerably. The real and imaginary parts are now clearly separated. An interesting graph is to plot A(x) and B(x) in the 3-dimensional space (u, v, x) and then visualize the projection into the u-v plane.
import numpy as np
from numpy import real, imag
import matplotlib.pyplot as plt
ax = fig.gca(projection='3d')
s = np.linspace(a, b, 1000)
ax.plot(f(s).real, f(s).imag, z, color='blue')
ax.plot(g(s).real, g(s).imag, z, color='red')
ax.plot(f(s).real, f(s).imag, 0, color='black')
ax.plot(g(s).real, g(s).imag, 0, color='black')
The question arises: "Can the parametric representation be replaced so that a relationship such as,
A(x): u(x) + i v(x) gives v(u) = f(u)
is obtained?" This will provide A(x) as a function v(u) = f(u) in the u-v plane. Then, if for
B(x): u(x) + i v(x) gives v(u) = g(u)
a similar relationship can be found, the solutions can be set equal to one another,
f(u) = g(u)
and the root(s) computed. In fact, it is convenient to look for a solution in the square of the above equation. The worst case is that an algorithm will have to be built to find the root, but at this point the behavior of our functions are better understood. For example, if f(u) and g(u) are polynomials of degree n then it is known that there are n roots. The best case is that a closed-form solution might be a reward for our determination.
Here is more detail to the solution. For A(x) the following is derived:
and v(u) = f(u) is just v(u) = constant. Similarly for B(x) a slightly more complex form is required:
Look at the function g(u) for B(x). It is imaginary if u > 0, but the root must be real since f(u) is real. This means that u must be less then 0, and there is both a positive and negative real branch to the square root. The sign of f(u) then allows one to pick the negative branch as the solution for the root. So the fact that the solution must be real is determined by the sign of u, and the fact that the real root is negative specifies what branch of the square root to choose.
In the following plot both the real (u < 0) and complex (u > 0) solutions are shown.
The camera looks toward the origin in the back corner, where the red and blue curves meet. The z-axis is the magnitude of f(u) and g(u). The x and y axes are the real/complex values of u respectively. The blue curves are the real solution with (3 - |u|). The red curves are the complex solution with (3 + |u|). The two sets meet at u = 0. The black curve is f(u) equal to (-pi/8).
There is a divergence in g(u) at |u| = 3 and this is associated with x = 0. It is far removed from the solution and will not be considered further.
To obtain the roots to f = g it is easier to square f(u) and equate the two functions. When the function g(u) is squared the branches of the square root are lost, much like squaring the solutions for x**2 = 4. In the end the appropriate root will be chosen by the sign of f(u) and so this is not an issue.
So by looking at the dependence of A and B, with respect to the parametric variable x, a representation for these functions was obtained where v is a function of u and the roots found. A simpler representation can be obtained if the term involving c in the square root is ignored.
The answer gives all the roots to be found. A cubic equation has at most three roots and one is guaranteed to be real. The other two may be imaginary or real. In this case the real root has been found and the other two roots are complex. Interestingly, as c changes these two complex roots may move into the real plane.
In the above figure the x-axis is u and the y axis is the evaluated cubic equation with constant c. The blue curve has c as (pi/8) squared. The red curve uses a larger and negative value for c, and has been translated upwards for purposes of demonstration. For the blue curve there is an inflection point near (0, 0.5), while the red curve has a maximum at (-0.9, 2.5) and a minimum at (0.9, -0.3).
The intersection of the cubic with the black line represents the roots given by: u**3 + c u + 3c = 0. For the blue curve there is one intersection and one real root with two roots in the complex plane. For the red curve there are three intersections, and hence 3 roots. As we change the value of the constant c (blue to red) the one real root undergoes a "pitchfork" bifurcation, and the two roots in the complex plane move into the real space.
Although the root (u_t, v_t) has been located, obtaining the value for x requires that (u, v) be inverted. In the present example this is a trivial matter, but if not, a bisection routine can be used to avoid the algebraic difficulties.
The parametric representation also leads to a solution for the real root, and it rounds out the analysis with an independent verification of the first result. Second, it answers the question about what happens at the singularity in the square root. Third, it gives a greater understanding of the multiplicity of roots.
The steps are: (1) convert A(x) and B(x) into polar form, (2) equate the modulus and argument giving two equations in two unknowns, (3) make a substitution for z = x**2. Converting A(x) to polar form:
Absolute value bars are not indicated, and it should be understood that the moduli r(x) and s(x) are positive definite as their names imply. For B(x):
The two equations in two unknowns:
Finally, the cubic solution is sketched out here where the substitution z = x**2 has been made:
The solution for z = x**2 gives x directly, which allows one to substitute into both A(x) and B(x). This is an exact solution if all terms are maintained in the cubic solution, and there is no error in x0, y0, A(x0), or B(x0). A simpler representation can be found by considering terms proportional to 1/d as small.
Before leaving the polar representation consider the two singular points where: (1) abs(x) = 1, and (2) x = 0. A complicating factor is that the arguments behave something like 1/x instead of x. It is worthwhile to look at a plot of the arctan(a) and then ask yourself how that changes if its argument is replaced by 1/a. The following graphs will then look less foreign.
Consider the polar representation of B(x). As x approaches 0 the modulus and argument tend toward infinity, i.e. the point is infinitely far from the origin and lies along the y-axis. Approaching 0 from the negative direction the point lies along the negative y-axis with varphi = (-pi/2), while approaching from the other direction the point lies along the positive y-axis with varphi = (+pi/2).
A somewhat more complicated behavior is exhibited by A(x). A(x) is even in x since the modulus is positive definite and the argument involves only x**2. There is a symmetry across the y-axis that allows us to only consider the x > 0 plane.
At x = 1 the modulus is just (pi/8), and as x continues to approach 0 so does r(x). The behavior of the argument is more complex. As x approaches unity from large positive values the argument is diverging towards +inf and so theta is approaching (+pi/2). As x passes through 1 the argument becomes complex. At x equals 0 the argument has reached its minimum value of -i. For complex arguments the arctan is given by:
The following are plots of the arguments for A(x) and B(x). The x-axis is the value of x, and the y-axis is the value of the angle in units of pi. In the first plot theta is shown in blue curves, and as x approaches 1 the angle approaches (+pi/2). Theta is real because abs(x) >= 1, and notice it is symmetric across the y-axis. The black curve is varphi and as x approaches 0 it approaches plus or minus (pi/2). Notice it is an odd function in x.
In the second plot A(x) is shown where abs(x) < 1 and the argument becomes complex. Near x = 1 theta is equal to (+pi/2), the blue curve, minus a small imaginary part, the red curve. As x approaches zero theta is equal to (+pi/2) minus a large imaginary part. At x equals 0 the argument is equal to -i and theta = (+pi/2) minus an infinite imaginary part, i.e ln(0) = -inf:
The values for x0 and y0 are determined by the set of equations that equate modulus and argument of A(x) and B(x), and there are no other roots. If x0 = 0 was a root, then it would fall out of these equations. The same holds for x0 = 1. In fact, if one uses approximations in the argument of A(x) about these points, and then substitutes into the equation for the modulus, the equality cannot be maintained there.
Here is another perspective: consider the set of equations where x is assumed large and call it x_inf. The equation for the argument then gives x_inf = y_inf, where 1 is neglected with respect to x_inf squared. Upon substitution into the second equation a cubic is obtained in x_inf. Will this give the correct answer? Yes, if x0 is actually large, and in this case you might get away with it since x0 is approximately 2. The difference between the sqrt(4) and the sqrt(5) is around 10%. But does this mean that x_inf = 100 is a solution? No it does not: x_inf is only a solution if it equals x0.
The initial reason for examining the problem in the first place was to find a context for building a root-finding bisection routine as a Python iterator. This can be used to find any of the roots discussed here, and looks something like this:
class Bisection:
def __init__(self, a, b, func, max_iter):
self.max_iter = max_iter
self.count_iter = 0
self.a = a
self.b = b
self.func = func
fa = func(self.a)
fb = func(self.b)
if fa*fb >= 0.0:
raise ValueError
def __iter__(self):
self.x1 = self.a
self.x2 = self.b
self.xmid = self.x1 + ((self.x2 - self.x1)/2.0)
return self
def __next__(self):
f1 = self.func(self.x1)
f2 = self.func(self.x2)
error = abs(f1 - f2)
fmid = self.func(self.xmid)
if fmid == 0.0:
return self.xmid
if f1*fmid < 0:
self.x2 = self.xmid
else:
self.x1 = self.xmid
self.xmid = self.x1 + ((self.x2 - self.x1)/2.0)
f1 = self.func(self.x1)
fmid = self.func(self.xmid)
self.count_iter += 1
if self.count_iter >= self.max_iter:
raise StopIteration
return self.xmid
The routine does only a minimal amount in the way of catching exceptions and was used to find x for the given solution in the u-v plane. The arguments a and b give the lower and upper brackets for the root to be found. The argument func is the function for the root to be found. This might look like: u0 - B(x).real. The constant max_iterations tells the iterator to stop after a given number of bisections has been attempted.

Categories