An algorithm that I'm trying to implement requires finding the roots of a 10th degree polynomial, which I created with sympy, it looks like this:
import sympy
import numpy as np
det = sympy.Poly(1.3339507303385e-16*z**10 + 6.75390067076469e-14*z**9 + 7.18791227134351e-12*z**8 + 2.27504286959352e-10*z**7 + 2.37058998324426e-8*z**6 + 1.63916629439745e-6*z**5 + 3.0608041245671e-5*z**4 + 4.83564348562906e-8*z**3 + 2.0248073519853e-5*z**2 - 4.73126177166988e-7*z + 1.1495883206077e-6)
For finding the roots of the polynomial, I use the following code:
coefflist = det.coeffs()
solutions = np.roots(coefflist)
print(coefflist)
[1.33395073033850e-16, 6.75390067076469e-14, 7.18791227134351e-12, 2.27504286959352e-10, 2.37058998324426e-8, 1.63916629439745e-6, 3.06080412456710e-5, 4.83564348562906e-8, 2.02480735198530e-5, -4.73126177166988e-7, 1.14958832060770e-6]
print(solutions)
[-3.70378229e+02+0.00000000e+00j -1.18366138e+02+0.00000000e+00j
2.71097137e+01+5.77011644e+01j 2.71097137e+01-5.77011644e+01j
-3.59084863e+01+1.44819591e-02j -3.59084863e+01-1.44819591e-02j
2.60969082e-03+7.73805425e-01j 2.60969082e-03-7.73805425e-01j
1.42936329e-02+2.49877948e-01j 1.42936329e-02-2.49877948e-01j]
However, when I substitute z with a root, lets say the first one, the result is not zero, but some number:
print(det.subs(z,solutions[0]))
-1.80384169514123e-6
I would have expected, that the result probably isn't the integer 0, but 1e-6 is pretty bad (it should be zero, right?). Is there a mistake in my code? Is this inaccuracy normal? Any thoughts/suggestions would be helpful. Is there a more accurate alternative to compute the roots of a 10th degree polynomial?
You do not need sympy, the methods in numpy are completely sufficient. Define the polynomial by its list of coefficients and compute the roots
p=[1.33395073033850e-16, 6.75390067076469e-14, 7.18791227134351e-12, 2.27504286959352e-10, 2.37058998324426e-8, 1.63916629439745e-6, 3.06080412456710e-5, 4.83564348562906e-8, 2.02480735198530e-5, -4.73126177166988e-7, 1.14958832060770e-6]
sol= np.roots(p); sol
giving the result
array([ -3.70378229e+02 +0.00000000e+00j, -1.18366138e+02 +0.00000000e+00j,
2.71097137e+01 +5.77011644e+01j, 2.71097137e+01 -5.77011644e+01j,
-3.59084863e+01 +1.44819592e-02j, -3.59084863e+01 -1.44819592e-02j,
2.60969082e-03 +7.73805425e-01j, 2.60969082e-03 -7.73805425e-01j,
1.42936329e-02 +2.49877948e-01j, 1.42936329e-02 -2.49877948e-01j])
and evaluate the polynomials at these approximate roots
np.polyval(p,sol)
giving the array
array([ 2.28604877e-06 +0.00000000e+00j, 1.30435230e-10 +0.00000000e+00j,
1.05461854e-11 -7.56043461e-12j, 1.05461854e-11 +7.56043461e-12j,
-3.98439686e-14 +6.84489332e-17j, -3.98439686e-14 -6.84489332e-17j,
1.18584613e-20 +1.59976730e-21j, 1.18584613e-20 -1.59976730e-21j,
6.35274710e-22 +1.74700545e-21j, 6.35274710e-22 -1.74700545e-21j])
Obviously, evaluating a polynomial close to a root involves lots of catastrophic cancellations, that is, the intermediate terms are large of opposite sign and cancel out, but their errors are proportional to their original sizes. To get an estimate of the combined error size, replace the polynomial coefficients with their absolute values and also the evaluation points.
np.polyval(np.abs(p),np.abs(sol))
resulting in
array([ 1.81750423e+10, 8.40363409e+05,
8.08166359e+03, 8.08166359e+03,
2.44160616e+02, 2.44160616e+02,
2.50963696e-05, 2.50963696e-05,
2.65889696e-06, 2.65889696e-06])
In the case of the first root, the scale multiplied with the machine constant gives an error scale of 1e+10*1e-16=1e-6, which means that the value at the root is as good as zero within the framework of double-precision floating point.
Related
I am trying to solve this systems but I get error.
I have to definition y3d=0 because y3'=0 in the equation systems. but when I did this, program cant solve. if I say y3d=y[3] then program run,
equation system that ı have to solve is like this:
dy1/dx=y2
dy2/dx=-y3*y1
dy3/dx=0
dy4/d=y1**2+y2**2 and boundary condition y1(0)=y1(1)=0 and y4(0)= 0 y4(1)=1
can scipy handle this?
import numpy as np
from scipy.integrate import solve_bvp
import matplotlib.pyplot as plt
def eqts(x,y):
y1d=y[1]
y2d=-y[2]*y[0]
y3d=0
y4d=y[0]**2+y[1]**2
return np.vstack((y1d,y2d,y3d,y4d))
def bc(ya,yb):
return np.array([ya[0],yb[3],ya[0],yb[3]-1])
x = np.linspace(0,1,10)
y= np.zeros((4,x.size))
y[2,:]=1
sol=solve_bvp(eqts,bc,x,y)
Unfortunately I get the following error message ;
ValueError: all the input array dimensions for the concatenation axis must match exactly, but along dimension 1, the array at index 0 has size 10 and the array at index 2 has size 1
Well, first of all, in your script, your boundary conditions are overdetermined. Nowhere it is said that y3(0) = 0 or y3(1) = 0. Actually, it is not: y3(t) is a constant but it is not zero. If you impose such condition y3(t) = 0, things will not work at all. On top of that, this system looks non-linear (quadratic) but actually is a linear system. You can solve it explicitly without python. If I am not mistaken, the only way you can have a solution is when y3 > 0, which gives you
y1(t) = B * sin(k*pi*t)
y2(t) = k*pi*B * cos(k*pi*t)
y3(t) = k^2*pi^2
y4(t) = t + (k^2pi^2 - 1) B^2 * sin(2*k*pi*t) / (4*k*pi)
where B = sqrt( 2*pi*k / (k^2*pi^2 + 1) )
and k is an arbitrary non-zero integer
or at least something along those lines.
Determinant
The main issue is here
y1d=y[1]
y2d=-y[2]*y[0]
y3d=0
y4d=y[0]**2+y[1]**2
In your implementation all y1d, y2d, and y4d are vector, but y3d is scalar!
You may use y3d = np..zeros_like(y1d)
solve_bvp requires to return a rhs with same size of y see https://docs.scipy.org/doc/scipy/reference/generated/scipy.integrate.solve_bvp.html?highlight=solve_bvp#scipy.integrate.solve_bvp
In scipy's solve_bvp there is the possibility to treat constant components as parameters that are to be fitted. Thus one can define the system in dimension 3 as
def eqn(t,x,p): return x[1], -p[0]*x[0],x[0]**2+x[1]**2
def bc(x0,x1,p): return x0[0], x1[0], x0[2], x1[2]-1
t_init = np.linspace(0,1,6)
x_init = np.zeros([3,len(t_init)])
x_init[0,1]=1; # break the symmetry that gives a singular Jacobian
Another variant to get a general initial guess would be to fill it with random noise.
To get different solutions the initial conditions need to be different, one point of influence is the frequency square. Setting it close to the expected values gives indeed different solutions.
for w0 in np.arange(1,10)*3:
res = solve_bvp(eqn,bc,t_init,x_init,p=[w0**2])
print(res.message,res.p[0])
This gives as output
The algorithm converged to the desired accuracy. 9.869869348810965
The algorithm converged to the desired accuracy. 39.47947747757308
The algorithm converged to the desired accuracy. 88.82805487260174
The algorithm converged to the desired accuracy. 88.8280548726001
The algorithm converged to the desired accuracy. 157.91618155379464
The algorithm converged to the desired accuracy. 157.91618155379678
The algorithm converged to the desired accuracy. 355.31352482772894
The algorithm converged to the desired accuracy. 355.3135248277307
The algorithm converged to the desired accuracy. 157.91618163528014
As one can see, in the higher frequencies the given initial frequency gets balanced against the lower frequency behavior of the initial functions. This is a general problem, if not forced to stay orthogonal to the lower frequency solutions, the solver tends towards the smoother lower frequencies.
Adding plot commands plt.plot(res.x, res.y[0]) etc.
shows the expected sinusoidal solutions.
I'm trying to use numpy to find the roots of some polynomials, but I am getting some erroneous results:
>> poly = np.polynomial.Polynomial([4.383930e+00, 2.277144e+14, -7.008406e+25, -4.258004e+16])
>> roots = poly.roots()
>> roots
array([-1.64593692e+09, -1.91391398e-14, 3.26830022e-12])
>> poly(roots)
array([-3.74803539e+23, -7.99360578e-15, -1.89182003e-13])
What is up with the false root -1.64593692e+09 which results in -3.74803539e+23? This is clearly not a root.
Is this the result of floating-point errors? or something else?..
And more importantly;
Is there a way to get around it?
..perhaps something I can tweak, or a different function I can use?. Any help is much appreciated.
I found this and this previous question which seemed to be related, but after reading them and the answers/comments I don't think that they are the same problem.
First of all, computing the roots of a polynomial is a classically ill-conditioned problem, meaning (roughly) that no matter what algorithm you use to solve it, small changes in the coefficients of many polynomials can lead to huge changes in their roots. That means we should be a little careful not to place an extraordinary amount of faith in root-finding results in general, and that perhaps that we shouldn't be too surprised when a root finder gives weird results. There's a pretty good example on Wikipedia, Wilkinson's polynomial, that shows how things can go wrong.
In this instance, the coefficients of the polynomial of interest are of such different magnitudes that it's not surprising that the results seem poor. But consider this: if our original polynomial is p() and it has a root x, then p(x) = 0, but also c*p(x) = 0 for any constant c. In other words, we can scale the coefficients without changing the roots, so happen if we normalized the polynomial by dividing by the coefficient of largest magnitude, 7e25?
Original polynomial: p(x) = 4.4 + 2.3e+14*x - 7.0e25*x**2 - 4.3e16*x**3
Scaled polynomial: p(x) = 6.3e-26 + 3.2e-12*x - x**2 - 6.1e-10*x**3
So for this polynomial, the largest coefficient ~7e25 is so huge that the smallest coefficient ~4.4 is essentially negligible. That should give us a hint that what counts as zero in a root finding iteration isn't what we would normally consider "small."
The short answer is that the root calculated by NumPy isn't perfect, but it is an estimate of an actual root. Here's some code to convince us.
>>> import numpy as np
>>> coefs = np.array([4.383930e+00, 2.277144e+14, -7.008406e+25, -4.258004e+16])
>>> coefs_normed = coefs / np.abs(coefs).max()
>>> coefs_normed
array([ 6.25524549e-26, 3.24916108e-12, -1.00000000e+00, -6.07556697e-10])
>>> poly = np.polynomial.Polynomial(coefs)
>>> roots = poly.roots()
>>> roots
array([-1.64593692e+09, -1.91391398e-14, 3.26830022e-12])
>>> poly(roots)
array([-3.74803539e+23, 8.43769499e-14, -1.89182003e-13])
>>> poly_normed = np.polynomial.Polynomial(coefs_normed)
>>> roots_normed = poly_normed.roots()
>>> roots_normed
array([-1.64593692e+09, -1.91391398e-14, 3.26830022e-12])
>>> poly_normed(roots_normed)
array([-5.34791419e-03, 1.20534089e-39, -2.11221641e-39])
Now, -5e-03 is not very close to machine epsilon, but that should convince us that maybe the calculated root isn't quite as bad as it seemed at first.
A final point: the np.polynomial.Polynomial class has domain and window arguments that determine how it does its computations. Since polynomials get absolutely huge as the domain tends to +infinity or -infinity, it's unrealistic to expect accurate calculations for a value around 10^9.
The root appears real:
x = np.linspace(-2e9, 1000, 10000)
plt.plot(x, poly(x))
The problem is that the scale of the data is very large. -3e23 is tiny compared to say 6e43. The discrepancy is caused by roundoff error. Third order polynomials have an analytical solution, but it's not going to be numerically stable when your domain is on the order of 1e9.
You can try to use the domain and window parameters to attempt to introduce some numerical stability. For example, a common choice of domain is something that envelops your entire dataset. You would have to adjust the coefficients to compenstate, since those values are usually used for fitting data.
When creating a line of best fit with numpy's polyfit, you can specify the parameter full to be True. This returns 4 extra values, apart from the coefficents. What do these values mean and what do they tell me about how well the function fits my data?
https://docs.scipy.org/doc/numpy/reference/generated/numpy.polyfit.html
What i'm doing is:
bestFit = np.polyfit(x_data, y_data, deg=1, full=True)
and I get the result:
(array([ 0.00062008, 0.00328837]), array([ 0.00323329]), 2, array([
1.30236506, 0.55122159]), 1.1102230246251565e-15)
The documentation says that the four extra pieces of information are: residuals, rank, singular_values, and rcond.
Edit:
I am looking for a further explanation of how rcond and singular_values describes goodness of fit.
Thank you!
how rcond and singular_values describes goodness of fit.
Short answer: they don't.
They do not describe how well the polynomial fits the data; this is what residuals are for. They describe how numerically robust was the computation of that polynomial.
rcond
The value of rcond is not really about quality of fit, it describes the process by which the fit was obtained, namely a least-squares solution of a linear system. Most of the time the user of polyfit does not provide this parameter, so a suitable value is picked by polyfit itself. This value is then returned to the user for their information.
rcond is used for truncation in ill-conditioned matrices. Least squares solver does two things:
Finds x that minimizes the norm of residuals Ax-b
If multiple x achieve this minimum, returns x with the smallest norm among those.
The second clause occurs when some changes of x do not affect the right-hand side at all. But since floating point computations are imperfect, usually what happens is that some changes of x affect the right hand side very little. And this is where rcond is used to decide when "very little" should be considered as "zero up to noise".
For example, consider the system
x1 = 1
x1 + 0.0000000001 * x2 = 2
This one can be solved exactly: x1 = 1 and x2 = 10000000000. But... that tiny coefficient (that in reality, came after some matrix manipulations) has some numeric error in it; for all we know it could be negative, or zero. Should we let it have such huge influence on the solution?
So, in such a situation the matrix (specifically its singular values) gets truncated at level rcond. This leaves
x1 = 1
x1 = 2
for which the least-squares solution is x1 = 1.5, x2 = 0. Note that this solution is robust: no huge numbers from tiny fluctuations of coefficients.
Singular values
When one solves a linear system Ax = b in the least squares sense, the singular values of A determine how numerically tricky this is. Specifically, large disparity between largest and smallest singular values is problematic: such systems are ill-conditioned. An example is
0.835*x1 + 0.667*x2 = 0.168
0.333*x1 + 0.266*x2 = 0.0067
The exact solution is (1, -1). But if the right hand side is changed from 0.067 to 0.066, the solution is (-666, 834) -- totally different. The problem is that the singular values of A are (roughly) 1 and 1e-6; this magnifies any changes on the right by the factor of 1e6.
Unfortunately, polynomial fit often results in ill-conditioned matrices. For example, fitting a polynomial of degree 24 to 25 equally spaced data points is unadvisable.
import numpy as np
x = np.arange(25)
np.polyfit(x, x, 24, full=True)
The singular values are
array([4.68696731e+00, 1.55044718e+00, 7.17264545e-01, 3.14298605e-01,
1.16528492e-01, 3.84141241e-02, 1.15530672e-02, 3.20120674e-03,
8.20608411e-04, 1.94870760e-04, 4.28461687e-05, 8.70404409e-06,
1.62785983e-06, 2.78844775e-07, 4.34463936e-08, 6.10212689e-09,
7.63709211e-10, 8.39231664e-11, 7.94539407e-12, 6.32326226e-13,
4.09332903e-14, 2.05501534e-15, 7.55397827e-17, 4.81104905e-18,
8.98275758e-20]),
which, with the default value of rcond (5.55e-15 here), gets four of them truncated to 0.
The difference in magnitude between smallest and largest singular values indicates that perturbing the y-values by numbers of size 1e-15 can result in changes of about 1 to the coefficients. (Not every perturbation will do that, just some that happen to align with a singular vector for a small singular value).
Rank
The effective rank is just the number of singular values above the rcond threshold. In the above example it's 21. This means that even though the fit is for 25 points, and we get a polynomial with 25 coefficients, there are only 21 degrees of freedom in the solution.
I've implemented the mutual information formula in python using pandas and numpy
def mutual_info(p):
p_x=p.sum(axis=1)
p_y=p.sum(axis=0)
I=0.0
for i_y in p.index:
for i_x in p.columns:
I+=(p.ix[i_y,i_x]*np.log2(p.ix[i_y,i_x]/(p_x[i_y]*p[i_x]))).values[0]
return I
However, if a cell in p has a zero probability, then np.log2(p.ix[i_y,i_x]/(p_x[i_y]*p[i_x])) is negative infinity, and the whole expression is multiplied by zero and returns NaN.
What is the right way to work around that?
For various theoretical and practical reasons (e.g., see Competitive Distribution Estimation:
Why is Good-Turing Good), you might consider never using a zero probability with the log loss measure.
So, say, if you have a probability vector p, then, for some small scalar α > 0, you would use α 1 + (1 - α) p (where here the first 1 is the uniform vector). Unfortunately, there are no general guidelines for choosing α, and you'll have to assess this further down the calculation.
For the Kullback-Leibler distance, you would of course apply this to each of the inputs.
First I find the eigenvalues of a (4000x4000) matrix by using numpy.linalg.eigvalsh. Then, I change the boundary conditions, expecting only a minor change in the eigenvalues.
Subtracting the eigenvalues is vulnerable to floating point errors, so I've used some relative tolerance.
Now say I have a eigenvalue A = 1.0001e-10, and another B = 1.0050e-10. According to my humble knowledge of floating-point arithmetic, A - B != 0. The problem is, that these numbers come from linear algebra calculations involving many orders of magnitude. Other eigenvalues might for example be of order 1.
The question is, what is the precision of eigenvalues calculated using numpy.linalg.eigvalsh? Is this precision relative to the value (A * eps), or is it relative to the largest eigenvalue? or perhaps relative to elements of the original matrix?
For example, this matrix:
1 1e-20
1e-20 3
gives the same eigenvalues as this:
1 1e-5
1e-5 3
I'm not sure if Lapack is used underneath eigvalsh, but this might be of interest:
Lapack error bounds for the Symmetric/Unsymmetric eigenproblem:
http://www.netlib.org/lapack/lug/node89.html
http://www.netlib.org/lapack/lug/node91.html
First, the solver is not exact. Second, your example matrices are poorly conditioned: the diagonal elements are orders of magnitude larger than the off-diagonal ones. This is always going to cause numerical issues.
From simple algebra, the determinant of the second matrix is (1 * 3) - (1e5 * 1e5) = 3 - 1e-10. You can already see that the precision problem is actually twice as large the precision of your smallest element. (The same applies for the eigenvalues.) Even though linalg is using double precision, because the solver is approximate you get the same answer. If you change the small values to 1e-3 you start seeing a difference, because now the precision is on the order of the numerical approximation.
This specific problem has been asked before. You can see in this answer how to use sympy to solve the eigenvalues with arbitrary precision.