I know that similar precision questions have been asked here however I am reading a code of a project that is doing an exact equality comparison among floats and is puzzling me.
Assume that x1 and x2 are of type numpy.ndarray and of dtype np.float32. These two variables have been computed by the same code executed on the same data but x1 has been computed by one machine and x2 by another (this is done on an AWS cluster which communicates with MPI).
Then the values are compared as follows
numpy.array_equal(x1, x2)
Hence, exact equality (no tolerance) is crucial for this program to work and it seems to work fine. This is confusing me. How can one compare two np.float32 computed on different machines and face no precision issues? When can these two (or more) floats can be equal?
The arithmetic specified by IEEE-754 is deterministic given certain constraints discussed in its clause 11 (2008 version), including suitable rules for expression evaluation (such as unambiguous translation from expressions in a programming language to IEEE-754 operations, such as a+b+c must give (a+b)+c, not a+(b+c)).
If parallelism is not used or is constructed suitably, such as always partitioning a job into the same pieces and combining their results in the same way regardless of order of completion of computations, then obtaining identical results is not surprising.
Some factors that prevent reproducibility include varying parallelism, using different math libraries (with different implementations of functions such as pow), and using languages that are not strict about floating-point evaluation (such as permitting, but not requiring, extra precision).
Related
EDIT: Original post too vague. I am looking for an algorithm to solve a large-system, solvable, linear IVP that can handle very small floating point values. Solving for the eigenvectors and eigenvalues is impossible with numpy.linalg.eig() as the returned values are complex and should not be, it does not support numpy.float128 either, and the matrix is not symmetric so numpy.linalg.eigh() won't work. Sympy could do it given an infinite amount of time, but after running it for 5 hours I gave up. scipy.integrate.solve_ivp() works with implicit methods (have tried Radau and BDF), but the output is wildly wrong. Are there any libraries, methods, algorithms, or solutions for working with this many, very small numbers?
Feel free to ignore the rest of this.
I have a 150x150 sparse (~500 nonzero entries of 22500) matrix representing a system of first order, linear differential equations. I'm attempting to find the eigenvalues and eigenvectors of this matrix to construct a function that serves as the analytical solution to the system so that I can just give it a time and it will give me values for each variable. I've used this method in the past for similar 40x40 matrices, and it's much (tens, in some cases hundreds of times) faster than scipy.integrate.solve_ivp() and also makes post model analysis much easier as I can find maximum values and maximum rates of change using scipy.optimize.fmin() or evaluate my function at inf to see where things settle if left long enough.
This time around, however, numpy.linalg.eig() doesn't seem to like my matrix and is giving me complex values, which I know are wrong because I'm modeling a physical system that can't have complex rates of growth or decay (or sinusoidal solutions), much less complex values for its variables. I believe this to be a stiffness or floating point rounding problem where the underlying LAPACK algorithm is unable to handle either the very small values (smallest is ~3e-14, and most nonzero values are of similar scale) or disparity between some values (largest is ~4000, but values greater than 1 only show up a handful of times).
I have seen suggestions for similar users' problems to use sympy to solve for the eigenvalues, but when it hadn't solved my matrix after 5 hours I figured it wasn't a viable solution for my large system. I've also seen suggestions to use numpy.real_if_close() to remove the imaginary portions of the complex values, but I'm not sure this is a good solution either; several eigenvalues from numpy.linalg.eig() are 0, which is a sign of error to me, but additionally almost all the real portions are of the same scale as the imaginary portions (exceedingly small), which makes me question their validity as well. My matrix is real, but unfortunately not symmetric, so numpy.linalg.eigh() is not viable either.
I'm at a point where I may just run scipy.integrate.solve_ivp() for an arbitrarily long time (a few thousand hours) which will probably take a long time to compute, and then use scipy.optimize.curve_fit() to approximate the analytical solutions I want, since I have a good idea of their forms. This isn't ideal as it makes my program much slower, and I'm also not even sure it will work with the stiffness and rounding problems I've encountered with numpy.linalg.eig(); I suspect Radau or BDF would be able to navigate the stiffness, but not the rounding.
Anybody have any ideas? Any other algorithms for finding eigenvalues that could handle this? Can numpy.linalg.eig() work with numpy.float128 instead of numpy.float64 or would even that extra precision not help?
I'm happy to provide additional details upon request. I'm open to changing languages if needed.
As mentioned in the comment chain above the best solution for this is to use a Matrix Exponential, which is a lot simpler (and apparently less error prone) than diagonalizing your system with eigenvectors and eigenvalues.
For my case I used scipy.sparse.linalg.expm() since my system is sparse. It's fast, accurate, and simple. My only complaint is the loss of evaluation at infinity, but it's easy enough to work around.
I am trying to solve an equation that can include truncations in Python with a numerical approach. I am wondering what the best library and approach would be? Following is more detail about the problem:
The equation changes every time. From a human perspective, the equations should be pretty simple; they include common operators such as +,-,*,/, and they also sometimes have truncation functions (truncate to integer) or limit functions (limit the value in parenthesis between two provided bounds) or (rarely) multiple variables. A couple of examples (with these being separate examples and not a system of equations) would be:
TRUNCATE(VAR_1 + 300) - 50.4 = 200
(VAR_2 + VAR_3)*3 = 35
LIMIT(3,5)(VAR_4) = 8
VAR_5 = 34
(This is not exactly what the equations look like, because I am writing them in postfix notation, but I have a calculator to determine their value with provided input values.)
All I need for these equations is some value for each variable that would solve each equation; I do not need to know every solution.
Some additional things to note about this is a) these variables all have maximum and minimum values, b) while perfection would be nice, occasional errors are acceptable, and c) some of the variables are integers, which I expect really complicates things. Right now, I'm handling this very sloppily but also mostly acceptably for my case by rounding the integer values to the nearest int.
In an attempt to solve this problem, I tried solving analytically with Sympy (which as you might expect didn't work on truncations and was difficult to implement), and I also tried using Scipy minimize as follows:
minimize(minimization, x0, method = 'SLSQP', constraints = cons, tol = 1e-3, options={'ftol': 1e-3, 'disp':True, 'maxiter': 100, "eps":.1}, args = (x_vals, postfix, const_values, value))
This one got stuck on truncations, presumably because it didn't know what direction to move, unless I set the step to 1, which decreased accuracy. For some reason, it also didn't seem to follow the ftol, because it would give acceptable answers within the tolerance but would just keep going to the iteration limit.
I am considering using something that does random walks like the "Markov Chain Monte Carlo" method, but I really don't know much about this and was eager to hear other thoughts.
I ended up solving the problem two slightly different ways. Both of them used the Powell solver as suggested by joni in the comments on the original question, and for both of them I had to multiply the output of the function that gets passed to the "fun" parameter (a function that I named minimize) by 100, because I could never get the tolerance adjusted in the solver function call.
When the equation had only one variable, I removed the truncation from the minimize function. This worked for my purposes because the reason the equations I was using was being truncated was so they would equal an integer value (generally). So, when the equation output is an integer and there is only one variable, I believe the correct solution will be obtained by just pretending the truncation function does not exist in the solver (though remember to be wary of floating point math). (And if any numbers outside of the truncation are integers, the equation may not have a solution anyways.)
In cases with multiple variables, my solution was to a) include the truncation function in the minimize function and b) round the x values suggested by the solver as I planned to round them in the end (ex. round them to an integer if they were an integer value).
Anyways, this solution worked for the problem defined above, but it otherwise has some limitations. It is not guaranteed to always find the correct output, especially the second part. Another approach people with this problem may wish to consider would be some sort of integer programming, if they have linear equations.
I am implementing a numerical evaluation of some analytical expressions which involve factors like exp(1i*arg(z) / 2), where z is in principle a complex number, which sometimes happens to be almost real (i.e. to floating point precision, e.g. 4.440892098500626e-16j).
I have implemented my computations in Python and C++ and find that sometimes results disagree as the small imaginary part of the "almost real" numbers differ slightly in sign, and then branch cut behaviour of arg(z)(i.e. arg(-1+0j) = pi, but arg(-1-0j) = -pi) significantly changes the result … I was wondering if there is any commonly used protocol to mitigate these issues?
Many thanks in advance.
Suppose both x and y are very small numbers, but I know that the true value of x / y is reasonable.
What is the best way to compute x/y?
In particular, I have been doing np.exp(np.log(x) - np.log(y) instead, but I'm not sure if that would make a difference at all?
Python uses the floating-point features of the hardware it runs on, according to Python documentation. On most common machines today, that is IEEE-754 arithmetic or something near it. That Python documentation is not explicit about rounding mode but mentions in passing that the result of a sample division is the nearest representable value, so presumably Python uses round-to-nearest-ties-to-even mode. (“Round-to-nearest” for short. If two representable values are equally close in binary floating-point, the one with a zero in the low bit of its significand is produced.)
In IEEE-754 arithmetic in round-to-nearest mode, the result of a division is the representable value nearest to the exact mathematical value. Since you say the mathematical value of x/y is reasonable, it is in the normal range of representable values (not below it, in the subnormal range, where precision suffers, and not above it, where results are rounded to infinity). In the normal range, results of elementary operations will be accurate within the normal precision of the format.
However, since x and y are “very small numbers,” we may be concerned that they are subnormal and have a loss of precision already in them, before division is performed. In the IEEE-754 basic 64-bit binary format, numbers below 2-1022 (about 2.22507•10-308) are subnormal. If x and y are smaller than that, then they have already suffered a loss of precision, and no method can produce a correct quotient from them except by happenstance. Taking the logarithms to calculate the quotient will not help.
If the machine you are running on happens not to be using IEEE-754, it is still likely that computing x/y directly will produce a better result than np.exp(np.log(x)-np.log(y)). The former is a single operation computing a basic function in hardware that was likely reasonably designed. The latter is several operations computing complicated functions in software that is difficult to make accurate using common hardware operations.
There is a fair amount of unease and distrust of floating-point operations. Lack of knowledge seems to lead to people being afraid of them. But what should be understood here is that elementary floating-point operations are very well defined and are accurate in normal ranges. The actual problems with floating-point computing arise from accumulating rounding errors over sequences of operations, from the inherent mathematics that compounds errors, and from incorrect expectations about results. What this means is that there is no need to worry about the accuracy of a single division. Rather, it is the overall use of floating-point that should be kept in mind. (Your question could be better answered if it presented more context, illuminating why this division is important, how x and y have been produced from prior data, and what the overall goal is.)
Note
A not uncommon deviation from IEEE-754 is to flush subnormal values to zero. If you have some x and some y that are subnormal, some implementations might flush them to zero before performing operations on them. However, this is more common in SIMD code than in normal scalar programming. And, if it were occurring, it would prevent you from evaluating np.log(x) and np.log(y) anyway, as subnormal values would be flushed to zero in those as well. So we can likely dismiss this possibility.
Division, like other IEEE-754-specified operations, is computed at infinite precision and then (with ordinary rounding rules) rounded to the closest representable float. The result of calculating x/y will almost certainly be a lot more accurate than the result of calculating np.exp(np.log(x) - np.log(y) (and is guaranteed not to be less accurate).
I am comparing the numerical results of C++ and Python computations. In C++, I make use of LAPACK's sgels function to compute the coefficients of a linear regression problem. In Python, I use Numpy's linalg.lstsq function for a similar task.
What is the mathematical difference between the methods used by sgels and linalg.lstsq?
What is the expected error (e.g. 6 significant digits) when comparing the results (i.e. the regression coefficients) numerically?
FYI: I am by no means a C++ or Python expert, which makes it difficult to understand what is going on inside the functions.
Taking a look at the source of numpy, in the file linalg.py, lstsq relies on LAPACK's zgelsd() for complex and dgelsd() for real. Here are the differences to sgels():
dgelsd() is for double while sgels() is for float. There is a difference of precision...
dgels() makes use the QR factorization of the matrix A and assumes that A has full rank. The condition number of the matrix must be reasonable to get a significant result. See this course for getting the logic of the method. On the other hand, dgelsd() makes use of the Singular value decomposition of A. In particular, A can be rank defiencient and small singular values are discarted depending of the additional argument rcond or machine precision. Notice that numpy's default value for rcond is -1: negative values refers to machine precision. See this course for the logic.
According to the benchmark of LAPACK, on can expect dgels() to be about 5 time faster than dgelsd().
You may see significant differences between the result of sgels() and dgelsd() if the matrix is ill conditionned. Indeed, there is a bound on the error of the linear regression which depends on the algorithm and the value of rcond() that is used. See the user guide of LAPACK on, Error Bounds for Linear Least Squares Problems for estimates of the errors and Further Details: Error Bounds for Linear Least Squares Problems for technical details.
As a conclusion, sgels() and dgels() can be used if the measures in b are accurate and easily related to the explanatory variables. For instance, if sensors are placed at the exits of exhaust pipes, it's easy to guess which motors are running. But sometimes, the linear link between the source and the measures is not precisely known (uncertainty on the terms of A) or discriminating polluters on the base of measurements becomes more difficult (Some polluters are far from the set of sensors and A is ill-conditionned). In this kind of situation, dgelsd() and tunning the rcond argument can help. Whenever in doubt, use dgelsd() and estimate the error on the estimated x according to LAPACK's user guide.