I am trying to solve an equation that can include truncations in Python with a numerical approach. I am wondering what the best library and approach would be? Following is more detail about the problem:
The equation changes every time. From a human perspective, the equations should be pretty simple; they include common operators such as +,-,*,/, and they also sometimes have truncation functions (truncate to integer) or limit functions (limit the value in parenthesis between two provided bounds) or (rarely) multiple variables. A couple of examples (with these being separate examples and not a system of equations) would be:
TRUNCATE(VAR_1 + 300) - 50.4 = 200
(VAR_2 + VAR_3)*3 = 35
LIMIT(3,5)(VAR_4) = 8
VAR_5 = 34
(This is not exactly what the equations look like, because I am writing them in postfix notation, but I have a calculator to determine their value with provided input values.)
All I need for these equations is some value for each variable that would solve each equation; I do not need to know every solution.
Some additional things to note about this is a) these variables all have maximum and minimum values, b) while perfection would be nice, occasional errors are acceptable, and c) some of the variables are integers, which I expect really complicates things. Right now, I'm handling this very sloppily but also mostly acceptably for my case by rounding the integer values to the nearest int.
In an attempt to solve this problem, I tried solving analytically with Sympy (which as you might expect didn't work on truncations and was difficult to implement), and I also tried using Scipy minimize as follows:
minimize(minimization, x0, method = 'SLSQP', constraints = cons, tol = 1e-3, options={'ftol': 1e-3, 'disp':True, 'maxiter': 100, "eps":.1}, args = (x_vals, postfix, const_values, value))
This one got stuck on truncations, presumably because it didn't know what direction to move, unless I set the step to 1, which decreased accuracy. For some reason, it also didn't seem to follow the ftol, because it would give acceptable answers within the tolerance but would just keep going to the iteration limit.
I am considering using something that does random walks like the "Markov Chain Monte Carlo" method, but I really don't know much about this and was eager to hear other thoughts.
I ended up solving the problem two slightly different ways. Both of them used the Powell solver as suggested by joni in the comments on the original question, and for both of them I had to multiply the output of the function that gets passed to the "fun" parameter (a function that I named minimize) by 100, because I could never get the tolerance adjusted in the solver function call.
When the equation had only one variable, I removed the truncation from the minimize function. This worked for my purposes because the reason the equations I was using was being truncated was so they would equal an integer value (generally). So, when the equation output is an integer and there is only one variable, I believe the correct solution will be obtained by just pretending the truncation function does not exist in the solver (though remember to be wary of floating point math). (And if any numbers outside of the truncation are integers, the equation may not have a solution anyways.)
In cases with multiple variables, my solution was to a) include the truncation function in the minimize function and b) round the x values suggested by the solver as I planned to round them in the end (ex. round them to an integer if they were an integer value).
Anyways, this solution worked for the problem defined above, but it otherwise has some limitations. It is not guaranteed to always find the correct output, especially the second part. Another approach people with this problem may wish to consider would be some sort of integer programming, if they have linear equations.
Related
EDIT: Original post too vague. I am looking for an algorithm to solve a large-system, solvable, linear IVP that can handle very small floating point values. Solving for the eigenvectors and eigenvalues is impossible with numpy.linalg.eig() as the returned values are complex and should not be, it does not support numpy.float128 either, and the matrix is not symmetric so numpy.linalg.eigh() won't work. Sympy could do it given an infinite amount of time, but after running it for 5 hours I gave up. scipy.integrate.solve_ivp() works with implicit methods (have tried Radau and BDF), but the output is wildly wrong. Are there any libraries, methods, algorithms, or solutions for working with this many, very small numbers?
Feel free to ignore the rest of this.
I have a 150x150 sparse (~500 nonzero entries of 22500) matrix representing a system of first order, linear differential equations. I'm attempting to find the eigenvalues and eigenvectors of this matrix to construct a function that serves as the analytical solution to the system so that I can just give it a time and it will give me values for each variable. I've used this method in the past for similar 40x40 matrices, and it's much (tens, in some cases hundreds of times) faster than scipy.integrate.solve_ivp() and also makes post model analysis much easier as I can find maximum values and maximum rates of change using scipy.optimize.fmin() or evaluate my function at inf to see where things settle if left long enough.
This time around, however, numpy.linalg.eig() doesn't seem to like my matrix and is giving me complex values, which I know are wrong because I'm modeling a physical system that can't have complex rates of growth or decay (or sinusoidal solutions), much less complex values for its variables. I believe this to be a stiffness or floating point rounding problem where the underlying LAPACK algorithm is unable to handle either the very small values (smallest is ~3e-14, and most nonzero values are of similar scale) or disparity between some values (largest is ~4000, but values greater than 1 only show up a handful of times).
I have seen suggestions for similar users' problems to use sympy to solve for the eigenvalues, but when it hadn't solved my matrix after 5 hours I figured it wasn't a viable solution for my large system. I've also seen suggestions to use numpy.real_if_close() to remove the imaginary portions of the complex values, but I'm not sure this is a good solution either; several eigenvalues from numpy.linalg.eig() are 0, which is a sign of error to me, but additionally almost all the real portions are of the same scale as the imaginary portions (exceedingly small), which makes me question their validity as well. My matrix is real, but unfortunately not symmetric, so numpy.linalg.eigh() is not viable either.
I'm at a point where I may just run scipy.integrate.solve_ivp() for an arbitrarily long time (a few thousand hours) which will probably take a long time to compute, and then use scipy.optimize.curve_fit() to approximate the analytical solutions I want, since I have a good idea of their forms. This isn't ideal as it makes my program much slower, and I'm also not even sure it will work with the stiffness and rounding problems I've encountered with numpy.linalg.eig(); I suspect Radau or BDF would be able to navigate the stiffness, but not the rounding.
Anybody have any ideas? Any other algorithms for finding eigenvalues that could handle this? Can numpy.linalg.eig() work with numpy.float128 instead of numpy.float64 or would even that extra precision not help?
I'm happy to provide additional details upon request. I'm open to changing languages if needed.
As mentioned in the comment chain above the best solution for this is to use a Matrix Exponential, which is a lot simpler (and apparently less error prone) than diagonalizing your system with eigenvectors and eigenvalues.
For my case I used scipy.sparse.linalg.expm() since my system is sparse. It's fast, accurate, and simple. My only complaint is the loss of evaluation at infinity, but it's easy enough to work around.
I have no idea how or where to start, I need to have some vocabulary or terms to get going and researching so let's ask the community.
Problem: I have a value X that is the final answer of the equation. I have a range of values, let's say 10 (A-J), that will be used in the equation. Using simple calculations (+, -, *, /, (), ^) to form an equation that uses some or all of the values A-J to solve the equation to the value X.
Example: A^2+B*C+(D+E+F+G)*J+30 = X
Input: final value X, the values used in the equation, max number of values to be used in the equation meaning I want to use at least Z number of given values in the equation (in the example Z = 8)
Output: the equation that solves it with the given values
Can this be turned into a python script for example? Is it possible at all to calculate this way? What are the terms that describe this kind of calculations?
If I understand the question correctly:
The algorithm you are looking for outputs a mathematical function, along with the specific values that, when applied to the function, give you the input value x, for an arbitrary value x.
In general I believe this is not possible. It may be possible from a non-deterministic point of view, where you try and guess values but that usually isn't feasible from an algorithmic or computational standpoint.
Let's first limit the function to one variable. Finding a function that gives you a value x, for some value a, i.e. f(a) = x, is the same as asking for f(a) - x = 0.
Limiting the operations to +,-,*, we see that f(a) is a polynomial.
Limiting the problem in this way relates the solution to algebraic numbers and constructible numbers. The more general mathematical theory that explain properties of these numbers and functions is called Galois Theory.
It is possible to find the polynomial of a given the input value x IF and only IF the input value x is algebraic. You can produce a simple algorithm that takes powers of x until that power of x is an integer (or a rational number).
You can produce a simple algorithm that takes powers of the irrational part of x until that power of x is an integer (or a rational number). Note that even in this algorithm you would need to take into account some sort of measurement error, because sqrt(2) = 1.41421356237... for an infinite number of decimal places, and the computer can only keep track of some finite amount of decimal places. For example:
def sqrt(x):
return x**(1/2)
if __name__ == "__main__":
num = sqrt(2)
print(num)
print(num**2)
Will output:
1.4142135623730951
2.0000000000000004
So the output of such a simple algorithm will never be 100% correct. It is possible for a human to do, but not a computer. You may want to look into the field of symbolic computation, but the algorithm for solving even part of your problem will not be easy to turn into a script.
If you are okay with not solving this problem 100% of the time, you may want to look into linear approximations and non-linear approximations.
The reason why even without measurement errror, I believe this is not possible in general is that adding the operations (^,/) may not result in a polynomial ring. Which is the basis of solving problems relating to algebraic numbers.
Introducing extra variables b,c,....,n,.. to f, such that f(a,b,c,...,n,...) = x would also restrict what functions would satisfy the properties of a ring.
Regardless, this problem is difficult, even when only considering polynomials of one variable. Googling the terms in bold may help provide additional insight into your problem. Sorry if some of this was already obvious for you and in any case I hope this helps!
I am comparing the numerical results of C++ and Python computations. In C++, I make use of LAPACK's sgels function to compute the coefficients of a linear regression problem. In Python, I use Numpy's linalg.lstsq function for a similar task.
What is the mathematical difference between the methods used by sgels and linalg.lstsq?
What is the expected error (e.g. 6 significant digits) when comparing the results (i.e. the regression coefficients) numerically?
FYI: I am by no means a C++ or Python expert, which makes it difficult to understand what is going on inside the functions.
Taking a look at the source of numpy, in the file linalg.py, lstsq relies on LAPACK's zgelsd() for complex and dgelsd() for real. Here are the differences to sgels():
dgelsd() is for double while sgels() is for float. There is a difference of precision...
dgels() makes use the QR factorization of the matrix A and assumes that A has full rank. The condition number of the matrix must be reasonable to get a significant result. See this course for getting the logic of the method. On the other hand, dgelsd() makes use of the Singular value decomposition of A. In particular, A can be rank defiencient and small singular values are discarted depending of the additional argument rcond or machine precision. Notice that numpy's default value for rcond is -1: negative values refers to machine precision. See this course for the logic.
According to the benchmark of LAPACK, on can expect dgels() to be about 5 time faster than dgelsd().
You may see significant differences between the result of sgels() and dgelsd() if the matrix is ill conditionned. Indeed, there is a bound on the error of the linear regression which depends on the algorithm and the value of rcond() that is used. See the user guide of LAPACK on, Error Bounds for Linear Least Squares Problems for estimates of the errors and Further Details: Error Bounds for Linear Least Squares Problems for technical details.
As a conclusion, sgels() and dgels() can be used if the measures in b are accurate and easily related to the explanatory variables. For instance, if sensors are placed at the exits of exhaust pipes, it's easy to guess which motors are running. But sometimes, the linear link between the source and the measures is not precisely known (uncertainty on the terms of A) or discriminating polluters on the base of measurements becomes more difficult (Some polluters are far from the set of sensors and A is ill-conditionned). In this kind of situation, dgelsd() and tunning the rcond argument can help. Whenever in doubt, use dgelsd() and estimate the error on the estimated x according to LAPACK's user guide.
I have a follow up question to the post written a couple days ago, thank you for the previous feedback:
Finding complex roots from set of non-linear equations in python
I have gotten the set non-linear equations set up in python now so that fsolve will handle the real and imaginary parts independently. However, there are still problems with the python "fsolve" converging to the correct solution. I have exactly the same inputs that are used in Matlab, and after double checking, the set of equations are exactly the same as well. Matlab, no matter how I set the initial values, will always converge to the correct solution. With python however, every initial condition produces a different result, and never the correct one. After a fraction of a second, the following warning appears with python:
/opt/local/Library/Frameworks/Python.framework/Versions/Current/lib/python2.7/site-packages/scipy/optimize/minpack.py:227:
RuntimeWarning: The iteration is not making good progress, as measured by the
improvement from the last ten iterations.
warnings.warn(msg, RuntimeWarning)
I was wondering if there are some known differences between the fsolve in python and Matlab, and if there are some known methods to optimize the performance in python.
Thank you very much
I don't think that you should rely on the fact that the names are the same. I see from your other question that you are specifying that Matlab's fsolve use the 'levenberg-marquardt' algorithm rather than the default. Python's scipy.optimize.fsolve uses MINPACK's hybrd algorithms. Levenberg-Marquardt finds roots approximately by minimizing the sum of squares of the function and is quite robust. It is not a true root-finding method like the default 'trust-region-dogleg' algorithm. I don't know how the hybrd schemes work, but they claim to be a modification of Powell's method.
If you want something similar to what you're doing in Matlab, I'd look for an optimization scheme that implements Levenberg-Marquardt, such as scipy.optimize.root, which you were also using in your previous question. Is there a reason why you're not using that?
I am trying to numerically integrate an arbitrary (known when I code) function in my program
using numerical integration methods. I am using Python 2.5.2 along with SciPy's numerical integration package. In order to get a feel for it, i decided to try integrating sin(x) and observed this behavior-
>>> from math import pi
>>> from scipy.integrate import quad
>>> from math import sin
>>> def integrand(x):
... return sin(x)
...
>>> quad(integrand, -pi, pi)
(0.0, 4.3998892617846002e-14)
>>> quad(integrand, 0, 2*pi)
(2.2579473462709165e-16, 4.3998892617846002e-14)
I find this behavior odd because -
1. In ordinary integration, integrating over the full cycle gives zero.
2. In numerical integration, this (1) isn't necessarily the case, because you may just be
approximating the total area under the curve.
In any case, either assuming 1 is True or assuming 2 is True, I find the behavior to be inconsistent. Either both integrations (-pi to pi and 0 to 2*pi) should return 0.0 (first value in the tuple is the result and the second is the error) or return 2.257...
Can someone please explain why this is happening? Is this really an inconsistency? Can someone also tell me if I am missing something really basic about numerical methods?
In any case, in my final application, I plan to use the above method to find the arc length of a function. If someone has experience in this area, please advise me on the best policy for doing this in Python.
Edit
Note
I already have the first differential values at all points in the range stored in an array.
Current error is tolerable.
End note
I have read Wikipaedia on this. As Dimitry has pointed out, I will be integrating sqrt(1+diff(f(x), x)^2) to get the Arc Length. What I wanted to ask was - is there a better approximation/ Best practice(?) / faster way to do this. If more context is needed, I'll post it separately/ post context here, as you wish.
The quad function is a function from an old Fortran library. It works by judging by the flatness and slope of the function it is integrating how to treat the step size it uses for numerical integration in order to maximize efficiency. What this means is that you may get slightly different answers from one region to the next even if they're analytically the same.
Without a doubt both integrations should return zero. Returning something that is 1/(10 trillion) is pretty close to zero! The slight differences are due to the way quad is rolling over sin and changing its step sizes. For your planned task, quad will be all you need.
EDIT:
For what you're doing I think quad is fine. It is fast and pretty accurate. My final statement is use it with confidence unless you find something that really has gone quite awry. If it doesn't return a nonsensical answer then it is probably working just fine. No worries.
I think it is probably machine precision since both answers are effectively zero.
If you want an answer from the horse's mouth I would post this question on the scipy discussion board
I would say that a number O(10^-14) is effectively zero. What's your tolerance?
It might be that the algorithm underlying quad isn't the best. You might try another method for integration and see if that improves things. A 5th order Runge-Kutta can be a very nice general purpose technique.
It could be just the nature of floating point numbers: "What Every Computer Scientist Should Know About Floating Point Arithmetic".
This output seems correct to me since you have absolute error estimate here. The integral value of sin(x) is indeed should have value of zero for full period (any interval of 2*pi length) in both ordinary and numeric integration and your results is close to that value.
To evaluate arc length you should calculate integral for sqrt(1+diff(f(x), x)^2) function, where diff(f(x), x) is derivative of f(x). See also Arc length
0.0 == 2.3e-16 (absolute error tolerance 4.4e-14)
Both answers are the same and correct i.e., zero within the given tolerance.
The difference comes from the fact that sin(x)=-sin(-x) exactly even in finite precision. Whereas finite precision only gives sin(x)~sin(x+2*pi) approximately. Sure it would be nice if quad were smart enough to figure this out, but it really has no way of knowing apriori that the integral over the two intervals you give are equivalent or that the the first is a better result.