Python - solve polynomial for y - python

I'm taking in a function (e.g. y = x**2) and need to solve for x. I know I can painstakingly solve this manually, but I'm trying to find instead a method to use. I've browsed numpy, scipy and sympy, but can't seem to find what I'm looking for. Currently I'm making a lambda out of the function so it'd be nice if i'm able to keep that format for the the method, but not necessary.
Thanks!

If you are looking for numerical solutions (i.e. just interested in the numbers, not the symbolic closed form solutions), then there are a few options for you in the SciPy.optimize module. For something simple, the newton is a pretty good start for simple polynomials, but you can take it from there.
For symbolic solutions (which is to say to get y = x**2 -> x = +/- sqrt(y)) SymPy solver gives you roughly what you need. The whole SymPy package is directed at doing symbolic manipulation.
Here is an example using the Python interpreter to solve the equation that is mentioned in the question. You will need to make sure that SymPy package is installed, then:
>>>> from sympy import * # we are importing everything for ease of use
>>>> x = Symbol("x")
>>>> y = Symbol("y") # create the two variables
>>>> equation = Eq(x ** 2, y) # create the equation
>>>> solve(equation, x)
[y**(1/2), -y**(1/2)]
As you see the basics are fairly workable, even as an interactive algebra system. Not nearly as nice as Mathematica, but then again, it is free and you can incorporate it into your own programs. Make sure to read the Gotchas and Pitfalls section of the SymPy documentation on how to encode the appropriate equations.
If all this was to get a quick and dirty solutions to equations then there is always Wolfram Alpha.

Use Newton-Raphson via scipy.optimize.newton. It finds roots of an equation, i.e., values of x for which f(x) = 0. In the example, you can cast the problem as looking for a root of the function f(x) = x² - y. If you supply a lambda that computes y, you can provide a general solution thus:
def inverse(f, f_prime=None):
def solve(y):
return newton(lambda x: f(x) - y, 1, f_prime, (), 1E-10, 1E6)
return solve
Using this function is quite simple:
>>> sqrt = inverse(lambda x: x**2)
>>> sqrt(2)
1.4142135623730951
>>> import math
>>> math.sqrt(2)
1.4142135623730951
Depending on the input function, you may need to tune the parameters to newton(). The current version uses a starting guess of 1, a tolerance of 10-10 and a maximum iteration count of 106.
For an additional speed-up, you can supply the derivative of the function in question:
>>> sqrt = inverse(lambda x: x**2, lambda x: 2*x)
In fact, without it, the function actually uses the secant method instead of Newton-Raphson, which relies on knowing the derivative.

Check out SymPy, specifically the solver.

Related

How to find out how SymPy solved an equation?

When I solve an equation using SymPy, it would be nice to learn what it actually did to find the solutions.
The following is a simple test case for answers. We know it should be using the quadratic equation, or some equivalent approach.
from sympy import *
>>> x = Symbol('x', real=True)
>>> solve(2*x**2 + 12*x + 12)
[-3 - sqrt(3), -3 + sqrt(3)]
Is some way of filtering through a record of the execution stack, or something else to extract what steps were useful to SymPy in solving a problem? I imagine some sort of tree-like search of steps is taken by SymPy, and a record of the stack calls would have to be pruned/simplified somehow.

Add a z3 constraint, such that the value of a z3 variable equals to the return value of some function

I have a Python function that takes a real number and returns a string, e.g.
def fun(x):
if x == 0.5:
return "Hello"
else:
return "Bye"
I start a z3 solver:
from z3 import *
r = Real('r')
s = Solver()
Now I want to tell the solver find a value of r, such that fun(r) returns "Hello". I am facing the following problems:
If r = Real('r'), then I cannot call fun(r), because r is a z3 variable
I cannot say s.add(StringVal("Hello") == StringVal(fun(r))) as a constraint, because 1) r is a z3 variable, 2) the interpretation of r is not known yet. Hence, I have no model from which I could extract a value and pass it to the fun.
Is it possible to implement what I want?
There's no out-of-the box way to do this in z3, unless you're willing to modify the definition of fun so z3 can understand it. Like this:
from z3 import *
def fun(x):
return If(x == 0.5, StringVal("Hello"), StringVal("Bye"))
r = Real('r')
s = Solver()
s.add(fun(r) == StringVal("Hello"))
print(s.check())
print(s.model())
This prints:
sat
[r = 1/2]
So, we changed the function fun to use the z3 idioms that correspond to what you wanted to write. A few points:
Obviously, not every Python construct will be easy to translate in this way. While you can express most every construct, the complexity of the hand-transformation will get higher as you work on more complicated functions.
So far as I know, there's no automatic way to do this translation for you. However, take a look at this blog post for some ideas on how to do so via the disassembler.
PyExZ3 is a now-defunct symbolic simulator for Python, using z3 as the backend solver. While the project is no longer active, you might be able to resurrect the source-code or get some ideas from it.
You have used Real to model the parameter x. Z3's Real values are true mathematical reals. (Strictly speaking, they are algebraic reals, but that's besides the point for now.) Python, however, doesn't have those; instead it uses double-precision floats. But luckily, z3 understands floats, so for a more realistic encoding use the Float64 sort. See here for details.

Limits involving the cumulative distribution function of a normal variable

I'm working through some exercises on improper integrals and I've stumbled across an issue I can't resolve. I'm attempting to use the limit() function on the following problem:
Here N(x) is the cumulative distribution function of the standard normal variable.
The limit() function so far hasn't caused any problems, including problems which require L'Hôpital's rule be applied. However, I'm struggling to get compute the correct answer for this particular problem and can't work out why. The following code yields an incorrect answer
from sympy import *
x, y = symbols('x y')
init_printing(use_unicode=False) #Print the answers in unicode characters
cum_distribution = (1/sqrt(2*pi)*(integrate(exp(-y**2/2), (y, -oo, x))))
func = (cum_distribution -(1/2)-(x/sqrt(2*pi)))/(x**3)
limit(func, x, 0)
If I apply L'Hôpital's rule, i get the correct
l_hopital = diff((cum_distribution -(1/2)-(x/sqrt(2*pi))), x)/diff(x**3, x)
limit(l_hopital, x, 0)
I looked through the limit() function source code and my understanding is that L'Hôpital's rule isn't applied? In this case, can this problem be solved using the limit() function without applying this rule?
At present, a limit involving the function erf (known as the error function, related to normal CDF) can only be evaluated when the argument of erf tends to positive infinity. Limits at other places are either not evaluated, or evaluated incorrectly. (Related PR). This includes the limit
limit(-(sqrt(2)*x - sqrt(pi)*erf(sqrt(2)*x/2))/(2*sqrt(pi)*x**3), x, 0)
which returns unevaluated (though I would not call this incorrect). As a workaround, you can compute the Taylor series of this function with one term (the constant term), which gives the correct value of the limit:
series(func, x, 0, 1).removeO()
returns -sqrt(2)/(12*sqrt(pi)).
As in calculus practice, L'Hopital's rule is inferior to power series techniques when it comes to algorithmic computations, and SymPy relies primarily on the latter. The algorithm it uses is devised and explained in On Computing Limits in a Symbolic Manipulation System by Dominik Gruntz.

RuntimeError in solving equation using SymPy

I have a equation to solve. The equation can be described as the formula above. N and S are constants, for example N = 201 and S = 0.5. I use sympy in python to solve it. The python script is given as following:
from sympy import *
x=Symbol('x')
print solve( (((1-x)/200) **(1-x))* x**x - 2**(-0.5), x)
However, there is a RuntimeError: maximum recursion depth exceeded in __instancecheck__
I have also tried to use Mathematica, and it can output a result of 0.963
http://www.wolframalpha.com/input/?i=(((1-x)%2F200)+(1-x))*+xx+-+2**(-0.5)+%3D+0
Any suggestion is welcome. Thanks.
Assuming that you don't want a symbolic solution, just a value you can work with (like WA's 0.964), you can use mpmath for this. I'm not sure if it's actually possible to express the solution in radicals - WA certainly didn't even try. You should already have it installed as SymPy
Requires: mpmath
Specifically, mpmath.findroot seems to do what you want. It takes an actual callable Python object which is the function to find a root of, and a starting value for x. It also accepts some more parameters such as the minimum error tol and the solver to use which you could play around with, although they don't really seem necessary. You could quite simply use it like this:
import mpmath
f = lambda x: (((1-x)/200) **(1-x))* x**x - 2**(-0.5)
print mpmath.findroot(f, 1)
I just used 1 as a starting value - you could probably think of a better one. Judging by the shape of your graph, there's only one root to be found and it can be approached quite easily, without much need for fancy solvers, so this should suffice. Also, considering that "mpmath is a Python library for arbitrary-precision floating-point arithmetic", you should be able to get a very high precision answer from this if you wished. It has the output of
(0.963904761592753 + 0.0j)
This is actually an mpmath complex or mpc object,
mpc(real='0.96390476159275343', imag='0.0')
If you know it will have an imaginary value of 0, you can just use either of the following methods:
In [6]: abs(mpmath.mpc(23, 0))
Out[6]: mpf('23.0')
In [7]: mpmath.mpc(23, 0).real
Out[7]: mpf('23.0')
to "extract" a single float in the format of an mpf.

Best way to write a Python function that integrates a gaussian?

In attempting to use scipy's quad method to integrate a gaussian (lets say there's a gaussian method named gauss), I was having problems passing needed parameters to gauss and leaving quad to do the integration over the correct variable. Does anyone have a good example of how to use quad w/ a multidimensional function?
But this led me to a more grand question about the best way to integrate a gaussian in general. I didn't find a gaussian integrate in scipy (to my surprise). My plan was to write a simple gaussian function and pass it to quad (or maybe now a fixed width integrator). What would you do?
Edit: Fixed-width meaning something like trapz that uses a fixed dx to calculate areas under a curve.
What I've come to so far is a method make___gauss that returns a lambda function that can then go into quad. This way I can make a normal function with the average and variance I need before integrating.
def make_gauss(N, sigma, mu):
return (lambda x: N/(sigma * (2*numpy.pi)**.5) *
numpy.e ** (-(x-mu)**2/(2 * sigma**2)))
quad(make_gauss(N=10, sigma=2, mu=0), -inf, inf)
When I tried passing a general gaussian function (that needs to be called with x, N, mu, and sigma) and filling in some of the values using quad like
quad(gen_gauss, -inf, inf, (10,2,0))
the parameters 10, 2, and 0 did NOT necessarily match N=10, sigma=2, mu=0, which prompted the more extended definition.
The erf(z) in scipy.special would require me to define exactly what t is initially, but it nice to know it is there.
Okay, you appear to be pretty confused about several things. Let's start at the beginning: you mentioned a "multidimensional function", but then go on to discuss the usual one-variable Gaussian curve. This is not a multidimensional function: when you integrate it, you only integrate one variable (x). The distinction is important to make, because there is a monster called a "multivariate Gaussian distribution" which is a true multidimensional function and, if integrated, requires integrating over two or more variables (which uses the expensive Monte Carlo technique I mentioned before). But you seem to just be talking about the regular one-variable Gaussian, which is much easier to work with, integrate, and all that.
The one-variable Gaussian distribution has two parameters, sigma and mu, and is a function of a single variable we'll denote x. You also appear to be carrying around a normalization parameter n (which is useful in a couple of applications). Normalization parameters are usually not included in calculations, since you can just tack them back on at the end (remember, integration is a linear operator: int(n*f(x), x) = n*int(f(x), x) ). But we can carry it around if you like; the notation I like for a normal distribution is then
N(x | mu, sigma, n) := (n/(sigma*sqrt(2*pi))) * exp((-(x-mu)^2)/(2*sigma^2))
(read that as "the normal distribution of x given sigma, mu, and n is given by...") So far, so good; this matches the function you've got. Notice that the only true variable here is x: the other three parameters are fixed for any particular Gaussian.
Now for a mathematical fact: it is provably true that all Gaussian curves have the same shape, they're just shifted around a little bit. So we can work with N(x|0,1,1), called the "standard normal distribution", and just translate our results back to the general Gaussian curve. So if you have the integral of N(x|0,1,1), you can trivially calculate the integral of any Gaussian. This integral appears so frequently that it has a special name: the error function erf. Because of some old conventions, it's not exactly erf; there are a couple additive and multiplicative factors also being carried around.
If Phi(z) = integral(N(x|0,1,1), -inf, z); that is, Phi(z) is the integral of the standard normal distribution from minus infinity up to z, then it's true by the definition of the error function that
Phi(z) = 0.5 + 0.5 * erf(z / sqrt(2)).
Likewise, if Phi(z | mu, sigma, n) = integral( N(x|sigma, mu, n), -inf, z); that is, Phi(z | mu, sigma, n) is the integral of the normal distribution given parameters mu, sigma, and n from minus infinity up to z, then it's true by the definition of the error function that
Phi(z | mu, sigma, n) = (n/2) * (1 + erf((x - mu) / (sigma * sqrt(2)))).
Take a look at the Wikipedia article on the normal CDF if you want more detail or a proof of this fact.
Okay, that should be enough background explanation. Back to your (edited) post. You say "The erf(z) in scipy.special would require me to define exactly what t is initially". I have no idea what you mean by this; where does t (time?) enter into this at all? Hopefully the explanation above has demystified the error function a bit and it's clearer now as to why the error function is the right function for the job.
Your Python code is OK, but I would prefer a closure over a lambda:
def make_gauss(N, sigma, mu):
k = N / (sigma * math.sqrt(2*math.pi))
s = -1.0 / (2 * sigma * sigma)
def f(x):
return k * math.exp(s * (x - mu)*(x - mu))
return f
Using a closure enables precomputation of constants k and s, so the returned function will need to do less work each time it's called (which can be important if you're integrating it, which means it'll be called many times). Also, I have avoided any use of the exponentiation operator **, which is slower than just writing the squaring out, and hoisted the divide out of the inner loop and replaced it with a multiply. I haven't looked at all at their implementation in Python, but from my last time tuning an inner loop for pure speed using raw x87 assembly, I seem to remember that adds, subtracts, or multiplies take about 4 CPU cycles each, divides about 36, and exponentiation about 200. That was a couple years ago, so take those numbers with a grain of salt; still, it illustrates their relative complexity. As well, calculating exp(x) the brute-force way is a very bad idea; there are tricks you can take when writing a good implementation of exp(x) that make it significantly faster and more accurate than a general a**b style exponentiation.
I've never used the numpy version of the constants pi and e; I've always stuck with the plain old math module's versions. I don't know why you might prefer either one.
I'm not sure what you're going for with the quad() call. quad(gen_gauss, -inf, inf, (10,2,0)) ought to integrate a renormalized Gaussian from minus infinity to plus infinity, and should always spit out 10 (your normalization factor), since the Gaussian integrates to 1 over the real line. Any answer far from 10 (I wouldn't expect exactly 10 since quad() is only an approximation, after all) means something is screwed up somewhere... hard to say what's screwed up without knowing the actual return value and possibly the inner workings of quad().
Hopefully that has demystified some of the confusion, and explained why the error function is the right answer to your problem, as well as how to do it all yourself if you're curious. If any of my explanation wasn't clear, I suggest taking a quick look at Wikipedia first; if you still have questions, don't hesitate to ask.
scipy ships with the "error function", aka Gaussian integral:
import scipy.special
help(scipy.special.erf)
The gaussian distribution is also called a normal distribution. The cdf function in the scipy norm module does what you want.
from scipy.stats import norm
print norm.cdf(0.0)
>>>0.5
http://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.norm.html#scipy.stats.norm
Why not just always do your integration from -infinity to +infinity, so that you always know the answer? (joking!)
My guess is that the only reason that there's not already a canned Gaussian function in SciPy is that it's a trivial function to write. Your suggestion about writing your own function and passing it to quad to integrate sounds excellent. It uses the accepted SciPy tool for doing this, it's minimal code effort for you, and it's very readable for other people even if they've never seen SciPy.
What exactly do you mean by a fixed-width integrator? Do you mean using a different algorithm than whatever QUADPACK is using?
Edit: For completeness, here's something like what I'd try for a Gaussian with the mean of 0 and standard deviation of 1 from 0 to +infinity:
from scipy.integrate import quad
from math import pi, exp
mean = 0
sd = 1
quad(lambda x: 1 / ( sd * ( 2 * pi ) ** 0.5 ) * exp( x ** 2 / (-2 * sd ** 2) ), 0, inf )
That's a little ugly because the Gaussian function is a little long, but still pretty trivial to write.
I assume you're handling multivariate Gaussians; if so, SciPy already has the function you're looking for: it's called MVNDIST ("MultiVariate Normal DISTribution). The SciPy documentation is, as ever, terrible, so I can't even find where the function is buried, but it's in there somewhere. The documentation is easily the worst part of SciPy, and has frustrated me to no end in the past.
Single-variable Gaussians just use the good old error function, of which many implementations are available.
As for attacking the problem in general, yes, as James Thompson mentions, you just want to write your own gaussian distribution function and feed it to quad(). If you can avoid the generalized integration, though, it's a good idea to do so -- specialized integration techniques for a particular function (like MVNDIST uses) are going to be much faster than a standard Monte Carlo multidimensional integration, which can be extremely slow for high accuracy.

Categories