Which programming language or a library is able to process infinite series (like geometric or harmonic)? It perhaps must have a database of some well-known series and automatically give proper values in case of convergence, and maybe generate an exception in case of divergence.
For example, in Python it could look like:
sum = 0
sign = -1.0
for i in range(1,Infinity,2):
sign = -sign
sum += sign / i
then, sum must be math.pi/4 without doing any computations in the loop (because it's a well-known sum).
Most functional languages which evaluate lazily can simulate the processing of infinite series. Of course, on a finite computer it is not possible to process infinite series, as I am sure you are aware. Off the top of my head, I guess Mathematica can do most of what you might want, I suspect that Maple can too, maybe Sage and other computer-algebra systems and I'd be surprised if you can't find a Haskell implementation that suits you.
EDIT to clarify for OP: I do not propose generating infinite loops. Lazy evaluation allows you to write programs (or functions) which simulate infinite series, programs which themselves are finite in time and space. With such languages you can determine many of the properties, such as convergence, of the simulated infinite series with considerable accuracy and some degree of certainty. Try Mathematica or, if you don't have access to it, try Wolfram Alpha to see what one system can do for you.
One place to look might be the Wikipedia category of Computer Algebra Systems.
There are two tools available in Haskell for this beyond simply supporting infinite lists.
First there is a module that supports looking up sequences in OEIS. This can be applied to the first few terms of your series and can help you identify a series for which you don't know the closed form, etc. The other is the 'CReal' library of computable reals. If you have the ability to generate an ever improving bound on your value (i.e. by summing over the prefix, you can declare that as a computable real number which admits a partial ordering, etc. In many ways this gives you a value that you can use like the sum above.
However in general computing the equality of two streams requires an oracle for the halting problem, so no language will do what you want in full generality, though some computer algebra systems like Mathematica can try.
Maxima can calculate some infinite sums, but in this particular case it doesn't seem to find the answer :-s
(%i1) sum((-1)^k/(2*k), k, 1, inf), simpsum;
inf
==== k
\ (- 1)
> ------
/ k
====
k = 1
(%o1) ------------
2
but for example, those work:
(%i2) sum(1/(k^2), k, 1, inf), simpsum;
2
%pi
(%o2) ----
6
(%i3) sum((1/2^k), k, 1, inf), simpsum;
(%o3) 1
You can solve the series problem in Sage (a free Python-based math software system) exactly as follows:
sage: k = var('k'); sum((-1)^k/(2*k+1), k, 1, infinity)
1/4*pi - 1
Behind the scenes, this is really using Maxima (a component of Sage).
For Python check out SymPy - clone of Mathematica and Matlab.
There is also a heavier Python-based math-processing tool called Sage.
You need something that can do a symbolic computation like Mathematica.
You can also consider quering wolframaplha: sum((-1)^i*1/i, i, 1 , inf)
There is a library called mpmath(python), a module of sympy, which provides the series support for sympy( I believe it also backs sage).
More specifically, all of the series stuff can be found here: Series documentation
The C++ iRRAM library performs real arithmetic exactly. Among other things it can compute limits exactly using the limit function. The homepage for iRRAM is here. Check out the limit function in the documentation. Note that I'm not talking about arbitrary precision arithmetic. This is exact arithmetic, for a sensible definition of exact. Here's their code to compute e exactly, pulled from the example on their web site:
//---------------------------------------------------------------------
// Compute an approximation to e=2.71.. up to an error of 2^p
REAL e_approx (int p)
{
if ( p >= 2 ) return 0;
REAL y=1,z=2;
int i=2;
while ( !bound(y,p-1) ) {
y=y/i;
z=z+y;
i+=1;
}
return z;
};
//---------------------------------------------------------------------
// Compute the exact value of e=2.71..
REAL e()
{
return limit(e_approx);
};
Clojure and Haskell off the top of my head.
Sorry I couldn't find a better link to haskell's sequences, if someone else has it, please let me know and I'll update.
Just install sympy on your computer. Then do the following code:
from sympy.abc import i, k, m, n, x
from sympy import Sum, factorial, oo, IndexedBase, Function
Sum((-1)**k/(2*k+1), (k, 0, oo)).doit()
Result will be: pi/4
I have worked in couple of Huge Data Series for Research purpose.
I used Matlab for that. I didn't know it can/can't process Infinite Series.
But I think there is a possibility.
U can try :)
This can be done in for instance sympy and sage (among open source alternatives) In the following, a few examples using sympy:
In [10]: summation(1/k**2,(k,1,oo))
Out[10]:
2
π
──
6
In [11]: summation(1/k**4, (k,1,oo))
Out[11]:
4
π
──
90
In [12]: summation( (-1)**k/k, (k,1,oo))
Out[12]: -log(2)
In [13]: summation( (-1)**(k+1)/k, (k,1,oo))
Out[13]: log(2)
Behind the scenes, this is using the theory for hypergeometric series, a nice introduction is the book "A=B" by Marko Petkovˇeks, Herbert S. Wilf
and Doron Zeilberger which you can find by googling. ¿What is a hypergeometric series?
Everybody knows what an geometric series is: $X_1, x_2, x_3, \dots, x_k, \dots $ is geometric if the contecutive terms ratio $x_{k+1}/x_k$ is constant. It is hypergeometric if the consecutive terms ratio is a rational function in $k$! sympy can handle basically all infinite sums where this last condition is fulfilled, but only very few others.
Related
I am trying to implement an R algortihm dealing with non-negative ODE Systems. I need something like ode45 in MATLAB to define states which have to be none-negative.
I discussed about that already 3 years ago but with no real solution. deSolve is still not the way to go. I found some python code which looks very promising. Maybe this is possible in R as well. In the end I have to define a function wraper, as functools in python. What it does is pretty simple. Here is the code the of the python wraper:
def wrap(f):
#wraps(f)
def wrapper(t, y, *args, **kwargs):
low = y < 0
y = np.maximum(y, np.ones(np.shape(y))*0)
result = f(t, y, *args, **kwargs)
result[too_low] = np.maximum(result[low], np.ones(low.sum())*0)
return result
return wrapper
return wrap
I mean in python this is straight forward. The wraper will be used in each step of the integration called by
solver = scipy.integrate.odeint(f, y0)
solution = solver.solve()
Is the same possible in R? I know there is a functools package and functools function, as well. But I have no clue if this really works. Can I use events in deSolve for that?
I am working now on this project for 5 years and I am out of ideas. I used an MATLAB, C++ and Python interface but all this is to slow, I need it in R. Thank you very much for your help!
deSolve does not support automatic non-negativity constraints for good reasons. We had such questions several times in the past, but it turned out in all these cases, that the reason of the negative value was an incomplete model specification. The typical case is that something is exported from an empty pool. Because unwanted negative values are usually an indicator of an inadequate model specification, we do (currently) not consider to add a "non-negative" constraint in the future.
Example: in the following equation, X can become negative by model design:
dX/dt = -k
whereas the following cannot:
dX/dt = -k * X
If you need a linear decrease "most of the time" that reduces to zero shortly before X becomes zero, you can use a Monod-type safeguard (or something similar):
dX/dt = -k*X / (k2 + X)
The selection of k2 is relatively uncritical. It should be small enough not to influence the overall behavior and not too small, compared to the numerical accuracy of the solver.
Another method to avoid negative values is to work in log-transformed space. Here are some related threads:
https://stat.ethz.ch/pipermail/r-sig-dynamic-models/2010q2/000028.html
https://stat.ethz.ch/pipermail/r-sig-dynamic-models/2013q3/000222.html
https://stat.ethz.ch/pipermail/r-sig-dynamic-models/2016/000437.html
In addition, it is of course also possible to write an own wrapper in R.
Hope it helps
I have a equation to solve. The equation can be described as the formula above. N and S are constants, for example N = 201 and S = 0.5. I use sympy in python to solve it. The python script is given as following:
from sympy import *
x=Symbol('x')
print solve( (((1-x)/200) **(1-x))* x**x - 2**(-0.5), x)
However, there is a RuntimeError: maximum recursion depth exceeded in __instancecheck__
I have also tried to use Mathematica, and it can output a result of 0.963
http://www.wolframalpha.com/input/?i=(((1-x)%2F200)+(1-x))*+xx+-+2**(-0.5)+%3D+0
Any suggestion is welcome. Thanks.
Assuming that you don't want a symbolic solution, just a value you can work with (like WA's 0.964), you can use mpmath for this. I'm not sure if it's actually possible to express the solution in radicals - WA certainly didn't even try. You should already have it installed as SymPy
Requires: mpmath
Specifically, mpmath.findroot seems to do what you want. It takes an actual callable Python object which is the function to find a root of, and a starting value for x. It also accepts some more parameters such as the minimum error tol and the solver to use which you could play around with, although they don't really seem necessary. You could quite simply use it like this:
import mpmath
f = lambda x: (((1-x)/200) **(1-x))* x**x - 2**(-0.5)
print mpmath.findroot(f, 1)
I just used 1 as a starting value - you could probably think of a better one. Judging by the shape of your graph, there's only one root to be found and it can be approached quite easily, without much need for fancy solvers, so this should suffice. Also, considering that "mpmath is a Python library for arbitrary-precision floating-point arithmetic", you should be able to get a very high precision answer from this if you wished. It has the output of
(0.963904761592753 + 0.0j)
This is actually an mpmath complex or mpc object,
mpc(real='0.96390476159275343', imag='0.0')
If you know it will have an imaginary value of 0, you can just use either of the following methods:
In [6]: abs(mpmath.mpc(23, 0))
Out[6]: mpf('23.0')
In [7]: mpmath.mpc(23, 0).real
Out[7]: mpf('23.0')
to "extract" a single float in the format of an mpf.
I am currently implementing this simple code trying to find the n-th element of the Fibonacci sequence using Python 2.7:
import numpy as np
def fib(n):
F = np.empty(n+2)
F[1] = 1
F[0] = 0
for i in range(2,n+1):
F[i]=F[i-1]+F[i-2]
return int(F[n])
This works fine for F < 79, but after that I get wrong numbers. For example, according to wolfram alpha F79 should be equal to 14472334024676221, but fib(100) gives me 14472334024676220. I think this could be caused by the way python deals with integers, but I have no idea what exactly the problem is. Any help is greatly appreciated!
the default data type for a numpy array is depending on architecture a 64 (or 32) bit int.
pure python would let you have arbitrarily long integers; numpy does not.
so it's more the way numpy deals with integers; pure python would do just fine.
Python will deal with integers perfectly fine here. Indeed, that is the beauty of python. numpy, on the other hand, introduces ugliness and just happens to be completely unnecessary, and will likely slow you down. Your implementation will also require much more space. Python allows you to write beautiful, readable code. Here is Raymond Hettinger's canonical implementation of iterative fibonacci in Python:
def fib(n):
x, y = 0, 1
for _ in range(n):
x, y = y, x + y
return x
That is O(n) time and constant space. It is beautiful, readable, and succinct. It will also give you the correct integer as long as you have memory to store the number on your machine. Learn to use numpy when it is the appropriate tool, and as importantly, learn to not use it when it is inappropriate.
Unless you want to generate a list with all the fibonacci numbers until Fn, there is no need to use a list, numpy or anything else like that, a simple loop and 2 variables will be enough as you only really need to know the 2 previous values
def fib(n):
Fk, Fk1 = 0, 1
for _ in range(n):
Fk, Fk1 = Fk1, Fk+Fk1
return Fk
of course, there is better ways to do it using the mathematical properties of the Fibonacci numbers, with those we know that there is a matrix that give us the right result
import numpy
def fib_matrix(n):
mat = numpy.matrix( [[1,1],[1,0]], dtype=object) ** n
return mat[0,1]
to which I assume they have an optimized matrix exponentiation making it more efficient that the previous method.
Using the properties of the underlying Lucas sequence is possible to do it without the matriz, and equally as efficient as exponentiation by squaring and with the same number of variables as the other, but that is a little harder to understand at first glance unlike the first example because alongside the second example it require more mathematical.
The close form, the one with the golden ratio, will give you the result even faster, but that have the risk of being inaccurate because the use of floating point arithmetic.
As an additional word to the previous answer by hiro protagonist, note that if using Numpy is a requirement, you can solve very easely your issue by replacing:
F = np.empty(n+2)
with
F = np.empty(n+2, dtype=object)
but it will not do anything more than transferring back the computation to pure Python.
This question already has answers here:
Which is faster in Python: x**.5 or math.sqrt(x)?
(15 answers)
Closed 9 years ago.
In my field it's very common to square some numbers, operate them together, and take the square root of the result. This is done in pythagorean theorem, and the RMS calculation, for example.
In numpy, I have done the following:
result = numpy.sqrt(numpy.sum(numpy.pow(some_vector, 2)))
And in pure python something like this would be expected:
result = math.sqrt(math.pow(A, 2) + math.pow(B,2)) # example with two dimensions.
However, I have been using this pure python form, since I find it much more compact, import-independent, and seemingly equivalent:
result = (A**2 + B**2)**0.5 # two dimensions
result = (A**2 + B**2 + C**2 + D**2)**0.5
I have heard some people argue that the ** operator is sort of a hack, and that squaring a number by exponentiating it by 0.5 is not so readable. But what I'd like to ask is if:
"Is there any COMPUTATIONAL reason to prefer the former two alternatives over the third one(s)?"
Thanks for reading!
math.sqrt is the C implementation of square root and is therefore different from using the ** operator which implements Python's built-in pow function. Thus, using math.sqrt actually gives a different answer than using the ** operator and there is indeed a computational reason to prefer numpy or math module implementation over the built-in. Specifically the sqrt functions are probably implemented in the most efficient way possible whereas ** operates over a large number of bases and exponents and is probably unoptimized for the specific case of square root. On the other hand, the built-in pow function handles a few extra cases like "complex numbers, unbounded integer powers, and modular exponentiation".
See this Stack Overflow question for more information on the difference between ** and math.sqrt.
In terms of which is more "Pythonic", I think we need to discuss the very definition of that word. From the official Python glossary, it states that a piece of code or idea is Pythonic if it "closely follows the most common idioms of the Python language, rather than implementing code using concepts common to other languages." In every single other language I can think of, there is some math module with basic square root functions. However there are languages that lack a power operator like ** e.g. C++. So ** is probably more Pythonic, but whether or not it's objectively better depends on the use case.
Even in base Python you can do the computation in generic form
result = sum(x**2 for x in some_vector) ** 0.5
x ** 2 is surely not an hack and the computation performed is the same (I checked with cpython source code). I actually find it more readable (and readability counts).
Using instead x ** 0.5 to take the square root doesn't do the exact same computations as math.sqrt as the former (probably) is computed using logarithms and the latter (probably) using the specific numeric instruction of the math processor.
I often use x ** 0.5 simply because I don't want to add math just for that. I'd expect however a specific instruction for the square root to work better (more accurately) than a multi-step operation with logarithms.
In attempting to use scipy's quad method to integrate a gaussian (lets say there's a gaussian method named gauss), I was having problems passing needed parameters to gauss and leaving quad to do the integration over the correct variable. Does anyone have a good example of how to use quad w/ a multidimensional function?
But this led me to a more grand question about the best way to integrate a gaussian in general. I didn't find a gaussian integrate in scipy (to my surprise). My plan was to write a simple gaussian function and pass it to quad (or maybe now a fixed width integrator). What would you do?
Edit: Fixed-width meaning something like trapz that uses a fixed dx to calculate areas under a curve.
What I've come to so far is a method make___gauss that returns a lambda function that can then go into quad. This way I can make a normal function with the average and variance I need before integrating.
def make_gauss(N, sigma, mu):
return (lambda x: N/(sigma * (2*numpy.pi)**.5) *
numpy.e ** (-(x-mu)**2/(2 * sigma**2)))
quad(make_gauss(N=10, sigma=2, mu=0), -inf, inf)
When I tried passing a general gaussian function (that needs to be called with x, N, mu, and sigma) and filling in some of the values using quad like
quad(gen_gauss, -inf, inf, (10,2,0))
the parameters 10, 2, and 0 did NOT necessarily match N=10, sigma=2, mu=0, which prompted the more extended definition.
The erf(z) in scipy.special would require me to define exactly what t is initially, but it nice to know it is there.
Okay, you appear to be pretty confused about several things. Let's start at the beginning: you mentioned a "multidimensional function", but then go on to discuss the usual one-variable Gaussian curve. This is not a multidimensional function: when you integrate it, you only integrate one variable (x). The distinction is important to make, because there is a monster called a "multivariate Gaussian distribution" which is a true multidimensional function and, if integrated, requires integrating over two or more variables (which uses the expensive Monte Carlo technique I mentioned before). But you seem to just be talking about the regular one-variable Gaussian, which is much easier to work with, integrate, and all that.
The one-variable Gaussian distribution has two parameters, sigma and mu, and is a function of a single variable we'll denote x. You also appear to be carrying around a normalization parameter n (which is useful in a couple of applications). Normalization parameters are usually not included in calculations, since you can just tack them back on at the end (remember, integration is a linear operator: int(n*f(x), x) = n*int(f(x), x) ). But we can carry it around if you like; the notation I like for a normal distribution is then
N(x | mu, sigma, n) := (n/(sigma*sqrt(2*pi))) * exp((-(x-mu)^2)/(2*sigma^2))
(read that as "the normal distribution of x given sigma, mu, and n is given by...") So far, so good; this matches the function you've got. Notice that the only true variable here is x: the other three parameters are fixed for any particular Gaussian.
Now for a mathematical fact: it is provably true that all Gaussian curves have the same shape, they're just shifted around a little bit. So we can work with N(x|0,1,1), called the "standard normal distribution", and just translate our results back to the general Gaussian curve. So if you have the integral of N(x|0,1,1), you can trivially calculate the integral of any Gaussian. This integral appears so frequently that it has a special name: the error function erf. Because of some old conventions, it's not exactly erf; there are a couple additive and multiplicative factors also being carried around.
If Phi(z) = integral(N(x|0,1,1), -inf, z); that is, Phi(z) is the integral of the standard normal distribution from minus infinity up to z, then it's true by the definition of the error function that
Phi(z) = 0.5 + 0.5 * erf(z / sqrt(2)).
Likewise, if Phi(z | mu, sigma, n) = integral( N(x|sigma, mu, n), -inf, z); that is, Phi(z | mu, sigma, n) is the integral of the normal distribution given parameters mu, sigma, and n from minus infinity up to z, then it's true by the definition of the error function that
Phi(z | mu, sigma, n) = (n/2) * (1 + erf((x - mu) / (sigma * sqrt(2)))).
Take a look at the Wikipedia article on the normal CDF if you want more detail or a proof of this fact.
Okay, that should be enough background explanation. Back to your (edited) post. You say "The erf(z) in scipy.special would require me to define exactly what t is initially". I have no idea what you mean by this; where does t (time?) enter into this at all? Hopefully the explanation above has demystified the error function a bit and it's clearer now as to why the error function is the right function for the job.
Your Python code is OK, but I would prefer a closure over a lambda:
def make_gauss(N, sigma, mu):
k = N / (sigma * math.sqrt(2*math.pi))
s = -1.0 / (2 * sigma * sigma)
def f(x):
return k * math.exp(s * (x - mu)*(x - mu))
return f
Using a closure enables precomputation of constants k and s, so the returned function will need to do less work each time it's called (which can be important if you're integrating it, which means it'll be called many times). Also, I have avoided any use of the exponentiation operator **, which is slower than just writing the squaring out, and hoisted the divide out of the inner loop and replaced it with a multiply. I haven't looked at all at their implementation in Python, but from my last time tuning an inner loop for pure speed using raw x87 assembly, I seem to remember that adds, subtracts, or multiplies take about 4 CPU cycles each, divides about 36, and exponentiation about 200. That was a couple years ago, so take those numbers with a grain of salt; still, it illustrates their relative complexity. As well, calculating exp(x) the brute-force way is a very bad idea; there are tricks you can take when writing a good implementation of exp(x) that make it significantly faster and more accurate than a general a**b style exponentiation.
I've never used the numpy version of the constants pi and e; I've always stuck with the plain old math module's versions. I don't know why you might prefer either one.
I'm not sure what you're going for with the quad() call. quad(gen_gauss, -inf, inf, (10,2,0)) ought to integrate a renormalized Gaussian from minus infinity to plus infinity, and should always spit out 10 (your normalization factor), since the Gaussian integrates to 1 over the real line. Any answer far from 10 (I wouldn't expect exactly 10 since quad() is only an approximation, after all) means something is screwed up somewhere... hard to say what's screwed up without knowing the actual return value and possibly the inner workings of quad().
Hopefully that has demystified some of the confusion, and explained why the error function is the right answer to your problem, as well as how to do it all yourself if you're curious. If any of my explanation wasn't clear, I suggest taking a quick look at Wikipedia first; if you still have questions, don't hesitate to ask.
scipy ships with the "error function", aka Gaussian integral:
import scipy.special
help(scipy.special.erf)
The gaussian distribution is also called a normal distribution. The cdf function in the scipy norm module does what you want.
from scipy.stats import norm
print norm.cdf(0.0)
>>>0.5
http://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.norm.html#scipy.stats.norm
Why not just always do your integration from -infinity to +infinity, so that you always know the answer? (joking!)
My guess is that the only reason that there's not already a canned Gaussian function in SciPy is that it's a trivial function to write. Your suggestion about writing your own function and passing it to quad to integrate sounds excellent. It uses the accepted SciPy tool for doing this, it's minimal code effort for you, and it's very readable for other people even if they've never seen SciPy.
What exactly do you mean by a fixed-width integrator? Do you mean using a different algorithm than whatever QUADPACK is using?
Edit: For completeness, here's something like what I'd try for a Gaussian with the mean of 0 and standard deviation of 1 from 0 to +infinity:
from scipy.integrate import quad
from math import pi, exp
mean = 0
sd = 1
quad(lambda x: 1 / ( sd * ( 2 * pi ) ** 0.5 ) * exp( x ** 2 / (-2 * sd ** 2) ), 0, inf )
That's a little ugly because the Gaussian function is a little long, but still pretty trivial to write.
I assume you're handling multivariate Gaussians; if so, SciPy already has the function you're looking for: it's called MVNDIST ("MultiVariate Normal DISTribution). The SciPy documentation is, as ever, terrible, so I can't even find where the function is buried, but it's in there somewhere. The documentation is easily the worst part of SciPy, and has frustrated me to no end in the past.
Single-variable Gaussians just use the good old error function, of which many implementations are available.
As for attacking the problem in general, yes, as James Thompson mentions, you just want to write your own gaussian distribution function and feed it to quad(). If you can avoid the generalized integration, though, it's a good idea to do so -- specialized integration techniques for a particular function (like MVNDIST uses) are going to be much faster than a standard Monte Carlo multidimensional integration, which can be extremely slow for high accuracy.