I need to perform double integration using MCMC method. I have already done it using romberg and doublequad integrations with correct results. I need to also use MCMC integration to compare the results. I found it difficult to understand PyMC.
The outline is this: I have some timeseries data and I need to find out which distribution fits it. I have a set of equations that tells me what to do that involves Double Integration.
Hoping for some guidance.
I'd suggest you start with a simple direct sampling MC and do a trivial 2D integral for which you can obtain the answer by paper and pencil. Then move on to a MCMC for the same integral.
Related
I'm trying to perform a GP regression with linear operators as described in for example this paper by Särkkä: https://users.aalto.fi/~ssarkka/pub/spde.pdf In this example we can see from equation (8) that I need a different kernel function for the four covariance blocks (of training and test data) in the complete covariance matrix.
This is definitely possible and valid, but I would like to include this in a kernel definition of (preferably) GPflow, or GPytorch, GPy or the like.
However, in the documentation for kernel design in Gpflow, the only possibility is to define a covariance function that acts on all covariance blocks. In principle, the method above should be straight-forward to add myself (the kernel function expressions can be derived analytically), but I don't see any way of incorporating the 'heterogeneous' kernel functions into the regression or kernel classes. I tried to consult other packages such as Gpytorch and Gpy, but again, the kernel design does not seem to allow this.
Maybe I'm missing something here, maybe I'm not familiar enough with the underlying implementation to asses this, but if someone has done this before or sees the (what should be reasonably straight-forward?) implementation possibility, I would be happy to find out.
Thank you very much in advance for your answer!
Kind regards
This should be reasonably straightforward, though requires building a custom kernel. Basically, you need a kernel that can know for each input what the linear operator for the corresponding output is (whether this is a function observation/identity operator, integral observation, derivative observation, etc). You can achieve this by including an extra column in your input matrix X, similar to how it's done for the gpflow.kernels.Coregion kernel (see this notebook). You would need to then need to define a new kernel with K and K_diag methods that for each linear operator type find the corresponding rows in the input matrix, and pass it to the appropriate covariance function (using tf.dynamic_partition and tf.dynamic_stitch, this is used in a very similar way in GPflow's SwitchedLikelihood class).
The full implementation would probably take half a day or so, which is beyond what I can do here, but I hope this is a useful starting pointer, and you're very welcome to join the GPflow slack (invite link in the GPflow README) and discuss it in more detail there!
I have two questions.
1) I have an array like [1,2,3,4,5,5,3,1]. and I don't know which distributions it is. Can I use scipy.stats to calculate pmf,cdf automatically?
2)scipy.stats is just like a library of distributions? If I want to analysis data, I have to find one distributions or define one? I need to manually calculate some of data, like pmf. Am I understanding correctly?
Well, scipy.stats is not a library for telling you the distribution of data and calculating pmf and cdf automatically. Its a library for easing your tasks while estimating the probabily distribution. You have to explore your data and find which distribution which fits your data with least error ,which is the ultimate task and scipy.stats helps you achieving this.... you don't have to reinvent the wheel as they say by writing all the mathematical functions again and again.
Well, to answer your question in the comment, lets suppose you have a dataset, to get an insight and get a starting point for your analysis, what you'll do is plot the data in a historgram (which also shows distribution of data), now you can plot different distributions on the same plot using scipy.stats to get a feel of bestfit....
Check out this answer, it might help ya...
https://stats.stackexchange.com/questions/132652/how-to-determine-which-distribution-fits-my-data-best
So, I have the following data I've plotted in Python.
The data is input for a forcing term in a system of differential equations I am working with. Thus, I need to fit a continuous function to this data so I will not have to deal with stability issues that could come with discontinuities of a step-wise function. Unfortunately, it's a pretty large data set.
I am trying to end up with a fitted function that is possible and not too tedious to translate into Stan, the language that I am coding the differential equations in, so was preferring something in piece-wise polynomial form with a maximum of just a few pieces that I can manually code.
I started off with polyfit from numpy, which was not very good. Using UnivariateSpline from scipy gave me a decent fit, but it did not give me something that looked tractable for translation into Stan. Hence, I was looking for suggestions into other fits I could try that would return functions that are more easily translatable into other languages? Looking at the shape of my data, is there a periodic spline fit that could be useful?
The UnivariateSpline object has get_knots and get_coeffs methods. They give you the knots and coefficients of the fit in the b-spline basis.
An alternative, equivalent, way is to use splrep for fitting (and splev for evaluations).
To convert to a piecewise polynomial representation, use PPoly.from_spline (check the docs for the latter for the exact format)
If what you want is a Fourier space representation, you can use leastsq or least_squares. It'd be essential to provide sensible starting values for NLSQ fit parameters. At least I'd start from e.g. max-to-max distance estimate for the period and max-to-min estimate for the amplitude.
As always with non-linear fitting, YMMV, however.
From the direction field, it seems that a fit involving the sum of or composition of multiple sinusoidal functions might be it.
Ex: sin(cos(2x)), sin(x)+2cos(x), etc.
I would use Wolfram Alpha, Mathematica, or Matlab to create direction fields.
Following from this question, is there a way to use any method other than MLE (maximum-likelihood estimation) for fitting a continuous distribution in scipy? I think that my data may be resulting in the MLE method diverging, so I want to try using the method of moments instead, but I can't find out how to do it in scipy. Specifically, I'm expecting to find something like
scipy.stats.genextreme.fit(data, method=method_of_moments)
Does anyone know if this is possible, and if so how to do it?
Few things to mention:
1) scipy does not have support for GMM. There is some support for GMM via statsmodels (http://statsmodels.sourceforge.net/stable/gmm.html), you can also access many R routines via Rpy2 (and R is bound to have every flavour of GMM ever invented): http://rpy.sourceforge.net/rpy2/doc-2.1/html/index.html
2) Regarding stability of convergence, if this is the issue, then probably your problem is not with the objective being maximised (eg. likelihood, as opposed to a generalised moment) but with the optimiser. Gradient optimisers can be really fussy (or rather, the problems we give them are not really suited for gradient optimisation, leading to poor convergence).
If statsmodels and Rpy do not give you the routine you need, it is perhaps a good idea to write out your moment computation out verbose, and see how you can maximise it yourself - perhaps a custom-made little tool would work well for you?
I'm trying to call upon the famous multilateration algorithm in order to pinpoint a radiation emission source given a set of arrival times for various detectors. I have the necessary data, but I'm still having trouble implementing this calculation; I am relatively new with Python.
I know that, if I were to do this by hand, I would use matrices and carry out elementary row operations in order to find my 3 unknowns (x,y,z), but I'm not sure how to code this. Is there a way to have Python implement ERO, or is there a better way to carry out my computation?
Depending on your needs, you could try:
NumPy if your interested in numerical solutions. As far as I remember, it could solve linear equations. Don't know how it deals with non-linear resolution.
SymPy for symbolic math. It solves symbolically linear equations ... according to their main page.
The two above are "generic" math packages. I doubt you will find (easily) any dedicated (and maintained) library for your specific need. Their was already a question on that topic here: Multilateration of GPS Coordinates