Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
Questions asking for code must demonstrate a minimal understanding of the problem being solved. Include attempted solutions, why they didn't work, and the expected results. See also: Stack Overflow question checklist
Closed 9 years ago.
Improve this question
How can I fit my data to an asymptotic power law curve or an exponential approach curve in R or Python?
My data essentially shows that the the y-axis increases continuously but the delta (increase) decreases with increase in x.
Any help will be much appreciated.
Using Python, if you have numpy and scipy installed, you could use curve_fit of thescipy package. It takes a user-defined function and x- as well as y-values (x_values and y_values in the code), and returns the optimized parameters and the covariance of the parameters.
import numpy
import scipy
def exponential(x,a,b):
return a*numpy.exp(b*x)
fit_data, covariance = scipy.optimize.curve_fit(exponential, x_values, y_values, (1.,1.))
This answer assumes you have your data as a one-dimensional numpy-array. You could easily convert your data into one of these, though.
The last argument contains starting values for your optimization. If you dont supply them, there might be problems in determining the number of parameters.
Related
Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 1 year ago.
Improve this question
In MATLAB, binofit returns the maximum likelihood estimate of the success of binomial probability distribution and confidence intervals.
statsmodels.stats.proportion.proportion_confint returns confidence intervals as well, but couldn't find a function for maximum likelihood estimate of the binomial probability distribution. Is there any function that you can suggest as a binofit function in MATLAB for python?
I think the function you suggested is good enough. I ran some test comparing Matlab binofit and Python statsmodels.stats.proportion.proportion_confint. The test was empyrical like testin 100K experiments like [phat,pci] = binofit(x,n,alpha) with min_conf,max_conf = proportion_confint(x,n,alpha=alpha,method='beta').
The RMSE bewteen confidence interval limits from Matlab and Python are below 5e-6 for values values of x and n between 0 and 10000. Tested with alpha=0.05 and 0.01.
I know this is not strict demonstrations but for my project I decided to consider the two estimates of confidence intervals as equivalent.
Try using one of these two libraries: statsmodels or scipy.
I do not know if it is exactly what you're looking for, but I hope you find it useful still.
Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 5 years ago.
Improve this question
I have been trying to find a way to fit some of my columns (that contains user click data) to poisson distribution in python. These columns (e.g., click_website_1, click_website_2) may contain a value ranging from 1 to thousands. I am trying to do this as it is recommended by some resources:
We recommend that count data should not be analysed by
log-transforming it, but instead models based on Poisson and negative
binomial distributions should be used.
I found some methods in scipy and numpy, but these methods seem to generate some random numbers that have poisson distribution. However, what I am interested in is to fit my own data to poisson distribution. Any library suggestions to do this in Python?
Here is a quick way to check if your data follows a poisson distribution. You plot the under the assumption that it follows a poisson distribution with rate parameter lambda = data.mean()
import numpy as np
from scipy.misc import factorial
def poisson(k, lamb):
"""poisson pdf, parameter lamb is the fit parameter"""
return (lamb**k/factorial(k)) * np.exp(-lamb)
# lets collect clicks since we are going to need it later
clicks = df["clicks_website_1"]
Here we use the pmf for possion distribution.
Now lets do some modeling, from data (click_website_one)
we'll estimate the the poisson parameter using the MLE,
which turns out to be just the mean
lamb = clicks.mean()
# plot the pmf using lamb as as an estimate for `lambda`.
# let sort the counts in the columns first.
clicks.sort().apply(poisson, lamb).plot()
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 6 years ago.
Improve this question
I wanted to apply one of the minimizations methods within sicpy.minimize to a function which may not always provide smooth derivatives. I've gotten comfortable with the Nelder-Mead implementation of the Simplex method, but it does not appear to accept the bounds argument: (...,bounds=[xmin, xmax],...). Reading this documentation it seems only L-BFGS-B, TNC and SLSQP methods accept bounds, and all three of those are based in some way upon Newton's method, and will either calculate a numerical derivative or accept one.
I don't know the exact term, but I'm looking for a 'Simplex-like' or 'derivativeless' method in scipy that accepts bounds, but will also be forgiving of functions that will not provide a smooth derivative (one example being staircase-like behavior). For now, I'm doing 1d. Later I may add dimensions, but that's not critical right now.
I would give lmfit a try (http://cars9.uchicago.edu/software/python/lmfit/).
While not part of scipy, but based on, it offers bounded minimization. I use it for curve-fitting and parameters extraction. Nevertheless, I couldn't tell how it would perform on your specific function.
Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
Questions asking for code must demonstrate a minimal understanding of the problem being solved. Include attempted solutions, why they didn't work, and the expected results. See also: Stack Overflow question checklist
Closed 9 years ago.
Improve this question
I'm new to python. Now I have a dataframe which contain annual records from 1959 to 2009. Could you please tell me how to use it to predict, say from 2010 to 2012?
Appreciation for any help!
First of all, plot your data and have a look at it. You must then have a feeling of what's going on and also have a subjective prediction.
If your data seems to be completely random, without any obvious trends, calculate its average and use it as a first-guess prediction. (For a fully random data, it will be the result from the linear regression as well).
You can then use linear regression, either with Pandas' ols regression tools, or numpy's polyfit. Make sure you plot your data and the regression line to actually see how well your prediction is doing.
And don't expect to do a miracle with this method. Complicated things are much harder to predict than a linear regression, and 50-year-long processes, whatever they be, are usually complicated enough.
Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
Questions asking for code must demonstrate a minimal understanding of the problem being solved. Include attempted solutions, why they didn't work, and the expected results. See also: Stack Overflow question checklist
Closed 9 years ago.
Improve this question
I am plotting in there (http://db.tt/9SG85XFK) a pandas dataframe; index of 'timestamp' with two variables (plotted as blue and green curves).
I would like to extract subsets of that dataframe for which the blue curve variable is more or less constant (std.variation below a specific value?).
Therefore for the attached plot it would extract 3 different subsets ~(41000:41170, 41180:41315, and 41320:41580).
Is there a clean way to do this? I could do it through a loop, but ... not sure it's the right way.
Thanks,
N
You probably want the functionality of the rolling_std function.
Specify the width of the interval you want to check for the standard deviation (let's say 100 data points), select the appropriate standard deviation (let's say 10) and do:
import pandas as pd
s = pd.Series(the way you get your data)
std = pd.rolling_std(s, 100)
selected = s[std < 10]
And you will get all the data points that have a standard deviation less than 10 in a surrounding of 100 data points.