Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 1 year ago.
Improve this question
In MATLAB, binofit returns the maximum likelihood estimate of the success of binomial probability distribution and confidence intervals.
statsmodels.stats.proportion.proportion_confint returns confidence intervals as well, but couldn't find a function for maximum likelihood estimate of the binomial probability distribution. Is there any function that you can suggest as a binofit function in MATLAB for python?
I think the function you suggested is good enough. I ran some test comparing Matlab binofit and Python statsmodels.stats.proportion.proportion_confint. The test was empyrical like testin 100K experiments like [phat,pci] = binofit(x,n,alpha) with min_conf,max_conf = proportion_confint(x,n,alpha=alpha,method='beta').
The RMSE bewteen confidence interval limits from Matlab and Python are below 5e-6 for values values of x and n between 0 and 10000. Tested with alpha=0.05 and 0.01.
I know this is not strict demonstrations but for my project I decided to consider the two estimates of confidence intervals as equivalent.
Try using one of these two libraries: statsmodels or scipy.
I do not know if it is exactly what you're looking for, but I hope you find it useful still.
Related
Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 10 months ago.
Improve this question
My goal is to display a non-decreasing stepwise curve F. For each x-axis point, computing the corresponding F(x) is fairly computational, hence discretizing the x-axis and computing every corresponding F(x) point may take a lot of time.
My idea is to compute the curve by dichotomy.
Starting with 0 and the end-point of the x-axis (say 100).
If F(0)=F(100), then F is constant.
Else F is not constant and I compute F(50).
If F(50) = F(0), then F is constant on [0,50] and I compute F(75) and so on.
Else if F(50) = F(100), then F is constant on [50,100] and I compute F(25) and so on.
Else I compute F(25) and F(75) and so on.
Is there any python librairy which would be useful to implement such an algorithm?
Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 2 years ago.
Improve this question
I have a system which is configurable. It has 3 parameters
Parameter1 - can vary between [0, 2^30]
Parameter2 - can vary between [0, 2^30]
Parameter3 - can vary between [0, 2^12]
I have a python code which when given a set of valid numbers corresponding to (parameter1, parameter2, parameter3) can configure the system and after few mins return a number say a score for the configuration.
Suppose the aim to maximize the score. Is there a python library to generate the parameters with constraints intelligently.
Thanks in advance
This is a bounded optimization problem. Depending on how the "system" is structured different optimization strategies will behave differently well.
My suggestions is to either look at scipy.optimize.minimize or a genetic algorithm. I know there are multiple GA implementations in Python, but I haven't tried them.
Since a single evaluation takes on the order of minutes, you are looking at long optimization times though.
If you can compute the gradient to your score function, and if it is well-behaved (i.e. convex) then the scipy.optimize.minimize is probably your best bet. Otherwise you might have better luck with a GA.
Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 2 years ago.
Improve this question
I want to forecast upcoming total users on a daily basis within Python using a machine learning algorithm. Check the pattern below:
Looking at this graph, I was wondering if someone knows which forecasting method in Python I should use to predict?
Thanks!
If you have no additional data expect the user data over time which you have shown, the only thing you can do is try to find a function dependent on time which gives you a good approximation for that plot (ordinary curve fitting). I suppose that's not what you want.
To do a predection (which can be done not only by a machine learning approach), you need other data which is somehow correlated to the data you want to predict.
Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 5 years ago.
Improve this question
I have been trying to find a way to fit some of my columns (that contains user click data) to poisson distribution in python. These columns (e.g., click_website_1, click_website_2) may contain a value ranging from 1 to thousands. I am trying to do this as it is recommended by some resources:
We recommend that count data should not be analysed by
log-transforming it, but instead models based on Poisson and negative
binomial distributions should be used.
I found some methods in scipy and numpy, but these methods seem to generate some random numbers that have poisson distribution. However, what I am interested in is to fit my own data to poisson distribution. Any library suggestions to do this in Python?
Here is a quick way to check if your data follows a poisson distribution. You plot the under the assumption that it follows a poisson distribution with rate parameter lambda = data.mean()
import numpy as np
from scipy.misc import factorial
def poisson(k, lamb):
"""poisson pdf, parameter lamb is the fit parameter"""
return (lamb**k/factorial(k)) * np.exp(-lamb)
# lets collect clicks since we are going to need it later
clicks = df["clicks_website_1"]
Here we use the pmf for possion distribution.
Now lets do some modeling, from data (click_website_one)
we'll estimate the the poisson parameter using the MLE,
which turns out to be just the mean
lamb = clicks.mean()
# plot the pmf using lamb as as an estimate for `lambda`.
# let sort the counts in the columns first.
clicks.sort().apply(poisson, lamb).plot()
Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 7 years ago.
Improve this question
I'm very confused as to what np.exp() actually does. In the documentation it says that it: "Calculates the exponential of all elements in the input array." I'm confused as to what exactly this means. Could someone give me more information to what it actually does?
The exponential function is e^x where e is a mathematical constant called Euler's number, approximately 2.718281. This value has a close mathematical relationship with pi and the slope of the curve e^x is equal to its value at every point. np.exp() calculates e^x for each value of x in your input array.
It calculates ex for each x in your list where e is Euler's number (approximately 2.718). In other words, np.exp(range(5)) is similar to [math.e**x for x in range(5)].
exp(x) = e^x where e= 2.718281(approx)
In Python we can use the exp function from numpy (docs):
import numpy as np
ar=np.array([1,2,3])
ar=np.exp(ar)
print ar
outputs:
[ 2.71828183 7.3890561 20.08553692]