Gaussian fit in Python plot - python

I am trying to fit Gaussian function to my Python plot. I have attached the code here. Any corrections would be appreciated!
import numpy as np
import matplotlib.pyplot as plt
from scipy.optimize import curve_fit
import math
import random
from numpy import genfromtxt
data= genfromtxt ('PVC_Cs137.txt')
plt.xlim(0,2500)
plt.ylim(0,30000)
plt.xlabel("Channel number")
plt.ylabel("Counts")
x = data[:,0]
y = data[:,1]
n = len(x)
mean = sum(x*y)/n
sigma = sum(y*(x-mean)**2)/n
def gaus(x,a,x0,sigma):
return a*exp(-(x-x0)**2/(2*sigma**2))
popt,pcov = curve_fit(gaus,x,y,p0=[1,mean,sigma])
plt.plot(x,gaus(x,*popt))
plt.show()
And here is the link to my file:
https://www.dropbox.com/s/hrqjr2jgfsjs55x/PVC_Cs137.txt?dl=0

There are two problems with your approach. One is related to programming. The gauss fit function has to work with a numpy array. math functions can't provide this functionality, they work with scalars. Therefore your fit functions should look like this
def gauss(x, a, x0, sigma):
return a * np.exp(-(x - x0) ** 2 / (2 * sigma ** 2))
This produces with the right mean/sigma combination a Gauss curve like this
And now we look at the distribution of the values from your file:
This doesn't even vaguely look like a Gauss curve. No wonder that the fit function doesn't converge.
Actually there is a third problem, your calculation of mean/sigma is wrong, but since you can't fit your data to a Gaussian distribution, we can neglect this problem for now.

Related

Writing lognormal function in python

I'm trying to write an inverse lognormal function in python:
import numpy as np
import scipy.stats as sp
from scipy.optimize import curve_fit
def lognorm1(x,s,scale):
ANS = sp.lognorm(s,scale=scale).ppf(x)
return ANS
curve_fit(lognorm1,x,y)
I have no troubles fitting the curve, however the scale paramater is the exponential of what LOGNORM.INV function is on excel. I know I can just log the scale parameter at the end, but is there anyway to rewrite the function so that I don't have to do this everytime?
Indeed, SciPy documentation says
A common parametrization for a lognormal random variable Y is in terms of the mean, mu, and standard deviation, sigma, of the unique normally distributed random variable X such that exp(X) = Y. This parametrization corresponds to setting s = sigma and scale = exp(mu).
So let's set it as such:
def lognorm1(x, mu, sigma):
ANS = sp.lognorm(s=sigma, scale=exp(mu)).ppf(x)
return ANS
curve_fit(lognorm1, x, y)
Now the parameters returned by curve_fit have the meaning of the mean and standard deviation of the underlying normal distribution.

How to convert Uniform normality variables in python?

assign x~U(a,b)
get a uniform distribution array:
x_U=uniform(a,b,1000)
There is a normality distribution:
y~N(μ,σ)
I want get array y_N which is correspondently related to x_U elements.
How to carry out in python? It looks like easy in matlab. Such as this link explainnation.
The code as follows is Normality convert to Uniform:
from numpy.random import *
import matplotlib.pyplot as plt
a = normal(25,5.4,1000)
hist_N = plt.hist(a,bins=20,normed=True)
a_cum = np.cumsum(a)
hist_U = plt.hist(a_cum,bins=20,normed=True)
a_cum is uniform correspondent related to a elements
Generating unifrom random number will be applied to Monto-Carlo simulation.But the original parameter is normality distribution.So it needs convertion. My purpose is to inverse above coding process.
If I follow the link in your question, it tells me what to do. I am not sure that the erfinv does, but this code seems to convert a random uniform array to a gaussian shaped array:
import matplotlib.pyplot as plt
import numpy as np
from scipy.special import erfinv
X = np.random.uniform(0,1,1000)
Gauss = lambda x, mu, sigma: mu + np.sqrt(2)*sigma*erfinv(2*X-1)
plt.hist(Gauss(X, 1, 0.2), bins = 20)
plt.show()
Gauss is here a function, made with the lambda statement, which basically works the same as defining a function with def. The function I used is the one that was in your link.
The gaussian shape looks like
and the uniform shape like
.

Fitting a Gaussian to a set of x,y data

Firstly this is an assignment I've been set so I'm only after pointers, and I am restricted to using the following libraries, NumPy, SciPy and MatPlotLib.
We have been given a txt file which includes x and y data for a resonance experiment and have to fit both a gaussian and lorentzian fit. I'm working on the gaussian fit at the minute and have tried following the code laid out in a previous question as a basis for my own code. (Gaussian fit for Python)
from numpy import *
from matplotlib import *
import matplotlib.pyplot as plt
import pylab
from scipy.optimize import curve_fit
energy, intensity = numpy.loadtxt('resonance_data.txt', unpack=True)
n = size(energy)
mean = 30.7
sigma = 10
intensity0 = 45
def gaus(energy, intensity0, energy0, sigma):
return intensity0 * exp(-(energy - energy0)**2 / (sigma**2))
popt, pcov = curve_fit(gaus, energy, intensity, p0=[45, mean, sigma])
plt.plot(energy, intensity, 'o')
plt.xlabel('Energy/eV')
plt.ylabel('Intensity')
plt.title('Plot of Intensity against Energy')
plt.plot(energy, gaus(energy, *popt))
plt.show()
Which returns the following graph
If I keep the expressions for mean and sigma, as in the url posted the curve fit is a horizontal line, so I'm guessing the problem lies in the curve fit not converging or something.
Looks like your data skews heavily to the left, why Gaussian? Not Boltzmann, Log-Normal, or anything else?
Much of these are already implemented in scipy.stats. See scipy.stats.cauchy for lorentzian and scipy.stats.normal gaussian. An example:
import scipy.stats as ss
A=ss.norm.rvs(0, 5, size=(100)) #Generate a random variable of 100 elements, with expected mean=0, std=5
ss.norm.fit_loc_scale(A) #fit both the mean and std
(-0.13053732553697531, 5.163322485150271) #your number will vary.
And I think you don't need the intensity0 parameter, it is just going to be 1/sigma/srqt(2*pi), because the density function has to sum up to 1.

How to make this matplotlib plot less noisy?

How can I plot the following noisy data with a smooth, continuous line without considering each individual value? I would like to only show the behavior in a nicer way, without caring about noisy and extreme values. This is the code I am using:
import numpy
import sys
import matplotlib.pyplot as plt
from scipy.interpolate import spline
dataset = numpy.genfromtxt(fname='data', delimiter=",")
dic = {}
for d in dataset:
dic[d[0]] = d[1]
plt.plot(range(len(dic)), dic.values(),linestyle='-', linewidth=2)
plt.savefig('plot.png')
plt.show()
In a previous answer, I was introduced to the Savitzky Golay filter, a particular type of low-pass filter, well adapted for data smoothing. How "smooth" you want your resulting curve to be is a matter of preference, and this can be adjusted by both the window-size and the order of the interpolating polynomial. Using the cookbook example for sg_filter:
import numpy as np
import sg_filter
import matplotlib.pyplot as plt
# Generate some sample data similar to your post
X = np.arange(1,1000,1)
Y = np.log(X**3) + 10*np.random.random(X.shape)
Y2 = sg_filter.savitzky_golay(Y, 101, 3)
plt.plot(X,Y,linestyle='-', linewidth=2,alpha=.5)
plt.plot(X,Y2,color='r')
plt.show()
There is more than one way to do it!
Here I show how to reduce noise using a variety of techniques:
Moving average
LOWESS regression
Low pass filter
Interpolation
Sticking with #Hooked example data for consistency:
import numpy as np
import matplotlib.pyplot as plt
X = np.arange(1, 1000, 1)
Y = np.log(X ** 3) + 10 * np.random.random(X.shape)
plt.plot(X, Y, alpha = .5)
plt.show()
Moving average
Sometimes all you need is a moving average.
For example, using pandas with a window size of 100:
import pandas as pd
df = pd.DataFrame(Y, X)
df_mva = df.rolling(100).mean() # moving average with a window size of 100
df_mva.plot(legend = False);
You will probably have to try several window sizes with your data. Note that the first 100 values of df_mva will be NaN but these can be removed with the dropna method.
Usage details for the pandas rolling function.
LOWESS regression
I've used LOWESS (Locally Weighted Scatterplot Smoothing) successfully to remove noise from repeated measures datasets. More information on local regression methods, including LOWESS and LOESS, here. It's a simple method with only one parameter to tune which in my experience gives good results.
Here is how to apply the LOWESS technique using the statsmodels implementation:
import statsmodels.api as sm
y_lowess = sm.nonparametric.lowess(Y, X, frac = 0.3) # 30 % lowess smoothing
plt.plot(y_lowess[:, 0], y_lowess[:, 1]) # some noise removed
plt.show()
It may be necessary to vary the frac parameter, which is the fraction of the data used when estimating each y value. Increase the frac value to increase the amount of smoothing. The frac value must be between 0 and 1.
Further details on statsmodels lowess usage.
Low pass filter
Scipy provides a set of low pass filters which may be appropriate.
After application of the lfiter:
from scipy.signal import lfilter
n = 50 # larger n gives smoother curves
b = [1.0 / n] * n # numerator coefficients
a = 1 # denominator coefficient
y_lf = lfilter(b, a, Y)
plt.plot(X, y_lf)
plt.show()
Check scipy lfilter documentation for implementation details regarding how numerator and denominator coefficients are used in the difference equations.
There are other filters in the scipy.signal package.
Interpolation
Finally, here is an example of radial basis function interpolation:
from scipy.interpolate import Rbf
rbf = Rbf(X, Y, function = 'multiquadric', smooth = 500)
y_rbf = rbf(X)
plt.plot(X, y_rbf)
plt.show()
Smoother approximation can be achieved by increasing the smooth parameter. Alternative function parameters to consider include 'cubic' and 'thin_plate'. When considering the function value, I usually try 'thin_plate' first followed by 'cubic'; however both 'thin_plate' and 'cubic' seemed to struggle with the noise in this dataset.
Check other Rbf options in the scipy docs. Scipy provides other univariate and multivariate interpolation techniques (see this tutorial).

Having trouble while using scipy.integrate.odeint with python

I was trying to use odeint to solve a problem. My code is as below:
import numpy as np
import matplotlib.pyplot as plt
from scipy.integrate import odeint
eta=1.24e-9/2
def fun(x):
f=1.05e-8*eta*x**(1.5)*np.exp(13.6/x)
return (np.sqrt(1.+4*f)-1)/2./f
x=np.arange(0,1,0.001)
y=odeint(fun,x,0)[0]
plt.plot(x,y)
plt.plot(x,x)
plt.show()
It the two curves are the same, which is obviously wrong. If I plot the function, it will looks like a step function, which is very very small before about 0.3 and exponentially goes to 1. Can you help me figure out what's wrong with it? Thank you!
There are several problems with your code, most of which you might be able to solve yourself if you read the docstring for odeint more carefully.
To get you started, the following is a simple example of solving a scalar differential equation with odeint. Instead of trying to understand (and possibly debug) your function, I'll use a very simple equation. I'll solve the equation dy/dt = a * y, with initial condition y(0) = 100. Once you have this example working, you can modify fun to solve your problem.
import numpy as np
from scipy.integrate import odeint
import matplotlib.pyplot as plt
def fun(y, t, a):
"""Define the right-hand side of equation dy/dt = a*y"""
f = a * y
return f
# Initial condition
y0 = 100.0
# Times at which the solution is to be computed.
t = np.linspace(0, 1, 51)
# Parameter value to use in `fun`.
a = -2.5
# Solve the equation.
y = odeint(fun, y0, t, args=(a,))
# Plot the solution. `odeint` is generally used to solve a system
# of equations, so it returns an array with shape (len(t), len(y0)).
# In this case, len(y0) is 1, so y[:,0] gives us the solution.
plt.plot(t, y[:,0])
plt.xlabel('t')
plt.ylabel('y')
plt.show()
Here's the plot:
More complicated examples of the use of odeint can be found in the SciPy Cookbook (scroll down to the bullet labeled "Ordinary differential equations").

Categories