I'm trying to calculate the area under the curve of a Gaussian, I even managed to fit my data but I can't make an integral using this fit.
`
import numpy as np
from scipy.optimize import curve_fit
import matplotlib.pyplot as mpl
# Let's create a function to model and create data
def func(x, a, x0, sigma):
return a*np.exp(-(x-x0)**2/(4*sigma**2))
# Generating clean data
x = dados.col1
y = dados.col2
# Adding noise to the data
yn = y + 0.2 * np.random.normal(size=len(x))
# Plot out the current state of the data and model
fig = mpl.figure()
ax = fig.add_subplot(111)
ax.plot(x, y, c='k', label='Function')
ax.scatter(x, yn)
# Executing curve_fit on noisy data
popt, pcov = curve_fit(func, x, yn)
#popt returns the best fit values for parameters of the given model (func)
print (popt)
ym = func(x, popt[0], popt[1], popt[2])
ax.plot(x, ym, c='r', label='Best fit')
ax.legend()
fig.savefig('model_fit.png')
`
I hope to have the area of this function
Related
I am looking for a way to display measured data and interpolate a map from it.
I originally asked the question in this post:
Link
I have now found a way with Matlab to interpolate the area as I would like (see the following photo).
Matlab fit normalized
fit([x,y],z,'linearinterp','normalize','on');
I would now like to know how to find an equivalent way in Python to get the same area.
The raw data is stored here:
Data
I tried the following with scipy.curve_fit:
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
from scipy.optimize import curve_fit
plt.close('all')
def func(data, a, alpha, beta):
x = data[0]
y = data[1]
return a * (x**alpha) * (y**beta)
df = pd.read_csv("Data21.csv")
df.sort_values(by=['x', 'y'], inplace=True)
mat = df.to_numpy()
x = mat[:,0]
y = mat[:,1]
z = mat[:,2]
# curve Fit
initial_parameters = [1.0,1.0,1.0]
popt, pcov = curve_fit(func, [x,y], z, initial_parameters)
xModel = np.linspace(min(x), max(x), 100)
yModel = np.linspace(min(y), max(y), 100)
XI, YI = np.meshgrid(xModel, yModel)
ZI = func(np.array([XI, YI]), *popt)
fig, ax = plt.subplots(subplot_kw={"projection": "3d"})
ax.set_title('Surface Fit')
ax.plot_surface(XI, YI, ZI, cmap='jet')
ax.scatter(x, y, np.transpose(z), marker='o', c='black', s=2)
The result can be seen here.
Python curve_fit
I know I have the wrong fit function. I would have to adjust this, however I do not know how. Can anyone help me with this?
I'm trying to plot a histogram in python by importing data from an excel file.
Also, the histogram needs to be fitted with an exponential function.
How can I do this plotting and fitting procedure?
For plotting just use plt.hist and your data
import random
import matplotlib.pyplot as plt
# data for test
data = [random.randint(1,20) for i in range(20)]
n, x, _ = plt.hist(data)
bin_centers = 0.5*(x[1:]+x[:-1])
plt.plot(bin_centers,n);
for fitting you can extract bins centers and try to fit it with curve_fit:
from scipy.optimize import curve_fit
# some exponential function
def func(x, a, b, c):
return a * np.exp(-b * x) + c
popt, pcov = curve_fit(func, bin_centers, n, bounds=(0, [3., 1., 0.5]))
# bounds are variable, so you can change them as you wish
plt.plot(bin_centers, n, label='data')
plt.plot(bin_centers, func(bin_centers, *popt), label='fit')
plt.legend()
For the following python script, when I add bounds to the curve_fit function, the resulting curve fit is completely different and visibly wrong, even though the parameter that is adjusted for the fit is within the bounds both before and after the bounds are added to the code. Why would this happen?
Here's a link to the data: https://drive.google.com/file/d/0Bwb0PrDn9o3KZ0lOa1FVZldjV0k/view?usp=sharing
import numpy as np
import matplotlib.pyplot as plt
from numpy import loadtxt, sqrt
from scipy.optimize import curve_fit #for least squares curve fit
from scipy import special #for erfc function
plt.rcParams.update({'font.family': "Times New Roman"})
plt.rcParams.update({'font.size': 12})
filename = 'Cr3.csv'
C_b = 17 #base concentration
t_hours = 451
t = t_hours * 3600 #451 hours = 1623600 seconds
data = loadtxt(filename, delimiter=',')
xdata = data[:, 0] #positions
ydata = data[:, 1] #concentration
corr = data[0, 2] #the correction value is manually measured in imagej
xdata = xdata - corr
def func(x, D):
return C_b/2 * special.erfc(x/(2 * sqrt(D * t))/1e6) #correction for um to m
fig = plt.figure()
plt.plot(xdata, ydata, 'b-', label='data')
popt, pcov = curve_fit(func, xdata, ydata, p0=1e-16)#, bounds=(0,1))
perr = np.sqrt(np.diag(pcov))
plt.plot(xdata, func(xdata, *popt), 'r-',
label='fit: D = %.2e' % tuple(popt))#, z = %5.3f
plt.xlabel('x (μm)')
plt.ylabel('Cr (wt%)')
plt.legend()
plt.show()
I would like to find and plot a function f that represents a curve fitted on some number of set points that I already know, x and y.
After some research I started experimenting with scipy.optimize and curve_fit but on the reference guide I found that the program uses a function to fit the data instead and it assumes ydata = f(xdata, *params) + eps.
So my question is this: What do I have to change in my code to use the curve_fit or any other library to find the function of the curve using my set points? (note: I want to know the function as well so I can integrate later for my project and plot it). I know that its going to be a decaying exponencial function but don't know the exact parameters. This is what I tried in my program:
import numpy as np
import matplotlib.pyplot as plt
from scipy.optimize import curve_fit
def func(x, a, b, c):
return a * np.exp(-b * x) + c
xdata = np.array([0.2, 0.5, 0.8, 1])
ydata = np.array([6, 1, 0.5, 0.2])
plt.plot(xdata, ydata, 'b-', label='data')
popt, pcov = curve_fit(func, xdata, ydata)
plt.plot(xdata, func(xdata, *popt), 'r-', label='fit')
plt.xlabel('x')
plt.ylabel('y')
plt.legend()
plt.show()
Am currently developing this project on a Raspberry Pi, if it changes anything. And would like to use least squares method since is great and precise, but any other method that works well is welcome.
Again, this is based on the reference guide of scipy library. Also, I get the following graph, which is not even a curve: Graph and curve based on set points
import numpy as np
import matplotlib.pyplot as plt
from scipy.optimize import curve_fit
def func(x, a, b, c):
return a * np.exp(-b * x) + c
#c is a constant so taking the derivative makes it go to zero
def deriv(x, a, b, c):
return -a * b * np.exp(-b * x)
#Integrating gives you another c coefficient (offset) let's call it c1 and set it equal to zero by default
def integ(x, a, b, c, c1 = 0):
return -a/b * np.exp(-b * x) + c*x + c1
#There are only 4 (x,y) points here
xdata = np.array([0.2, 0.5, 0.8, 1])
ydata = np.array([6, 1, 0.5, 0.2])
#curve_fit already uses "non-linear least squares to fit a function, f, to data"
popt, pcov = curve_fit(func, xdata, ydata)
a,b,c = popt #these are the optimal parameters for fitting your 4 data points
#Now get more x values to plot the curve along so it looks like a curve
step = 0.01
fit_xs = np.arange(min(xdata),max(xdata),step)
#Plot the results
plt.plot(xdata, ydata, 'bx', label='data')
plt.plot(fit_xs, func(fit_xs,a,b,c), 'r-', label='fit')
plt.plot(fit_xs, deriv(fit_xs,a,b,c), 'g-', label='deriv')
plt.plot(fit_xs, integ(fit_xs,a,b,c), 'm-', label='integ')
plt.xlabel('x')
plt.ylabel('y')
plt.legend()
plt.show()
I'm trying to use interpolation in scipy. Here is my code:
from Constants import LOWER_LAT, LOWER_LONG, UPPER_LAT, UPPER_LONG, GRID_RESOLUTION
from mpl_toolkits.mplot3d import axes3d
import matplotlib.pyplot as plt
from matplotlib import cm
from cmath import sin
from scipy.signal.windows import cosine
from scipy import interpolate
from scipy.interpolate import RectBivariateSpline
import numpy
from numpy import meshgrid
#===============================================================
y_range = GRID_RESOLUTION
delta = (UPPER_LAT - LOWER_LAT)/float(GRID_RESOLUTION)
x_range = int((UPPER_LONG - LOWER_LONG)/delta) + 1
x = numpy.linspace(0,x_range-1,x_range)
y = numpy.linspace(0,y_range-1,y_range)
X,Y = meshgrid(x,y)
Z = numpy.zeros((y.size, x.size))
base_val = 0
# fill values for Z
with open('map.txt','rb') as fp:
for line in fp:
parts = line[:-1].split("\t")
tup = parts[0]
tup = tup[:-1]
tup = tup[1:]
yx = tup.strip().replace(" ","").split(",")
y_val = int(yx[0])
x_val = int(yx[1])
h_val = int(parts[-1])
for i in range(y_range):
tx = X[i];
ty = Y[i];
tz = Z[i];
for j in range(x_range):
if (int(tx[j])==x_val) and (int(ty[j])==y_val):
tz[j] = h_val + base_val
Z = numpy.array(Z)
# spline = RectBivariateSpline(y, x, Z)
# Z2 = spline(y, x)
f = interpolate.interp2d(x, y, Z,'cubic')
Z2 = f(x,y)
# Plot here
fig = plt.figure()
ax = fig.gca(projection='3d')
ax.plot_surface(X, Y, Z2, rstride=1, cstride=1, alpha=0.3, cmap='Accent')
ax.set_xlabel('X')
ax.set_xlim(0, 50)
ax.set_ylabel('Y')
ax.set_ylim(0, 50)
ax.set_zlabel('Z')
# ax.set_zlim(0, 1000)
plt.show()
Here are some constants from the top of the above code:
LOWER_LAT = 32.5098
LOWER_LONG = -84.7485
UPPER_LAT = 47.5617
UPPER_LONG = -69.1699
GRID_RESOLUTION = 50
My code creates 1D arrays x, y and then creates the grid with function meshgrid. Values in Z are filled from a text file that you can find here. Each line in the text file has format of (y_value,x_value) z_value.
After creating the grid and interpolating the function, I plot it. However, the figure I obtain is the same as the figure I got without interpolation. Concretely, these two lines produce the same figure:
ax.plot_surface(X, Y, Z, rstride=1, cstride=1, alpha=0.3, cmap='Accent')
ax.plot_surface(X, Y, Z2, rstride=1, cstride=1, alpha=0.3, cmap='Accent')
In the lines above, Z2's values are from the interpolation function, and the Z's values are original values.
How can I make the interpolation work?
Here is the figure.
I think you are confusing smoothing with interpolation.
In both cases you are fitting a function that yields a continuous approximation of your input data. However, in the case of interpolation the interpolant is constrained to pass exactly through your input points, whereas with smoothing this constraint is relaxed.
In your example above, you have performed interpolation rather than smoothing. Since you are evaluating your interpolant on the exact same grid of input points as your original data then Z2 is guaranteed to be almost exactly the same as Z. The point of doing interpolation is so that you can evaluate the approximate z-values given a different set of x- and y-values (e.g. a more finely-spaced grid).
If you want to perform smoothing rather than interpolation, you could try passing a non-zero value as the s= argument to RectBivariateSpline, e.g.:
spline = RectBivariateSpline(y, x, Z, s=5E7)
Z2 = spline(y, x)
fig, ax = plt.subplots(1, 2, sharex=True, sharey=True,
subplot_kw={'projection':'3d'})
ax[0].plot_surface(X, Y, Z, rstride=1, cstride=1, alpha=0.3, cmap='Accent')
ax[1].plot_surface(X, Y, Z2, rstride=1, cstride=1, alpha=0.3, cmap='Accent')
ax[0].set_title('Original')
ax[1].set_title('Smoothed')
fig.tight_layout()
plt.show()