I write simple code using interpolation of sin function, nearest method. My question is it's that code it's correct? It seems to me that the function should consist of straight lines. Curved lines appear on the generated graph.
import numpy as np
import matplotlib.pyplot as plt
from scipy.interpolate import interp1d
import math
# Original "data set" --- 21 random numbers between 0 and 1.
x0 = np.arange(9)
y0 = [math.sin(i) for i in x0]
plt.plot(x0, y0, 'o', label='Data')
plt.grid(linestyle="-", color=(0.7, 0.8, 1.0))
x = np.linspace(0, 8, len(x0)*2)
# Available options for interp1d
options = ('linear', 'nearest')
f = interp1d(x0, y0, kind='nearest') # interpolation function
plt.plot(x, f(x), label='nearest') # plot of interpolated data
plt.legend()
plt.show()
EDIT:
I woudl like to impelment own interpolation algorithm, I try to divide sum of 2 values by 2
lst = list(x0)
for i, val in enumerate(lst):
lst[i] = lst[i] + lst[i+1] / 2
x0 = tuple(lst)
plt.plot(x0, y0, label='nearest')
But it's not working correctly
The problem is that the green line is drawn as a connected graph between all the points, and you have too few points. Maybe you have misunderstood how np.linspace works. If you increase the number of points, (and change to plot only the points instead as connected lines) you will get a result that looks much more like you probably expect:
import numpy as np
import matplotlib.pyplot as plt
from scipy.interpolate import interp1d
import math
# Original "data set" --- 21 random numbers between 0 and 1.
x0 = np.arange(9)
y0 = [math.sin(i) for i in x0]
plt.plot(x0, y0, 'o', label='Data')
plt.grid(linestyle="-", color=(0.7, 0.8, 1.0))
x = np.linspace(0, 8, 1000)
# Available options for interp1d
options = ('linear', 'nearest')
f = interp1d(x0, y0, kind='nearest') # interpolation function
plt.plot(x, f(x), '.', label='nearest') # plot of interpolated data
plt.legend()
plt.show()
Related
I'm currently working on a lab report for Brownian Motion using this PDF equation with the intent of evaluating D:
Brownian PDF equation
And I am trying to curve_fit it to a histogram. However, whenever I plot my curve_fits, it's a line and does not appear correctly on the histogram.
Example Histogram with bad curve_fit
And here is my code:
import numpy as np
import matplotlib.pyplot as plt
from scipy import optimize
# Variables
eta = 1e-3
ra = 0.95e-6
T = 296.5
t = 0.5
# Random data
r = np.array(np.random.rayleigh(0.5e-6, 500))
# Histogram
plt.hist(r, bins=10, density=True, label='Counts')
# Curve fit
x,y = np.histogram(r, bins=10, density=True)
x = x[2:]
y = y[2:]
bin_width = y[1] - y[2]
print(bin_width)
bin_centers = (y[1:] + y[:-1])/2
err = x*0 + 0.03
def f(r, a):
return (((1e-6)3*np.pi*r*eta*ra)/(a*T*t))*np.exp(((-3*(1e-6 * r)**2)*eta*ra*np.pi)/(a*T*t))
print(x) # these are flipped for some reason
print(y)
plt.plot(bin_centers, x, label='Fitting this', color='red')
popt, pcov = optimize.curve_fit(f, bin_centers, x, p0 = (1.38e-23), sigma=err, maxfev=1000)
plt.plot(y, f(y, popt), label='PDF', color='orange')
print(popt)
plt.title('Distance vs Counts')
plt.ylabel('Counts')
plt.xlabel('Distance in micrometers')
plt.legend()
Is the issue with my curve_fit? Or is there an underlying issue I'm missing?
EDIT: I broke down D to get the Boltzmann constant as a in the function, which is why there are more numbers in f than the equation above. D and Gamma.
I've tried messing with the initial conditions and plotting the function with 1.38e-23 instead of popt, but that does this (the purple line). This tells me something is wrong with the equation for f, but no issues jump out to me when I look at it. Am I missing something?
EDIT 2: I changed the function to this to simplify it and match the numpy.random.rayleigh() distribution:
def f(r, a):
return ((r)/(a))*np.exp((-1*(r)**2)/(2*a))
But this doesn't resolve the issue that the curve_fit is a line with a positive slope instead of anything remotely what I'm interested in. Now I am more confused as to what the issue is.
There are a few things here. I don't think x and y were ever flipped, or at least when I assumed they weren't, everything seemed to work fine. I also cleaned up a few parts of the code, for example, I'm not sure why you call two different histograms; and I think there may have been problems handling the single element tuple of parameters. Also, for curve fitting, the initial parameter guess often needs to be in the ballpark, so I changed that too.
Here's a version that works for me:
import numpy as np
import matplotlib.pyplot as plt
from scipy import optimize
# Random data
r = np.array(np.random.rayleigh(0.5e-6, 500))
# Histogram
hist_values, bin_edges, patches = plt.hist(r, bins=10, density=True, label='Counts')
bin_centers = (bin_edges[1:] + bin_edges[:-1])/2
x = bin_centers[2:] # not necessary, and I'm not sure why the OP did this, but I'm doing this here because OP does
y = hist_values[2:]
def f(r, a):
return (r/(a*a))*np.exp((-1*(r**2))/(2*a*a))
plt.plot(x, y, label='Fitting this', color='red')
err = x*0 + 0.03
popt, pcov = optimize.curve_fit(f, x, y, p0 = (1.38e-6,), sigma=err, maxfev=1000)
plt.plot(x, f(x, *popt), label='PDF', color='orange')
plt.title('Distance vs Counts')
plt.ylabel('Counts')
plt.xlabel('Distance in Meters') # Motion seems to be in micron range, but calculation and plot has been done in meters
plt.legend()
I have written a code that reads in my data file and plots it and then fits it and finds the peaks however I have 6 peaks and the code is only currently fitting 2 of the peaks and isn't returning any data on them by code is as follows:
from scipy.optimize import curve_fit
import numpy as np
import matplotlib.pyplot as plt
from scipy.signal import find_peaks
data = np.genfromtxt("C:\\Users\\lenovo laptop\\practice_data_ll16ame1.dat", skip_header = 15)
x = data[: , 0]
y = data[: , 1]
plt.plot(x,y)
plt.show()
def func(x, *params):
y = np.zeros_like(x)
for i in range(0, len(params), 3):
ctr = params[i]
amp = params[i+1]
wid = params[i+2]
y = y + amp * np.exp( -((x - ctr)/wid)**2)
return y
guess = [0, 60000, 80, 1000, 60000, 80]
for i in range(12):
guess += [60+80*i, 46000, 25]
popt, pcov = curve_fit(func, x, y, p0=guess)
fit = func(x, *popt)
plt.plot(x, y)
plt.plot(x, fit , 'r-')
plt.show()
When I looked at the plot of your custom function, it was clear that the majority of points were in a more-or-less horizontal line, so the function wouldn't fit well to your peaks. Because there is no noise and the peaks are so prominent, you just need to pass the y values and a threshold to the find_peaks function.
By implementing find_peaks instead of your custom function, you get the following code:
from scipy.optimize import curve_fit
import numpy as np
import matplotlib.pyplot as plt
from scipy.signal import find_peaks
data = np.genfromtxt("C:\\Users\\lenovo laptop\\practice_data_ll16ame1.dat", skip_header = 15)
x = data[: , 0]
y = data[: , 1]
points = find_peaks(y, height = 100)
plt.plot(x, y)
for i in points[0]:
plt.scatter(x[i], y[i])
plt.show()
Find_peaks returns a tuple consisting of two things:
1. The index of the peaks ( points[0] in the code above)
2. The height of each peak (points[1])
The code yields the following plot, which I believe is what you want:
I am getting a horrible fit when I am trying to fit a parabola to this data.
I am initially making a histogram of the data which is the position of an object and then plotting the negative log values of the histogram bin counts to the position using a parabola fit.
the code I am using is this:
time,pos=postime()
plt.plot(time, pos)
poslen=len(pos)
plt.xlabel('Time')
plt.ylabel('Positions')
plt.show()
n,bins,patches = plt.hist(pos,bins=100)
n=n.tolist()
plt.show()
l=len(bins)
s=len(n)
posx=[]
i=0
j=0
pbin=[]
sig=[]
while j < (l-1):
pbin.append((bins[j]+bins[j+1])/2)
j=j+1
while i < s:
if n[i]==0:
pbin[i]=0
else:
sig.append(np.power(1/n[i],2))
n[i]=n[i]/poslen
n[i]=np.log(n[i])
n[i]=n[i]*(-1)
i=i+1
n[:]=[y for y in n if y != 0]
pbin[:]=[y for y in pbin if y != 0]
from scipy.optimize import curve_fit
def parabola(x, a , b):
return a * (np.power(x,2)) + b
popt, pcov = curve_fit(parabola, pbin, n)
print popt
plt.plot(pbin,n)
plt.plot(pbin, parabola(pbin, *popt), 'r-')
I am not sure why you are computing the histogram... But here is a working example which does not require histogram computation.
import numpy as np
from scipy.optimize import curve_fit
from matplotlib import pyplot
time_ = np.arange(-5, 5, 0.1)
pos = time_**2 + np.random.rand(len(time_))*5
def parabola(x, a, b):
return a * (np.power(x, 2)) + b
popt, pcov = curve_fit(parabola, time_, pos)
yfit = parabola(time_, *popt)
pyplot.plot(time_, pos, 'o')
pyplot.plot(time_, yfit)
Also, if your time_ vector is not uniformly sampled, and you want it to be uniformly sampled for the fit, you can do: fittime_ = np.linsapce(np.min(time_), np.max(time_)) and then yfit = parabola(fittime_, *popt).
You can also use matrix inversion.
import numpy as np
import matplotlib.pyplot as plt
x = np.linspace(-5,5,100)
Y = (np.power(x,2) + np.random.normal(0,1,x.shape)).reshape(-1,1)
X = np.c_[np.ones(x.shape), x, np.power(x,2)]
A = np.linalg.inv(X.transpose().dot(X)).dot(X.transpose().dot(Y))
Yp = X.dot(A)
fig = plt.figure()
ax = fig.add_subplot()
plt.plot(x,Y,'o',alpha=0.5)
plt.plot(x,Yp)
plt.show()
The matrix form is
X*A=Y
A=(Xt*X)-1*Xt*Y
You can have a better idea here if needed. It does not always work out and you may want to apply some form of regularization.
I want to get 2d and 3d plots as shown below.
The equation of the curve is given.
How can we do so in python?
I know there may be duplicates but at the time of posting
I could not fine any useful posts.
My initial attempt is like this:
# Imports
import numpy as np
import matplotlib.pyplot as plt
# to plot the surface rho = b*cosh(z/b) with rho^2 = r^2 + b^2
z = np.arange(-3, 3, 0.01)
rho = np.cosh(z) # take constant b = 1
plt.plot(rho,z)
plt.show()
Some related links are following:
Rotate around z-axis only in plotly
The 3d-plot should look like this:
Ok so I think you are really asking to revolve a 2d curve around an axis to create a surface. I come from a CAD background so that is how i explain things.
and I am not the greatest at math so forgive any clunky terminology. Unfortunately you have to do the rest of the math to get all the points for the mesh.
Heres your code:
#import for 3d
from mpl_toolkits.mplot3d import Axes3D
import numpy as np
import matplotlib.pyplot as plt
change arange to linspace which captures the endpoint otherwise arange will be missing the 3.0 at the end of the array:
z = np.linspace(-3, 3, 600)
rho = np.cosh(z) # take constant b = 1
since rho is your radius at every z height we need to calculate x,y points around that radius. and before that we have to figure out at what positions on that radius to get x,y co-ordinates:
#steps around circle from 0 to 2*pi(360degrees)
#reshape at the end is to be able to use np.dot properly
revolve_steps = np.linspace(0, np.pi*2, 600).reshape(1,600)
the Trig way of getting points around a circle is:
x = r*cos(theta)
y = r*sin(theta)
for you r is your rho, and theta is revolve_steps
by using np.dot to do matrix multiplication you get a 2d array back where the rows of x's and y's will correspond to the z's
theta = revolve_steps
#convert rho to a column vector
rho_column = rho.reshape(600,1)
x = rho_column.dot(np.cos(theta))
y = rho_column.dot(np.sin(theta))
# expand z into a 2d array that matches dimensions of x and y arrays..
# i used np.meshgrid
zs, rs = np.meshgrid(z, rho)
#plotting
fig, ax = plt.subplots(subplot_kw=dict(projection='3d'))
fig.tight_layout(pad = 0.0)
#transpose zs or you get a helix not a revolve.
# you could add rstride = int or cstride = int kwargs to control the mesh density
ax.plot_surface(x, y, zs.T, color = 'white', shade = False)
#view orientation
ax.elev = 30 #30 degrees for a typical isometric view
ax.azim = 30
#turn off the axes to closely mimic picture in original question
ax.set_axis_off()
plt.show()
#ps 600x600x600 pts takes a bit of time to render
I am not sure if it's been fixed in latest version of matplotlib but the setting the aspect ratio of 3d plots with:
ax.set_aspect('equal')
has not worked very well. you can find solutions at this stack overflow question
Only rotate the axis, in this case x
import numpy as np
import matplotlib.pyplot as plt
import mpl_toolkits.mplot3d.axes3d as axes3d
np.seterr(divide='ignore', invalid='ignore')
fig = plt.figure()
ax = fig.add_subplot(111, projection='3d')
x = np.linspace(-3, 3, 60)
rho = np.cosh(x)
v = np.linspace(0, 2*np.pi, 60)
X, V = np.meshgrid(x, v)
Y = np.cosh(X) * np.cos(V)
Z = np.cosh(X) * np.sin(V)
ax.set_xlabel('eje X')
ax.set_ylabel('eje Y')
ax.set_zlabel('eje Z')
ax.plot_surface(X, Y, Z, cmap='YlGnBu_r')
plt.plot(x, rho, 'or') #Muestra la curva que se va a rotar
plt.show()
The result:
Most pyplot examples out there use linear data, but what if data is scattered?
x = 3,7,9
y = 1,4,5
z = 20,3,7
better meshgrid for contourf
xi = np.linspace(min(x)-1, max(x)+1, 9)
yi = np.linspace(min(y)-1, max(y)+1, 9)
X, Y = np.meshgrid(xi, yi)
Now "z" data got to be interpolated onto the meshgrid.
numpy.interp does little help here, while both linear and nn interpolaton of
zi = matplotlib.mlab.griddata(x,y,z,xi,yi,interp="linear")
returns rather strange results
scipy.interpolate.griddata cubic from second answer below needs something else to return data rather than nils
With custom levels data expected be looking something like this
This is what happens:
Although contour requires grid data, we can caste scatter data to a grid and then using masked arrays mask out the blank regions. I simulate this in the code below, by creating a random array, then using this to mask a test dataset (shown at bottom). The bulk of the code is taken from this matplotlib demo page.
import matplotlib
import numpy as np
import matplotlib.mlab as mlab
import matplotlib.pyplot as plt
matplotlib.rcParams['xtick.direction'] = 'out'
matplotlib.rcParams['ytick.direction'] = 'out'
delta = 0.025
x = np.arange(-3.0, 3.0, delta)
y = np.arange(-2.0, 2.0, delta)
X, Y = np.meshgrid(x, y)
Z1 = mlab.bivariate_normal(X, Y, 1.0, 1.0, 0.0, 0.0)
Z2 = mlab.bivariate_normal(X, Y, 1.5, 0.5, 1, 1)
# difference of Gaussians
Z = 10.0 * (Z2 - Z1)
from numpy.random import *
import numpy.ma as ma
J = random_sample(X.shape)
mask = J > 0.7
X = ma.masked_array(X, mask=mask)
Y = ma.masked_array(Y, mask=mask)
Z = ma.masked_array(Z, mask=mask)
plt.figure()
CS = plt.contour(X, Y, Z, 20)
plt.clabel(CS, inline=1, fontsize=10)
plt.title('Simplest default with labels')
plt.savefig('cat.png')
plt.show()
countourf will only work with a grid of data. If you're data is scattered, then you'll need to create an interpolated grid matching your data, like this: (note you'll need scipy to perform the interpolation)
import numpy as np
from scipy.interpolate import griddata
import matplotlib.pyplot as plt
import numpy.ma as ma
from numpy.random import uniform, seed
# your data
x = [3,7,9]
y = [1,4,5]
z = [20,3,7]
# define grid.
xi = np.linspace(0,10,300)
yi = np.linspace(0,6,300)
# grid the data.
zi = griddata((x, y), z, (xi[None,:], yi[:,None]), method='cubic')
# contour the gridded data, plotting dots at the randomly spaced data points.
CS = plt.contour(xi,yi,zi,15,linewidths=0.5,colors='k')
CS = plt.contourf(xi,yi,zi,15,cmap=plt.cm.jet)
plt.colorbar() # draw colorbar
# plot data points.
plt.scatter(x,y,marker='o',c='b',s=5)
plt.xlim(min(x),max(x))
plt.ylim(min(y),max(y))
plt.title('griddata test (%d points)' % len(x))
plt.show()
See here for the origin of that code.