3D-plot of the error function in a linear regression

3D-plot of the error function in a linear regression - python

I would like to visually plot a 3D graph of the error function calculated for a given slope and y-intercept for a linear regression.
This graph will be used to illustrate a gradient descent application.
Let’s suppose we want to model a set of points with a line. To do this we’ll use the standard y=mx+b line equation where m is the line’s slope and b is the line’s y-intercept. To find the best line for our data, we need to find the best set of slope m and y-intercept b values.
A standard approach to solving this type of problem is to define an error function (also called a cost function) that measures how “good” a given line is. This function will take in a (m,b) pair and return an error value based on how well the line fits the data. To compute this error for a given line, we’ll iterate through each (x,y) point in the data set and sum the square distances between each point’s y value and the candidate line’s y value (computed at mx+b). It’s conventional to square this distance to ensure that it is positive and to make our error function differentiable. In python, computing the error for a given line will look like:
# y = mx + b
# m is slope, b is y-intercept
def computeErrorForLineGivenPoints(b, m, points):
totalError = 0
for i in range(0, len(points)):
totalError += (points[i].y - (m * points[i].x + b)) ** 2
return totalError / float(len(points))
Since the error function consists of two parameters (m and b) we can visualize it as a two-dimensional surface.
Now my question, how can we plot such 3D-graph using python ?
Here is a skeleton code to build a 3D plot. This code snippet is totally out of the question context but it show the basics for building a 3D plot.
For my example i would need the x-axis being the slope, the y-axis being the y-intercept and the z-axis, the error.
Can someone help me build such example of graph ?
import numpy as np
from mpl_toolkits.mplot3d import Axes3D
import matplotlib.pyplot as plt
import random
def fun(x, y):
return x**2 + y
fig = plt.figure()
ax = fig.add_subplot(111, projection='3d')
x = y = np.arange(-3.0, 3.0, 0.05)
X, Y = np.meshgrid(x, y)
zs = np.array([fun(x,y) for x,y in zip(np.ravel(X), np.ravel(Y))])
Z = zs.reshape(X.shape)
ax.plot_surface(X, Y, Z)
ax.set_xlabel('X Label')
ax.set_ylabel('Y Label')
ax.set_zlabel('Z Label')
plt.show()
The above code produce the following plot, which is very similar to what i am looking for.

Simply replace fun with computeErrorForLineGivenPoints:
import numpy as np
from mpl_toolkits.mplot3d import Axes3D
import matplotlib.pyplot as plt
import collections
def error(m, b, points):
totalError = 0
for i in range(0, len(points)):
totalError += (points[i].y - (m * points[i].x + b)) ** 2
return totalError / float(len(points))
x = y = np.arange(-3.0, 3.0, 0.05)
Point = collections.namedtuple('Point', ['x', 'y'])
m, b = 3, 2
noise = np.random.random(x.size)
points = [Point(xp, m*xp+b+err) for xp,err in zip(x, noise)]
fig = plt.figure()
ax = fig.add_subplot(111, projection='3d')
ms = np.linspace(2.0, 4.0, 10)
bs = np.linspace(1.5, 2.5, 10)
M, B = np.meshgrid(ms, bs)
zs = np.array([error(mp, bp, points)
for mp, bp in zip(np.ravel(M), np.ravel(B))])
Z = zs.reshape(M.shape)
ax.plot_surface(M, B, Z, rstride=1, cstride=1, color='b', alpha=0.5)
ax.set_xlabel('m')
ax.set_ylabel('b')
ax.set_zlabel('error')
plt.show()
yields
Tip: I renamed computeErrorForLineGivenPoints as error. Generally, there is no need to name a function compute... since almost all functions compute something. You also do not need to specify "GivenPoints" since the function signature shows that points is an argument. If you have other error functions or variables in your program, line_error or total_error might be a better name for this function.

Related

Rayleigh distribution Curve_fit on python

I'm currently working on a lab report for Brownian Motion using this PDF equation with the intent of evaluating D:
Brownian PDF equation
And I am trying to curve_fit it to a histogram. However, whenever I plot my curve_fits, it's a line and does not appear correctly on the histogram.
Example Histogram with bad curve_fit
And here is my code:
import numpy as np
import matplotlib.pyplot as plt
from scipy import optimize
# Variables
eta = 1e-3
ra = 0.95e-6
T = 296.5
t = 0.5
# Random data
r = np.array(np.random.rayleigh(0.5e-6, 500))
# Histogram
plt.hist(r, bins=10, density=True, label='Counts')
# Curve fit
x,y = np.histogram(r, bins=10, density=True)
x = x[2:]
y = y[2:]
bin_width = y[1] - y[2]
print(bin_width)
bin_centers = (y[1:] + y[:-1])/2
err = x*0 + 0.03
def f(r, a):
return (((1e-6)3*np.pi*r*eta*ra)/(a*T*t))*np.exp(((-3*(1e-6 * r)**2)*eta*ra*np.pi)/(a*T*t))
print(x) # these are flipped for some reason
print(y)
plt.plot(bin_centers, x, label='Fitting this', color='red')
popt, pcov = optimize.curve_fit(f, bin_centers, x, p0 = (1.38e-23), sigma=err, maxfev=1000)
plt.plot(y, f(y, popt), label='PDF', color='orange')
print(popt)
plt.title('Distance vs Counts')
plt.ylabel('Counts')
plt.xlabel('Distance in micrometers')
plt.legend()
Is the issue with my curve_fit? Or is there an underlying issue I'm missing?
EDIT: I broke down D to get the Boltzmann constant as a in the function, which is why there are more numbers in f than the equation above. D and Gamma.
I've tried messing with the initial conditions and plotting the function with 1.38e-23 instead of popt, but that does this (the purple line). This tells me something is wrong with the equation for f, but no issues jump out to me when I look at it. Am I missing something?
EDIT 2: I changed the function to this to simplify it and match the numpy.random.rayleigh() distribution:
def f(r, a):
return ((r)/(a))*np.exp((-1*(r)**2)/(2*a))
But this doesn't resolve the issue that the curve_fit is a line with a positive slope instead of anything remotely what I'm interested in. Now I am more confused as to what the issue is.

There are a few things here. I don't think x and y were ever flipped, or at least when I assumed they weren't, everything seemed to work fine. I also cleaned up a few parts of the code, for example, I'm not sure why you call two different histograms; and I think there may have been problems handling the single element tuple of parameters. Also, for curve fitting, the initial parameter guess often needs to be in the ballpark, so I changed that too.
Here's a version that works for me:
import numpy as np
import matplotlib.pyplot as plt
from scipy import optimize
# Random data
r = np.array(np.random.rayleigh(0.5e-6, 500))
# Histogram
hist_values, bin_edges, patches = plt.hist(r, bins=10, density=True, label='Counts')
bin_centers = (bin_edges[1:] + bin_edges[:-1])/2
x = bin_centers[2:] # not necessary, and I'm not sure why the OP did this, but I'm doing this here because OP does
y = hist_values[2:]
def f(r, a):
return (r/(a*a))*np.exp((-1*(r**2))/(2*a*a))
plt.plot(x, y, label='Fitting this', color='red')
err = x*0 + 0.03
popt, pcov = optimize.curve_fit(f, x, y, p0 = (1.38e-6,), sigma=err, maxfev=1000)
plt.plot(x, f(x, *popt), label='PDF', color='orange')
plt.title('Distance vs Counts')
plt.ylabel('Counts')
plt.xlabel('Distance in Meters') # Motion seems to be in micron range, but calculation and plot has been done in meters
plt.legend()

How to make 3D model of heat equation in Python?

Given:
and
We have formula:
I make 3D model, but I can't give the condition like when x = 0 u(0,t) = 0
import math
import matplotlib.pyplot as plt
from mpl_toolkits.mplot3d import Axes3D
import numpy as np
def u(x,t,n):
for i in range(1,n):
alpha=((6*(-1)**i-30)/(i**2*np.pi**2))
e=np.exp((-(np.pi**2)*(i**2)*t))
sin=np.sin((i*np.pi*x)/3)
u=alpha*e*sin
return u
N=20
L = 4 # length
att = 20 # iteration
x = np.linspace(0, L ,N) #x-array
t = np.linspace(0, L, N) #t-array
X, Y = np.meshgrid(x, t)
Z = u(X, Y, att)
fig = plt.figure(figsize = (10,10))
ax = fig.add_subplot(111, projection='3d')
ax.plot_wireframe(X, Y, Z, rstride=10, cstride=1000)
plt.show()
My 3D model:

It would help if you actually computed a sum in the partial Fourier sum calculation, at the moment you just return the last term of that sum.
def u(x,t,n):
u = 0*x
for i in range(1,n):
alpha=((6*(-1)**i-30)/(i**2*np.pi**2))
e=np.exp((-(np.pi**2)*(i**2)*t))
sin=np.sin((i*np.pi*x)/3)
u+=alpha*e*sin
return u
Are you sure about the Fourier coefficients? The number 30 in it is for me somewhat suspicious. Also the frequency seems strange, the continuation of u(x,0) should be an odd rectangular wave of period 8. Notice, it is a=3 but L=4.

Creating a 3D surface plot with matplotlib in python

I am trying to plot a 3D surface but I am having some trouble because the documentation for matplotlib does not appear to be very thorough and is lacking in examples. Anyways the program I have written is to solve the Heat Equation Numerically via Method of Finite Differences. Here is my code:
## This program is to implement a Finite Difference method approximation
## to solve the Heat Equation, u_t = k * u_xx,
## in 1D w/out sources & on a finite interval 0 < x < L. The PDE
## is subject to B.C: u(0,t) = u(L,t) = 0,
## and the I.C: u(x,0) = f(x).
import numpy as np
import matplotlib.pyplot as plt
from matplotlib import cm
from mpl_toolkits.mplot3d import Axes3D
# Parameters
L = 1 # length of the rod
T = 10 # terminal time
N = 40 # spatial values
M = 1600 # time values/hops; (M ~ N^2)
s = 0.25 # s := k * ( (dt) / (dx)^2 )
# uniform mesh
x_init = 0
x_end = L
dx = float(x_end - x_init) / N
x = np.arange(x_init, x_end, dx)
x[0] = x_init
# time discretization
t_init = 0
t_end = T
dt = float(t_end - t_init) / M
t = np.arange(t_init, t_end, dt)
t[0] = t_init
# time-vector
for m in xrange(0, M):
t[m] = m * dt
# spatial-vector
for j in xrange(0, N):
x[j] = j * dx
# definition of the solution u(x,t) to u_t = k * u_xx
u = np.zeros((N, M+1)) # array to store values of the solution
# Finite Difference Scheme:
u[:,0] = x * (x - 1) #initial condition
for m in xrange(0, M):
for j in xrange(1, N-1):
if j == 1:
u[j-1,m] = 0 # Boundary condition
elif j == N-1:
u[j+1,m] = 0 # Boundary Condition
else:
u[j,m+1] = u[j,m] + s * ( u[j+1,m] -
2 * u[j,m] + u[j-1,m] )
This is what I have written to try and plot a 3D surface graph:
# for 3D graph
fig = plt.figure()
ax = fig.add_subplot(111, projection='3d')
surf = ax.plot_surface(x, t, u, rstride=1, cstride=1, cmap=cm.coolwarm, linewidth=0, antialiased=False)
fig.colorbar(surf, shrink=0.5, aspect=5)
plt.show()
I am getting this error when I run the code to plot the graph: "ValueError: shape mismatch: two or more arrays have incompatible dimensions on axis 1."
Please, any and all help is very greatly appreicated. I think the error comes up because I defined u to be a Nx(M+1) matrix but it is necessary to make the original program run. I am unsure of how to correct this so the graph plots properly. Thanks!

Use this code (look at the comments):
# plot 3d surface
# create a meshgrid of (x,t) points
# T and X are 2-d arrays
T, X = np.meshgrid(t,x)
fig = plt.figure()
ax = fig.add_subplot(111, projection='3d')
# Use X and T arrays to plot u
# shape of X, T and u must to be the same
# but shape of u is [40,1601] and I will skip last row while plotting
surf = ax.plot_surface(X, T, u[:,:1600], rstride=1, cstride=1, cmap=cm.coolwarm, linewidth=0, antialiased=False)
fig.colorbar(surf, shrink=0.5, aspect=5)
plt.show()
Result:
because the documentation for matplotlib does not appear to be very thorough and is lacking in examples
http://matplotlib.org/examples/mplot3d/index.html

It's helpful to print out the shapes of the variables x, t, and u:
x.shape == (40,)
t.shape == (1600,)
u.shape == (40, 1601)
So there are two problems here.
The first one is that x and t are 1-dimensional, even though they need to be 2-dimensional.
And the second one is that u has one more element than t in the second dimension.
You can fix both by running
t, x = np.meshgrid(t, x)
u = u[:,:-1]
before creating the 3d plot.

Python & Matplotlib: How to create a meshgrid to plot surf?

I want to plot the a probability density function z=f(x,y).
I find the code to plot surf in Color matplotlib plot_surface command with surface gradient
But I don't know how to conver the z value into grid so I can plot it
The example code and my modification is below.
import numpy as np
import matplotlib.pyplot as plt
from sklearn import mixture
import matplotlib as mpl
from mpl_toolkits.mplot3d import Axes3D
from matplotlib import cm
%matplotlib inline
n_samples = 1000
# generate random sample, two components
np.random.seed(0)
shifted_gaussian = np.random.randn(n_samples, 2) + np.array([20, 5])
sample = shifted_gaussian
# fit a Gaussian Mixture Model with two components
clf = mixture.GMM(n_components=3, covariance_type='full')
clf.fit(sample)
# Plot it
fig = plt.figure()
ax = fig.gca(projection='3d')
X = np.arange(-5, 5, .25)
Y = np.arange(-5, 5, .25)
X, Y = np.meshgrid(X, Y)
## In example Code, the z is generate by grid
# R = np.sqrt(X**2 + Y**2)
# Z = np.sin(R)
# In my case,
# for each point [x,y], the probability value is
# z = clf.score([x,y])
# but How can I generate a grid Z?
Gx, Gy = np.gradient(Z) # gradients with respect to x and y
G = (Gx**2+Gy**2)**.5 # gradient magnitude
N = G/G.max() # normalize 0..1
surf = ax.plot_surface(
X, Y, Z, rstride=1, cstride=1,
facecolors=cm.jet(N),
linewidth=0, antialiased=False, shade=False)
plt.show()
The original approach to plot z is to generate through mesh. But in my case, the fitted model cannot return result in grid-like style, so the problem is how can I generete the grid-style z value, and plot it?

If I understand correctly, you basically have a function z that takes a two scalar values x,y in a list and returns another scalar z_val. In other words z_val = z([x,y]), right?
If that's the case, the you could do the following (note that this is not written with efficiency in mind, but with focus on readability):
from itertools import product
X = np.arange(15) # or whatever values for x
Y = np.arange(5) # or whatever values for y
N, M = len(X), len(Y)
Z = np.zeros((N, M))
for i, (x,y) in enumerate(product(X,Y)):
Z[np.unravel_index(i, (N,M))] = z([x,y])
If you want to use plot_surface, then follow that with this:
X, Y = np.meshgrid(X, Y)
ax.plot_surface(X, Y, Z.T)

Plotting surface of implicitly defined volume

Having a volume implicitly defined by
x*y*z <= 1
for
-5 <= x <= 5
-5 <= y <= 5
-5 <= z <= 5
how would I go about plotting its outer surface using available Python modules, preferably mayavi?
I am aware of the function mlab.mesh, but I don't understand its input. It requires three 2D arrays, that I don't understand how to create having the above information.
EDIT:
Maybe my problem lies with an unsufficient understanding of the meshgrid()-function or the mgrid-class of numpy. I see that I have to use them in some way, but I do not completely grasp their purpose or what such a grid represents.
EDIT:
I arrived at this:
import numpy as np
from mayavi import mlab
x, y, z = np.ogrid[-5:5:200j, -5:5:200j, -5:5:200j]
s = x*y*z
src = mlab.pipeline.scalar_field(s)
mlab.pipeline.iso_surface(src, contours=[1., ],)
mlab.show()
This results in an isosurface (for x*y*z=1) of a volume though, which is not quite what I was looking for. What I am looking for is basically a method to draw an arbitrary surface, like a "polygon in 3d" if there is such a thing.
I created the following code, which plots a surface (works with mayavi, too). I would need to modify this code to my particular problem, but to do that I need to understand why and how a 3d surface is defined by three 2d-arrays? What do these arrays (x, y and z) represent?
import numpy as np
from matplotlib import pyplot as plt
from mpl_toolkits.mplot3d import axes3d, Axes3D
phi, theta = np.mgrid[0:np.pi:11j, 0:2*np.pi:11j]
x = np.sin(phi) * np.cos(theta)
y = np.sin(phi) * np.sin(theta)
z = np.cos(phi)
fig = plt.figure()
ax = fig.add_subplot(111, projection='3d')
ax.plot_wireframe(x,y,z)
fig.show()

The outer surface, implicitly defined by
x*y*z = 1,
cannot be defined explicitly globally. To see this, consider x and y given, then:
z = 1/(x*y),
which is not defined for x = 0 or y = 0. Therefore, you can only define your surface locally for domains that do not include the singularity, e.g. for the domain
0 < x <= 5
0 < y <= 5
z is indeed defined (a hyperbolic surface). Similarly, you need to plot the surfaces for the other domains, until you have patched together
-5 <= x <= 5
-5 <= y <= 5
Note that your surface is not defined for x = 0 and y = 0, i.e. the axis of your coordinate system, so you cannot patch your surfaces together to get a globally defined surface.
Using numpy and matplotlib, you can plot one of these surfaces as follows (adopted from http://matplotlib.org/mpl_toolkits/mplot3d/tutorial.html#surface-plots):
from mpl_toolkits.mplot3d import Axes3D
from matplotlib import cm
import matplotlib.pyplot as plt
import numpy as np
fig = plt.figure()
ax = fig.gca(projection='3d')
X = np.arange(0.25, 5, 0.25)
Y = np.arange(0.25, 5, 0.25)
X, Y = np.meshgrid(X, Y)
Z = 1/(X*Y)
surf = ax.plot_surface(X, Y, Z, rstride=1, cstride=1, cmap=cm.coolwarm,
linewidth=0, antialiased=False)
ax.set_zlim(0, 10)
ax.set_xlabel('x')
ax.set_ylabel('y')
ax.set_zlabel('z')
fig.colorbar(surf, shrink=0.5, aspect=5)
plt.show()
I'm not familiar with mayavi, but I would assume that creating the meshes with numpy would work the same.

The test case in the Mayavi docs where the function test_mesh() is defined is capable of producing a sphere. This is done by replacing
r = sin(m0*phi)**m1 + cos(m2*phi)**m3 + sin(m4*theta)**m5 + cos(m6*theta)**m7
with r = 1.0 say.
However, your problem is you need to understand that the equations you are writing define a volume when you want to draw a sphere. You need to reformulate them to give a parametric equation of a sphere. This is essentially what is done in the above example, but it may be worth your while to try it yourself. As a hint consider the equation of a circle and extend it.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

3D-plot of the error function in a linear regression - python

Related

Rayleigh distribution Curve_fit on python

How to make 3D model of heat equation in Python?

Creating a 3D surface plot with matplotlib in python

Python & Matplotlib: How to create a meshgrid to plot surf?

Plotting surface of implicitly defined volume

Categories

Resources