Scipy curve fitting plots multiple fitted graphs instead of one [duplicate] - python

I'm trying to fit a second order polynomial to raw data and output the results using Matplotlib. There are about a million points in the data set that I'm trying to fit. It is supposed to be simple, with many examples available around the web. However for some reason I cannot get it right.
I get the following warning message:
RankWarning: Polyfit may be poorly conditioned
This is my output:
This is output using Excel:
See below for my code. What am I missing??
xData = df['X']
yData = df['Y']
xTitle = 'X'
yTitle = 'Y'
title = ''
minX = 100
maxX = 300
minY = 500
maxY = 2200
title_font = {'fontname':'Arial', 'size':'30', 'color':'black', 'weight':'normal',
'verticalalignment':'bottom'} # Bottom vertical alignment for more space
axis_font = {'fontname':'Arial', 'size':'18'}
#Poly fit
# calculate polynomial
z = np.polyfit(xData, yData, 2)
f = np.poly1d(z)
print(f)
# calculate new x's and y's
x_new = xData
y_new = f(x_new)
#Plot
plt.scatter(xData, yData,c='#002776',edgecolors='none')
plt.plot(x_new,y_new,c='#C60C30')
plt.ylim([minY,maxY])
plt.xlim([minX,maxX])
plt.xlabel(xTitle,**axis_font)
plt.ylabel(yTitle,**axis_font)
plt.title(title,**title_font)
plt.show()

The array to plot must be sorted. Here is a comparisson between plotting a sorted and an unsorted array. The plot in the unsorted case looks completely distorted, however, the fitted function is of course the same.
2
-3.496 x + 2.18 x + 17.26
import matplotlib.pyplot as plt
import numpy as np; np.random.seed(0)
x = (np.random.normal(size=300)+1)
fo = lambda x: -3*x**2+ 1.*x +20.
f = lambda x: fo(x) + (np.random.normal(size=len(x))-0.5)*4
y = f(x)
fig, (ax, ax2) = plt.subplots(1,2, figsize=(6,3))
ax.scatter(x,y)
ax2.scatter(x,y)
def fit(ax, x,y, sort=True):
z = np.polyfit(x, y, 2)
fit = np.poly1d(z)
print(fit)
ax.set_title("unsorted")
if sort:
x = np.sort(x)
ax.set_title("sorted")
ax.plot(x, fo(x), label="original func", color="k", alpha=0.6)
ax.plot(x, fit(x), label="fit func", color="C3", alpha=1, lw=2.5 )
ax.legend()
fit(ax, x,y, sort=False)
fit(ax2, x,y, sort=True)
plt.show()

The problem is probably using a power basis for data that is displaced some distance from zero along the x axis. If you use the Polynomial class from numpy.polynomial it will scale and shift the data before the fit, which will help, and also keep track of the scale and shift used. Note that if you want the coefficients in the normal form you will need to convert to that form.

Related

Plotting a heatmap with interpolation in Python using excel file

I need to plot a HEATMAP in python using x, y, z data from the excel file.
All the values of z are 1 except at (x=5,y=5). The plot should be red at point (5,5) and blue elsewhere. But I am getting false alarms which need to be removed. The COLORMAP I have used is 'jet'
X=[0,0,0,0,0,0,0,0,0,0,1,1,1,1,1,1,1,1,1,1,2,2,2,2,2,2,2,2,2,2,3,3,3,3,3,3,3,3,3,3,4,4,4,4,4,4,4,4,4,4,5,5,5,5,5,5,5,5,5,5,6,6,6,6,6,6,6,6,6,6,7,7,7,7,7,7,7,7,7,7,8,8,8,8,8,8,8,8,8,8,9,9,9,9,9,9,9,9,9,9]
Y=[0,1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9]
Z=[1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,9,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1]
Code I have used is:
import matplotlib.pyplot as plt
import numpy as np
from numpy import ravel
from scipy.interpolate import interp2d
import pandas as pd
import matplotlib as mpl
excel_data_df = pd.read_excel('test.xlsx')
X= excel_data_df['x'].tolist()
Y= excel_data_df['y'].tolist()
Z= excel_data_df['z'].tolist()
x_list = np.array(X)
y_list = np.array(Y)
z_list = np.array(Z)
# f will be a function with two arguments (x and y coordinates),
# but those can be array_like structures too, in which case the
# result will be a matrix representing the values in the grid
# specified by those arguments
f = interp2d(x_list,y_list,z_list,kind="linear")
x_coords = np.arange(min(x_list),max(x_list))
y_coords = np.arange(min(y_list),max(y_list))
z= f(x_coords,y_coords)
fig = plt.imshow(z,
extent=[min(x_list),max(x_list),min(y_list),max(y_list)],
origin="lower", interpolation='bicubic', cmap= 'jet', aspect='auto')
# Show the positions of the sample points, just to have some reference
fig.axes.set_autoscale_on(False)
#plt.scatter(x_list,y_list,400, facecolors='none')
plt.xlabel('X Values', fontsize = 15, va="center")
plt.ylabel('Y Values', fontsize = 15,va="center")
plt.title('Heatmap', fontsize = 20)
plt.tight_layout()
plt.show()
For your ease you can also use the X, Y, Z arrays instead of reading excel file.
The result that I am getting is:
Here you can see dark blue regions at (5,0) and (0,5). These are the FALSE ALARMS I am getting and I need to REMOVE these.
I am probably doing some beginner's mistake. Grateful to anyone who points it out. Regards
There are at least three problems in your example:
x_coords and y_coords are not properly resampled;
the interpolation z does to fill in the whole grid leading to incorrect output;
the output is then forced to be plotted on the original grid (extent) that add to the confusion.
Leading to the following interpolated results:
On what you have applied an extra smoothing with imshow.
Let's create your artificial input:
import matplotlib.pyplot as plt
import numpy as np
x = np.arange(0, 11)
y = np.arange(0, 11)
X, Y = np.meshgrid(x, y)
Z = np.ones(X.shape)
Z[5,5] = 9
Depending on how you want to proceed, you can simply let imshow smooth your signal by interpolation:
fig, axe = plt.subplots()
axe.imshow(Z, origin="lower", cmap="jet", interpolation='bicubic')
And you are done, simple and efficient!
If you aim to do it by yourself, then choose the interpolant that suits you best and resample on a grid with a higher resolution:
interpolant = interpolate.interp2d(x, y, Z.ravel(), kind="linear")
xlin = np.linspace(0, 10, 101)
ylin = np.linspace(0, 10, 101)
zhat = interpolant(xlin, ylin)
fig, axe = plt.subplots()
axe.imshow(zhat, origin="lower", cmap="jet")
Have a deeper look on scipy.interpolate module to pick up the best interpolant regarding your needs. Notice that all methods does not expose the same interface for imputing parameters. You may need to reshape your data to use another objects.
MCVE
Here is a complete example using the trial data generated above. Just bind it to your excel columns:
# Flatten trial data to meet your requirement:
x = X.ravel()
y = Y.ravel()
z = Z.ravel()
# Resampling on as square grid with given resolution:
resolution = 11
xlin = np.linspace(x.min(), x.max(), resolution)
ylin = np.linspace(y.min(), y.max(), resolution)
Xlin, Ylin = np.meshgrid(xlin, ylin)
# Linear multi-dimensional interpolation:
interpolant = interpolate.NearestNDInterpolator([r for r in zip(x, y)], z)
Zhat = interpolant(Xlin.ravel(), Ylin.ravel()).reshape(Xlin.shape)
# Render and interpolate again if necessary:
fig, axe = plt.subplots()
axe.imshow(Zhat, origin="lower", cmap="jet", interpolation='bicubic')
Which renders as expected:

How to plot a mathematical equation in python

I have a mathematical function
y = x^3 + sin(x) which I calculated using the below formular
np.random.seed(10)
x = np.random.random(20)
def calculate(x):
cube_x = np.power(x,3)
sin_x = np.sin(x)
y = cube_x + sin_x
return y
and I created a plot for the above equation
fig = plt.figure(figsize = (14, 8))
##Plot y = x^3 + sin(x)
y = calculate(x)
##plt.plot(x, y, 'b', label = '$x^3$ + $\sin$ $(x)$')
# Add features to our figure
plt.legend()
plt.grid(True, linestyle =':')
plt.xlim([0, 2])
plt.ylim([0, 2])
plt.title("Plot of y = $x^3$ + $\sin$ $(x)$ ")
plt.xlabel('x-axis')
plt.ylabel('y-axis')
# Show plot
plt.show()
I am not sure the above graph is correct. Please I need your assistance to know if I am getting the desired graph for the above function.
You should sort your random array in order to generate the plot correcty. You can use:
x = np.sort(np.random.random(20))
You can also use plt.scatter() instead of plt.plot(), so you don't have to sort the x array.
Like JMA said, you should to sort x first. If you had plotted your original data as a scatter, it would look fine:
However, if you were in a situation where you could not sort your input data prior to evaluating the function y, you can use np.argsort. Say you already have x and y computed and needed to sort x and y based on the order of x alone (e.g. y is not monotonic), you would use the following lines.
idx = np.argsort(x)
x, y = x[idx], y[idx]
and you plot would look like:

How a find tangent of a curve with available dataset for curvature?

I have a dataset for curvature and I need to find the tangent to the curve. My code is as follows but unfortunately, I am not getting the required resut:
chData = efficient.get('Car.Road.y')
fittedParameters = (np.gradient(chData_m_5[:],1)) # 999 values
plt.plot(chData[1:]) # orginally 1000 values
plt.plot(fittedParameters)
plt.show()
The output is:
Edit 1:
I made the following changes to the code to get the tangent to curvature but unfortunately, this is a bit far from the curve. Kindly guide me the issue solution related to the problem. Thank you!
fig, ax1 = plt.subplots()
chData_m = efficient.get('Car.Road.y')
x_fit = chData_m.timestamps
y_fit = chData_m.samples
fittedParameters = np.polyfit(x_fit[:],y_fit[:],1)
f = plt.figure(figsize=(800/100.0, 600/100.0), dpi=100)
axes = f.add_subplot(111)
# first the raw data as a scatter plot
axes.plot(x_fit, y_fit, 'D')
# create data for the fitted equation plot
xModel = np.linspace(min(x_fit), max(x_fit))
yModel = np.polyval(fittedParameters, xModel)
# now the model as a line plot
axes.plot(xModel, yModel)
axes.set_xlabel('X Data') # X axis data label
axes.set_ylabel('Y Data') # Y axis data label
# polynomial derivative from numpy
deriv = np.polyder(fittedParameters)
# for plotting
minX = min(x_fit)
maxX = max(x_fit)
# value of derivative (slope) at a specific X value, so
# that a straight line tangent can be plotted at the point
# you might place this code in a loop to animate
pointVal = 10.0 # example X value
y_value_at_point = np.polyval(fittedParameters, pointVal)
slope_at_point = np.polyval(deriv, pointVal)
ylow = (minX - pointVal) * slope_at_point + y_value_at_point
yhigh = (maxX - pointVal) * slope_at_point + y_value_at_point
# now the tangent as a line plot
axes.plot([minX, maxX], [ylow, yhigh])
plt.show()
plt.close('all') # clean up after using pyplot
And the output is:
Most likely just a scaling problem that we can address by creating a twin axis for the gradient that is scaled independently of the original data. To be on the safe side, we also provide the x-values to np.gradient in case they are not evenly spaced.
import matplotlib.pyplot as plt
import numpy as np
fig, ax1 = plt.subplots()
def func(x, a=0, b=100, c=1, n=3.5):
return a + (b/(1+(c/x)**n))
x_fit = np.linspace(0.1, 70, 100)
y_fit = func(x_fit, 1, 2, 15, 2.4)
tang = np.gradient(y_fit, x_fit)
ax1.plot(x_fit, y_fit, c="blue", label="data")
ax1.legend()
ax1.set_ylabel("data")
ax2 = ax1.twinx()
ax2.plot(x_fit, tang, c="red", label="gradient")
ax2.legend()
ax2.set_ylabel("gradient")
plt.show()
Sample output:
The figure if we plotted it in the same graph:

Numpy way to sort out a messy array for plotting

I have data of a plot on two arrays that are stored in unsorted way, so the plot jumps from one place to another discontinuously:
I have tried one example of finding the closest point in a 2D array:
import numpy as np
def distance(pt_1, pt_2):
pt_1 = np.array((pt_1[0], pt_1[1]))
pt_2 = np.array((pt_2[0], pt_2[1]))
return np.linalg.norm(pt_1-pt_2)
def closest_node(node, nodes):
nodes = np.asarray(nodes)
dist_2 = np.sum((nodes - node)**2, axis=1)
return np.argmin(dist_2)
a = []
for x in range(50000):
a.append((np.random.randint(0,1000),np.random.randint(0,1000)))
some_pt = (1, 2)
closest_node(some_pt, a)
Can I use it somehow to "clean" my data? (in the above code, a can be my data)
Exemplary data from my calculations is:
array([[ 2.08937872e+001, 1.99020033e+001, 2.28260611e+001,
6.27711094e+000, 3.30392288e+000, 1.30312878e+001,
8.80768833e+000, 1.31238275e+001, 1.57400130e+001,
5.00278061e+000, 1.70752624e+001, 1.79131456e+001,
1.50746185e+001, 2.50095731e+001, 2.15895974e+001,
1.23237801e+001, 1.14860312e+001, 1.44268222e+001,
6.37680265e+000, 7.81485403e+000],
[ -1.19702178e-001, -1.14050879e-001, -1.29711421e-001,
8.32977493e-001, 7.27437322e-001, 8.94389885e-001,
8.65931116e-001, -6.08199292e-002, -8.51922900e-002,
1.12333841e-001, -9.88131292e-324, 4.94065646e-324,
-9.88131292e-324, 4.94065646e-324, 4.94065646e-324,
0.00000000e+000, 0.00000000e+000, 0.00000000e+000,
-4.94065646e-324, 0.00000000e+000]])
After using radial_sort_line (of Joe Kington) I have received the following plot:
This is actually a problem that's tougher than you might think in general.
In your exact case, you might be able to get away with sorting by the y-values. It's hard to tell for sure from the plot.
Therefore, a better approach for somewhat circular shapes like this is to do a radial sort.
For example, let's generate some data somewhat similar to yours:
import numpy as np
import matplotlib.pyplot as plt
t = np.linspace(.2, 1.6 * np.pi)
x, y = np.cos(t), np.sin(t)
# Shuffle the points...
i = np.arange(t.size)
np.random.shuffle(i)
x, y = x[i], y[i]
fig, ax = plt.subplots()
ax.plot(x, y, color='lightblue')
ax.margins(0.05)
plt.show()
Okay, now let's try to undo that shuffle by using a radial sort. We'll use the centroid of the points as the center and calculate the angle to each point, then sort by that angle:
x0, y0 = x.mean(), y.mean()
angle = np.arctan2(y - y0, x - x0)
idx = angle.argsort()
x, y = x[idx], y[idx]
fig, ax = plt.subplots()
ax.plot(x, y, color='lightblue')
ax.margins(0.05)
plt.show()
Okay, pretty close! If we were working with a closed polygon, we'd be done.
However, we have one problem -- This closes the wrong gap. We'd rather have the angle start at the position of the largest gap in the line.
Therefore, we'll need to calculate the gap to each adjacent point on our new line and re-do the sort based on a new starting angle:
dx = np.diff(np.append(x, x[-1]))
dy = np.diff(np.append(y, y[-1]))
max_gap = np.abs(np.hypot(dx, dy)).argmax() + 1
x = np.append(x[max_gap:], x[:max_gap])
y = np.append(y[max_gap:], y[:max_gap])
Which results in:
As a complete, stand-alone example:
import numpy as np
import matplotlib.pyplot as plt
def main():
x, y = generate_data()
plot(x, y).set(title='Original data')
x, y = radial_sort_line(x, y)
plot(x, y).set(title='Sorted data')
plt.show()
def generate_data(num=50):
t = np.linspace(.2, 1.6 * np.pi, num)
x, y = np.cos(t), np.sin(t)
# Shuffle the points...
i = np.arange(t.size)
np.random.shuffle(i)
x, y = x[i], y[i]
return x, y
def radial_sort_line(x, y):
"""Sort unordered verts of an unclosed line by angle from their center."""
# Radial sort
x0, y0 = x.mean(), y.mean()
angle = np.arctan2(y - y0, x - x0)
idx = angle.argsort()
x, y = x[idx], y[idx]
# Split at opening in line
dx = np.diff(np.append(x, x[-1]))
dy = np.diff(np.append(y, y[-1]))
max_gap = np.abs(np.hypot(dx, dy)).argmax() + 1
x = np.append(x[max_gap:], x[:max_gap])
y = np.append(y[max_gap:], y[:max_gap])
return x, y
def plot(x, y):
fig, ax = plt.subplots()
ax.plot(x, y, color='lightblue')
ax.margins(0.05)
return ax
main()
Sorting the data base on their angle relative to the center as in #JoeKington 's solution might have problems with some parts of the data:
In [1]:
import scipy.spatial as ss
import matplotlib.pyplot as plt
import numpy as np
import re
%matplotlib inline
In [2]:
data=np.array([[ 2.08937872e+001, 1.99020033e+001, 2.28260611e+001,
6.27711094e+000, 3.30392288e+000, 1.30312878e+001,
8.80768833e+000, 1.31238275e+001, 1.57400130e+001,
5.00278061e+000, 1.70752624e+001, 1.79131456e+001,
1.50746185e+001, 2.50095731e+001, 2.15895974e+001,
1.23237801e+001, 1.14860312e+001, 1.44268222e+001,
6.37680265e+000, 7.81485403e+000],
[ -1.19702178e-001, -1.14050879e-001, -1.29711421e-001,
8.32977493e-001, 7.27437322e-001, 8.94389885e-001,
8.65931116e-001, -6.08199292e-002, -8.51922900e-002,
1.12333841e-001, -9.88131292e-324, 4.94065646e-324,
-9.88131292e-324, 4.94065646e-324, 4.94065646e-324,
0.00000000e+000, 0.00000000e+000, 0.00000000e+000,
-4.94065646e-324, 0.00000000e+000]])
In [3]:
plt.plot(data[0], data[1])
plt.title('Unsorted Data')
Out[3]:
<matplotlib.text.Text at 0x10a5c0550>
See x values between 15 and 20 are not sorted correctly.
In [10]:
#Calculate the angle in degrees of [0, 360]
sort_index = np.angle(np.dot((data.T-data.mean(1)), np.array([1.0, 1.0j])))
sort_index = np.where(sort_index>0, sort_index, sort_index+360)
#sorted the data by angle and plot them
sort_index = sort_index.argsort()
plt.plot(data[0][sort_index], data[1][sort_index])
plt.title('Data Sorted by angle relatively to the centroid')
plt.plot(data[0], data[1], 'r+')
Out[10]:
[<matplotlib.lines.Line2D at 0x10b009e10>]
We can sort the data based on a nearest neighbor approach, but since the x and y are of very different scale, the choice of distance metrics becomes an important issue. We will just try all the distance metrics available in scipy to get an idea:
In [7]:
def sort_dots(metrics, ax, start):
dist_m = ss.distance.squareform(ss.distance.pdist(data.T, metrics))
total_points = data.shape[1]
points_index = set(range(total_points))
sorted_index = []
target = start
ax.plot(data[0, target], data[1, target], 'o', markersize=16)
points_index.discard(target)
while len(points_index)>0:
candidate = list(points_index)
nneigbour = candidate[dist_m[target, candidate].argmin()]
points_index.discard(nneigbour)
points_index.discard(target)
#print points_index, target, nneigbour
sorted_index.append(target)
target = nneigbour
sorted_index.append(target)
ax.plot(data[0][sorted_index], data[1][sorted_index])
ax.set_title(metrics)
In [6]:
dmetrics = re.findall('pdist\(X\,\s+\'(.*)\'', ss.distance.pdist.__doc__)
In [8]:
f, axes = plt.subplots(4, 6, figsize=(16,10), sharex=True, sharey=True)
axes = axes.ravel()
for metrics, ax in zip(dmetrics, axes):
try:
sort_dots(metrics, ax, 5)
except:
ax.set_title(metrics + '(unsuitable)')
It looks like standardized euclidean and mahanalobis metrics give the best result. Note that we choose a starting point of the 6th data (index 5), it is the data point this the largest y value (use argmax to get the index, of course).
In [9]:
f, axes = plt.subplots(4, 6, figsize=(16,10), sharex=True, sharey=True)
axes = axes.ravel()
for metrics, ax in zip(dmetrics, axes):
try:
sort_dots(metrics, ax, 13)
except:
ax.set_title(metrics + '(unsuitable)')
This is what happens if you choose the starting point of max. x value (index 13). It appears that mahanalobis metrics is better than standardized euclidean as it is not affected by the starting point we choose.
If we do the assumption that the data are 2D and the x axis should be in an increasing fashion, then you could:
sort the x axis data, e.g. x_old and store the result in a different variable, e.g. x_new
for each element in the x_new find its index in the x_old array
re-order the elements in the y_axis array according to the indices that you got from previous step
I would do it with python list instead of numpy array due to list.index method been more easily manipulated than the numpy.where method.
E.g. (and assume that x_old and y_old are your previous numpy variables for x and y axis respectively)
import numpy as np
x_new_tmp = x_old.tolist()
y_new_tmp = y_old.tolist()
x_new = sorted(x_new_tmp)
y_new = [y_new_tmp[x_new_tmp.index(i)] for i in x_new]
Then you can plot x_new and y_new

3D-plot of the error function in a linear regression

I would like to visually plot a 3D graph of the error function calculated for a given slope and y-intercept for a linear regression.
This graph will be used to illustrate a gradient descent application.
Let’s suppose we want to model a set of points with a line. To do this we’ll use the standard y=mx+b line equation where m is the line’s slope and b is the line’s y-intercept. To find the best line for our data, we need to find the best set of slope m and y-intercept b values.
A standard approach to solving this type of problem is to define an error function (also called a cost function) that measures how “good” a given line is. This function will take in a (m,b) pair and return an error value based on how well the line fits the data. To compute this error for a given line, we’ll iterate through each (x,y) point in the data set and sum the square distances between each point’s y value and the candidate line’s y value (computed at mx+b). It’s conventional to square this distance to ensure that it is positive and to make our error function differentiable. In python, computing the error for a given line will look like:
# y = mx + b
# m is slope, b is y-intercept
def computeErrorForLineGivenPoints(b, m, points):
totalError = 0
for i in range(0, len(points)):
totalError += (points[i].y - (m * points[i].x + b)) ** 2
return totalError / float(len(points))
Since the error function consists of two parameters (m and b) we can visualize it as a two-dimensional surface.
Now my question, how can we plot such 3D-graph using python ?
Here is a skeleton code to build a 3D plot. This code snippet is totally out of the question context but it show the basics for building a 3D plot.
For my example i would need the x-axis being the slope, the y-axis being the y-intercept and the z-axis, the error.
Can someone help me build such example of graph ?
import numpy as np
from mpl_toolkits.mplot3d import Axes3D
import matplotlib.pyplot as plt
import random
def fun(x, y):
return x**2 + y
fig = plt.figure()
ax = fig.add_subplot(111, projection='3d')
x = y = np.arange(-3.0, 3.0, 0.05)
X, Y = np.meshgrid(x, y)
zs = np.array([fun(x,y) for x,y in zip(np.ravel(X), np.ravel(Y))])
Z = zs.reshape(X.shape)
ax.plot_surface(X, Y, Z)
ax.set_xlabel('X Label')
ax.set_ylabel('Y Label')
ax.set_zlabel('Z Label')
plt.show()
The above code produce the following plot, which is very similar to what i am looking for.
Simply replace fun with computeErrorForLineGivenPoints:
import numpy as np
from mpl_toolkits.mplot3d import Axes3D
import matplotlib.pyplot as plt
import collections
def error(m, b, points):
totalError = 0
for i in range(0, len(points)):
totalError += (points[i].y - (m * points[i].x + b)) ** 2
return totalError / float(len(points))
x = y = np.arange(-3.0, 3.0, 0.05)
Point = collections.namedtuple('Point', ['x', 'y'])
m, b = 3, 2
noise = np.random.random(x.size)
points = [Point(xp, m*xp+b+err) for xp,err in zip(x, noise)]
fig = plt.figure()
ax = fig.add_subplot(111, projection='3d')
ms = np.linspace(2.0, 4.0, 10)
bs = np.linspace(1.5, 2.5, 10)
M, B = np.meshgrid(ms, bs)
zs = np.array([error(mp, bp, points)
for mp, bp in zip(np.ravel(M), np.ravel(B))])
Z = zs.reshape(M.shape)
ax.plot_surface(M, B, Z, rstride=1, cstride=1, color='b', alpha=0.5)
ax.set_xlabel('m')
ax.set_ylabel('b')
ax.set_zlabel('error')
plt.show()
yields
Tip: I renamed computeErrorForLineGivenPoints as error. Generally, there is no need to name a function compute... since almost all functions compute something. You also do not need to specify "GivenPoints" since the function signature shows that points is an argument. If you have other error functions or variables in your program, line_error or total_error might be a better name for this function.

Categories