I'm trying to use an MLPRegressor to fit a predefined 3D function. The problem is that I can't get to print the correct result and therefore my fitting looks awful when plotted.
The function it the following:
def threeDFunc(xin,yin):
z = np.zeros((40,40))
for xIndex in range(0,40,1):
for yIndex in range(0,40,1):
z[xIndex,yIndex]=(np.exp(-(xin[xIndex]**2+yin[yIndex]**2)/0.1))
return z
xThD = np.arange(-1,1,0.05)
yThD = np.arange(-1,1,0.05)
zThD = threeDFunc(xThD, yThD)
The above plot is what it should approximate.
The red is what it does.
The code looks like this:
classifier = neural_network.MLPRegressor(hidden_layer_sizes=(200, 200), activation='logistic', learning_rate='adaptive')
xy = np.array((xThD.flatten(),yThD.flatten()))
classifier.fit(np.transpose(xy), zThD)
pre = classifier.predict(np.transpose(xy))
import pylab
from mpl_toolkits.mplot3d import Axes3D
fig = pylab.figure()
ax = Axes3D(fig)
X, Y = np.meshgrid(xThD, yThD)
ax.plot_wireframe(X, Y, zThD)
ax.plot_wireframe(X, Y, pre, color='red')
print(np.shape(zThD))
print(np.shape(pre))
plt.show()
Change the activation function to the hyperbolic tan function with activation='tanh' and the solver to lbfgs with solver='lbfgs'.
If your classifier instantiation then looks as follows, the plots of red and blue should be nearly identical:
classifier = neural_network.MLPRegressor(hidden_layer_sizes=(200, 200), solver='lbfgs', activation='tanh', learning_rate='adaptive')
Related
Using python I'm trying to plot a sin wave and random distribution, then show where the ratio is greater than or equal to 3.
I think I'm 90% of the way there but keep getting the error message 'x and y must be the same size' when I try to plot it. I've been racking my brains but can't figure out what I'm missing.
Any help or pointers gratefully received.
import numpy as np
import math
import matplotlib.pyplot as plt
r= 2*math.pi
dev = 0.1
x = np.array(np.arange(0, r, dev))
y1 = np.array(np.sin(x))
y2 = np.array(np.random.normal(loc=0, scale=0.1, size=63))
mask = y1//y2 >= 3
fit = np.array(x[mask])
print(fit)
plt.plot(x, y1)
plt.scatter(x, fit)
plt.scatter(x, y2, marker=".")
plt.show()
Insert this line into your code, just before the point of error:
print(len(x), len(fit))
Output:
63 28
You explicitly removed elements from your sequence, and then expected them to be of the same size. You still have 63 x values, but now only 28 y values. Since you didn't trace the problem and explain what you intend for this scatter plot, I have no way of knowing what a "fix" might be. Perhaps make a list of point (x-y pairs), and then filter that for the appropriate y1/y2 ratio?
Not sure if this is what you want but this will scatter dots on the sin-curve corresponding to your mask.
import numpy as np
import math
import matplotlib.pyplot as plt
r= 2*math.pi
dev = 0.1
x = np.array(np.arange(0, r, dev))
y1 = np.array(np.sin(x))
y2 = np.array(np.random.normal(loc=0, scale=0.1, size=63))
mask = y1//y2 >= 3
fit_x = np.array(x[mask])
fit_y = np.array(y1[mask])
plt.plot(x, y1)
plt.scatter(fit_x, fit_y)
plt.scatter(x, y2, marker=".")
plt.show()
In your line plt.scatter(x, fit) you are trying to scatter your x-values with your fit-values. However fit is only of size 25 file while x is of size 63 (as are y1 and y2 btw., thats why that part works).
mask is basically an array of False or True values. That means if you use the np.array(x[mask]) function. It will only create an array of the values where x is actually True, which seems to be what you want. But you can only scatter this against something like np.array(np.sin(fit)), otherwise the sizes are incompatible to scatter.
"""## Splitting the dataset into the Training set and Test set"""
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size = 1/3, random_state = 0)
print(X_train.shape, X_test.shape, y_train.shape, y_test.shape)
"""## Training the Simple Linear Regression model on the Training set"""
from sklearn.linear_model import LinearRegression
regressor = LinearRegression()
regressor.fit(X_train, y_train)
"""## Predicting the Test set results"""
y_pred = regressor.predict(X_test)
"""## Visualising the Training set results"""
plt.scatter(X_train, y_train, color = 'green')
plt.plot(X_train, regressor.predict(X_train), color = 'yellow')
plt.title('Doctor visits(Training set)')
plt.xlabel('X')
plt.ylabel('Y')
plt.show()
"""## Visualising the Test set results"""
plt.scatter(X_test, y_test, color = 'red')
plt.plot(X_train, regressor.predict(X_train), color = 'blue')
plt.title('Doctor visits (Test set)')
plt.xlabel('X')
plt.ylabel('Y')
plt.show()
I want to use yellowbrick Residual plot to show the residuals for of a linear regression model. From the doc's, I can see that Regression_Plot accepts a single color value for the training datasets.
train_colorcolor, default: ‘b’
Residuals for training data are ploted with this color but also given an opacity of 0.5 to ensure
the test data residuals are more visible. Can be any matplotlib color.
I would like to have the colors of the individual scatter points to match a comparable plot where I am plotting the regression and data points.
import numpy as np
from scipy.stats import linregress
import matplotlib.pyplot as plt
from yellowbrick.regressor import ResidualsPlot
from sklearn.linear_model import LinearRegression
data = np.array([[5.71032104e-01, 2.33781600e+03],
[6.28682565e-01, 2.25247200e+03],
[1.23262572e+00, 2.82244800e+03],
[7.44029502e-01, 2.49936000e+03],
[4.01478749e-01, 2.04825600e+03],
[3.46455997e-01, 2.32867200e+03],
[5.15778747e-01, 2.39268000e+03],
[4.16115498e-01, 2.20218000e+03],
[3.24103999e-01, 2.07264000e+03],
[4.29520513e-01, 1.97815200e+03],
[7.72794999e-01, 2.46278400e+03]])
x = data[:,1]
y = data[:,0]
names = np.array(['COTTONWOOD CREEK', 'EMIGRANT SUMMIT', 'GRAND TARGHEE',
'PHILLIPS BENCH', 'PINE CREEK PASS', 'SALT RIVER SUMMIT',
'SEDGWICK PEAK', 'SLUG CREEK DIVIDE', 'SOMSEN RANCH',
'WILDHORSE DIVIDE', 'WILLOW CREEK'], dtype=object)
colors = ['#a6cee3','#1f78b4','#b2df8a','#33a02c','#fb9a99','#e31a1c','#fdbf6f','#ff7f00','#cab2d6','#6a3d9a','#ffff99']
slope, intercept, r_value, p_value, std_err = linregress(x ,y)
xHat = np.linspace(x.min()-300,x+300, 100 )
yHat = y * slope +intercept
colors = ['#a6cee3','#1f78b4','#b2df8a','#33a02c','#fb9a99','#e31a1c','#fdbf6f','#ff7f00','#cab2d6','#6a3d9a','#ffff99']
fig,(ax,ax1) = plt.subplots(nrows=2)
for name, x_, y_, color in zip(names, x, y, colors):
ax.scatter(x_, y_, label = name, c = color)
ax.plot(xHat, xHat*slope + intercept, 'k--', marker=None)
ax.set_xlim(x.min()-200,x.max()+200)
leg = ax.legend(fontsize='x-small', loc='lower right')
ax.text(1934,1.27,'y=' + str(np.round(slope,6))+'x'+ str(np.round(intercept, 3)))
ax.text(1934,1.1, 'R$^2$ =' + str(np.round(r_value**2,4)))
linreg = LinearRegression()
vizul = ResidualsPlot(linreg, hist=False)
vizul.fit(x.reshape(-1,1) ,y.reshape(-1,1))
vizul.poof(ax=ax1)
plt.tight_layout()
Is it possible to achieve this without having to use base matplotlib for the residual plot?
Thanks.
I'm trying to plot the predicted mean data from Gaussian process regression into a 3-D contour. I've followed Plot 3D Contour from an Image using extent with Matplotlib
and mplot3d example code: contour3d_demo3.py threads. Following is my code:
import numpy as np
from sklearn.gaussian_process import GaussianProcessRegressor
from sklearn.gaussian_process.kernels import RBF, ConstantKernel as C
import matplotlib.pyplot as plt
from mpl_toolkits.mplot3d import axes3d
from matplotlib import cm
x_train = np.array([[0,0],[2,2],[3,3]])
y_train = np.array([[200,321,417]])
xvalues = np.array([0,1,2,3])
yvalues = np.array([0,1,2,3])
a,b = np.meshgrid(xvalues,yvalues)
positions = np.vstack([a.ravel(), b.ravel()])
x_test = (np.array(positions)).T
kernel = C(1.0, (1e-3, 1e3)) * RBF(10)
gp = GaussianProcessRegressor(kernel=kernel)
gp.fit(x_train, y_train)
y_pred_test = gp.predict(x_test)
fig = plt.figure()
ax = fig.add_subplot(projection = '3d')
x=y=np.arange(0,3,1)
X, Y = np.meshgrid(x,y)
Z = y_pred_test
cset = ax.contour(X, Y, Z, cmap=cm.coolwarm)
ax.clabel(cset, fontsize=9, inline=1)
plt.show()
After running the above code, I get following error on console:
I want x and y-axis as 2-D plane and the predicted values on the z-axis.The sample plot is as follows:
What is wrong with my code?
Thank you!
The specific error you've mentioned comes from your y_train, which might be a typo. It should be:
y_train_ : array-like, shape = (n_samples, [n_output_dims])
According to your x_train, you have 3 samples. So your y_train should have shape (3, 1) rather than (1, 3).
You also have other bugs in the plotting part:
add_subplot should have a position before projection = '3d'.
Z should have the same shape as X and Y for contour plot.
Because of 2, your x and y should match xvalues and yvalues.
Taken together, you might need to make the following changes:
...
y_train = np.array([200,321,417])
...
ax = fig.add_subplot(111, projection = '3d')
x=y=np.arange(0,4,1)
...
Z = y_pred_test.reshape(X.shape)
...
Just to mention two things:
The plot you will get after these changes won't match the figure you've shown. The figure in your question is a surface plot instead of a contour plot. You can use ax.plot_surface to get that type of plot.
I think you've already know this. But just in case, your plot won't be as smooth as your sample plot since your np.meshgrid is sparse.
I want to plot the a probability density function z=f(x,y).
I find the code to plot surf in Color matplotlib plot_surface command with surface gradient
But I don't know how to conver the z value into grid so I can plot it
The example code and my modification is below.
import numpy as np
import matplotlib.pyplot as plt
from sklearn import mixture
import matplotlib as mpl
from mpl_toolkits.mplot3d import Axes3D
from matplotlib import cm
%matplotlib inline
n_samples = 1000
# generate random sample, two components
np.random.seed(0)
shifted_gaussian = np.random.randn(n_samples, 2) + np.array([20, 5])
sample = shifted_gaussian
# fit a Gaussian Mixture Model with two components
clf = mixture.GMM(n_components=3, covariance_type='full')
clf.fit(sample)
# Plot it
fig = plt.figure()
ax = fig.gca(projection='3d')
X = np.arange(-5, 5, .25)
Y = np.arange(-5, 5, .25)
X, Y = np.meshgrid(X, Y)
## In example Code, the z is generate by grid
# R = np.sqrt(X**2 + Y**2)
# Z = np.sin(R)
# In my case,
# for each point [x,y], the probability value is
# z = clf.score([x,y])
# but How can I generate a grid Z?
Gx, Gy = np.gradient(Z) # gradients with respect to x and y
G = (Gx**2+Gy**2)**.5 # gradient magnitude
N = G/G.max() # normalize 0..1
surf = ax.plot_surface(
X, Y, Z, rstride=1, cstride=1,
facecolors=cm.jet(N),
linewidth=0, antialiased=False, shade=False)
plt.show()
The original approach to plot z is to generate through mesh. But in my case, the fitted model cannot return result in grid-like style, so the problem is how can I generete the grid-style z value, and plot it?
If I understand correctly, you basically have a function z that takes a two scalar values x,y in a list and returns another scalar z_val. In other words z_val = z([x,y]), right?
If that's the case, the you could do the following (note that this is not written with efficiency in mind, but with focus on readability):
from itertools import product
X = np.arange(15) # or whatever values for x
Y = np.arange(5) # or whatever values for y
N, M = len(X), len(Y)
Z = np.zeros((N, M))
for i, (x,y) in enumerate(product(X,Y)):
Z[np.unravel_index(i, (N,M))] = z([x,y])
If you want to use plot_surface, then follow that with this:
X, Y = np.meshgrid(X, Y)
ax.plot_surface(X, Y, Z.T)
So I have used scikit-learn's Gaussian mixture models(http://scikit-learn.org/stable/modules/mixture.html) to fit my data, now I want to use the model, How can I do it? Specifically:
How can I plot the probability density distribution?
How can I calculate the mean square error of the fitting model?
Here is the code you may need:
import numpy as np
import matplotlib.pyplot as plt
from matplotlib.colors import LogNorm
from sklearn import mixture
import matplotlib as mpl
from matplotlib.patches import Ellipse
%matplotlib inline
n_samples = 300
# generate random sample, two components
np.random.seed(0)
shifted_gaussian = np.random.randn(n_samples, 2) + np.array([20, 5])
sample= shifted_gaussian
# fit a Gaussian Mixture Model with two components
clf = mixture.GMM(n_components=2, covariance_type='full')
clf.fit(sample)
# plot sample scatter
plt.scatter(sample[:, 0], sample[:, 1])
# 1. Plot the probobility density distribution
# 2. Calculate the mean square error of the fitting model
UPDATE:
I can plot the distribution by:
x = np.linspace(-20.0, 30.0)
y = np.linspace(-20.0, 40.0)
X, Y = np.meshgrid(x, y)
XX = np.array([X.ravel(), Y.ravel()]).T
Z = -clf.score_samples(XX)[0]
Z = Z.reshape(X.shape)
CS = plt.contour(X, Y, Z, norm=LogNorm(vmin=1.0, vmax=1000.0),
levels=np.logspace(0, 3, 10))
CB = plt.colorbar(CS, shrink=0.8, extend='both')
But isn't it quite strange? Is there better way do to it? Can I plot something like this?
I think the result is reasonable, if you adjust the xlim and ylim a little bit:
# plot sample scatter
plt.scatter(sample[:, 0], sample[:, 1], marker='+', alpha=0.5)
# 1. Plot the probobility density distribution
# 2. Calculate the mean square error of the fitting model
x = np.linspace(-20.0, 30.0, 100)
y = np.linspace(-20.0, 40.0, 100)
X, Y = np.meshgrid(x, y)
XX = np.array([X.ravel(), Y.ravel()]).T
Z = -clf.score_samples(XX)[0]
Z = Z.reshape(X.shape)
CS = plt.contour(X, Y, Z, norm=LogNorm(vmin=1.0, vmax=10.0),
levels=np.logspace(0, 1, 10))
CB = plt.colorbar(CS, shrink=0.8, extend='both')
plt.xlim((10,30))
plt.ylim((-5, 15))