How to plot regression predicted data in 3-D contour plot?

How to plot regression predicted data in 3-D contour plot? - python

I'm trying to plot the predicted mean data from Gaussian process regression into a 3-D contour. I've followed Plot 3D Contour from an Image using extent with Matplotlib
and mplot3d example code: contour3d_demo3.py threads. Following is my code:
import numpy as np
from sklearn.gaussian_process import GaussianProcessRegressor
from sklearn.gaussian_process.kernels import RBF, ConstantKernel as C
import matplotlib.pyplot as plt
from mpl_toolkits.mplot3d import axes3d
from matplotlib import cm
x_train = np.array([[0,0],[2,2],[3,3]])
y_train = np.array([[200,321,417]])
xvalues = np.array([0,1,2,3])
yvalues = np.array([0,1,2,3])
a,b = np.meshgrid(xvalues,yvalues)
positions = np.vstack([a.ravel(), b.ravel()])
x_test = (np.array(positions)).T
kernel = C(1.0, (1e-3, 1e3)) * RBF(10)
gp = GaussianProcessRegressor(kernel=kernel)
gp.fit(x_train, y_train)
y_pred_test = gp.predict(x_test)
fig = plt.figure()
ax = fig.add_subplot(projection = '3d')
x=y=np.arange(0,3,1)
X, Y = np.meshgrid(x,y)
Z = y_pred_test
cset = ax.contour(X, Y, Z, cmap=cm.coolwarm)
ax.clabel(cset, fontsize=9, inline=1)
plt.show()
After running the above code, I get following error on console:
I want x and y-axis as 2-D plane and the predicted values on the z-axis.The sample plot is as follows:
What is wrong with my code?
Thank you!

The specific error you've mentioned comes from your y_train, which might be a typo. It should be:
y_train_ : array-like, shape = (n_samples, [n_output_dims])
According to your x_train, you have 3 samples. So your y_train should have shape (3, 1) rather than (1, 3).
You also have other bugs in the plotting part:
add_subplot should have a position before projection = '3d'.
Z should have the same shape as X and Y for contour plot.
Because of 2, your x and y should match xvalues and yvalues.
Taken together, you might need to make the following changes:
...
y_train = np.array([200,321,417])
...
ax = fig.add_subplot(111, projection = '3d')
x=y=np.arange(0,4,1)
...
Z = y_pred_test.reshape(X.shape)
...
Just to mention two things:
The plot you will get after these changes won't match the figure you've shown. The figure in your question is a surface plot instead of a contour plot. You can use ax.plot_surface to get that type of plot.
I think you've already know this. But just in case, your plot won't be as smooth as your sample plot since your np.meshgrid is sparse.

Related

Associating a colormap based on a Nx1 array to a 3D voxel plot

I have a problem very similar to this question. The answer works very well for plotting the voxels. However, I need to find a way to colour the voxels according to a colormap (of type 'jet') which is based on the 5x1 array called "variable". I also need to associate a logarithmic colorbar with that 3D plot.
Thanks in advance!

I found a solution myself. I will post the code here in case somebody has the same problem.
I added two changes to the problem conditions:
The voxels are rectangular prisms of custom dimensions (a,b,c) instead of simple cubes.
Instead of "variable", i defined an array called "Ivec", which has more suitable values for displaying the logarithmic colormap.
If one wants to display a linear colormap, he/she can simply uncomment the line commented as "linear scale colormap" and comment/delete the line commented as "log scale colormap"
import numpy as np
import matplotlib
import matplotlib.cm as cmx
import matplotlib.pyplot as plt
from mpl_toolkits.mplot3d import Axes3D
import pandas as pd
df = pd.DataFrame({"x": [14630, 14630, 14360, 14360, 14360], "y" : [21750, 21770, 21790, 21930, 21950], "z" : [4690, 4690, 4690, 5290, 5270]})
Ivec = np.array([1, 10, 100, 1000, 10000])
def get_cube():
phi = np.arange(1,10,2)*np.pi/4
Phi, Theta = np.meshgrid(phi, phi)
x = np.cos(Phi)*np.sin(Theta)
y = np.sin(Phi)*np.sin(Theta)
z = np.cos(Theta)/np.sqrt(2)
return x,y,z
fig = plt.figure()
ax = fig.add_subplot(111, projection='3d')
a = 25
b = 8
c = 14
ax.view_init(azim=0, elev=0)
cm = plt.get_cmap('jet')
#cNorm = matplotlib.colors.Normalize(vmin=min(Ivec), vmax=max(Ivec))#linear scale colormap
cNorm = matplotlib.colors.LogNorm(vmin=min(Ivec), vmax=max(Ivec)) #log scale colormap
scalarMap = cmx.ScalarMappable(norm=cNorm, cmap=cm)
scalarMap.set_array(Ivec)
fig.colorbar(scalarMap)
cmapRgba=scalarMap.to_rgba(Ivec)
for i in df.index:
x,y,z = get_cube()
# Change the centroid of the cube from zero to values in data frame
x = x*a + df.x[i]
y = y*b + df.y[i]
z = z*c + df.z[i]
ax.plot_surface(x, y, z, color = cmapRgba[i])
ax.set_zlabel("z")
plt.xlabel("x")
plt.ylabel("y")
plt.show()

Plotting a heatmap with interpolation in Python using excel file

I need to plot a HEATMAP in python using x, y, z data from the excel file.
All the values of z are 1 except at (x=5,y=5). The plot should be red at point (5,5) and blue elsewhere. But I am getting false alarms which need to be removed. The COLORMAP I have used is 'jet'
X=[0,0,0,0,0,0,0,0,0,0,1,1,1,1,1,1,1,1,1,1,2,2,2,2,2,2,2,2,2,2,3,3,3,3,3,3,3,3,3,3,4,4,4,4,4,4,4,4,4,4,5,5,5,5,5,5,5,5,5,5,6,6,6,6,6,6,6,6,6,6,7,7,7,7,7,7,7,7,7,7,8,8,8,8,8,8,8,8,8,8,9,9,9,9,9,9,9,9,9,9]
Y=[0,1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9]
Z=[1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,9,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1]
Code I have used is:
import matplotlib.pyplot as plt
import numpy as np
from numpy import ravel
from scipy.interpolate import interp2d
import pandas as pd
import matplotlib as mpl
excel_data_df = pd.read_excel('test.xlsx')
X= excel_data_df['x'].tolist()
Y= excel_data_df['y'].tolist()
Z= excel_data_df['z'].tolist()
x_list = np.array(X)
y_list = np.array(Y)
z_list = np.array(Z)
# f will be a function with two arguments (x and y coordinates),
# but those can be array_like structures too, in which case the
# result will be a matrix representing the values in the grid
# specified by those arguments
f = interp2d(x_list,y_list,z_list,kind="linear")
x_coords = np.arange(min(x_list),max(x_list))
y_coords = np.arange(min(y_list),max(y_list))
z= f(x_coords,y_coords)
fig = plt.imshow(z,
extent=[min(x_list),max(x_list),min(y_list),max(y_list)],
origin="lower", interpolation='bicubic', cmap= 'jet', aspect='auto')
# Show the positions of the sample points, just to have some reference
fig.axes.set_autoscale_on(False)
#plt.scatter(x_list,y_list,400, facecolors='none')
plt.xlabel('X Values', fontsize = 15, va="center")
plt.ylabel('Y Values', fontsize = 15,va="center")
plt.title('Heatmap', fontsize = 20)
plt.tight_layout()
plt.show()
For your ease you can also use the X, Y, Z arrays instead of reading excel file.
The result that I am getting is:
Here you can see dark blue regions at (5,0) and (0,5). These are the FALSE ALARMS I am getting and I need to REMOVE these.
I am probably doing some beginner's mistake. Grateful to anyone who points it out. Regards

There are at least three problems in your example:
x_coords and y_coords are not properly resampled;
the interpolation z does to fill in the whole grid leading to incorrect output;
the output is then forced to be plotted on the original grid (extent) that add to the confusion.
Leading to the following interpolated results:
On what you have applied an extra smoothing with imshow.
Let's create your artificial input:
import matplotlib.pyplot as plt
import numpy as np
x = np.arange(0, 11)
y = np.arange(0, 11)
X, Y = np.meshgrid(x, y)
Z = np.ones(X.shape)
Z[5,5] = 9
Depending on how you want to proceed, you can simply let imshow smooth your signal by interpolation:
fig, axe = plt.subplots()
axe.imshow(Z, origin="lower", cmap="jet", interpolation='bicubic')
And you are done, simple and efficient!
If you aim to do it by yourself, then choose the interpolant that suits you best and resample on a grid with a higher resolution:
interpolant = interpolate.interp2d(x, y, Z.ravel(), kind="linear")
xlin = np.linspace(0, 10, 101)
ylin = np.linspace(0, 10, 101)
zhat = interpolant(xlin, ylin)
fig, axe = plt.subplots()
axe.imshow(zhat, origin="lower", cmap="jet")
Have a deeper look on scipy.interpolate module to pick up the best interpolant regarding your needs. Notice that all methods does not expose the same interface for imputing parameters. You may need to reshape your data to use another objects.
MCVE
Here is a complete example using the trial data generated above. Just bind it to your excel columns:
# Flatten trial data to meet your requirement:
x = X.ravel()
y = Y.ravel()
z = Z.ravel()
# Resampling on as square grid with given resolution:
resolution = 11
xlin = np.linspace(x.min(), x.max(), resolution)
ylin = np.linspace(y.min(), y.max(), resolution)
Xlin, Ylin = np.meshgrid(xlin, ylin)
# Linear multi-dimensional interpolation:
interpolant = interpolate.NearestNDInterpolator([r for r in zip(x, y)], z)
Zhat = interpolant(Xlin.ravel(), Ylin.ravel()).reshape(Xlin.shape)
# Render and interpolate again if necessary:
fig, axe = plt.subplots()
axe.imshow(Zhat, origin="lower", cmap="jet", interpolation='bicubic')
Which renders as expected:

Python: Fitting a 3D function using MLPRegressor

I'm trying to use an MLPRegressor to fit a predefined 3D function. The problem is that I can't get to print the correct result and therefore my fitting looks awful when plotted.
The function it the following:
def threeDFunc(xin,yin):
z = np.zeros((40,40))
for xIndex in range(0,40,1):
for yIndex in range(0,40,1):
z[xIndex,yIndex]=(np.exp(-(xin[xIndex]**2+yin[yIndex]**2)/0.1))
return z
xThD = np.arange(-1,1,0.05)
yThD = np.arange(-1,1,0.05)
zThD = threeDFunc(xThD, yThD)
The above plot is what it should approximate.
The red is what it does.
The code looks like this:
classifier = neural_network.MLPRegressor(hidden_layer_sizes=(200, 200), activation='logistic', learning_rate='adaptive')
xy = np.array((xThD.flatten(),yThD.flatten()))
classifier.fit(np.transpose(xy), zThD)
pre = classifier.predict(np.transpose(xy))
import pylab
from mpl_toolkits.mplot3d import Axes3D
fig = pylab.figure()
ax = Axes3D(fig)
X, Y = np.meshgrid(xThD, yThD)
ax.plot_wireframe(X, Y, zThD)
ax.plot_wireframe(X, Y, pre, color='red')
print(np.shape(zThD))
print(np.shape(pre))
plt.show()

Change the activation function to the hyperbolic tan function with activation='tanh' and the solver to lbfgs with solver='lbfgs'.
If your classifier instantiation then looks as follows, the plots of red and blue should be nearly identical:
classifier = neural_network.MLPRegressor(hidden_layer_sizes=(200, 200), solver='lbfgs', activation='tanh', learning_rate='adaptive')

scikit-learn: How to use the fitted probability model?

So I have used scikit-learn's Gaussian mixture models(http://scikit-learn.org/stable/modules/mixture.html) to fit my data, now I want to use the model, How can I do it? Specifically:
How can I plot the probability density distribution?
How can I calculate the mean square error of the fitting model?
Here is the code you may need:
import numpy as np
import matplotlib.pyplot as plt
from matplotlib.colors import LogNorm
from sklearn import mixture
import matplotlib as mpl
from matplotlib.patches import Ellipse
%matplotlib inline
n_samples = 300
# generate random sample, two components
np.random.seed(0)
shifted_gaussian = np.random.randn(n_samples, 2) + np.array([20, 5])
sample= shifted_gaussian
# fit a Gaussian Mixture Model with two components
clf = mixture.GMM(n_components=2, covariance_type='full')
clf.fit(sample)
# plot sample scatter
plt.scatter(sample[:, 0], sample[:, 1])
# 1. Plot the probobility density distribution
# 2. Calculate the mean square error of the fitting model
UPDATE:
I can plot the distribution by:
x = np.linspace(-20.0, 30.0)
y = np.linspace(-20.0, 40.0)
X, Y = np.meshgrid(x, y)
XX = np.array([X.ravel(), Y.ravel()]).T
Z = -clf.score_samples(XX)[0]
Z = Z.reshape(X.shape)
CS = plt.contour(X, Y, Z, norm=LogNorm(vmin=1.0, vmax=1000.0),
levels=np.logspace(0, 3, 10))
CB = plt.colorbar(CS, shrink=0.8, extend='both')
But isn't it quite strange? Is there better way do to it? Can I plot something like this?

I think the result is reasonable, if you adjust the xlim and ylim a little bit:
# plot sample scatter
plt.scatter(sample[:, 0], sample[:, 1], marker='+', alpha=0.5)
# 1. Plot the probobility density distribution
# 2. Calculate the mean square error of the fitting model
x = np.linspace(-20.0, 30.0, 100)
y = np.linspace(-20.0, 40.0, 100)
X, Y = np.meshgrid(x, y)
XX = np.array([X.ravel(), Y.ravel()]).T
Z = -clf.score_samples(XX)[0]
Z = Z.reshape(X.shape)
CS = plt.contour(X, Y, Z, norm=LogNorm(vmin=1.0, vmax=10.0),
levels=np.logspace(0, 1, 10))
CB = plt.colorbar(CS, shrink=0.8, extend='both')
plt.xlim((10,30))
plt.ylim((-5, 15))

Python Point density plots in polar stereographic projection

I have a point cloud of magnetization directions with azimut (declination between 0° and 360°) and inclination between 0° and 90°. I display these points in a polar azimuthal equidistant projection (using matplotlib basemap). That means 90° inclination will point directly in the center of the plot and the declination runs clockwise.
My problem is that I want to also plot isolines around these point clouds, which should represent where the highest density of point/directions is located. What is the easiest way to do this? Nice would be to mark the isoline which encircles 50% is my data. If Iam not mistaken - this would be the median.
So far I've fiddled around with gaussian_kde and the outlier detection of sklearn (1 and 2), but the results are not as expected.
Any ideas?
Edit #1:
First gaussian_kde
import numpy as np
import matplotlib.pyplot as plt
import scipy.stats as stats
from mpl_toolkits.basemap import Basemap
m = Basemap(projection='spaeqd',boundinglat=0,lon_0=180,resolution='l',round=True)
m.drawparallels(np.arange(-80.,1.,10.),labels=[False,True,True,False])
m.drawmeridians(np.arange(-180.,181.,30.),labels=[True,False,False,True])
#data
x, y = m(m1,-m2) #m2 is negative because I to plot in the southern hemisphere!
#set up the grid for evaluation of the KDE
yi = np.arange(0,360.1,1)
xi = np.arange(-90,1,1)
xx,yy = np.meshgrid(xi,yi)
X, Y = m(xx,yy) # to have it in my basemap projection
#setup the gaussian kde and evaluate it
#pretty much similiar to the scipy.stats docs
positions = np.vstack([X.ravel(), Y.ravel()])
values = np.vstack([x, y])
kernel = stats.gaussian_kde(values)
Z = np.reshape(kernel(positions).T, X.shape)
#plot orginal points and probaility density function
ax = plt.gca()
ax.scatter(x,y,c = 'Crimson')
TOT = ax.contour(X,Y,Z,cmap=plt.cm.Reds)
plt.show()
Then sklearn:
import numpy as np
import matplotlib.pyplot as plt
import scipy.stats as stats
from mpl_toolkits.basemap import Basemap
from sklearn import svm
from sklearn.covariance import EllipticEnvelope
m = Basemap(projection='spaeqd',boundinglat=0,lon_0=180,resolution='l',round=True)
m.drawparallels(np.arange(-80.,1.,10.),labels=[False,True,True,False])
m.drawmeridians(np.arange(-180.,181.,30.),labels=[True,False,False,True])
#data
x, y = m(m1,-m2) #m2 is negative because I to plot in the southern hemisphere!
#Similar to examples in sklearn docs
outliers_fraction = 0.5
oneclass_svm = svm.OneClassSVM(nu=0.95 * outliers_fraction + 0.05,\
kernel="rbf", gamma=0.1,verbose=True)
#seup grid
yi = np.arange(0,360.1,1)
xi = np.arange(-90,1,1)
R,T = np.meshgrid(xi,yi)
xx, yy = m(T,R)
x, y = m(m1,-m2)
#standardize data as suggested by docs
x_std = (x-x.mean())/x.std()
y_std = (y-y.mean())/y.std()
values = np.vstack([x_std, y_std])
#fit data and calculate threshold - this should mark my median - according to value of outliers_fraction
oneclass_svm.fit(values.T)
y_pred = oneclass_svm.decision_function(values.T).ravel()
threshold = stats.scoreatpercentile(y_pred, 100 * outliers_fraction)
y_pred = y_pred > threshold
#Target vector for evaluation
TV = np.c_[xx.ravel(), yy.ravel()]
TV = (TV-TV.mean(axis=0))/TV.std(axis=0) #must be standardized as well
# evaluation - This is now shifted in the plot ad does not fit my point cloud anymore - because of the standadrization
Z = oneclass_svm.decision_function(TV)
Z = Z.reshape(xx.shape)
#plotting - very similar to the example in the docs
ax = plt.gca()
ax.contourf(xx, yy, Z, levels=np.linspace(Z.min(), threshold, 7), \
cmap=plt.cm.Blues_r)
ax.contour(xx, yy, Z, levels=[threshold],
linewidths=2, colors='red')
ax.contourf(xx, yy, Z, levels=[threshold, Z.max()],
colors='orange')
ax.scatter(x, y,s=30, marker='s',c = 'RoyalBlue',label = 'Mr')
plt.show()
The EllipticEvelope works, but it is not that want I want.

Ok, I think I might found a solution. But it should not work in every case. It should fail in my opinion when the data is multimodal distributed.
Nevertheless, here is my though process:
So the Probalibity Density Function (PDF) is essentially the same as a continuous histogram. So I used np.percentile to calculate the upper and lower 25% percentile of both vectors. The I've searched for the value of the PDF at these perctiles and this should be the Isoline that i want.
Of course this should also work in the polar stereographic (or any other) projection.
Here is a litte example code of two gamma distributed data sets in a crossplot:
import numpy as np
import matplotlib.pyplot as plt
import scipy.stats as stats
from scipy.interpolate import LinearNDInterpolator, RegularGridInterpolator
#generate some data
x = np.random.gamma(10,0.8,1e4)
y = np.random.gamma(4,0.3,1e4)
#set up the data and grid for the 2D PDF
values = np.vstack([x,y])
pdf_x = np.linspace(x.min(),x.max(),1e2)
pdf_y = np.linspace(y.min(),y.max(),1e2)
X,Y = np.meshgrid(pdf_x,pdf_y)
kernel = stats.gaussian_kde(values)
#evaluate the PDF at every grid location
positions = np.vstack([X.ravel(), Y.ravel()])
Z = np.reshape(kernel(positions).T, X.shape)
#upper and lower quartiles of x and y data
xql = np.percentile(x,25)
xqu = np.percentile(x,75)
yql = np.percentile(y,25)
yqu = np.percentile(y,75)
#set up the interpolator - I could also use RegularGridInterpolator - should be faster
Interp = LinearNDInterpolator((X.flatten(),Y.flatten()),Z.flatten())
#1D example to illustrate what I mean
plt.figure()
kernel2 = stats.gaussian_kde(x)
plt.hist(x,30,normed=True)
plt.plot(pdf_x,kernel2(pdf_x),'r--',linewidth=2)
#plot vertical lines at the upper and lower quartiles
plt.vlines(np.percentile(x,25),0,0.2,color='red')
plt.vlines(np.percentile(x,75),0,0.2,color='red')
#Scatterplot / Crossplot with PDF and 25 and 75% isolines
plt.figure()
plt.scatter(x,y)
#search for the isolines defining the upper and lower quartiles
#the lower quartiles isoline should encircle 75% of the data
levels = [Interp(xql,yql),Interp(xqu,yqu)]
plt.contour(X,Y,Z,levels=levels,colors='orange')
plt.show()
To finish up I will give a quick example of what it looks in a polar stereographic projection:
import numpy as np
import matplotlib.pyplot as plt
import scipy.stats as stats
from scipy.interpolate import LinearNDInterpolator
from mpl_toolkits.basemap import Basemap
#set up the coordinate projection
m = Basemap(projection='spaeqd',boundinglat=0,lon_0=180,\
resolution='l',round=True,suppress_ticks=True)
parallelGrid = np.arange(-80.,1.,10.)
meridianGrid = np.arange(-180.0,180.1,30)
m.drawparallels(parallelGrid,labels=[False,False,False,False])
m.drawmeridians(meridianGrid,labels=[False,False,False,False],labelstyle='+/-',fmt='%i')
#Found this on stackoverflow - labels it exactly how I want it
ax = plt.gca()
ax.text(0.5,1.025,'N',transform=ax.transAxes,\
horizontalalignment='center',verticalalignment='bottom',size=25)
for para in np.arange(30,360,30):
x= (1.1*0.5*np.sin(np.deg2rad(para)))+0.5
y= (1.1*0.5*np.cos(np.deg2rad(para)))+0.5
ax.text(x,y,u'%i\N{DEGREE SIGN}'%para,transform=ax.transAxes,\
horizontalalignment='center',verticalalignment='center')
#generate some data
x = np.random.randint(180,225,size=15)
y = np.random.randint(30,40,size=15)
#into projection
x,y = m(x,-y)
values = np.vstack([x,y])
pdf_x = np.arange(0,361,1)
pdf_y = np.arange(0,91,1)
#into projection
X,Y = np.meshgrid(pdf_x,pdf_y)
X,Y = m(X,-Y)
kernel = stats.gaussian_kde(values)
positions = np.vstack([X.ravel(), Y.ravel()])
Z = np.reshape(kernel(positions).T, X.shape)
xql = np.percentile(x,25)
xqu = np.percentile(x,75)
yql = np.percentile(y,25)
yqu = np.percentile(y,75)
Interp = LinearNDInterpolator((X.flatten(),Y.flatten()),Z.flatten())
ax = plt.gca()
ax.scatter(x,y)
levels = [Interp(xql,yql),Interp(xqu,yqu)]
ax.contour(X,Y,Z,levels=levels,colors='red')
plt.show()

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

How to plot regression predicted data in 3-D contour plot? - python

Related

Associating a colormap based on a Nx1 array to a 3D voxel plot

Plotting a heatmap with interpolation in Python using excel file

Python: Fitting a 3D function using MLPRegressor

scikit-learn: How to use the fitted probability model?

Python Point density plots in polar stereographic projection

Categories

Resources