Visualizing Prediction and Test values for comparison [duplicate] - python

This question already has answers here:
How do I equalize the scales of the x-axis and y-axis?
(5 answers)
Closed 8 months ago.
This post was edited and submitted for review 8 months ago and failed to reopen the post:
Original close reason(s) were not resolved
I'd like to make comparing this Prediction and Test values easier, so I'm thinking two ways to achieve that:
Scale the X and Y axis to the same scale
Plot a linear line (y=x)
Really like to have some way to either 'exclude' the outliers or perhaps 'zoom in' to the area where the points are dense, without manually excluding the outliers from the dataset (so its done automatically). Is this possible?
sns.scatterplot(y_pred, y_true)
plt.grid()
Looked around and tested plt.axis('equal') as mentioned on another question but it didn't seem quite right. Tried using plt.plot((0,0), (30,30)) to create the linear plot but it didn't show anything. Any other input on how to visualise this would be really appreciated as well. Thanks!

There are short ways to achieve everything you've suggested:
Force scaled axes with matplotlib.axes.Axes.set_aspect.
Add an infinite line with slope 1 through he origin with matplotlib.axes.Axes.axline
Set your plot to interactive mode, so you can pan and zoom. The way to do this depends on your environment and is explained in the docs.
Best to combine them all.
import matplotlib.pyplot as plt
from numpy import random
plt.ion() # activates interactive mode in most environments
plt.scatter(random.random_sample(10), random.random_sample(10))
ax = plt.gca()
ax.axline((0, 0), slope=1)
ax.set_aspect('equal', adjustable='datalim') # force equal aspect

To plot the linear line:
plt.plot([0,30], [0,30])
To scale x and y axis to same scale (see doc for set_aspect):
plt.xlim(0, 30)
plt.ylim(0, 30)
plt.gca().set_aspect('equal', adjustable='box')
plt.draw()
From the doc for set_aspect:
Axes.set_aspect(aspect, adjustable=None, anchor=None, share=False)
Set the aspect ratio of the axes scaling, i.e. y/x-scale
aspect='equal': same as aspect=1, i.e. same scaling for x and y.

Related

Interpretation of boxplot [duplicate]

This question already has answers here:
Why is matplotlib's notched boxplot folding back on itself?
(1 answer)
strange shape of the boxplot using matplotlib
(1 answer)
Unintended Notched Boxplot from Matplotlib, Error from Seaborn
(2 answers)
Closed last year.
I am trying to create a box plot with matplotlib library of python. The code is given below.
fig, ax = plt.subplots(figsize=(8, 6))
bp = ax.boxplot([corr_df['bi'], corr_df['ndsi'], corr_df['dbsi'], corr_df['mbi']], patch_artist = True, notch ='True', vert = 1)
ax.set_title("Spearman’s correlation coefficient for Soil indices", fontsize=14)
ax.set_xlabel("Indices", fontsize=14)
ax.set_ylabel("Spearman’s correlation coefficient", fontsize=14)
colors = ['#088A08', '#FFFF00','#01DFD7', '#FF00FF', '#3A01DF']
for patch, color in zip(bp['boxes'], colors):
patch.set_facecolor(color)
ax.grid()
ax.set_xticklabels(['bi', 'ndsi', 'dbsi', 'mbi'])
This creates an image like this :
I am not able to understand the 1st and 3rd boxplot. These two (box plots of bi and dbsi) have neck-like structures in them, which the other two boxplots don't have. What does this show? The interpretation of the boxplot as described on the web doesn't include this part.
In your example, the argument notch is set to True so according to the doc, it displays:
notch bool, default: False
Whether to draw a notched boxplot (True), or a rectangular boxplot (False). The notches represent the confidence interval (CI) around the median. The documentation for bootstrap describes how the locations of the notches are computed by default, but their locations may also be overridden by setting the conf_intervals parameter.
Specifically the behavior (flipped appearance) you're describing is documented as follow:
Note
In cases where the values of the CI are less than the lower quartile
or greater than the upper quartile, the notches will extend beyond the
box, giving it a distinctive "flipped" appearance. This is expected
behavior and consistent with other statistical visualization packages.
You will find more details in this answer.

Is it possible to plot array data in imshow with a Y-axis that starts off linear but changes to non-linear steps towards the end

Firstly, a big thanks to everyone who responds to these questions. I've made it this far without having to ask a question because I find that someone before me has inevitably encountered the same issue.
However, I find myself with a question that I've not been able to locate. I would like to plot a 2D array within imshow that started off with a linear Y-axis, which I have had to offset and adjust and is now non-linear after a certain point. Is this possible?
see below for a chart and example.
The orange line is the original Y-axis step which has a linear and regular step.
The blue line has been corrected with an offset and a varying step change towards the end. As seen it is linear up to a point before deviating to a non-linear step at the end.
I am using extent to set the bounds of the axes and as I understand it imshow will plot the data with a regular and linear step between the start and end points. I would like to fix the new (blue) Y-axis reference to the data to be plotted so that the data is presented at the correct position with respect to the Y-Axis value.
As an example I have the following code:
testData = np.array([[1,1,1,1], [2,2,2,2], [3,3,3,3], [4,4,4,4]])
x_axisTest = [1,2,3,4]
y_axisTest = [2,4,8,12]
fig, (ax1) = plt.subplots()
pcm = ax1.imshow(testData, interpolation='nearest', cmap=cm.jet, origin='upper',
aspect='auto', # vmin = 20, vmax = 60,
extent =[x_axisTest[0], x_axisTest[3], y_axisTest[3],
y_axisTest[0]])
As seen the data is linearly plotted even though the Y-axis step changes from 2 (2,4...) to 4 (...8, 12). What I would like is the data to be interpreted or stretched/compressed between Y-axis values 4 to 12 based on the new step value.
I've been looking into resampling the data which is maybe the preferred option but again I'm not sure how best to apply this and ensure I keep the Y-axis matched with the data. My concern is that I may also shift the linear portion of the data. I would appreciate a nudge in the right direction.
Thank you in advance for your assistance.

Add labels ONLY to SELECTED data points in seaborn scatter plot

I have created a seaborn scatter plot and added a trendline to it. I have some datapoints that fall very far away from the trendline (see the ones highlighted in yellow) so I'd like to add data labels only to these points, NOT to all the datapoints in the graph.
Does anyone know what's the best way to do this?
So far I've found answers to "how to add labels to ALL data points" (see this link) but this is not my case.
In the accepted answer to the question that you reference you can see that the way they add labels to all data points is by looping over the data points and calling .text(x, y, string) on the axes. You can find the documentation for this method here (seaborn is implemented on top of matplotlib). You'll have to call this method for the selected points.
In your specific case I don't know exactly what formula you want to use to find your outliers but to literally get the ones beyond the limits of the yellow rectangle that you've drawn you could try the following:
for x,y in zip(xarr, yarr):
if x < 5 and y > 5.5:
ax.text(x+0.01, y, 'outlier', horizontalalignment='left', size='medium', color='black')
Where xarr is your x-values, yarr your y-values and ax the returned axes from your call to seaborn.

Increasing each subplot's size and adjusting their width, matplotlib [duplicate]

This question already has answers here:
How do I change the size of figures drawn with Matplotlib?
(14 answers)
Closed 3 years ago.
I have a dataset from sci-kit learn, fetch_lfw_people.
import matplotlib.pyplot as plt
# plotting faces
fig, ax = plt.subplots(3,5)
# fig.subplots_adjust(wspace=2)
for enumerate_counter, axi in enumerate(ax.flat):
axi.imshow(faces.images[enumerate_counter], cmap='bone')
axi.set(xticks=[], yticks=[],xlabel=faces.target_names[faces.target[enumerate_counter]])
while trying to show images using subplots and labeling each image with proper name, I want to increase the size of each images and also separate them wide enough so that names do not overlap.
I've tried
fig.subplots_adjust(wspace=2)
however this separates images so that names do not overlap however images gets smaller in size.
Anyway I could resolve this issue?
I will give some examples with some sample numbres that may lead you in the right direction:
plt.figure(figsize=(20,10))
OR
fig, ax = plt.subplots(figsize=(20, 10))

Pyplot doesn't use the full space on 2D plots when setting equal ratio

I'm plotting some 2D fields using matplotlib and the fields have to be seen with equal aspect ratio. But when I set the aspect ratio I find that there are unnecessary blank spaces. Please consider the following example:
from matplotlib import pyplot as plt
import numpy as np
x=np.arange(100)
y=np.arange(100)
Y, X = np.meshgrid(y,x)
Z = X + Y
plt.contourf(X, Y, Z)
#plt.axes().set_aspect('equal', 'datalim')
plt.tight_layout()
plt.colorbar()
plt.grid()
plt.show()
If I run that command I get this figure:
However, let's say I uncomment the line that sets the equal ratio . So let's say I include this:
plt.axes().set_aspect('equal', 'datalim')
I get the following output:
Which is a very poor use of space. I can't make the actual plot take better advantage of the figure space no matter how hard I try (I don't have that much knowledge of pyplot).
I there a way to expand the actual data part of the equal-ratio plot so that I have less white space?
Thank you.
The issue you're having is caused by "datalim", which asks the axes to apply the usual limits you would expect from a normal line or scatter plot, e.g. the use of 5% margin on each side of the shown data.
I do not see any reason to use "datalim" here. So you may just leave it out,
plt.axes().set_aspect('equal')
and get a plot with equal aspect and no white space around.

Categories