I'm trying to plot via:
g = sns.jointplot(x = etas, y = vs, marginal_kws=dict(bins=100), space = 0)
g.ax_joint.set_xscale('log')
g.ax_joint.set_yscale('log')
g.ax_joint.set_xlim(0.01)
g.ax_joint.set_ylim(0.01)
g.ax_joint.set_xlabel(r'$\eta$')
g.ax_joint.set_ylabel("V")
plt.savefig("simple_scatter_plot_Seanborn.png",figsize=(8,8), dpi=150)
Which leaves me with the following image:
This is not what I want. Why are the histograms filled at the end? There are no data points there so I don't get it...
You're setting a log scale on the matplotlib axes, but by the time you are doing that, seaborn has already computed the histogram. So the equal-width bins in linear space appear to have different widths; the lowest bin has a narrow range in terms of actual values, but that takes up a lot of space on the horizontal plot.
Tested in python 3.10, matplotlib 3.5.1, seaborn 0.11.2
Solution: pass log_scale=True to the histograms:
import seaborn as sns
# test dataset
planets = sns.load_dataset('planets')
g = sns.jointplot(data=planets, x="orbital_period", y="distance", marginal_kws=dict(log_scale=True))
without using marginal_kws=dict(log_scale=True)
Compared to setting the scale after the plot is created.
g = sns.jointplot(data=planets, x="orbital_period", y="distance")
g.ax_joint.set_xscale('log')
g.ax_joint.set_yscale('log')
Related
A simple call to plotly's figure_factory routine to create a scatter matrix:
import pandas as pd
import numpy as np
from plotly import figure_factory
df = pd.DataFrame(np.random.randn(40,3))
fig = figure_factory.create_scatterplotmatrix(df, diag='histogram')
fig.show()
yields
My questions are:
How can I specify a single color for all the plots?
How can I set the axes ranges for each of the three variables on the scatter plot?
Is there a way to create a density (normalized) version of the histogram?
Is there a way to include the correlation coefficient (say, computed from df.corr()) in the upper right corner of the non-diagonal plots?
To change to the same color for the first, update the marker attribute color in the generated graph data; to modify the range of axes for the second scatter plot, update the generated data in the same way; since only the x-axis has been modified, use the same technique for the y-axis if necessary; to change to a normalized version of the third histogram To change to the normalized version of the third histogram, replace it with the normalized data. The data to be replaced is the one done in the example specification in Ref. If this does not hit normalization, I believe it is possible to replace it with data obtained with np.histogram(), etc. The fourth is a note, but I have added the data obtained with df.corr() with the graph data reference, specifying the data by axis name for each subplot.
import pandas as pd
import numpy as np
from plotly import figure_factory
np.random.seed(20220529)
df = pd.DataFrame(np.random.randn(40,3))
density = px.histogram(df, x=[0,1,2], histnorm='probability density')
df_corr = df.corr()
fig = figure_factory.create_scatterplotmatrix(df, diag='histogram', height=600, width=600)
# 1.How can I specify a single color for all the plots?
for i in range(9):
fig.data[i]['marker']['color'] = 'blue'
# 2.How can I set the axes ranges for each of the three variables on the scatter plot?
for axes in ['xaxis2','xaxis3','xaxis4','xaxis6','xaxis7']:
fig.layout[axes]['range']=(-4,4)
# 3.Is there a way to create a density (normalized) version of the histogram?
fig['data'][0]['histnorm'] = 'probability density'
fig['data'][4]['histnorm'] = 'probability density'
fig['data'][8]['histnorm'] = 'probability density'
# 4.Is there a way to include the correlation coefficient (say, computed from df.corr())
# in the upper right corner of the non-diagonal plots?
for r,x,y in zip(df_corr.values.flatten(),
['x1','x2','x3','x4','x5','x6','x7','x8','x9'],
['y1','y2','y3','y4','y5','y6','y7','y8','y9']):
if r == 1.0:
pass
else:
fig.add_annotation(x=3.3, y=2, xref=x, yref=y, showarrow=False, text='R:'+str(round(r,2)))
fig.show()
I'd like to generate a single figure that has two y axes: Count (from the histogram) and Density (from the KDE).
I want to use sns.displot in Seaborn >= v 0.11.
import seaborn as sns
df = sns.load_dataset('tips')
# graph 1: This should be the Y-Axis on the left side of the figure
sns.displot(df['total_bill'], kind='hist', bins=10)
# graph 2: This should be the Y-axis on the right side of the figure
sns.displot(df['total_bill'], kind='kde')
The code I've written generates two separate graphs; I could just use a facet grid for two separate graphs, but I want to be more concise and place the two y-axes on the two separate grids into a single figure sharing the same x-axis.
displot() is a figure-level function, which can create multiple subplots inside a figure. As such, you don't have control over individual axes.
To create combined plots, you can use the underlying axes-level functions: histplot() and kdeplot() for Seaborn v.0.11. These functions accept an ax= parameter. twinx() creates a second y-axis.
import matplotlib.pyplot as plt
import seaborn as sns
df = sns.load_dataset('tips')
fig, ax = plt.subplots()
sns.histplot(df['total_bill'], bins=10, ax=ax)
ax2 = ax.twinx()
sns.kdeplot(df['total_bill'], ax=ax2)
plt.tight_layout()
plt.show()
Edit:
As mentioned in the comments, the y-axes aren't aligned. The left axis only tells something about the histogram. E.g. the highest bin having height 68 means that there are exactly 68 total bills between 12.618 and 17.392. The right axis only tells something about the kde. E.g. a y-value of 0.043 for x=20 would mean there is about 4.3 % probability that the total bill would be between 19.5 and 20.5.
To align both similar to sns.histplot(..., kde=True), the area of the histogram can be calculated (bin width times number of data values) and used as a scaling factor. Such scaling would make the area of the histogram and the area below the kde curve equal when measured in pixels:
num_bins = 10
bin_width = (df['total_bill'].max() - df['total_bill'].min()) / num_bins
hist_area = len(df) * bin_width
ax2.set_ylim(ymax=ax.get_ylim()[1] / hist_area)
Note that the right axis would be more similar to a percentage if the histogram would use a bin width with a power of ten (e.g. sns.histplot(..., bins=np.arange(0, df['total_bill'].max()+10, 10)). Which bins would be most suitable strongly depends on how you want to interpret your data.
I want to create a 2D joint plot with the following data and from what I've read Seaborn is the best solution for this
I have completed a desired 1_D line plot, and have attempted to create the joint plot in Seaborn by putting the equations for each plot in the respective axes.
I am expecting the plot on the x axis to look similar to the plot I created using matplotlib and therefore the jointplot should have some vertical lines through the circular region.
However the plot output from seaborn on the x axis appears to have smoothed out many of the data points desired giving a smooth curve.
From reading about Seaborn it may not fit my needs for this kind of data, I have attempted using a matrix also but it did not seem to work with Seaborn.
This is the code I used
#imported as required
import numpy as np
import matplotlib.pyplot as plt
import matplotlib
#Set limits for y values (x - axis)
ymin=-.6
ymax= .6
#Set up an array of angle values between defined y values in mm
angle = np.linspace(np.deg2rad(ymin), np.deg2rad(ymax), 1000)
# Define known values
L = 480
a = 0.09
d = 0.4
lam = 670e-6
# Calculate values for position y, alpha and beta
y = np.tan(angle)*L
alpha = (np.pi*a/lam)*np.sin(angle)
beta = (np.pi*d/lam)*np.sin(angle)
I = ((np.sin(alpha)/alpha)**2)*((np.cos(beta))**2)
# Plot the graph of intensity versus displacement
plt.plot(y, I)
import seaborn as sns
p = ((np.sin(alpha)/alpha)**2)*((np.cos(beta))**2) # Interference term and decaying term
q = (np.sin(alpha)/alpha)**2 # Decaying term
sns.jointplot(x=p, y=q, kind='kde',marginal_kws=dict(bw=0.6),bw=0.8)
plt.show()
You might recognize this as famous the Double Slits Experiment
These are the outputs. Note the smooth plot on Seaborn x axis
edit: I have used JointGrid as follows to plot on the axes in an attempt to solve the problem
g = sns.JointGrid(x=p, y=q)
g.plot_joint(sns.kdeplot)
g.plot_marginals(sns.kdeplot)
I am not familiar with Seaborn syntax, so this simple snippet is all I could get to give an output, which had the same problem as my initial attempt.
I have one questions about matplotlib and contourf.
I am using the last version of matplotlib with python3.7. Basically I have to matrix I want to plot on the same contour plot but using different colormap. One important aspect is that, for instance, if we have zero matrixA and matrixB with shape=(10,10) then the positions in which matrixA is different of zero are the positions in which matrixB are non-zero, and viceversa.
In other words I want to plot in different colors two different mask.
Thanks for your time.
Edited:
I add an example here
import numpy
import matplotlib.pyplot as plt
matrixA=numpy.random.randn(10,10).reshape(100,)
matrixB=numpy.random.randn(10,10).reshape(100,)
mask=numpy.random.uniform(10,10)
mask=mask.reshape(100,)
indexA=numpy.where(mask[mask>0.5])[0]
indexB=numpy.where(mask[mask<=0.5])[0]
matrixA_masked=numpy.zeros(100,)
matrixB_masked=numpy.zeros(100,)
matrixA_masked[indexA]=matrixA[indexA]
matrixB_masked[indexB]=matrixB[indexB]
matrixA_masked=matrixA_masked.reshape(100,100)
matrixB_masked=matrixB_masked.reshape(100,100)
x=numpy.linspace(0,10,1)
X,Y = numpy.meshgrid(x,x)
plt.contourf(X,Y,matrixA_masked,colormap='gray')
plt.contourf(X,Y,matrixB_masked,colormap='winter')
plt.show()
What I want is to be able to use different colormaps that appear in the same plot. So for instance in the plot there will be a part assigned to matrixA with a contour color (and 0 where matrixB take place), and the same to matrixB with a different colormap.
In other works each part of the contourf plot correspond to one matrix. I am plotting decision surfaces of Machine Learning Models.
I stumbled into some errors in your code so I have created my own dataset.
To have two colormaps on one plot you need to open a figure and define the axes:
import numpy
import matplotlib.pyplot as plt
matrixA=numpy.linspace(1,20,100)
matrixA[matrixA >= 10] = numpy.nan
matrixA_2 = numpy.reshape(matrixA,[50,2])
matrixB=numpy.linspace(1,20,100)
matrixB[matrixB <= 10] = numpy.nan
matrixB_2 = numpy.reshape(matrixB,[50,2])
fig,ax = plt.subplots()
a = ax.contourf(matrixA_2,cmap='copper',alpha=0.5,zorder=0)
fig.colorbar(a,ax=ax,orientation='vertical')
b=ax.contourf(matrixB_2,cmap='cool',alpha=0.5,zorder=1)
fig.colorbar(b,ax=ax,orientation='horizontal')
plt.show()
You'll also see I've changed the alpha and zorder
I hope this helps.
Hi all, I am trying to plot the following type of plot using seaborn with a different data set. The problem is when a histogram type is used, I cannot name the bins (like 2-2.5,2.5-3..etc) even though it provides kernel curves. Bar plots dont have function to draw the normal curve like in the picture. The image seems to be used SPSS statistical package which I have little knowledge of.
Following is the closest thing I can get (I have attached the code)
df = pd.DataFrame({'cat': ['1-1.5', '1.5-2', '2-2.5','2.5-3','3-3.5','3.5-4','4-4.5','4.5-5'],'val': [0,0,1,7,7,33,17,10]})
ax = sns.barplot(y = 'val', x = 'cat',
data = df)
ax.set(xlabel='Categories', ylabel='Frequency')
plt.show()
So the problem is of course that you don't have the original data, but data that has already been binned. One could reverse this binning and start with an array of raw data. Then perform the histogramming again and use a sns.distplot which, by default, shows a KDE plot as well.
import seaborn as sns
import matplotlib.pyplot as plt
import numpy as np
cat = ['1-1.5', '1.5-2', '2-2.5','2.5-3','3-3.5','3.5-4','4-4.5','4.5-5']
val = [0,0,1,7,7,33,17,10]
data = []
for i in range(len(cat)):
data.extend([1.25+i*0.5]*val[i])
bins = np.arange(1,5.5, 0.5)
ax = sns.distplot(data, bins=bins, hist_kws= dict(edgecolor="k"))
ax.set(xlabel='Categories', ylabel='Frequency')
ax.set_xticks(bins[:-1]+0.25)
ax.set_xticklabels(cat)
plt.show()
Use the bw keyword argument to the KDE function to set the smoothness of the curve. E.g. sns.distplot(data, bins=bins, kde_kws=dict(bw=0.5), hist_kws= dict(edgecolor="k")) where bw=0.5 produces
Also try bw=0.1, bw=0.25, bw=0.35 and bw=2 to see the differences.