seaborn plot diffrent histogram and distrubtion on the same plot - python

I want to compare two distributions- one from real data- just plot histogram of cases and function of date and the other from predict model- plot the distribution.
I have two codes, one for each distribution:
only KDE without hist-
ax=sns.displot(PLT2['DATE'],kind="kde")
plt.xticks(rotation=90, fontsize=10)
ax.set(xlim=(datetime.date(2013, 1, 1), datetime.date(2013, 12, 31)))
histogram from real data-
ax=sns.displot(df['DATE'].sort_values(),stat="density")
plt.xticks(rotation=90, fontsize=10)
plt.show()
I want to show those two on the same plot. I tried this code but in return 2 different plots:
ax1=sns.displot(df_2013['DATE'].sort_values(),stat="density")
ax2=sns.displot(PLT2['DATE'],kind="kde")
plt.xticks(rotation=90, fontsize=10)
ax1.set(xlim=(datetime.date(2013, 1, 1), datetime.date(2013, 12, 31)))
ax2.set(xlim=(datetime.date(2013, 1, 1), datetime.date(2013, 12, 31)))
plt.show()
thanks for helping

You need to define the figure first and add subplots to it.
with sns.axes_style("whitegrid"):
fig = plt.figure(figsize=(15,10))
ax1 = fig.add_subplot(211)
ax2 = fig.add_subplot(212)
ax1.plot(df1['Column1'])
ax2.plot(df2['Column1'])
This way you can also plot on the same axis and overlay your plots if you want to. Or you can make plots align under each other in seperate subplots.
Here is an example of superimposed distplots that might be useful to you.
fig = plt.figure(figsize=(7.5,7.5))
ax1 = fig.add_subplot(211)
sns.distplot(Fnormal,ax = ax1, label="Normal Distribution from Filter");
sns.distplot(FilteredReturns, ax =ax1, label="Filtered Returns");
ax1.set_title('Comparison of Filtered Returns and Normal Distribution')
ax1.legend()

Related

Is there a matplotlib function in Python for forcing all subplots inside different figures to have the same x and y axis length?

I'm testing out different way of displaying figures. I have one figure which is made up of 12 subplots split into two columns. Something like...
fig, ax = plt.subplots(6, 2, figsize= (20,26))
I have another code which splits the 12 subplots into 3 different figures based on categorical data. Something like
figA, ax = plt.subplots(5, 1, figsize= (10,23))
figB, ax = plt.subplots(3, 1, figsize= (10,17))
fig2, ax = plt.subplots(4, 1, figsize= (10,20))
Is there a way to ensure all the subplots in every figure have the same x and y axis length?
Answer turns out to be simple. Use a variable that can be scaled by the number of plots in the figure. So, a figure with more plots will have a higher figsize yet equal plot sizes. Something like...
ps = 5 #indicates plot size
figA, ax = plt.subplots(5, 1, figsize= (10, 5*ps))
figB, ax = plt.subplots(3, 1, figsize= (10, 3*ps))
fig2, ax = plt.subplots(4, 1, figsize= (10, 4*ps))

How to plot two series with very different scales in python

I'm a beginner in python. I have to plot two graphs in the same plot. One of my graphs is velocity, which ranges between (-1,1), and the other one is groundwater, which ranges between (10,12). When I use the following code, the graphs become very small.
ax1 = plt.subplot(111)
ax2 = ax1.twinx()
df=pd.read_excel ('final-all-filters-0.6.xlsx')
df['Date']=pd.to_datetime(df['Date'])
date = df['Date']
gwl = df['gwl']
v =df['v']
plt.plot(date,gwl, color='deepskyblue',linewidth=2)
plt.plot(date,v, color='black',linewidth=2)
ax1.grid(axis='y')
ax1.xaxis.set_major_locator(matplotlib.dates.YearLocator())
ax1.xaxis.set_minor_locator(matplotlib.dates.MonthLocator((1,3,5,7,9,11)))
ax1.xaxis.set_major_formatter(matplotlib.dates.DateFormatter("\n%Y"))
ax1.xaxis.set_minor_formatter(matplotlib.dates.DateFormatter("%b"))
ax1.grid(which='minor', alpha=0.3, linestyle='--')
ax1.grid(which='major', alpha=2)
for spine in ax1.spines.values():
spine.set_edgecolor('gray')
ax1.tick_params(axis='x', which='both', colors='gray')
ax1.tick_params(axis='y', colors='gray')
ax1.set_ylabel('v', color='g')
ax2.set_ylabel('GWL', color='b')
plt.show()
But when I add the ax1.set_ylim(-1, 1)and ax2.set_ylim(10, 12) to my code, one of the graph was disappered!
I think it does plot the black graph, but it's out of range. You can check that by adding 11 or something to the black plot value.
Maybe you can try using ax2.set_yticks(np.arange(-1, 1, 0.5)) instead of set_ylim and/or using ax2.autoscale(enable=True, axis=y)

Add grid and change size in barh python

I am using matplotlib to draw horizontal plots. I want to add grids and change size of the plot to avoid overleap of the label. My code looks like this:
baseline = [0.5745,0.5282,0.4923,0.5077,0.5487,0.5385,0.5231]
low = [0.2653,0.3878,0.3673,0.5510,0.2245,0.5714,0.3265]
high = [0.5102,0.5102,0.3673,0.3877,0.5306,0.4286,0.49]
index = ['Bagging','Decision tree','Gussian Naive Bayes','Logistic regression','Random forest','SVM','k-NN']
df = pd.DataFrame({'Baseline': baseline,'ttd lower than median': low,'ttd higher than median': high}, index=index)
plt.figure(figsize = (6,12))
ax.yaxis.grid(color='gray', linestyle='dashed')
ax = df.plot.barh()
and the resulting plot looks like this:
However, it didn't show the grid and "plt.figure(figsize = (6,12))" seems did not work. How can I fix this? Thanks in advance!
Specify the location of the legend by using plt.legend
Making the figure larger won't necessarily make the legend fit better
Show the grid by using plt.grid()
plt.figure(figsize = (6,12)) didn't work, because the dataframe plot wasn't using this axes.
fig, ax = plt.subplots(figsize=(7, 12))
ax.yaxis.grid(color='gray', linestyle='dashed')
df.plot.barh(ax=ax) # ax=ax lets the dataframe plot use the subplot axes
plt.legend(bbox_to_anchor=(1.05, 1), loc='upper left') # place the legend outside
plt.grid() # show the grid
Alternatively, use df.plot.barh(ax=ax, figsize=(7, 12))
p = df.plot.barh(figsize=(7, 8))
p.yaxis.grid(color='gray', linestyle='dashed')
plt.legend(bbox_to_anchor=(1.05, 1), loc='upper left')

Stripplot and lineplot weird result

When I use lineplot or stripplot it works well. But using both the median is shifted; I don't understand why! Thank you for your help.
sns.lineplot(x='quality', y='alcohol', data=df, estimator=np.median, err_style=None)
sns.stripplot(x='quality', y='alcohol', data=df, jitter=True, color='red', alpha=0.2, edgecolor='none')
stripplot
lineplot+stripplot
lineplot
What is happening here is that your first plot is creating an x axis with 0 to n range, and relabeling those x tick with a list of integers from 3 to n, then when the second chart or the stripplot plots on top of this x axis it is using the original number therefore xtick 3 for this new chart starts on labelled xtick 6. Hence the offset.
One way to do correct this is to create a xaxis with a predefined range and then plot both charts on this predefined scale, see example below:
import seaborn as sns
import matplotlib.pyplot as plt
x = [3,4,5,6,7,8]
y = [10, 12, 15, 18, 19, 26]
#First axes creates the error in graphing
fig, ax = plt.subplots(1,2)
sns.lineplot(x=x,y=y, ax=ax[0])
sns.stripplot(x=x, y=y, ax=ax[0])
#Second axes shows correction
xplot = range(len(x))
sns.lineplot(x=xplot,y=y, ax=ax[1])
sns.stripplot(x, y=y, ax=ax[1])
Output:

Plotting multiple series on a line/bar graph with pandas

I'm trying to make a plot of a line and bar on the same graph. I'm close, but I can't solve a few items. Here's what I have so far...
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
data = pd.DataFrame({'Value1': np.arange(80, 180, 1),
'Value2': np.arange(1.5, .5, -0.01)},
index=np.arange(10, 110, 1))
fig, ax = plt.subplots(figsize=(10, 10))
data['Value1'].plot(ax=ax)
ax2 = ax.twinx()
data['Value2'].plot(kind='bar', ax=ax2, color='y', ylim=(0, 3))
So the problems I have with this graph are...
The x-ticks look awful. If I only do a line graph, the x-ticks look fine. As soon as I add the twinx axis however, the major/minor ticks logic get's dropped. How can I keep that?
My x-axes is numeric. Note that the line intercepts the x-axis at the value "10" (its hard to see, but that's what's going on). I presume this is because the line's x-axis is supposed to begin at "10" and the bar's x-axis begins at 10 as well, but there's confusion of the value and label so the line's x-axis get's pushed over the label "20".
What's the best way to do this?
Bar plot and line plot has different X coordinate range is different, consider using two x coordinate.
you can try to save xticks and xtickslabels after data['Value1'].plot(ax=ax) and set them back after data['Value2'].plot(kind='bar', ax=ax2, color='y', ylim=(0, 3)):
data['Value1'].plot(ax=ax)
xticks = ax.get_xticks()
xlabels = [x.get_text() for x in ax.get_xticklabels()]
ax2 = ax.twinx()
data['Value2'].plot(kind='bar', ax=ax2, color='y', ylim=(0, 3))
ax.set_xticks(xticks)
ax.set_xticklabels(xlabels)
plt.show()

Categories