How can I create distplot from countplot
plt.rcdefaults()
%config InlineBackend.figure_format='retina'
sns.set_style('darkgrid')
ax = sns.countplot(x='Age',hue='Gender',data=df,edgecolor="None")
ax.tick_params(bottom=False, left=False)
ax.set_axisbelow(True)
for rect in ax.patches:
x = rect.get_x() + rect.get_width()/2.
y = rect.get_height()
try:
ax.annotate("{}".format(int(y)), (x,y), ha='center', va='bottom', clip_on=True)
except:
pass
ax.set_xlabel('Age', color='green')
ax.set_ylabel('Count', color='green')
ax.set_title('Countplot for Age(Gender)', color='tomato',weight='bold')
plt.legend(title='Gender', fontsize='large', loc='upper right').get_frame().set_facecolor('white')
plt.tight_layout()
plt.savefig('files\\Countplot_for_Age(Gender).jpg')
I want distplot for 2 Genders either in same plot or separately
Any suggestions or help will be highly appreciable
The x-axis of a countplot is categorical: it puts one bar for each encountered age, skipping bars when there are no rows for a certain age (21 and 23 in the example). Internally the bars are numbered as 0, 1, 2, ...
The y-axis is the count, which is proportional to the number of rows.
For a distplot, the x-axis are the ages themselves, and the y-axis is a probability distribution, which usually are quite small numbers (the area under the curve is normalized to be 1).
So, as both the x-axis and the y-axis are different, it is better to use separate subplots.
A distplot can be generated directly from the given data. Passing the same ax results in two distplots in the same subplot. A distplot is a combination of a histogram and a kdeplot. If the histogram isn't needed, hist=False leaves
it out, or the kdeplot can be called directly. The shade=True option adds shading to the plot.
from matplotlib import pyplot as plt
import seaborn as sns
import pandas as pd
import numpy as np
NF = 50
NM = 10
df = pd.DataFrame({'Age': np.concatenate([np.random.randint(13, 20, NF) + np.random.randint(2, 7, NF),
np.random.randint(15, 23, NM)]),
'Gender': np.repeat(['female', 'male'], (NF, NM))})
df['Age'] = df['Age'].where((df['Age'] != 21) & (df['Age'] != 23), 20)
sns.set_style('darkgrid')
fig, axs = plt.subplots(ncols=2, figsize=(12, 4))
ax = sns.countplot(x='Age', hue='Gender', data=df, edgecolor="None", ax=axs[0])
ax.tick_params(bottom=False, left=False)
ax.set_axisbelow(True)
for rect in ax.patches:
x = rect.get_x() + rect.get_width() / 2.
y = rect.get_height()
ax.annotate(f"{y:.0f}", (x, y), ha='center', va='bottom', clip_on=True)
ax.set_xlabel('Age', color='green')
ax.set_ylabel('Count', color='green')
ax.set_title('Countplot for Age(Gender)', color='tomato', weight='bold')
ax.legend(title='Gender', fontsize='large', loc='upper right').get_frame().set_facecolor('white')
for gender in ('female', 'male'):
# ax2 = sns.kdeplot(df[df['Gender'] == gender]['Age'], shade=True, ax=axs[1], label=gender)
ax2 = sns.distplot(df[df['Gender'] == gender]['Age'], hist=False, kde_kws={'shade': True}, ax=axs[1], label=gender)
ax2.set_axisbelow(True)
ax2.set_xlabel('Age', color='green')
ax2.set_ylabel('probability distribution', color='green')
ax2.set_title('Distplot for Age(Gender)', color='tomato', weight='bold')
ax2.legend(title='Gender', fontsize='large', loc='upper right').get_frame().set_facecolor('white')
plt.tight_layout()
plt.show()
Related
I am using secondary y-axis and cmap color but when I plot together the color bar cross to my plot
here is my code
fig,ax1=plt.subplots()
ax1 = df_Combine.plot.scatter('Parameter2', 'NPV (MM €)', marker='s', s=500, ylim=(-10,60), c='Lifetime1 (a)', colormap='jet_r', vmin=0, vmax=25, ax=ax1)
graph.axhline(0, color='k')
plt.xticks(rotation=90)
ax2 = ax1.twinx()
ax2.plot(df_Combine_min_select1["CumEnergy1 (kWH)"])
plt.show()
and here is my plotting
anyone can help how to solve this issue?
Thank you
When you let pandas automatically create a colorbar, you don't have positioning options. Therefore, you can create the colorbar in a separate step and provide the pad= parameter to set a wider gap. Default, pad is 0.05, meaning 5% of the width of the subplot.
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
fig, ax1 = plt.subplots()
df_Combine = pd.DataFrame({'Parameter2': np.random.rand(10) * 10,
'NPV (MM €)': np.random.rand(10),
'Lifetime1 (a)': np.random.rand(10) * 25,
})
ax1 = df_Combine.plot.scatter('Parameter2', 'NPV (MM €)', marker='s', s=500, ylim=(-10, 60), c='Lifetime1 (a)',
colormap='jet_r', vmin=0, vmax=25, ax=ax1, colorbar=False)
plt.colorbar(ax1.collections[0], ax=ax1, pad=0.1)
ax2 = ax1.twinx()
ax2.plot(np.random.rand(10))
plt.show()
I have tried a number of different things to fix my chart, from zorder on the plots to plt.rcParams.
I feel that this is such a simple problem but I just dont know where I have gone wrong. As you can see the bottom annotation in cyan blue is unreadable and mashed with the y label.
Ideally, the annotation sits over the y label to a point where text inside annotation is readable.
If possible just for the annotation to sit on top and still overlay the y label..something like this
Any help on this would be greatly appreciated.
ax = df.plot(x=df.columns[0], y=df.columns[1], legend=False, zorder=0, linewidth=1)
y1 =df.loc[:, df.columns[2]].tail(1)
y2= df.loc[:, df.columns[1]].tail(1)
colors = plt.rcParams["axes.prop_cycle"].by_key()["color"]
print(colors)
for var in (y1, y2):
plt.annotate('%0.2f' % var.max(), xy=(1, var.max()), zorder=1, xytext=(8, 0),
xycoords=('axes fraction', 'data'),
textcoords='offset points',
bbox=dict(boxstyle="round", fc=colors[0], ec=colors[0],))
ax2 = ax.twinx()
df.plot(x=df.columns[0], y=df.columns[2], ax=ax2, legend=False, color='#fa8174', zorder=0,linewidth=1)
ax.figure.legend(prop=subtitle_font)
ax.grid(True, color="white",alpha=0.2)
pack = [df.columns[1], df.columns[2], freq[0]]
plt.text(0.01, 0.95,'{0} v {1} - ({2})'.format(df.columns[1], df.columns[2], freq[0]),
horizontalalignment='left',
verticalalignment='center',
transform = ax.transAxes,
zorder=10,
fontproperties=subtitle_font)
ax.text(0.01,0.02,"Sources: FRED, Quandl, #Paul92s",
color="white",fontsize=10,
horizontalalignment='left',
transform = ax.transAxes,
verticalalignment='center',
zorder=20,
fontproperties=subtitle_font)
ax.xaxis.set_major_locator(matplotlib.dates.YearLocator())
ax.xaxis.set_minor_locator(matplotlib.dates.MonthLocator((4,7,10)))
ax.xaxis.set_major_formatter(matplotlib.dates.DateFormatter("%Y"))
ax.xaxis.set_minor_formatter(ticker.NullFormatter()) # matplotlib.dates.DateFormatter("%m")
plt.setp(ax.get_xticklabels(), rotation=0, ha="center", zorder=-1)
plt.setp(ax2.get_yticklabels(), rotation=0, zorder=-1)
plt.setp(ax.get_yticklabels(), rotation=0, zorder=-1)
plt.gcf().set_size_inches(14,7)
ax.set_xlabel('Data as of; {0}'.format(df['Date'].max().strftime("%B %d, %Y")), fontproperties=subtitle_font)
y1 =df.loc[:, df.columns[2]].tail(1)
y2= df.loc[:, df.columns[1]].tail(1)
for var in (y1, y2):
plt.annotate('%0.2f' % var.max(), xy=(1, var.max()), zorder=1,xytext=(8, 0),
xycoords=('axes fraction', 'data'),
textcoords='offset points',
bbox=dict(boxstyle="round", fc="#fa8174", ec="#fa8174"))
plt.title('{0}'.format("FRED Velocity of M2 Money Stock v Trade Weighted U.S. Dollar Index: Broad"),fontproperties=heading_font)
ax.texts.append(ax.texts.pop())
ax.set_facecolor('#181818')
ax.figure.set_facecolor('#181818')
plt.rcParams['axes.axisbelow'] = True
I don't figure out why zorder doesn't work, but you can directly set the label style of tick labels:
import matplotlib.pyplot as plt
import numpy as np
from numpy.random import rand
import matplotlib.patches as mpatches
fig, ax = plt.subplots(1, 1)
ax.plot(rand(100), '^', color='r')
for label in ax.get_xticklabels():
label.set_bbox(dict(facecolor='orange'))
ax1 = ax.twinx()
ax1.plot(rand(100), 'o', color='b')
index_to_add_bbox = [2, 4]
ax1_labels = ax1.get_yticklabels()
for i in index_to_add_bbox:
ax1_labels[i].set_bbox(dict(boxstyle='Circle', facecolor='orange'))
plt.show()
This chart almost looks good but is probably not the way to model this in matplotlib. How to have two horizontal bars that extend to the left and right of vertical line at an x-point to show the change of the two datasets eg SDR from 0.7 to 0.25.
Currently i patch things together with '$-$' markers which make misaligned legends and i am not able to place properly. If i change the figsize the markers start misaligning from the vertical bar at their x-point, eg SDR.
How to model this kind of chart proberly?
layer0 = np.random.random(10)
fig, ax = plt.subplots(1,1, figsize=(15/2,1.5*2.5),)
ind = np.arange(10, dtype=np.float64)*1#cordx
ax.plot(ind[0::2]+0.05, layer0[0::2]-0.04, ls='None', marker='$-$', markersize=40)
ax.plot(ind[1::2]-0.15, layer0[1::2]-0.04, ls='None', marker='$-$', markersize=40)
ax.set_ylim(0,1.05)
ax.set_yticks(np.arange(0, 1.1, step=0.1))
ax.set_xticks(ind[0::2]+0.5)
ax.set_xticklabels( ('SDR', 'SSR', 'SCR', 'RCR', 'GUR') )
plt.grid(b=True)
plt.grid(color='black', which='major', axis='y', linestyle='--', lw=0.2)
plt.show()
Alternatively, you can use a horizontal bar chart barh which is more intuitive in this case. Here the key parameter is left which will shift your horizontal bar charts to left/right.
Following is a complete answer:
import numpy as np
import matplotlib.pyplot as plt
np.random.seed(2)
layer0 = np.random.random(10)
fig, ax = plt.subplots(1,1, figsize=(15/2,1.5*2.5),)
n = 10
width = 0.5
ind = np.arange(n, dtype=np.float64)*1#cordx
ax.barh(layer0[0::2], [width]*int(n/2), height=0.01, left = ind[0::2])
ax.barh(layer0[1::2], [width]*int(n/2), height=0.01, left = ind[0::2]+width)
ax.set_ylim(0,1.05)
ax.set_yticks(np.arange(0, 1.1, step=0.1))
ax.set_xticks(ind[0::2]+0.5)
ax.set_xticklabels( ('SDR', 'SSR', 'SCR', 'RCR', 'GUR') )
plt.grid(b=True)
plt.grid(color='black', which='major', axis='y', linestyle='--', lw=0.2)
plt.show()
up until now i havent thought of bar charts with bottom offset, which seems to be ok:
layer0 = np.random.random(10)
fig, ax = plt.subplots(1,1, figsize=(15/1.3,1.5*2.5),)# sharey=True)
ind = np.arange(10, dtype=np.float64)*1#cordx
height=0.03
width=0.8
ax.bar(ind[0::2]-width/2, height, width=width, bottom=layer0[0::2]-height)
ax.bar(ind[0::2]+width/2, height, width=width, bottom=layer0[1::2]-height)
ax.set_ylim(-0.,1.05)
plt.grid(color='black', which='major', axis='x', linestyle='-', lw=0.8)
I have two graphs to where both have the same x-axis, but with different y-axis scalings.
The plot with regular axes is the data with a trend line depicting a decay while the y semi-log scaling depicts the accuracy of the fit.
fig1 = plt.figure(figsize=(15,6))
ax1 = fig1.add_subplot(111)
# Plot of the decay model
ax1.plot(FreqTime1,DecayCount1, '.', color='mediumaquamarine')
# Plot of the optimized fit
ax1.plot(x1, y1M, '-k', label='Fitting Function: $f(t) = %.3f e^{%.3f\t} \
%+.3f$' % (aR1,kR1,bR1))
ax1.set_xlabel('Time (sec)')
ax1.set_ylabel('Count')
ax1.set_title('Run 1 of Cesium-137 Decay')
# Allows me to change scales
# ax1.set_yscale('log')
ax1.legend(bbox_to_anchor=(1.0, 1.0), prop={'size':15}, fancybox=True, shadow=True)
Now, i'm trying to figure out to implement both close together like the examples supplied by this link
http://matplotlib.org/examples/pylab_examples/subplots_demo.html
In particular, this one
When looking at the code for the example, i'm a bit confused on how to implant 3 things:
1) Scaling the axes differently
2) Keeping the figure size the same for the exponential decay graph but having a the line graph have a smaller y size and same x size.
For example:
3) Keeping the label of the function to appear in just only the decay graph.
Any help would be most appreciated.
Look at the code and comments in it:
import matplotlib.pyplot as plt
import numpy as np
from matplotlib import gridspec
# Simple data to display in various forms
x = np.linspace(0, 2 * np.pi, 400)
y = np.sin(x ** 2)
fig = plt.figure()
# set height ratios for subplots
gs = gridspec.GridSpec(2, 1, height_ratios=[2, 1])
# the first subplot
ax0 = plt.subplot(gs[0])
# log scale for axis Y of the first subplot
ax0.set_yscale("log")
line0, = ax0.plot(x, y, color='r')
# the second subplot
# shared axis X
ax1 = plt.subplot(gs[1], sharex = ax0)
line1, = ax1.plot(x, y, color='b', linestyle='--')
plt.setp(ax0.get_xticklabels(), visible=False)
# remove last tick label for the second subplot
yticks = ax1.yaxis.get_major_ticks()
yticks[-1].label1.set_visible(False)
# put legend on first subplot
ax0.legend((line0, line1), ('red line', 'blue line'), loc='lower left')
# remove vertical gap between subplots
plt.subplots_adjust(hspace=.0)
plt.show()
Here is my solution:
import numpy as np
import matplotlib.pyplot as plt
x = np.linspace(0, 2 * np.pi, 400)
y = np.sin(x ** 2)
fig, (ax1,ax2) = plt.subplots(nrows=2, sharex=True, subplot_kw=dict(frameon=False)) # frameon=False removes frames
plt.subplots_adjust(hspace=.0)
ax1.grid()
ax2.grid()
ax1.plot(x, y, color='r')
ax2.plot(x, y, color='b', linestyle='--')
One more option is seaborn.FacetGrid but this requires Seaborn and Pandas libraries.
Here are some adaptions to show how the code could work to add a combined legend when plotting a pandas dataframe. ax=ax0 can be used to plot on a given ax and ax0.get_legend_handles_labels() gets the information for the legend.
import matplotlib.pyplot as plt
import pandas as pd
import numpy as np
dates = pd.date_range('20210101', periods=100, freq='D')
df0 = pd.DataFrame({'x': np.random.normal(0.1, 1, 100).cumsum(),
'y': np.random.normal(0.3, 1, 100).cumsum()}, index=dates)
df1 = pd.DataFrame({'z': np.random.normal(0.2, 1, 100).cumsum()}, index=dates)
fig, (ax0, ax1) = plt.subplots(nrows=2, sharex=True, gridspec_kw={'height_ratios': [2, 1], 'hspace': 0})
df0.plot(ax=ax0, color=['dodgerblue', 'crimson'], legend=False)
df1.plot(ax=ax1, color='limegreen', legend=False)
# put legend on first subplot
handles0, labels0 = ax0.get_legend_handles_labels()
handles1, labels1 = ax1.get_legend_handles_labels()
ax0.legend(handles=handles0 + handles1, labels=labels0 + labels1)
# remove last tick label for the second subplot
yticks = ax1.get_yticklabels()
yticks[-1].set_visible(False)
plt.tight_layout()
plt.show()
How do we draw an average line (horizontal) for a histogram in using matplotlib?
Right now, I'm able to draw the histogram without any issues.
Here is the code I'm using:
## necessary variables
ind = np.arange(N) # the x locations for the groups
width = 0.2 # the width of the bars
plt.tick_params(axis='both', which='major', labelsize=30)
plt.tick_params(axis='both', which='minor', labelsize=30)
ax2 = ax.twinx()
## the bars
rects1 = ax.bar(ind, PAAE1, width,
color='0.2',
error_kw=dict(elinewidth=2,ecolor='red'),
label='PAAE1')
rects2 = ax.bar(ind+width, PAAE2, width,
color='0.3',
error_kw=dict(elinewidth=2,ecolor='black'),
label='PAAE2')
rects3 = ax2.bar(ind+width+width, AAE1, width,
color='0.4',
error_kw=dict(elinewidth=2,ecolor='red'),
label='AAE1')
rects4 = ax2.bar(ind+3*width, AAE2, width,
color='0.5',
error_kw=dict(elinewidth=2,ecolor='black'),
label='AAE3')
maxi = max(dataset[2])
maxi1 = max(dataset[4])
f_max = max(maxi, maxi1)
lns = [rects1,rects2,rects3,rects4]
labs = [l.get_label() for l in lns]
ax.legend(lns, labs, loc='upper center', ncol=4)
# axes and labels
ax.set_xlim(-width,len(ind)+width)
ax.set_ylim(0, 100)
ax.set_ylabel('PAAE', fontsize=25)
ax2.set_ylim(0, f_max+500)
ax2.set_ylabel('AAE (mW)', fontsize=25)
xTickMarks = dataset[0]
ax.set_xticks(ind+width)
xtickNames = ax.set_xticklabels(xTickMarks)
plt.setp(xtickNames, rotation=90, fontsize=25)
I want to plot the average line for PAAE 1, 2 and AAE 1, 2.
What should I be using to plot the average line?
If you'd like a vertical line to denote the mean use axvline(x_value). This will place a vertical line that always spans the full (or specified fraction of) y-axis. There's also axhline for horizontal lines.
In other works, you might have something like this:
ax.axvline(data1.mean(), color='blue', linewidth=2)
ax.axvline(data2.mean(), color='green', linewidth=2)
As a more complete, but unnecessarily complex example (most of this is nicely annotating the means with curved arrows):
import numpy as np
import matplotlib.pyplot as plt
data1 = np.random.normal(0, 1, 1000)
data2 = np.random.normal(-2, 1.5, 1000)
fig, ax = plt.subplots()
bins = np.linspace(-10, 5, 50)
ax.hist(data1, bins=bins, color='blue', label='Dataset 1',
alpha=0.5, histtype='stepfilled')
ax.hist(data2, bins=bins, color='green', label='Dataset 2',
alpha=0.5, histtype='stepfilled')
ax.axvline(data1.mean(), color='blue', linewidth=2)
ax.axvline(data2.mean(), color='green', linewidth=2)
# Add arrows annotating the means:
for dat, xoff in zip([data1, data2], [15, -15]):
x0 = dat.mean()
align = 'left' if xoff > 0 else 'right'
ax.annotate('Mean: {:0.2f}'.format(x0), xy=(x0, 1), xytext=(xoff, 15),
xycoords=('data', 'axes fraction'), textcoords='offset points',
horizontalalignment=align, verticalalignment='center',
arrowprops=dict(arrowstyle='-|>', fc='black', shrinkA=0, shrinkB=0,
connectionstyle='angle,angleA=0,angleB=90,rad=10'),
)
ax.legend(loc='upper left')
ax.margins(0.05)
plt.show()