Overlay two separate histograms in python - python

I have two separate dataframes that I made into histograms and I want to know how I can overlay them so for each category in the x axis the bar is a different color for each dataframe. This is the code I have for the separate bar graphs.
df1.plot.bar(x='brand', y='desc')
df2.groupby(['brand']).count()['desc'].plot(kind='bar')
I tried this code:
previous = df1.plot.bar(x='brand', y='desc')
current= df2.groupby(['brand']).count()['desc'].plot(kind='bar')
bins = np.linspace(1, 4)
plt.hist(x, bins, alpha=0.9,normed=1, label='Previous')
plt.hist(y, bins, alpha=0.5, normed=0,label='Current')
plt.legend(loc='upper right')
plt.show()
This code is not overlaying the graphs properly. The problem is dataframe 2 doesn't have numeric values so i need to use the count method. Appreciate the help!

You might have to use axes objects in matplotlib. In simple terms, you create a figure with some axes object associated with it, then you can call hist from it. Here's one way you can do it:
fig, ax = plt.subplots(1, 1)
ax.hist(x, bins, alpha=0.9,normed=1, label='Previous')
ax.hist(y, bins, alpha=0.5, normed=0,label='Current')
ax.legend(loc='upper right')
plt.show()

Make use of seaborn's histogram with several variables. In your case it would be:
import seaborn as sns
previous = df1.plot.bar(x='brand', y='desc')
current= df2.groupby(['brand']).count()['desc']
sns.distplot( previous , color="skyblue", label="previous")
sns.distplot( current , color="red", label="Current")

Related

How do I make two subplots with diffrent scales in matplotlib, python?

I wanna make two (sub) plots in one figure, on the first I wanna have log-log scale on second linear-log scale. How do I do that?
Following code doesn't work.
figure, (ax1,ax2) = plt.subplots(1, 2)
plt.xscale("log")
plt.yscale("log")
ax1.plot(indices,pi_singal,linestyle='-')
plt.xscale("log")
plt.yscale("linear")
ax2.plot(indices,max_n_for_f)
plt.show()

Plotting two pandas series together one appears flat

I am practicing with Python Pandas plotting functions and I am trying to plot the content of two series extracted from the same dataframe into one plot.
When I plot the two series individually the result is correct. However, when I plot them together, the one that I plot as second appears flat in the picture.
Here is my code:
# dailyFlow and smooth are created in the same way from the same dataframe
dailyFlow = pd.Series(dataFrame...
smooth = pd.Series(dataFrame...
# lower the noise in the signal with standard deviation = 6
smooth = smooth.resample('D').sum().rolling(31, center=True, win_type='gaussian').sum(std=6)
dailyFlow.plot(style ='-b')
plt.legend(loc = 'upper right')
plt.show()
smooth.plot(style ='-r')
plt.legend(loc = 'upper right')
plt.show()
plt.figure(figsize=(12,5))
smooth.plot(style ='-r')
dailyFlow.plot(style ='-b')
plt.legend(loc = 'upper right')
plt.show()
Here is the output of my function:
I already tried using the parameter secondary_y=True in the second plot, but then I lose the information on the second line in the legend and the scaling between the two plots is wrong.
Many sources on the Internet seem to suggest that plotting the two series like I am doing should be correct, but then why is the third plot incorrect?
Thank you very much for your help.
For the data you have, the 3rd plot is correct. Look at the scale of the y axis on your two plots: one goes up to 70,000 and the other to 60,000,000.
I suspect what you actually want is a .rolling(...).mean() which should have a range comparable to your original data.
If you would like to make both plots bigger, you cold try something like this
fig, ax1 = plt.subplots()
ax1.set_ylim([0, 75000])
# plot first graph
ax2 = ax1.twinx() # second axes that shares the same x-axis
ax2.set_ylim([0, 60000000])
#plot the second graph

Plotting a pandas dataframe as stacked barchart with matplotlib. How to get rid of the overlapping?

I am plotting different columns of a year of hourly data from a Pandas dataframe.
I have written the following function to plot:
def plotResults(dfA):
fig, (ax1,ax2) = plt.subplots(2, sharex=True, sharey=False)
ax1.plot(dfA.index, dfA['LP1'], color='blue',alpha=0.7, label='LP1')
ax1.bar(dfA.index, dfA['b'], color='green',alpha=0.1, rwidth=1, label='b')
ax1.bar(dfA.index, dfA['p'].fillna(0), color='red',alpha=0.1, rwidth=1, label='p', bottom=dfA['b'])
ax1.legend(loc='best')
ax2.plot(dfA.index, dfA['Residual'], color='red',alpha=0.7, label='LP1')
ax2.legend(loc='best')
plt.show()
The bars for the bar charts are overlapping for some reason that I do not understand. I have been trying with width = 1.0/(len(dfA.index)) but then the bars get extremely narrow.
How can I set up the bars so they do not overlap and cover one hour (which is the periodicity of dfA)?
There should be gaps between the red bars in the upper graph. They only have values for some hours.
I havent got to any solution about this overlapping bars in the barchart, but I have found a Workaround that solves the problem for me.
Using a stackplot instead of barchart.
def plotResults(dfA):
fig, (ax1,ax2) = plt.subplots(2, sharex=True, sharey=False)
# Upper Chart
# Linechart
ax1.plot(dfA.index, dfA['LP1'], color='blue',alpha=0.7, label='LP1')
# stackplot
ax1.stackplot(dfA.index,dfA['b'],dfA['p'], label=['b', 'p'], colors=['green','red'] ) #stackplot labels do not show in Legend
ax1.plot([], [], color='red', label='p', linewidth=10) #dummy plots only to show labels in the legend
ax1.plot([], [], color='green', label='b', linewidth=10) #dummy plots only to show labels in the legend
ax1.legend(loc='best')
# Lower Chart - Residuals
ax2.plot(dfA.index, dfA['Residual'], color='red',alpha=0.7, label='Residual')
ax2.legend(loc='best')
plt.show()
Using the stackplot is not an answer to the initial issue but it solves the Problem for me.

Two seaborn distplots one same axis

I am trying to figure a nice way to plot two distplots (from seaborn) on the same axis. It is not coming out as pretty as I want since the histogram bars are covering each other. And I don't want to use countplot or barplot simply because they don't look as pretty. Naturally if there is no other way I shall do it in that fashion, but distplot looks very good. But, as said, the bars are now covering each other (see pic).
Thus is there any way to fit two distplot frequency bars onto one bin so that they do not overlap? Or placing the counts on top of each other? Basically I want to do this in seaborn:
Any ideas to clean it up are most welcome. Thanks.
MWE:
sns.set_context("paper",font_scale=2)
sns.set_style("white")
rc('text', usetex=False)
fig, ax = plt.subplots(figsize=(7,7),sharey=True)
sns.despine(left=True)
mats=dict()
mats[0]=[1,1,1,1,1,2,3,3,2,3,3,3,3,3]
mats[1]=[3,3,3,3,3,4,4,4,5,6,1,1,2,3,4,5,5,5]
N=max(max(set(mats[0])),max(set(mats[1])))
binsize = np.arange(0,N+1,1)
B=['Thing1','Thing2']
for i in range(len(B)):
ax = sns.distplot(mats[i],
kde=False,
label=B[i],
bins=binsize)
ax.set_xlabel('My label')
ax.get_yaxis().set_visible(False)
ax.legend()
plt.show()
As #mwaskom has said seaborn is wrapping matplotlib plotting functions (well to most part) to deliver more complex and nicer looking charts.
What you are looking for is "simple enough" to get it done with matplotlib:
sns.set_context("paper", font_scale=2)
sns.set_style("white")
plt.rc('text', usetex=False)
fig, ax = plt.subplots(figsize=(4,4))
sns.despine(left=True)
# mats=dict()
mats0=[1,1,1,1,1,2,3,3,2,3,3,3,3,3]
mats1=[3,3,3,3,3,4,4,4,5,6,1,1,2,3,4,5,5,5]
N=max(mats0 + mats1)
# binsize = np.arange(0,N+1,1)
binsize = N
B=['Thing1','Thing2']
ax.hist([mats0, mats1], binsize, histtype='bar',
align='mid', label=B, alpha=0.4)#, rwidth=0.6)
ax.set_xlabel('My label')
ax.get_yaxis().set_visible(False)
# ax.set_xlim(0,N+1)
ax.legend()
plt.show()
Which yields:
You can uncomment ax.set_xlim(0,N+1) to give more space around this histogram.

Matplotlib: putting together figure, xaxis, minor_locator, major_locator

I am trying to plot a very basic plot putting several parameters together. This is how far I have come. Unfortunately the documentation and its examples does not cover my issue:
fig=plt.figure(figsize=(50,18), dpi=60)
dax_timeseries_xts.plot(color="blue", linewidth=1.0, linestyle="-", label='DAX')
# dax_timeseries_xts is a XTS with dates as index
ax.xaxis.set_minor_locator(dates.WeekdayLocator(byweekday=(1),interval=1))
ax.xaxis.set_minor_formatter(dates.DateFormatter('%d\n%a'))
ax.xaxis.grid(True, which="minor")
ax.yaxis.grid()
ax.xaxis.set_major_locator(dates.MonthLocator())
ax.xaxis.set_major_formatter(dates.DateFormatter('\n\n\n%b\n%Y'))
plt.tight_layout()
plt.show()
Where do I create the "ax" in order to make this work?
Or maybe I am not efficiently putting the arguments listed above together to create my chart?
fig, ax_f = plt.subplots(nrows=1, ncols=1)
will give you the axes

Categories