I have a dataframe with date as index, floats as columns, filled with mostly NaN and a few floats.
I am plotting this dataframe using :
fig, ax = plt.subplots()
plot(df2[11][2:], linestyle='dashed',linewidth=2,label='xx')
ax.set(xlabel='xx', ylabel='xx', title='xx')
ax.grid()
ax.legend()
The plot window open but with no data appearing. But if I use markers instead of line, the data point will appears.
What should I correct to plot my graphs as lines?
edit Thanks, it worked like this :
s1 = np.isfinite(df2[11][2:])
fig, ax = plt.subplots()
plot(df2.index[2:][s1],df2[11][2:].values[s1], linestyle='-',linewidth=2,label='xx')
ax.set(xlabel='xx', ylabel='xx',title='xx')
ax.grid()
ax.legend()
Try
import matplotlib.pyplot as plt
fig = plt.figure()
plt.plot(df2[11][2:], linestyle='dashed',linewidth=2,label='xx')
plt.set(xlabel='xx', ylabel='xx', title='xx')
plt.grid()
plt.legend()
plt.show()
In your case matplotlib won't draw a line between points separated by NaNs. You can mask NaNs or get rid of them. Have a look at the link below, there are some solutions to draw lines skipping NaNs.
matplotlib: drawing lines between points ignoring missing data
Related
I'm trying to plot two datasets into one plot with matplotlib. One of the two plots is misaligned by 1 on the x-axis.
This MWE pretty much sums up the problem. What do I have to adjust to bring the box-plot further to the left?
import matplotlib.pyplot as plt
import pandas as pd
import numpy as np
titles = ["nlnd", "nlmd", "nlhd", "mlnd", "mlmd", "mlhd", "hlnd", "hlmd", "hlhd"]
plotData = pd.DataFrame(np.random.rand(25, 9), columns=titles)
failureRates = pd.DataFrame(np.random.rand(9, 1), index=titles)
color = {'boxes': 'DarkGreen', 'whiskers': 'DarkOrange', 'medians': 'DarkBlue',
'caps': 'Gray'}
fig = plt.figure()
ax1 = fig.add_subplot(111)
ax2 = ax1.twinx()
plotData.plot.box(ax=ax1, color=color, sym='+')
failureRates.plot(ax=ax2, color='b', legend=False)
ax1.set_ylabel('Seconds')
ax2.set_ylabel('Failure Rate in %')
plt.xlim(-0.7, 8.7)
ax1.set_xticks(range(len(titles)))
ax1.set_xticklabels(titles)
fig.tight_layout()
fig.show()
Actual result. Note that its only 8 box-plots instead of 9 and that they're starting at index 1.
The issue is a mismatch between how box() and plot() work - box() starts at x-position 1 and plot() depends on the index of the dataframe (which defaults to starting at 0). There are only 8 plots because the 9th is being cut off since you specify plt.xlim(-0.7, 8.7). There are several easy ways to fix this, as #Sheldore's answer indicates, you can explicitly set the positions for the boxplot. Another way you can do this is to change the indexing of the failureRates dataframe to start at 1 in construction of the dataframe, i.e.
failureRates = pd.DataFrame(np.random.rand(9, 1), index=range(1, len(titles)+1))
note that you need not specify the xticks or the xlim for the question MCVE, but you may need to for your complete code.
You can specify the positions on the x-axis where you want to have the box plots. Since you have 9 boxes, use the following which generates the figure below
plotData.plot.box(ax=ax1, color=color, sym='+', positions=range(9))
I might be doing something wrong but I'm struggling to achieve the below:
# plot bars and lines in the same figure, sharing both x and y axes.
df = some DataFrame with multiple columns
_, ax = plt.subplots()
df[col1].plot(kind='bar', ax=ax)
df[col2].plot(ax=ax, marker='o', ls='-')
ax.legend(loc='best')
I expected to see a chart with both some bars and a line. However, what I end up with is only the line for df[col2], the bars from df[col1] are just not on the chart. Whatever is before df[col2] seem to have been overwritten.
I got around this with:
df[col1].plot(kind='bar', ax=ax, label=bar_labels)
ax.plot(df[col2], marker='o', ls='-', label=line_labels)
ax.legend(loc='best')
However, this isn't perfect as I had to use label tags otherwise legends will not included items for df[col2]...
Anyone out there has a more elegant solution to make both bars and lines show up?
** Edit **
Thanks to #DizietAsahi - Found out that this is a problem with DatetimeIndex as x-values. Filed the following at Pandas:
https://github.com/pydata/pandas/issues/10761#issuecomment-128671523
I wonder if your problem is related to the hold state of your plot...
This works:
df = pd.DataFrame(np.random.random_sample((10,2)), columns=['col1', 'col2'])
fig, ax = plt.subplots()
plt.hold(True)
df['col1'].plot(kind='bar', ax=ax)
df['col2'].plot(ax=ax, marker='o', ls='-')
ax.legend(loc='best')
This only shows the line and not the bar plot
df = pd.DataFrame(np.random.random_sample((10,2)), columns=['col1', 'col2'])
fig, ax = plt.subplots()
plt.hold(False)
df['col1'].plot(kind='bar', ax=ax)
df['col2'].plot(ax=ax, marker='o', ls='-')
ax.legend(loc='best')
Thanks to #DizietAsahi - Found out that this is a problem with DatetimeIndex as x-values. Integer values work with #DizietAsahi's code above.
Filed the following at Pandas:
https://github.com/pydata/pandas/issues/10761#issuecomment-128671523
I have the same problem with using DatetimeIndex as my x-values. I couldn't get the hack at github to work, using Jupyter Notebook Version 4.2.0 with Python 2.7.11. I'm grouping two columns by month to plot as bars, then overlaying raw values as a line graph.
My solution is to use matplotlib to plot the bar charts and the line, this was the only way I could get it to work.
import pandas as pd
import matplotlib
import matplotlib.pyplot as plt
matplotlib.style.use('ggplot')
%matplotlib inline
df = pd.read_excel('data.xlsx', index_col='Date')
fig, ax = plt.subplots(figsize=(15,10))
ax.hold(True)
g = df.groupby(pd.TimeGrouper("M"))
width=10
ax.bar(g.sum().index, g['Money Out'].sum(),width=width, color='r',label='Money Out')
ax.bar(g.sum().index, g['Money in'].sum(),width=-width,color='g', label='Money In')
ax.plot(df[['Balance']], color='black', lw=1, ls='-',label='Balance')
ax.legend(loc='best')
I encounter a plotting issue I don't understand. Below code shall draw a straight line, fill the area above the line with a colour and plot several scattered dots in it. That all works but if I combine scatter and any of line or fill_between I cannot set the plot limits. The plot area is much larger than it had to be.
So how do I set the plot limits?
from matplotlib import pyplot as plt
import numpy as np
x = np.linspace(0,160,100)
MCSample = np.random.normal(112,10,1000)
YSample = np.random.normal(100,2.41,1000)
y_limit = max(160, np.max(YSample))
fig, ax = plt.subplots(1, 1)
ax.plot(x,x, label="Limit State function")
ax.scatter(MCSample,YSample, marker='.', color='b', alpha=0.5)
ax.fill_between(x,y_limit,x, alpha=0.1, color='r')
ax.set_xlim=(0,160)
ax.set_ylim=(0,y_limit)
plt.show()
I'm using Python 3.5.1 and Matplotlib 1.5.1.
In your code you are setting ax.set_xlim to equal (0,160).
All you have to do to make your code work is to get rid of the equal signs as shown below:
ax.set_xlim(0,160)
ax.set_ylim(0,y_limit) # no equals sign on these 2 lines
Now you are applying those limits to the graph rather than defining them to equal the limits.
I am trying to figure a nice way to plot two distplots (from seaborn) on the same axis. It is not coming out as pretty as I want since the histogram bars are covering each other. And I don't want to use countplot or barplot simply because they don't look as pretty. Naturally if there is no other way I shall do it in that fashion, but distplot looks very good. But, as said, the bars are now covering each other (see pic).
Thus is there any way to fit two distplot frequency bars onto one bin so that they do not overlap? Or placing the counts on top of each other? Basically I want to do this in seaborn:
Any ideas to clean it up are most welcome. Thanks.
MWE:
sns.set_context("paper",font_scale=2)
sns.set_style("white")
rc('text', usetex=False)
fig, ax = plt.subplots(figsize=(7,7),sharey=True)
sns.despine(left=True)
mats=dict()
mats[0]=[1,1,1,1,1,2,3,3,2,3,3,3,3,3]
mats[1]=[3,3,3,3,3,4,4,4,5,6,1,1,2,3,4,5,5,5]
N=max(max(set(mats[0])),max(set(mats[1])))
binsize = np.arange(0,N+1,1)
B=['Thing1','Thing2']
for i in range(len(B)):
ax = sns.distplot(mats[i],
kde=False,
label=B[i],
bins=binsize)
ax.set_xlabel('My label')
ax.get_yaxis().set_visible(False)
ax.legend()
plt.show()
As #mwaskom has said seaborn is wrapping matplotlib plotting functions (well to most part) to deliver more complex and nicer looking charts.
What you are looking for is "simple enough" to get it done with matplotlib:
sns.set_context("paper", font_scale=2)
sns.set_style("white")
plt.rc('text', usetex=False)
fig, ax = plt.subplots(figsize=(4,4))
sns.despine(left=True)
# mats=dict()
mats0=[1,1,1,1,1,2,3,3,2,3,3,3,3,3]
mats1=[3,3,3,3,3,4,4,4,5,6,1,1,2,3,4,5,5,5]
N=max(mats0 + mats1)
# binsize = np.arange(0,N+1,1)
binsize = N
B=['Thing1','Thing2']
ax.hist([mats0, mats1], binsize, histtype='bar',
align='mid', label=B, alpha=0.4)#, rwidth=0.6)
ax.set_xlabel('My label')
ax.get_yaxis().set_visible(False)
# ax.set_xlim(0,N+1)
ax.legend()
plt.show()
Which yields:
You can uncomment ax.set_xlim(0,N+1) to give more space around this histogram.
I'm trying to use Python and Matplotlib to plot a number of different data sets. I'm using twinx to have one data set plotted on the primary axis and another on the secondary axis. I would like to have two separate legends for these data sets.
In my current solution, the data from the secondary axis is being plotted over the top of the legend for the primary axis, while data from the primary axis is not being plotted over the secondary axis legend.
I have generated a simplified version based on the example here: http://matplotlib.org/users/legend_guide.html
Here is what I have so far:
import matplotlib.pyplot as plt
import pylab
fig, ax1 = plt.subplots()
fig.set_size_inches(18/1.5, 10/1.5)
ax2 = ax1.twinx()
ax1.plot([1,2,3], label="Line 1", linestyle='--')
ax2.plot([3,2,1], label="Line 2", linewidth=4)
ax1.legend(loc=2, borderaxespad=1.)
ax2.legend(loc=1, borderaxespad=1.)
pylab.savefig('test.png',bbox_inches='tight', dpi=300, facecolor='w', edgecolor='k')
With the result being the following plot:
As shown in the plot, the data from ax2 is being plotted over the ax1 legend and I would like the legend to be over the top of the data. What am I missing here?
Thanks for the help.
You could replace your legend setting lines with these:
ax1.legend(loc=1, borderaxespad=1.).set_zorder(2)
ax2.legend(loc=2, borderaxespad=1.).set_zorder(2)
And it should do the trick.
Note that locations have changed to correspond to the lines and there is .set_zorder() method applied after the legend is defined.
The higher integer in zorder the 'higher' layer it will be painted on.
The trick is to draw your first legend, remove it, and then redraw it on the second axis with add_artist():
legend_1 = ax1.legend(loc=2, borderaxespad=1.)
legend_1.remove()
ax2.legend(loc=1, borderaxespad=1.)
ax2.add_artist(legend_1)
Tribute to #ImportanceOfBeingErnest :
https://github.com/matplotlib/matplotlib/issues/3706#issuecomment-378407795