I am trying to plot a python bar chart. Here is my code and an image of my bar chart. The problems I am facing are:
I want to write name of each category of bar chart on the x-axis as CAT1, CAT2, CAT3, CAT4. Right now it's printing 0, 1, 2 on the x-axis.
I want to change the purple color of the bar chart.
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
df = pd.DataFrame([['CAT1',9,3,24,46,76], ['CAT2', 48,90,42,56,68], ['CAT3', 31,24,28,11,90],
['CAT4', 76,85,16,65,91]],
columns=['metric', 'A', 'B', 'C', 'D', 'E'])
df.plot(
kind='bar',
stacked=False
)
plt.legend(labels=['A', 'B', 'C', 'D', 'E'], ncol=4, loc='center', fontsize=15, bbox_to_anchor=(0.5, 1.06))
plt.show()
By default, matplotlib recognizes the index of your dataframe as x-labels.
I suggest you to add the following to make the column metric as the index, which allows matplotlib to automatically add label for you.
df = df.set_index('metric')
Related
Here is my example, I can't get different bar colors defined.... for some reason all are red.
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
# initiliaze a dataframe with index and column names
idf = pd.DataFrame.from_items([('A', [1, 2, 3]), ('B', [4, 5, 6]), ('C', 10,
20, 30]), ('D', [14, 15, 16])], orient='index', columns=['x', > 'y', 'z'])
# Plot the clustermap which will be a figure by itself
cax = sns.clustermap(idf, col_cluster=False, row_cluster=True)
# Get the column dendrogram axis
cax_col_dend_ax = cax.ax_col_dendrogram.axes
# Plot the boxplot on the column dendrogram axis
idf.iloc[0,:].plot(kind='bar', ax=cax_col_dend_ax, color = ['r', 'g', 'b'])
# Show the plot
plt.show()
Your code works fine for me. It seems you are using old python version because I got a FutureWarning: from_items is deprecated.. Although this is from pandas but you might want to upgrade. Nevertheless, you can still change the colors as follows
import matplotlib as mpl
# Your code here
ax1 = idf.iloc[0,:].plot.bar(ax=cax_col_dend_ax)
colors = ['r', 'g', 'b']
bars = [r for r in ax1.get_children() if isinstance(r, mpl.patches.Rectangle)]
for i, bar in enumerate(bars[0:3]):
bar.set_color(colors[i])
I'm trying t create a graphic with three stacked bar graphics, like so:
I actually have two questions:
1) I'm trying to use the 'position' parameter in Pandas' DataFrame, but the bars still overlap. Is there another alternative other than reducing the width of the bars?
2) The three bars have three categories in common (B, C, D, E), how can I have a legend that only contains the actual six different categories ?
My DataFrame is:
A B C D E F
0 0.108858 0.265929 0.537369 2.183963 1.353575 2.938775
1 0.375641 0.198720 0.266806 0.409179 0.286645 0.636405
2 1.179256 0.808986 0.171202 0.946194 0.506783 2.121366
3 1.510399 1.218619 0.307752 0.819865 1.283067 0.213556
And my test code is:
import matplotlib
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
df = pd.DataFrame(abs(np.random.randn(4, 6)), columns=list('ABCDEF'))
print(df)
colors1 = ['#a50f15', '#9ecae1', '#6baed6', '#3182bd']
colors2 = ['#fb6a4a', '#9ecae1', '#6baed6', '#3182bd']
colors3 = ['#fcbba1', '#9ecae1', '#6baed6', '#3182bd']
fig, ax = plt.subplots(figsize=(8,6))
df.plot(ax=ax, y=['A', 'B', 'C', 'D'], kind='bar', stacked=True, width=0.15, color=colors1, position=0)
df.plot(ax=ax, y=['E', 'B', 'C', 'D'], kind='bar', stacked=True, width=0.15, color=colors2, position=0.5)
df.plot(ax=ax, y=['F', 'B', 'C', 'D'], kind='bar', stacked=True, width=0.15, color=colors3, position=1)
ax.legend(ncol=3)
plt.tight_layout()
plt.show()
I have created a plot bar using plotly. An xticklabel is under each bar. Is it possible to shift the xticklabels a bit to the right or the left or even in the middle between two ticks?
import plotly
import pandas as pd
from plotly.graph_objs import *
json_file = {'y': [0, 1, 2, 3, 1]}
df = pd.DataFrame(json_file, index=['a', 'b', 'c', 'd', 'e'])
trace1 = Bar(
x=df.index,
y=df['y'])
layout = Layout(
xaxis=XAxis(
ticks=df.index,
tickvals=df.index))
data = Data([trace1])
fig = Figure(data=data, layout=layout)
plotly.offline.plot(fig)
A part of the result is the following:
Is there a way to place b between the two bars?
I'm trying to plot a dataframe to a few subplots using pandas and matplotlib.pyplot. But I want to have the two columns use different y axes and have those shared between all subplots.
Currently my code is:
import pandas as pd
import matplotlib.pyplot as plt
df = pd.DataFrame({'Area':['A', 'A', 'A', 'B', 'B', 'C','C','C','D','D','D','D'],
'Rank':[1,2,3,1,2,1,2,3,1,2,3,4],
'Count':[156,65,152,70,114,110,195,92,44,179,129,76],
'Value':[630,426,312,191,374,109,194,708,236,806,168,812]}
)
df = df.set_index(['Area', 'Rank'])
fig = plt.figure(figsize=(6,4))
for i, l in enumerate(['A','B','C','D']):
if i == 0:
sub1 = fig.add_subplot(141+i)
else:
sub1 = fig.add_subplot(141+i, sharey=sub1)
df.loc[l].plot(kind='bar', ax=sub1)
This produces:
This works to plot the 4 graphs side by side which is what I want but both columns use the same y-axis I'd like to have the 'Count' column use a common y-axis on the left and the 'Value' column use a common secondary y-axis on the right.
Can anybody suggest a way to do this? My attempts thus far have lead to each graph having it's own independent y-axis.
To create a secondary y axis, you can use twinax = ax.twinx(). Once can then join those twin axes via the join method of an axes Grouper, twinax.get_shared_y_axes().join(twinax1, twinax2). See this question for more details.
The next problem is then to get the two different barplots next to each other. Since I don't think there is a way to do this using the pandas plotting wrappers, one can use a matplotlib bar plot, which allows to specify the bar position quantitatively. The positions of the left bars would then be shifted by the bar width.
import pandas as pd
import matplotlib.pyplot as plt
df = pd.DataFrame({'Area':['A', 'A', 'A', 'B', 'B', 'C','C','C','D','D','D','D'],
'Rank':[1,2,3,1,2,1,2,3,1,2,3,4],
'Count':[156,65,152,70,114,110,195,92,44,179,129,76],
'Value':[630,426,312,191,374,109,194,708,236,806,168,812]}
)
df = df.set_index(['Area', 'Rank'])
fig, axes = plt.subplots(ncols=len(df.index.levels[0]), figsize=(6,4), sharey=True)
twinaxes = []
for i, l in enumerate(df.index.levels[0]):
axes[i].bar(df["Count"].loc[l].index.values-0.4,df["Count"].loc[l], width=0.4, align="edge" )
ax2 = axes[i].twinx()
twinaxes.append(ax2)
ax2.bar(df["Value"].loc[l].index.values,df["Value"].loc[l], width=0.4, align="edge", color="C3" )
ax2.set_xticks(df["Value"].loc[l].index.values)
ax2.set_xlabel("Rank")
[twinaxes[0].get_shared_y_axes().join(twinaxes[0], ax) for ax in twinaxes[1:]]
[ax.tick_params(labelright=False) for ax in twinaxes[:-1]]
axes[0].set_ylabel("Count")
axes[0].yaxis.label.set_color('C0')
axes[0].tick_params(axis='y', colors='C0')
twinaxes[-1].set_ylabel("Value")
twinaxes[-1].yaxis.label.set_color('C3')
twinaxes[-1].tick_params(axis='y', colors='C3')
twinaxes[0].relim()
twinaxes[0].autoscale_view()
plt.show()
I'm running Pandas 0.16.2 and Matplotlib 1.4.3. I have this issue coloring the median of the boxplot generated by the following code:
df = pd.DataFrame(np.random.rand(10, 5), columns=['A', 'B', 'C', 'D', 'E'])
fig, ax = plt.subplots()
medianprops = dict(linestyle='-', linewidth=2, color='blue')
bp = df.boxplot(medianprops=medianprops)
plt.show()
That returns:
It appears that the color setting is not read. Changing only the settings of linestyle and linewidth the plot reacts correctly.
medianprops = dict(linestyle='-.', linewidth=5, color='blue')
Anyone can reproduce it?
Looking at the code for DataFrame.boxplot() there is some special code to handle the colors of the different elements that supersedes the kws passed to matplotlib's boxplot. In theory, there seem to be a way to pass a color= argument containing a dictionary with keys being 'boxes', 'whiskers', 'medians', 'caps' but I can't seem to get it to work when calling boxplot() directly.
However, this seem to work:
df.plot(kind='box', color={'medians': 'blue'},
medianprops={'linestyle': '--', 'linewidth': 5})
see Pandas Boxplot Examples
Actually the following workaround works well, returning a dict from the boxplot command:
df = pd.DataFrame(np.random.rand(10, 5), columns=['A', 'B', 'C', 'D', 'E'])
fig, ax = plt.subplots()
bp = df.boxplot(return_type='dict')
and then assign directly colors and linewidth to the medians with:
[[item.set_color('r') for item in bp[key]['medians']] for key in bp.keys()]
[[item.set_linewidth(0.8) for item in bp[key]['medians']] for key in bp.keys()]