matplotlib bar chart with data frame row names as legend

matplotlib bar chart with data frame row names as legend - python

I am trying to set the legend of a bar plot using the values of a pandas dataframe. I searched and could not find a solution, I have used another snippet from SO to annotate the bars. The plot generated shows the bars from the series in different colors as I want and even with the values of the bars. In Excel, e.g., you can have a legend that shows the series values as legend. I am trying to get that functionality here.
Here's a MWE:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from pylab import *
import seaborn, itertools
seaborn.set()
def flip(items, ncol):
return itertools.chain(*[items[i::ncol] for i in range(ncol)])
def annotateBars(row, ax=ax):
if row['A'] < 0.2:
color = 'black'
vertalign = 'bottom'
vertpad = 0.02
else:
color = 'white'
vertalign = 'top'
vertpad = -0.02
ax.text(row.name, row['A'] + vertpad, "{:.4f}%".format(row['A']),
zorder=10, rotation=90, color=color,
horizontalalignment='center',
verticalalignment=vertalign,
fontsize=14, weight='heavy')
labels1=["Monday", "Tuesday", "Wednesday", "Thursday", "Friday", "Saturday", "Sunday"]
width = 0.75
my_colors = 'gbkymc'
arr1 = np.random.random((1, 5))
arr1_ind = np.arange((arr1.shape[1]))
df_arr1 = pd.DataFrame(zip(*arr1), index = arr1_ind, columns = ['A'])
ax = df_arr1.plot(kind='bar', width = 0.85, alpha = 0.5, color = my_colors)
# plt.xticks(arr1_ind+width/4, arr1_ind)
ax.set_xticks(arr1_ind)
ax.set_xticklabels([labels1[i] for i in arr1_ind])
hndls, lbls = ax.get_legend_handles_labels()
plt.legend(flip(hndls, 2), flip(labels1, 2), loc='best', ncol=2)
junk = df_arr1.apply(annotateBars, ax=ax, axis=1)
plt.tick_params(
axis='x', # changes apply to the x-axis
which='both', # both major and minor ticks are affected
bottom='off', # ticks along the bottom edge are off
top='off', # ticks along the top edge are off
labelbottom='off') # labels along the bottom edge are off
plt.tight_layout()
plt.show()

It sounds like you're wanting the legend to have one item per color.
Right now, you're only creating a single artist (a single call to bar), so the legend will only have one entry.
As a quick example of doing something similar to what you want:
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
df = pd.DataFrame({
'value':np.random.random(5),
'label':['Monday', 'Tuesday', 'Wednesday', 'Thursday', 'Friday'],
'color':['g', 'b', 'k', 'y', 'm']})
fig, ax = plt.subplots()
# Plot each bar separately and give it a label.
for index, row in df.iterrows():
ax.bar([index], [row['value']], color=row['color'], label=row['label'],
alpha=0.5, align='center')
ax.legend(loc='best', frameon=False)
# More reasonable limits for a vertical bar plot...
ax.margins(0.05)
ax.set_ylim(bottom=0)
# Styling similar to your example...
ax.patch.set_facecolor('0.9')
ax.grid(color='white', linestyle='-')
ax.set(axisbelow=True, xticklabels=[])
plt.show()

Related

How to customize seaborn boxplot with specific color sequence when boxplots have hue

I want to make boxplots with hues but I want to color code it so that each specific X string is a certain color with the hue just being a lighter color. I am able to do a boxplot without a hue. When I incorporate the hue, I get the second boxplot which loses the colors. Can someone help me customize the colors for the figure that contains the hue?
Essentially, its what the answer for this question is but with boxplots.
This is my code:
first boxplot
order=['Ash1','E1A','FUS','p53']
colors=['gold','teal','darkorange','royalblue']
color_dict=dict(zip(order,colors))
fig,ax=plt.subplots(figsize=(25,15))
bp=sns.boxplot(data=df_idrs, x=df_idrs["construct"], y=df_idrs['Norm_Ef_IDR/Ef_GS'],ax=ax,palette=color_dict)
sns.stripplot(ax=ax,y='Norm_Ef_IDR/Ef_GS', x='construct', data=df_idrs,palette=color_dict,
jitter=1, marker='o', alpha=0.4,edgecolor='black',linewidth=1, dodge=True)
ax.axhline(y=1,linestyle="--",color='black',linewidth=2)
plt.legend(loc='upper left', bbox_to_anchor=(1.03, 1))
second boxplot
order=['Ash1','E1A','FUS','p53']
colors=['gold','teal','darkorange','royalblue']
color_dict=dict(zip(order,colors))
fig,ax=plt.subplots(figsize=(25,15))
bp=sns.boxplot(data=df_idrs, x=df_idrs["construct"], y=df_idrs['Norm_Ef_IDR/Ef_GS'],ax=ax, hue=df_idrs["location"])
sns.stripplot(y='Norm_Ef_IDR/Ef_GS', x='construct', data=df_idrs, hue=df_idrs["location"],
jitter=1, marker='o', alpha=0.4,edgecolor='black',linewidth=1, dodge=True)
ax.axhline(y=1,linestyle="--",color='black',linewidth=2)
plt.legend(loc='upper left', bbox_to_anchor=(1.03, 1))
The only thing that changed was the palette to hue. I have seen many examples on here but I am unable to get them to work. Using the second code, I have tried the following:
Nothing happens for this one.
for ind, bp in enumerate(ax.findobj(PolyCollection)):
rgb = to_rgb(colors[ind // 2])
if ind % 2 != 0:
rgb = 0.5 + 0.5 * np.array(rgb) # make whiter
bp.set_facecolor(rgb)
I get index out of range for the following one.
for i in range(0,4):
mybox = bp.artists[i]
mybox.set_facecolor(color_dict[order[i]])

Matplotlib stores the boxes in ax.patches, but there are also 2 dummy patches (used to construct the legend) that need to be filtered away. The dots of the stripplot are stored in ax.collections. There are also 2 dummy collections for the legend, but as those come at the end, they don't form a problem.
Some remarks:
sns.boxplot returns the subplot on which it was drawn; as it is called with ax=ax it will return that same ax
Setting jitter=1in the stripplot will smear the dots over a width of 1. 1 is the distance between the x positions, and the boxes are only 0.4 wide. To avoid clutter, the code below uses jitter=0.4.
Here is some example code starting from dummy test data:
from matplotlib import pyplot as plt
from matplotlib.legend_handler import HandlerTuple
from matplotlib.patches import PathPatch
from matplotlib.colors import to_rgb
import seaborn as sns
import pandas as pd
import numpy as np
np.random.seed(20230215)
order = ['Ash1', 'E1A', 'FUS', 'p53']
colors = ['gold', 'teal', 'darkorange', 'royalblue']
hue_order = ['A', 'B']
df_idrs = pd.DataFrame({'construct': np.repeat(order, 200),
'Norm_Ef_IDR/Ef_GS': (np.random.normal(0.03, 1, 800).cumsum() + 10) / 15,
'location': np.tile(np.repeat(hue_order, 100), 4)})
fig, ax = plt.subplots(figsize=(12, 5))
sns.boxplot(data=df_idrs, x=df_idrs['construct'], y=df_idrs['Norm_Ef_IDR/Ef_GS'], hue='location',
order=order, hue_order=hue_order, ax=ax)
box_colors = [f + (1 - f) * np.array(to_rgb(c)) # whiten colors depending on hue
for c in colors for f in np.linspace(0, 0.5, len(hue_order))]
box_patches = [p for p in ax.patches if isinstance(p, PathPatch)]
for patch, color in zip(box_patches, box_colors):
patch.set_facecolor(color)
sns.stripplot(y='Norm_Ef_IDR/Ef_GS', x='construct', data=df_idrs, hue=df_idrs['location'],
jitter=0.4, marker='o', alpha=0.4, edgecolor='black', linewidth=1, dodge=True, ax=ax)
for collection, color in zip(ax.collections, box_colors):
collection.set_facecolor(color)
ax.axhline(y=1, linestyle='--', color='black', linewidth=2)
handles = [tuple(box_patches[i::len(hue_order)]) for i in range(len(hue_order))]
ax.legend(handles=handles, labels=hue_order, title='hue category',
handlelength=4, handler_map={tuple: HandlerTuple(ndivide=None, pad=0)},
loc='upper left', bbox_to_anchor=(1.01, 1))
plt.tight_layout()
plt.show()

Multiple label positions for same axis in Matplotlib

I have a long bar chart with lots of bars and I wanna improve its reability from axis to the bars.
Suppose I have the following graph:
import seaborn as sns
import numpy as np
import matplotlib.pyplot as plt
y = np.linspace(1,-1,20)
x = np.arange(0,20)
labels = [f'Test {i}' for i in x]
fig, ax = plt.subplots(figsize=(12,8))
sns.barplot(y = y, x = x, ax=ax )
ax.set_xticklabels(labels, rotation=90)
which provides me the following:
All I know is how to change the label position globally across the chart. How can I change the axis layout to be cantered in the middle and change its label position based on a condition (in this case, being higher or lower than 0)? What I want to achieve is:
Thanks in advance =)

You could remove the existing x-ticks and place texts manually:
import matplotlib.pyplot as plt
import seaborn as sns
import numpy as np
y = np.linspace(1,-1,20)
x = np.arange(0,20)
labels = [f'Test {i}' for i in x]
fig, ax = plt.subplots(figsize=(12,8))
sns.barplot(y = y, x = x, ax=ax )
ax.set_xticks([]) # remove existing ticks
for i, (label, height) in enumerate(zip(labels, y)):
ax.text(i, 0, ' '+ label+' ', rotation=90, ha='center', va='top' if height>0 else 'bottom' )
ax.axhline(0, color='black') # draw a new x-axis
for spine in ['top', 'right', 'bottom']:
ax.spines[spine].set_visible(False) # optionally hide spines
plt.show()
Here is another approach, I'm not sure whether it is "more pythonic".
move the existing xaxis to y=0
set the tick marks in both directions
put the ticks behind the bars
prepend some spaces to the labels to move them away from the axis
realign the tick labels depending on the bar value
fig, ax = plt.subplots(figsize=(12, 8))
sns.barplot(y=y, x=x, ax=ax)
ax.spines['bottom'].set_position('zero')
for spine in ['top', 'right']:
ax.spines[spine].set_visible(False)
ax.set_xticklabels([' ' + label for label in labels], rotation=90)
for tick, height in zip(ax.get_xticklabels(), y):
tick.set_va('top' if height > 0 else 'bottom')
ax.tick_params(axis='x', direction='inout')
ax.set_axisbelow(True) # ticks behind the bars
plt.show()

How to annotate a bar plot and add a custom legend

I am trying to draw a Bar chart that looks like the one below, I am not sure how to set a percentage value in each column top, and a legend at the right side. My code snippets below. It's working, however it's missing the percentage value and legend.
import matplotlib.pyplot as plt; plt.rcdefaults()
import numpy as np
import matplotlib.pyplot as plt
objects = ('18-25', '26-30', '31-40', '40-50')
y_pos = np.arange(len(objects))
performance = [13, 18, 16, 3]
width = 0.35 # the width of the bars
plt.bar(y_pos, performance, align='center', alpha=0.5, color=('red', 'green', 'blue', 'yellow'))
plt.xticks(y_pos, objects)
plt.ylabel('%User', fontsize=16)
plt.title('Age of Respondents', fontsize=20)
width = 0.35
plt.show()

The legend colors were slightly different than the plot colors because alpha=0.5, which has been removed.
imagecolorpicker.com was used to select the correct blue and green.
import numpy as np
import matplotlib.pyplot as plt
from matplotlib.patches import Patch
color = ('red', '#00b050', '#00b0f0', 'yellow')
objects = ('18-25', '26-30', '31-40', '40-50')
y_pos = np.arange(len(objects))
performance = [13, 18, 16, 3]
width = 0.35 # the width of the bars
plt.bar(y_pos, performance, align='center', color=color)
plt.xticks(y_pos, objects)
plt.ylim(0, 20) # this adds a little space at the top of the plot, to compensate for the annotation
plt.ylabel('%User', fontsize=16)
plt.title('Age of Respondents', fontsize=20)
# map names to colors
cmap = dict(zip(performance, color))
# create the rectangles for the legend
patches = [Patch(color=v, label=k) for k, v in cmap.items()]
# add the legend
plt.legend(title='Number of Trips', labels=objects, handles=patches, bbox_to_anchor=(1.04, 0.5), loc='center left', borderaxespad=0, fontsize=15, frameon=False)
# add the annotations
for y, x in zip(performance, y_pos):
plt.annotate(f'{y}%\n', xy=(x, y), ha='center', va='center')
Annotation Resources - from matplotlib v3.4.2
Adding value labels on a matplotlib bar chart
How to annotate each segment of a stacked bar chart
Stacked Bar Chart with Centered Labels
How to plot and annotate multiple data columns in a seaborn barplot
How to annotate a seaborn barplot with the aggregated value
stack bar plot in matplotlib and add label to each section
How to add multiple annotations to a barplot
How to plot and annotate a grouped bar chart
How to plot a horizontal stacked bar with annotations

How do I plot percentage labels for a horizontal bar graph in Python?

Can someone please help me plot x axis labels in percentages given the following code of my horizontal bar chart?
Finding it difficult to find as I want a more simplistic chart without x axis labels and ticks.
[Horizontal Bar Chart][1]
# Plot the figure size
plt.figure(figsize= (8,6))
# New variable and plot the question of the data frame in a normalized in a horizontal bar chat.
ax1 = df[q1].value_counts(normalize=True).sort_values().plot(kind="barh", color='#fd6a02', width=0.75, zorder=2)
# Draw vague vertical axis lines and set lines to the back of the order
vals = ax1.get_xticks()
for tick in vals:
ax1.axvline(x=tick, linestyle='dashed', alpha=0.4, color = '#d3d3d3', zorder=1)
# Tot3als to produce a composition ratio
total_percent = df[q1].value_counts(normalize=True) *100
# Remove borders
ax1.spines['right'].set_visible(False)
ax1.spines['top'].set_visible(False)
ax1.spines['left'].set_visible(False)
ax1.spines['bottom'].set_visible(False)
# Set the title of the graph inline with the Y axis labels.
ax1.set_title(q1, weight='bold', size=14, loc = 'left', pad=20, x = -0.16)
# ax.text(x,y,text,color)
for i,val in enumerate(total):
ax1.text(val - 1.5, i, str("{:.2%}".format(total_percent), color="w", fontsize=10, zorder=3)
# Create axis labels
plt.xlabel("Ratio of Responses", labelpad=20, weight='bold', size=12)
Each time I get a EOF error. Can someone help?

It's not based on your code, but I'll customize the answer from the official reference.
The point is achieved with ax.text(), which is a looping process.
import matplotlib.pyplot as plt
import numpy as np
# Fixing random state for reproducibility
np.random.seed(19680801)
plt.rcdefaults()
fig, ax = plt.subplots()
# Example data
people = ('Tom', 'Dick', 'Harry', 'Slim', 'Jim')
y_pos = np.arange(len(people))
performance = 3 + 10 * np.random.rand(len(people))
ax.barh(y_pos, performance, align='center')
ax.set_yticks(y_pos)
ax.set_yticklabels(people)
ax.invert_yaxis() # labels read top-to-bottom
ax.set_xlabel('Performance')
ax.set_title('How fast do you want to go today?')
# Totals to produce a composition ratio
total = sum(performance)
# ax.text(x,y,text,color)
for i,val in enumerate(performance):
ax.text(val - 1.5, i, str("{:.2%}".format(val/total)), color="w", fontsize=10)
plt.show()

Python - dual y axis chart, align zero

I'm trying to create a horizontal bar chart, with dual x axes. The 2 axes are very different in scale, 1 set goes from something like -5 to 15 (positive and negative value), the other set is more like 100 to 500 (all positive values).
When I plot this, I'd like to align the 2 axes so zero shows at the same position, and only the negative values are to the left of this. Currently the set with all positive values starts at the far left, and the set with positive and negative starts in the middle of the overall plot.
I found the align_yaxis example, but I'm struggling to align the x axes.
Matplotlib bar charts: Aligning two different y axes to zero
Here is an example of what I'm working on with simple test data. Any ideas/suggestions? thanks
import pandas as pd
import matplotlib.pyplot as plt
d = {'col1':['Test 1','Test 2','Test 3','Test 4'],'col 2':[1.4,-3,1.3,5],'Col3':[900,750,878,920]}
df = pd.DataFrame(data=d)
fig = plt.figure() # Create matplotlib figure
ax = fig.add_subplot(111) # Create matplotlib axes
ax2 = ax.twiny() # Create another axes that shares the same y-axis as ax.
width = 0.4
df['col 2'].plot(kind='barh', color='darkblue', ax=ax, width=width, position=1,fontsize =4, figsize=(3.0, 5.0))
df['Col3'].plot(kind='barh', color='orange', ax=ax2, width=width, position=0, fontsize =4, figsize=(3.0, 5.0))
ax.set_yticklabels(df.col1)
ax.set_xlabel('Positive and Neg',color='darkblue')
ax2.set_xlabel('Positive Only',color='orange')
ax.invert_yaxis()
plt.show()

I followed the link from a question and eventually ended up at this answer : https://stackoverflow.com/a/10482477/5907969
The answer has a function to align the y-axes and I have modified the same to align x-axes as follows:
def align_xaxis(ax1, v1, ax2, v2):
"""adjust ax2 xlimit so that v2 in ax2 is aligned to v1 in ax1"""
x1, _ = ax1.transData.transform((v1, 0))
x2, _ = ax2.transData.transform((v2, 0))
inv = ax2.transData.inverted()
dx, _ = inv.transform((0, 0)) - inv.transform((x1-x2, 0))
minx, maxx = ax2.get_xlim()
ax2.set_xlim(minx+dx, maxx+dx)
And then use it within the code as follows:
import pandas as pd
import matplotlib.pyplot as plt
d = {'col1':['Test 1','Test 2','Test 3','Test 4'],'col 2' [1.4,-3,1.3,5],'Col3':[900,750,878,920]}
df = pd.DataFrame(data=d)
fig = plt.figure() # Create matplotlib figure
ax = fig.add_subplot(111) # Create matplotlib axes
ax2 = ax.twiny() # Create another axes that shares the same y-axis as ax.
width = 0.4
df['col 2'].plot(kind='barh', color='darkblue', ax=ax, width=width, position=1,fontsize =4, figsize=(3.0, 5.0))
df['Col3'].plot(kind='barh', color='orange', ax=ax2, width=width, position=0, fontsize =4, figsize=(3.0, 5.0))
ax.set_yticklabels(df.col1)
ax.set_xlabel('Positive and Neg',color='darkblue')
ax2.set_xlabel('Positive Only',color='orange')
align_xaxis(ax,0,ax2,0)
ax.invert_yaxis()
plt.show()
This will give you what you're looking for

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

matplotlib bar chart with data frame row names as legend - python

Related

How to customize seaborn boxplot with specific color sequence when boxplots have hue

Multiple label positions for same axis in Matplotlib

How to annotate a bar plot and add a custom legend

How do I plot percentage labels for a horizontal bar graph in Python?

Python - dual y axis chart, align zero

Categories

Resources