I have two lists, one of them is names, the other one is values. I want y axis to be values, x axis to be names. But names are too long to be put on axis, that's why I want to put them into bars, like on the picture, but bars should be vertical.
On the picture, my namelist represent names of cities.
My input is as such:
mylist=[289.657,461.509,456.257]
nameslist=['Bacillus subtilis','Caenorhabditis elegans','Arabidopsis thaliana']
my code:
fig = plt.figure()
width = 0.35
ax = fig.add_axes([1,1,1,1])
ax.bar(nameslist,mylist,width)
ax.set_ylabel('Average protein length')
ax.set_xlabel('Names')
ax.set_title('Average protein length by bacteria')
Any help appreciated!
ax.text can be used to place text at a given x and y position. To fit into a vertical bar, the text should be rotated 90 degrees. The text could either start at the top, or have its anchor point at the bottom. The alignment should be respectively top or bottom. A fontsize can be chosen to fit well to the image. The text color should be sufficiently contrasting to the color of the bars. An additional space can be used to have some padding.
Alternatively, there is ax.annotate with more options for positioning and decorating.
from matplotlib import pyplot as plt
import numpy as np
mylist = [289.657, 461.509, 456.257]
nameslist = ['Bacillus subtilis', 'Caenorhabditis elegans', 'Arabidopsis thaliana']
fig, ax = plt.subplots()
width = 0.35
ax.bar(nameslist, mylist, width, color='darkorchid')
for i, (name, height) in enumerate(zip(nameslist, mylist)):
ax.text(i, height, ' ' + name, color='seashell',
ha='center', va='top', rotation=-90, fontsize=18)
ax.set_ylabel('Average protein length')
ax.set_title('Average protein length by bacteria')
ax.set_xticks([]) # remove the xticks, as the labels are now inside the bars
plt.show()
Related
My plot function creates horizontal bars per year for data with different size. I have to change the figure size for each set of subplots.
I need to place my two legends on lower center of each figure below the x axis label. The positions need to vary depending on the figure size and remain consistent. So for all produced figures, the legends would look like this figure.
Find a snippet of my dataframe here. I have tried to simplify the code as much as I could and I know the plot is missing some element, but I just want to get to my question's answer, not to create a perfect plot here. I understand probably I need to create a variable for my anchor bounding box but I don't know how. Here is my code:
def plot_bars(data,ax):
""" Plots a single chart of work plan for a specific routeid
data: dataframe with section length and year
Returns: None"""
ax.barh(df['year'], df['sec_len'] , left = df['sec_begin'])
ax.set_yticklabels('')
def plot_fig(df):
# Draw the plots
ax_set = df[['routeid','num_bars']].drop_duplicates('routeid')
route_set = ax_set['routeid'].values
h_ratios = ax_set['num_bars'].values
len_ratio = h_ratios.sum()/BARS_PER_PAGE # Global constant set to 40 based on experiencing
fig, axes = plt.subplots(len(route_set), 1, squeeze=False, sharex=True
, gridspec_kw={'height_ratios':h_ratios}
, figsize=(10.25,7.5*len_ratio))
for i, r in enumerate(route_set):
plot_bars(df[df['routeid']==r], axes[i,0])
plt.xlabel('Section length')
## legends
fig.legend(labels=['Legend2'], loc=8, bbox_to_anchor=(0.5, -0.45))
fig.legend( labels=['Legend1'], loc = 8, bbox_to_anchor=(0.5, -0.3))
## Title
fig.suptitle('title', fontsize=16, y=1)
fig.subplots_adjust(hspace=0, top = 1-0.03/len_ratio)
for df in df_list:
plot_fig(df)
The problem is when the figure size changes, the legends move as in these pictures:
here
here
I think the problem boils down to having the correct relative position with respect to the xlabel, so are right that you need to calculate the bbox_to_anchor using the position of the xlabel and the height/width of the axes. Something like this:
fig, (ax, ax1) = plt.subplots(nrows=2, figsize=(5, 4), gridspec_kw={'height_ratios':[4, 1]})
ax.plot(range(10), range(10), label="myLabel")
ax.set_xlabel("xlabel")
x, y = ax.xaxis.get_label().get_position() # position of xlabel
h, w = ax.bbox.height, ax.bbox.width # height and width of the Axes
leg_pos = [x + 0 / w, y - 55 / h] # this needs to be adjusted according to your needs
fig.legend(loc="lower center", bbox_to_anchor=leg_pos, bbox_transform=ax.transAxes)
plt.show()
I am trying to create a stacked bar chart using PyCharm.
I am using matplotlib to explore at fullest its potentialities for simple data visualization.
My original code is for a group chart bar that displays cycle time for different teams. Such information come from a dataframe. The chart also includes autolabeling function (i.e. the height of each bar = continuous variable).
I am trying to convert such group bar chart in a stacked bar chart. The code below needs to be improved because of two factors:
labels for variables have too many decimals: this issue did not occur for the grouped bar chart. The csv file and the derived data frame weren't altered. I am struggling to understand if and where to use round command. I guess the issue is either related to the autolabeling function, because datatype used is float (I need to see at least 1 decimal).
data labels are displaced: as the auto labeling function was created for separated bars, the labels always matched the distance I wanted (based on the vertical offset). Unfortunately I did not figure out how to make sure that this distance is rather centered (see my example, the value for funnel time is at the height of squad time instead, and vice-versa). By logic, the issue should be that the height of each variable is defined ahead (see rects3 in the code, value of bottom) but I don't know how to reflect this in my auto-labeling variable.
The question is what exactly in the code must be altered in order to have the values of cycle time centered?
The code (notes for you are marked in bold):
import matplotlib.pyplot as plt
import pandas as pd
import numpy as np
'''PART 1 - Preprocess data -----------------------------------------------'''
#Directory or link of my CSV. This can be used also if you want to use API.
csv1 = r"C:\Users\AndreaPaviglianiti\Downloads\CT_Plot_DF.csv"
#Create and read dataframe. This is just to check the DF before plotting
df = pd.read_csv(csv1, sep=',', engine= 'python')
print(df, '\n')
#Extract columns as lists
squads = df['Squad_Name'].astype('str') #for our horizontal axis
funnel = df['Funnel_Time'].astype('float')
squadt = df['Squad_Time'].astype('float')
wait = df['Waiting_Time'].astype('float')
Here I tried to set the rounding but without success
'''PART 2 - Create the Bar Plot / Chart ----------------------------------'''
x = np.arange(len(squads)) #our labels on x will be the squads' names
width = 0.2 # the width of the bars. The bigger value, the larger bars
distance = 0.2
#Create objects that will be used as subplots (fig and ax).
#Each "rects" is the visualization of a yn value. first witdth is distance between X values,
# the second is the real width of bars.
fig, ax = plt.subplots()
rects1 = ax.bar(x, funnel, width, color='red', label='Funnel Time')
rects2 = ax.bar(x, squadt, width, color='green', bottom=funnel, label='Squad Time')
rects3 = ax.bar(x, wait, width, bottom=funnel+squadt, color='purple', label='Waiting Time')
# Add some text for labels, title and custom x-axis tick labels, etc.
ax.set_ylabel('Mean Cycle Time (h)')
ax.set_xlabel('\n Squads')
ax.set_title("Squad's Cycle Time Comparison in Dec-2020 \n (in mean Hours)")
ax.set_xticks(x)
ax.set_xticklabels(squads)
ax.legend()
fig.align_xlabels() #align labels to columns
# The function to display values above the bars
def autolabel(rects):
"""Attach a text label above each bar in *rects*, displaying its height."""
for rect in rects:
height = rect.get_height()
ax.annotate('{}'.format(height),
xy=(rect.get_x() + rect.get_width()/2, height),
xytext=(0, 3), # 3 points vertical offset
textcoords="offset points",
ha='center', va='bottom')
Here I tried to change xytext="center" but I get error, I am supposed to use coordinates only or is there an alternative to change the position from the height to the center?
#We will label only the most recent information. To label both add to the code "autolabel(rects1)"
autolabel(rects1)
autolabel(rects2)
autolabel(rects3)
fig.tight_layout()
'''PART 3 - Execute -------------------------------------------------------'''
plt.show()
Thank you for the help!
I am trying to display a count plot using seaborn, but the width of the bars is very high and the plot doesn't look nice. To counter it I change the width of the plot using the following code snippet:
sns.set()
fig,ax = plt.subplots(figsize=(10,4))
sns.countplot(x=imdb_data["label"],ax=ax)
for patch in ax.patches:
height = p.get_height()
width = patch.get_width
p.set_height(height*0.8)
patch.set_width(width*0.4)
x = p.get_x()
ax.text(x = x+new_width/2.,y= new_height+4,s = height,ha="center")
ax.legend(labels=("Negative","Positive"),loc='lower right')
plt.show()
But upon doing so the x-tick labels get shifted and the plot looks something like as shown in the attached image.
How should I change the width that, the x-tick location also, change automatically as per the new width of the bar ? . Also the legend is not being displayed properly. I used the below snippet to add the legend:
plt.legend(labels=['Positive','Negative'],loc='lower right')
Please help me out.
To keep the bar centered, you also need to change the x position with half the difference of the old and new width. Changing the height doesn't seem to be a good idea, as then the labels on the y-axis get mismatched. If the main reason to change the height is to make space for the text, it would be easier to change the y limits, e.g. via ax.margins(). Aligning the text vertically with 'bottom' allows leaving out the arbitrary offset for the y position.
The labels for the legend can be set via looping through the patches and setting the labels one by one. As the x-axis already has different positions for each bar, it might be better to leave out the legend and change the x tick labels?
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
import seaborn as sns
sns.set()
imdb_data = pd.DataFrame({"label": np.random.randint(0, 2, 7500)})
fig, ax = plt.subplots(figsize=(10, 4))
sns.countplot(x=imdb_data["label"], ax=ax)
for patch, label in zip(ax.patches, ["Negative", "Positive"]):
height = patch.get_height()
width = patch.get_width()
new_width = width * 0.4
patch.set_width(new_width)
patch.set_label(label)
x = patch.get_x()
patch.set_x(x + (width - new_width) / 2)
ax.text(x=x + width/2, y=height, s=height, ha='center', va='bottom')
ax.legend(loc='lower right')
ax.margins(y=0.1)
plt.tight_layout()
plt.show()
PS: To change the x tick labels, so they can be used instead of the legend, add
ax.set_xticklabels(['negative', 'positive'])
and leave out the ax.legend() and patch.set_label(label) lines.
I have been trouble with trying to find a way to display a 3 element list in the form of a table. What I actually care about is drawing the table. I would like to draw a 1by3 table for each ylabel in a plot.
Below is what I have so far. If I can get each Table instance to show up, I will have what I want. Right now a reference to a table appears and I'm not sure why. If you actually look in the center left where the reference locations appear, you can see one 1by3 table.
Is it possible using matplotlib to generate a new table for each ylabel? The table info is directly related to each row in the bar graph, so it's important that I have a way that they line up.
The number of rows in the bar graph is dynamic, so creating 1 table for the whole figure and trying to dynamically line up the rows with the corresponding bar graph is a difficult problem.
# initialize figure
fig = plt.figure()
gs = gridspec.GridSpec(1, 2, width_ratios=[2, 1])
fig.set_size_inches(18.5, 10.5)
ax = fig.add_subplot(gs[0])
#excluded bar graph code
# create ylabels
for row in range(1,len(data)):
ylabel = [str(data[row][0]),str(data[row][1]),str(data[row][2])]
ylabels.append(ylabel)
#attempting to put each ylabel in a 1by3 table to display
pos = np.arange(0.5,10.5,0.5)
axTables = [None] * len(ylabels)
for x in range(0,len(ylabels)):
axTables[x] = fig.add_subplot(gs[0])
ylabels[x] = axTables[x].table(cellText=[ylabels[x]], loc='left')
locsy, labelsy = plt.yticks(pos,ylabels)
First, yticks will expect text as input, it cannot handle other objects. Second, a table needs to sit within an axes.
So in order to get a table at the position of a tick(label) the idea can be to create an axes at the position of a y ticklabel. An option is the use of mpl_toolkits.axes_grid1.inset_locator.inset_axes. Now the difficulty is that this axes needs to be positionned in data coordinates along the y axis, and in figure (or pixel-) coorinates in the horizontal direction. For this one might use a blended transform. The inset_axes allows to give the width and height as absolute measures (in inches) or in relative, which is handy because we can set the width of the axes to 100% of the bounding box, while the height is still some absolute value (we don't want the axes height to depend on the data coordinates!).
In the following a function ax_at_posy creates such axes.
The table would then sit tight inside the axes, such that all columns are the same width. One would still need to make sure the same fontsize is used throughout.
import numpy as np
import matplotlib.pyplot as plt
from mpl_toolkits.axes_grid1.inset_locator import inset_axes
import matplotlib.transforms as mtrans
# General helper function to create an axes at the position of a yticklabel
def ax_at_posy(y, ax=None, width=0.3, leftspace=0.08, height=0.2):
ax = ax or plt.gca()
trans = mtrans.blended_transform_factory(ax.figure.transFigure, ax.transData)
axins = inset_axes(ax, "100%", height,
bbox_to_anchor=(leftspace, y, width-leftspace, 0.05),
bbox_transform=trans, loc="center right", borderpad=0.8)
axins.tick_params(left=False, bottom=False, labelleft=False, labelbottom=False)
axins.axis("off")
return axins
fig, ax = plt.subplots()
fig.subplots_adjust(left=0.4)
ax.scatter(np.random.rand(30), np.random.randint(7, size=30), c=np.random.rand(30))
get_data = lambda i: "".join(np.random.choice(list("abcdefgxyzt0"), size=i+2))
data = np.vectorize(get_data)(np.random.randint(2,6,size=(7,3)))
for i, row in enumerate(data):
axi = ax_at_posy(i, ax=ax, width=0.4)
tab = axi.table(cellText=[list(row)], loc='center', bbox=(0,0,1,1))
tab.auto_set_font_size(False)
tab.set_fontsize(9)
plt.setp(tab.get_celld().values(), linewidth=0.72)
plt.show()
I have for instance the following line drawn in matplotlib
import matplotlib.pyplot as plt
fig = plt.figure()
ax = fig.add_subplot(2,1,1) # two rows, one column, first plot
# This should be a straight line which spans the y axis
# from 0 to 50
line, = ax.plot([0]*50, range(50), color='blue', lw=2)
line2, = ax.plot([10]*100, range(100), color='blue', lw=2)
how can I get how many pixels that straight line is, in the y direction?
Note: I have several of these lines with gaps in between and I would like to put text next to them, however, if there are too many lines, I would need to know how much text I can add, that is the reason why I need the height of the line.
For instance in the attached photo, there is a blue line on the right hand side which is roughly 160 pixels in height. In a height of 160 pixels (with the font I am using) I can fit in roughly 8 lines of text as the height of the text is roughly 12 pixels in height.
How can I get the information on how tall the line is in pixels? Or is there a better way to lay the text out?
In order to obtain the height of a line in units of pixels you can use its bounding box. To make sure the bounding box is the one from the line as drawn on the canvas, you first need to draw the canvas. Then the bounding box is obtained via .line2.get_window_extent(). The difference between the upper end of the bounding box (y1) and the lower end (y0) is then the number of pixels you are looking for.
fig.canvas.draw()
bbox = line2.get_window_extent(fig.canvas.get_renderer())
# at this point you can get the line height:
print "Lineheight in pixels: ", bbox.y1 - bbox.y0
In order to draw text within the y-extent of the line, the following may be useful. Given a fontsize in points, e.g. fontsize = 12, you may calculate the size in pixels and then calculate the number of possible text lines to fit into the range of pixels determined above. Using a blended transform, where where x is in data units and y in pixels allows you to specify the x-coordinate in data units (here x=8) but the y coordinate in a coordinate in pixels calculated from the extent of the line.
import matplotlib.pyplot as plt
import matplotlib.transforms as transforms
fig = plt.figure()
ax = fig.add_subplot(2,1,1)
line, = ax.plot([0]*50, range(50), color='blue', lw=2)
line2, = ax.plot([10]*100, range(100), color='blue', lw=2)
fig.canvas.draw()
bbox = line2.get_window_extent(fig.canvas.get_renderer())
# at this point you can get the line height:
print "Lineheight in pixels: ", bbox.y1 - bbox.y0
#To draw text
fontsize=12 #pt
# fontsize in pixels:
fs_pixels = fontsize*fig.dpi/72.
#number of possible texts to draw:
n = (bbox.y1 - bbox.y0)/fs_pixels
# create transformation where x is in data units and y in pixels
trans = transforms.blended_transform_factory(ax.transData, transforms.IdentityTransform())
for i in range(int(n)):
ax.text(8.,bbox.y0+(i+1)*fs_pixels, "Text", va="top", transform=trans)
plt.show()