Apologies, rather unskilled with programming and stackoverflow too. I am drawing bar plots on some data and have managed to add percentages beside the bars, using ax.annotate. However for the bar with highest responses I always get part of the percentage number outside the figure box, as per image below. Have tried different ideas but none worked to fix this. Looking for some suggestions on how to fix this.
Here is my code
from matplotlib import pyplot as plt
import seaborn as sns
def plot_barplot(df):
plt.rcParams.update({'font.size': 18})
sns.set(font_scale=2)
if (len(df) > 1):
fig = plt.figure(figsize=(12,10))
ax = sns.barplot(x='count', y=df.columns[0], data=df, color='blue')
else:
fig = plt.figure(figsize=(5,7))
ax = sns.barplot(x=df.columns[0], y='count', data=df, color='blue')
fig.set_tight_layout(True)
plt.rcParams.update({'font.size': 14})
total = df['count'].sum()
for p in ax.patches:
percentage ='{:.2f}%'.format(100 * p.get_width()/total)
print(percentage)
x = p.get_x() + p.get_width() + 0.02
y = p.get_y() + p.get_height()/2
ax.annotate(percentage, (x, y))
Dataframe looks like this
I would suggest you increase the axes' margins (in the x direction in that case). That is the space there is between the maximum of your data and the maximum scale on the axis. You will have to play around with the value depending on your needs, but it looks like a value of 0.1 or 0.2 should be enough.
add:
plt.rcParams.update({'axes.xmargin': 0.2})
to the top of your function
full code:
from matplotlib import pyplot as plt
import seaborn as sns
import pandas as pd
def plot_barplot(df):
plt.rcParams.update({'font.size': 18})
plt.rcParams.update({'axes.xmargin': 0.1})
sns.set(font_scale=2)
if (len(df) > 1):
fig = plt.figure(figsize=(12, 10))
ax = sns.barplot(x='count', y=df.columns[0], data=df, color='blue')
else:
fig = plt.figure(figsize=(5, 7))
ax = sns.barplot(x=df.columns[0], y='count', data=df, color='blue')
fig.set_tight_layout(True)
plt.rcParams.update({'font.size': 14})
total = df['count'].sum()
for p in ax.patches:
percentage = '{:.2f}%'.format(100 * p.get_width() / total)
print(percentage)
x = p.get_x() + p.get_width() + 0.02
y = p.get_y() + p.get_height() / 2
ax.annotate(percentage, (x, y))
df = pd.DataFrame({'question': ['Agree', 'Strongly agree'], 'count': [200, 400]})
plot_barplot(df)
plt.show()
Related
So I have this plot here:
What I want to do is to have every second element of yaxis to be coloured for example in blue and the rest in red.
Here is the result I want to get:
and here is the code I got:
import matplotlib as mpl
import matplotlib.pyplot as plt
import numpy as np
mpl.rcParams['toolbar'] = 'None'
plt.style.use('fivethirtyeight')
result_7_s = amount * s_7_days
result_14_s = amount * s_14_days
result_21_s = amount * s_21_days
result_7_fc = amount * fc_7_days
result_14_fc = amount * fc_14_days
result_21_fc = amount * fc_21_days
final_y = np.array([int(result_7_s), int(result_14_s),
int(result_21_s), int(result_7_fc),
int(result_14_fc), int(result_21_fc)])
fig, ax = plt.subplots(num = 'Test')
x = np.array([7, 14, 21])
plt.xticks(ticks = x, labels = x)
plt.yticks(ticks = final_y, labels = final_y)
plt.title(f'Prices for {amount} people')
plt.xlabel('Days')
plt.ylabel('Price')
plt.tight_layout()
ax.bar(x - 0.5, final_y[:3], width=1, color='#444444', label='Standard')
ax.bar(x + 0.5, final_y[3:], width=1, color='#e5ae38', label='First Class')
ax.tick_params(axis='y', colors = 'blue') # <-------
ax.yaxis.set_major_formatter('{x}$')
plt.legend()
plt.savefig('result.png')
plt.show()
Iterate over the tick labels to apply the desired color to each one of them:
for n, tick_label in enumerate(ax.yaxis.get_ticklabels()):
tick_label.set_color("red" if n%2 else "blue")
Here is the solution I came with:
for i in range(0, 3):
plt.gca().get_yticklabels()[i].set_color('blue')
for i in range(3, 6):
plt.gca().get_yticklabels()[i].set_color('red')
Im not sure if i use the wrong data or if there is and edit i need to do and not seeing it. It would be nice if someone could take a look at the code. The problem here is that yerr at the first bar is at x=0 and in the image the yerr is somewhere around 2.5
Does someone know what i did wrong or forgot to edit?
the end result should be:
my code:
import numpy as np
import matplotlib.pyplot as plt
np.random.seed(1)
y_raw = np.random.randn(1000).cumsum() + 15
x_raw = np.linspace(0, 24, y_raw.size)
x_pos = x_raw.reshape(-1, 100).min(axis=1)
y_avg = y_raw.reshape(-1, 100).mean(axis=1)
y_err = y_raw.reshape(-1, 100).ptp(axis=1)
bar_width = x_pos[1] - x_pos[0]
x_pred = np.linspace(0, 30)
y_max_pred = y_avg[0] + y_err[0] + 2.3 * x_pred
y_min_pred = y_avg[0] - y_err[0] + 1.2 * x_pred
barcolor, linecolor, fillcolor = 'wheat', 'salmon', 'lightblue'
fig, axes = fig, ax = plt.subplots()
axes.set_title(label="Future Projection of Attitudes", fontsize=15)
plt.xlabel('Minutes since class began', fontsize=12)
plt.ylabel('Snarkiness (snark units)', fontsize=12)
fig.set_size_inches(8, 6, forward=True)
axes.fill_between(x_pred, y_min_pred, y_max_pred ,color='lightblue')
axes.plot(x_raw, y_raw, color='salmon')
vert_bars = axes.bar(x_pos, y_avg, yerr=y_err, color='wheat', width = bar_width, edgecolor='grey',error_kw=dict(lw=1, capsize=5, capthick=1, ecolor='gray'))
axes.set(xlim=[0, 30], ylim=[0,100])
plt.show()
yerr is meant to be the difference between the mean and the min/max. Now you're using the full difference between max and min. You might divide it by 2 to get a better approximation. To obtain the exact values, you could calculate them explicitly (see code example).
Further, by default, the bars are center aligned vs their x-position. You can use align='edge' to left-align them (as x_pos is calculated as the minimum of the range the bar represents). You could also set clip_on=False in the err_kw to make sure the error bars are never clipped by the axes.
import numpy as np
import matplotlib.pyplot as plt
np.random.seed(1)
y_raw = np.random.randn(1000).cumsum() + 15
x_raw = np.linspace(0, 24, y_raw.size)
x_pos = x_raw.reshape(-1, 100).min(axis=1)
y_avg = y_raw.reshape(-1, 100).mean(axis=1)
y_min = y_raw.reshape(-1, 100).min(axis=1)
y_max = y_raw.reshape(-1, 100).max(axis=1)
bar_width = x_pos[1] - x_pos[0]
x_pred = np.linspace(0, 30)
y_max_pred = y_avg[0] + y_err[0] + 2.3 * x_pred
y_min_pred = y_avg[0] - y_err[0] + 1.2 * x_pred
barcolor, linecolor, fillcolor = 'wheat', 'salmon', 'lightblue'
fig, ax = plt.subplots(figsize=(8, 6))
ax.set_title(label="Future Projection of Attitudes", fontsize=15)
ax.set_xlabel('Minutes since class began', fontsize=12)
ax.set_ylabel('Snarkiness (snark units)', fontsize=12)
ax.fill_between(x_pred, y_min_pred, y_max_pred, color='lightblue')
ax.plot(x_raw, y_raw, color='salmon')
vert_bars = ax.bar(x_pos, y_avg, yerr=(y_avg - y_min, y_max - y_avg),
color='wheat', width=bar_width, edgecolor='grey', align='edge',
error_kw=dict(lw=1, capsize=5, capthick=1, ecolor='grey', clip_on=False))
ax.set(xlim=[0, 30], ylim=[0, 100])
plt.tight_layout()
plt.show()
i want visualyze with seaborn and add the text. this my code:
# barplot price by body-style
fig, ax = plt.subplots(figsize = (12,8))
g = data[['body-style','price']].groupby(by = 'body-
style').sum().reset_index().sort_values(by='price')
x = g['body-style']
y = g['price']
ok = sns.barplot(x,y, ci = None)
ax.set_title('Price By Body Style')
def autolabel(rects):
for idx,rect in enumerate(ok):
height = rect.get_height()
g.text(rect.get_x() + rect.get_width()/2., 0.2*height,
g['price'].unique().tolist()[idx],
ha='center', va='bottom', rotation=90)
autolabel(ok)
but i go error:
You need a few changes:
As you already created the ax, you need sns.barplot(..., ax=ax).
autolabel() needs to be called with the list of bars as argument. With seaborn you get this list via ax.patches.
for idx,rect in enumerate(ok): shouldn't use ok but rects.
You can't use g.text. g is a dataframe and doesn't have a .text function. You need ax.text.
Using g['price'].unique().tolist()[idx] as the text to print doesn't have any relationship with the plotted bars. You could use height instead.
Here is some test code with toy data:
import matplotlib.pyplot as plt
import seaborn as sns
import numpy as np
fig, ax = plt.subplots(figsize=(12, 8))
g = data[['body-style','price']].groupby(by = 'body-style').sum().reset_index().sort_values(by='price')
x = g['body-style']
y = g['price']
# x = list('abcdefghij')
# y = np.random.randint(20, 100, len(x))
sns.barplot(x, y, ci=None, ax=ax)
ax.set_title('Price By Body Style')
def autolabel(rects):
for rect in rects:
height = rect.get_height()
ax.text(rect.get_x() + rect.get_width() / 2., 0.2 * height,
height,
ha='center', va='bottom', rotation=90, color='white')
autolabel(ax.patches)
plt.show()
PS: You can change the fontsize of the text via a parameter to ax.text: ax.text(..., fontsize=14).
I have an issue with customizing the legend of my plot. I did lot's of customizing but couldnt get my head around this one. I want the symbols (not the labels) to be equally spaced in the legend. As you can see in the example, the space between the circles in the legend, gets smaller as the circles get bigger.
any ideas?
Also, how can I also add a color bar (in addition to the size), with smaller circles being light red (for example) and bigger circle being blue (for example)
here is my code so far:
import pandas as pd
import matplotlib.pyplot as plt
from vega_datasets import data as vega_data
gap = pd.read_json(vega_data.gapminder.url)
df = gap.loc[gap['year'] == 2000]
fig, ax = plt.subplots(1, 1,figsize=[14,12])
ax=ax.scatter(df['life_expect'], df['fertility'],
s = df['pop']/100000,alpha=0.7, edgecolor="black",cmap="viridis")
plt.xlabel("X")
plt.ylabel("Y");
kw = dict(prop="sizes", num=6, color="lightgrey", markeredgecolor='black',markeredgewidth=2)
plt.legend(*ax.legend_elements(**kw),bbox_to_anchor=(1, 0),frameon=False,
loc="lower left",markerscale=1,ncol=1,borderpad=2,labelspacing=4,handletextpad=2)
plt.grid()
plt.show()
It's a bit tricky, but you could measure the legend elements and reposition them to have a constant inbetween distance. Due to the pixel positioning, the plot can't be resized afterwards.
I tested the code inside PyCharm with the 'Qt5Agg' backend. And in a Jupyter notebook, both with %matplotlib inline and with %matplotlib notebook. I'm not sure whether it would work well in all environments.
Note that ax.scatter doesn't return an ax (countrary to e.g. sns.scatterplot) but a list of the created scatter dots.
import pandas as pd
import matplotlib.pyplot as plt
from matplotlib.transforms import IdentityTransform
from vega_datasets import data as vega_data
gap = pd.read_json(vega_data.gapminder.url)
df = gap.loc[gap['year'] == 2000]
fig, ax = plt.subplots(1, 1, figsize=[14, 12])
fig.subplots_adjust(right=0.8)
scat = ax.scatter(df['life_expect'], df['fertility'],
s=df['pop'] / 100000, alpha=0.7, edgecolor="black", cmap="viridis")
plt.xlabel("X")
plt.ylabel("Y")
x = 1.1
y = 0.1
is_first = True
kw = dict(prop="sizes", num=6, color="lightgrey", markeredgecolor='black', markeredgewidth=2)
handles, labels = scat.legend_elements(**kw)
inverted_transData = ax.transData.inverted()
for handle, label in zip(handles[::-1], labels[::-1]):
plt.setp(handle, clip_on=False)
for _ in range(1 if is_first else 2):
plt.setp(handle, transform=ax.transAxes)
if is_first:
xd, yd = x, y
else:
xd, yd = inverted_transData.transform((x, y))
handle.set_xdata([xd])
handle.set_ydata([yd])
ax.add_artist(handle)
bbox = handle.get_window_extent(fig.canvas.get_renderer())
y += y - bbox.y0 + 15 # 15 pixels inbetween
x = (bbox.x0 + bbox.x1) / 2
if is_first:
xd_text, _ = inverted_transData.transform((bbox.x1+10, y))
ax.text(xd_text, yd, label, transform=ax.transAxes, ha='left', va='center')
y = bbox.y1
is_first = False
plt.show()
I have the following code:
from mpl_toolkits.axes_grid.axislines import SubplotZero
from matplotlib.transforms import BlendedGenericTransform
import matplotlib.pyplot as plt
import numpy
if 1:
fig = plt.figure(1)
ax = SubplotZero(fig, 111)
fig.add_subplot(ax)
ax.axhline(linewidth=1.7, color="black")
ax.axvline(linewidth=1.7, color="black")
plt.xticks([1])
plt.yticks([])
ax.text(0, 1.05, 'y', transform=BlendedGenericTransform(ax.transData, ax.transAxes), ha='center')
ax.text(1.05, 0, 'x', transform=BlendedGenericTransform(ax.transAxes, ax.transData), va='center')
for direction in ["xzero", "yzero"]:
ax.axis[direction].set_axisline_style("-|>")
ax.axis[direction].set_visible(True)
for direction in ["left", "right", "bottom", "top"]:
ax.axis[direction].set_visible(False)
x = numpy.linspace(-1, 1, 10000)
ax.plot(x, numpy.tan(2*(x - numpy.pi/2)), linewidth=1.2, color="black")
plt.ylim(-5, 5)
plt.savefig('graph.png')
which produces this graph:
As you can see, not only is the tan graph sketched, but a portion of line is added to join the asymptotic regions of the tan graph, where an asymptote would normally be.
Is there some built in way to skip that section? Or will I graph separate disjoint domains of tan that are bounded by asymptotes (if you get what I mean)?
Something you could try: set a finite threshold and modify your function to provide non-finite values after those points. Practical code modification:
yy = numpy.tan(2*(x - numpy.pi/2))
threshold = 10000
yy[yy>threshold] = numpy.inf
yy[yy<-threshold] = numpy.inf
ax.plot(x, yy, linewidth=1.2, color="black")
Results in:
This code creates a figure and one subplot for tangent function. NaN are inserted when cos(x) is tending to 0 (NaN means "Not a Number" and NaNs are not plotted or connected).
matplot-fmt-pi created by k-donn(https://pypi.org/project/matplot-fmt-pi/) used to change the formatter to make x labels and ticks correspond to multiples of π/8 in fractional format.
plot formatting (grid, legend, limits, axis) is performed as commented.
import matplotlib.pyplot as plt
import numpy as np
from matplot_fmt_pi import MultiplePi
fig, ax = plt.subplots() # creates a figure and one subplot
x = np.linspace(-2 * np.pi, 2 * np.pi, 1000)
y = np.tan(x)
y[np.abs(np.cos(x)) <= np.abs(np.sin(x[1]-x[0]))] = np.nan
# This operation inserts a NaN where cos(x) is reaching 0
# NaN means "Not a Number" and NaNs are not plotted or connected
ax.plot(x, y, lw=2, color="blue", label='Tangent')
# Set up grid, legend, and limits
ax.grid(True)
ax.axhline(0, color='black', lw=.75)
ax.axvline(0, color='black', lw=.75)
ax.set_title("Trigonometric Functions")
ax.legend(frameon=False) # remove frame legend frame
# axis formatting
ax.set_xlim(-2 * np.pi, 2 * np.pi)
pi_manager = MultiplePi(8) # number= ticks between 0 - pi
ax.xaxis.set_major_locator(pi_manager.locator())
ax.xaxis.set_major_formatter(pi_manager.formatter())
plt.ylim(top=10) # y axis limit values
plt.ylim(bottom=-10)
y_ticks = np.arange(-10, 10, 1)
plt.yticks(y_ticks)
fig
[![enter image description here][1]][1]plt.show()