Related
I really don't understand what's going wrong with this... I've looked through what is pretty simple data several times and have restarted the kernel (running on Jupyter Notebook) and nothing seems to be solving it.
Here's the data frame I have (sorry I know the numbers look a bit silly, this is a really sparse dataset over a long time period, original is reindexed for 20 years):
DATE NODP NVP VP VDP
03/08/2002 0.083623 0.10400659 0.81235517 1.52458E-05
14/09/2003 0.24669167 0.24806379 0.5052293 1.52458E-05
26/07/2005 0.15553726 0.13324796 0.7111538 0.000060983
20/05/2006 0 0.23 0.315 0.455
05/06/2007 0.21280034 0.29139224 0.49579217 1.52458E-05
21/02/2010 0 0.55502195 0.4449628 1.52458E-05
09/04/2011 0.09531311 0.17514162 0.72954527 0
14/02/2012 0.19213217 0.12866237 0.67920546 0
17/01/2014 0.12438848 0.10297326 0.77263826 0
24/02/2017 0.01541347 0.09897548 0.88561105 0
Note that all of the rows add up to 1! I have triple, quadruple checked this...XD
I am trying to produce a stacked bar chart of this data, with the following code, which seems to have worked perfectly for everything else I have been using it for:
NODP = df['NODP']
NVP = df['NVP']
VDP = df['VDP']
VP = df['VP']
ind = np.arange(len(df.index))
width = 5.0
p1 = plt.bar(ind, NODP, width, label = 'NODP', bottom=NVP, color= 'grey')
p2 = plt.bar(ind, NVP, width, label = 'NVP', bottom=VDP, color= 'tan')
p3 = plt.bar(ind, VDP, width, label = 'VDP', bottom=VP, color= 'darkorange')
p4 = plt.bar(ind, VP, width, label = 'VP', color= 'darkgreen')
plt.ylabel('Ratio')
plt.xlabel('Year')
plt.title('Ratio change',x=0.06,y=0.8)
plt.xticks(np.arange(min(ind), max(ind)+1, 6.0), labels=xlabels) #the xticks were cumbersome so not included in this example code
plt.legend()
Which gives me the following plot:
As is evident, 1) NODP is not showing up at all, and 2) the remainder of them are being plotted with the wrong proportions...
I really don't understand what is wrong, it should be really simple, right?! I'm sorry if it is really simple, it's probably right under my nose. Any ideas greatly appreciated!
If you want to create stacked bars this way (so standard matplotlib without using pandas or seaborn for plotting), the bottom needs to be the sum of all the lower bars.
Here is an example with the given data.
from matplotlib import pyplot as plt
import numpy as np
import pandas as pd
columns = ['DATE', 'NODP', 'NVP', 'VP', 'VDP']
data = [['03/08/2002', 0.083623, 0.10400659, 0.81235517, 1.52458E-05],
['14/09/2003', 0.24669167, 0.24806379, 0.5052293, 1.52458E-05],
['26/07/2005', 0.15553726, 0.13324796, 0.7111538, 0.000060983],
['20/05/2006', 0, 0.23, 0.315, 0.455],
['05/06/2007', 0.21280034, 0.29139224, 0.49579217, 1.52458E-05],
['21/02/2010', 0, 0.55502195, 0.4449628, 1.52458E-05],
['09/04/2011', 0.09531311, 0.17514162, 0.72954527, 0],
['14/02/2012', 0.19213217, 0.12866237, 0.67920546, 0],
['17/01/2014', 0.12438848, 0.10297326, 0.77263826, 0],
['24/02/2017', 0.01541347, 0.09897548, 0.88561105, 0]]
df = pd.DataFrame(data=data, columns=columns)
ind = pd.to_datetime(df.DATE)
NODP = df.NODP.to_numpy()
NVP = df.NVP.to_numpy()
VP = df.VP.to_numpy()
VDP = df.VDP.to_numpy()
width = 120
p1 = plt.bar(ind, NODP, width, label='NODP', bottom=NVP+VDP+VP, color='grey')
p2 = plt.bar(ind, NVP, width, label='NVP', bottom=VDP+VP, color='tan')
p3 = plt.bar(ind, VDP, width, label='VDP', bottom=VP, color='darkorange')
p4 = plt.bar(ind, VP, width, label='VP', color='darkgreen')
plt.ylabel('Ratio')
plt.xlabel('Year')
plt.title('Ratio change')
plt.yticks(np.arange(0, 1.001, 0.1))
plt.legend()
plt.show()
Note that in this case the x-axis measured in days, and each bar is located at its date. This helps to know the relative spacing between the dates, in case this is important. If it isn't important, the x-positions could be chosen equidistant and labeled via the dates column.
To do so with standard matplotlib, following code would change:
ind = range(len(df))
width = 0.8
plt.xticks(ind, df.DATE, rotation=20)
plt.tight_layout() # needed to show the full labels of the x-axis
Plot the dataframe
# using your data above
df.DATE = pd.to_datetime(df.DATE)
df.set_index('DATE', inplace=True)
ax = df.plot(stacked=True, kind='bar', figsize=(12, 8))
ax.legend(bbox_to_anchor=(1.05, 1), loc='upper left', borderaxespad=0.)
# sets the tick labels so time isn't included
ax.xaxis.set_major_formatter(plt.FixedFormatter(df.index.to_series().dt.strftime("%Y-%m-%d")))
plt.show()
Add labels for clarity
By adding the following code before plt.show() you can add text annotations to the bars
# .patches is everything inside of the chart
for rect in ax.patches:
# Find where everything is located
height = rect.get_height()
width = rect.get_width()
x = rect.get_x()
y = rect.get_y()
# The width of the bar is the data value and can used as the label
label_text = f'{height:.2f}' # f'{height:.2f}' if you have decimal values as labels
label_x = x + width - 0.125
label_y = y + height / 2
# don't include label if it's equivalently 0
if height > 0.001:
ax.text(label_x, label_y, label_text, ha='right', va='center', fontsize=8)
plt.show()
The following sample code will generate the donut chart I'll use as my example:
import matplotlib.pyplot as plt
%matplotlib inline
# Following should supposedly set the font correctly:
plt.rcParams['font.family'] = 'sans-serif'
plt.rcParams['font.sans-serif'] = ['Muli'] + plt.rcParams['font.sans-serif']
plt.rcParams['font.weight'] = 'extra bold'
size_of_groups=[12,11,30,0.3]
colors = ['#a1daaa','#bbbbb4','#444511','#1afff2']
import matplotlib as mpl
mpl.rcParams['text.color'] = '#273859'
# Create a pieplot
my_pie,texts,_ = plt.pie(size_of_groups,radius = 1.2,colors=colors,autopct="%.1f%%",
textprops = {'color':'w',
'size':15 #, 'weight':"extra bold"
}, pctdistance=0.75, labeldistance=0.7) #pctdistance and labeldistance change label positions.
labels=['High','Low','Normal','NA']
plt.legend(my_pie,labels,loc='lower center',ncol=2,bbox_to_anchor=(0.5, -0.2))
plt.setp(my_pie, width=0.6, edgecolor='white')
fig1 = plt.gcf()
fig1.show()
The above outputs this:
Mostly, this is great. Finally I got a nice looking donut chart!
But there is just one last thing to finesse - when the portion of the donut chart is very small (like the 0.6%), I need the labels to be moved out of the chart, and possibly colored black instead.
I managed to do something similar for bar charts using plt.text, but I don't think that will be feasible with pie charts at all. I figure someone has definitely solved a similar problem before, but I can't readily fine any decent solutions.
Here is a way to move all percent-texts for patches smaller than some given amount (5 degrees in the code example). Note that this will also fail when there would be multiple small pieces close to each other.
import matplotlib.pyplot as plt
import matplotlib as mpl
import numpy as np
size_of_groups = [12, 11, 30, 0.3]
colors = ['#a1daaa', '#bbbbb4', '#444511', '#1afff2']
my_pie, texts, pct_txts = plt.pie(size_of_groups, radius=1.2, colors=colors, autopct="%.1f%%",
textprops={'color': 'w', 'size': 15}, pctdistance=0.75,
labeldistance=0.7)
labels = ['High', 'Low', 'Normal', 'NA']
plt.legend(my_pie, labels, loc='lower center', ncol=2, bbox_to_anchor=(0.5, -0.2))
plt.setp(my_pie, width=0.6, edgecolor='white')
for patch, txt in zip(my_pie, pct_txts):
if (patch.theta2 - patch.theta1) <= 5:
# the angle at which the text is normally located
angle = (patch.theta2 + patch.theta1) / 2.
# new distance to the pie center
x = patch.r * 1.2 * np.cos(angle * np.pi / 180)
y = patch.r * 1.2 * np.sin(angle * np.pi / 180)
# move text to new position
txt.set_position((x, y))
txt.set_color('black')
plt.tight_layout()
plt.show()
I attempted a solution by tweaking the solution of ImportanceOfBeingErnest on a different problem given here. For some reason, the percentage sign is not being displayed in my system but you can figure that out
rad = 1.2 # Define a radius variable for later use
my_pie, texts, autotexts = plt.pie(size_of_groups, radius=rad, colors=colors, autopct="%.1f%%",
pctdistance=0.75, labeldistance=0.7, textprops={'color':'white', 'size':20})
# Rest of the code
cx, cy = 0, 0 # Center of the pie chart
for t in autotexts:
x, y = t.get_position()
text = t.get_text()
if float(text.strip('%')) < 1: # Here 1 is the target threshold percentage
angle = np.arctan2(y-cy, x-cx)
xt, yt = 1.1*rad*np.cos(angle)+cx, 1.1*rad*np.sin(angle)+cy
t.set_color("k")
t.set_position((xt,yt))
I have a two-dimensional array that I want to plot using bokeh's bokeh.plotting.figure.Figure.image. It works wonderful.
Now, I want to add a legend using the colors used for the image. I don't find any example for my case. The legend that I'd like to achieve is similar to the picture.
from bokeh.models import LinearColorMapper, ColorBar
from bokeh.plotting import figure, show
plot = figure(x_range=(0,1), y_range=(0,1), toolbar_location="right")
color_mapper = LinearColorMapper(palette="YlGn9", low=-1, high=1, nan_color="white")
plot.image(image=[ndvi], color_mapper=color_mapper,dh=[1.0], dw=[1.0], x=[0], y=[0])
color_bar = ColorBar(color_mapper=color_mapper,label_standoff=12, border_line_color=None, location=(0,0))
plot.add_layout(color_bar, 'right')
Additionally, I'd like to have some custom color boundaries, with non-fixed intervals. Here is an example how it would be done with matplotlib:
cmap = colors.ListedColormap(['#27821f', '#3fa336', '#6ce362','#ffffff','#e063a8' ,'#cc3b8b','#9e008c','#59044f'])
bounds = [-1000, -500, -100, 0, 50, 100, 300, 500, 10000000]
norm = colors.BoundaryNorm(bounds, cmap.N)
fig, ax = plt.subplots()
ax.imshow(data, cmap=cmap, norm=norm)
You can choose the red-yellow-green palette. In bokeh the name is 'RdYlGn5', where the digit at the end tells how many colors are needed. To use it in a legend, you'ld need to import RdYlGn5 from bokeh.palettes.
For creating the legend, I only know of employing some dummy glyphs as in the code below.
I updated my example with the new requirements of setting custom bounds with non-fixed intervals. This post offers some guidance. Basically, the idea is to use a larger colormap with repeated colors. Such a format doesn't fit for general types of boundaries, but it fits yours, at least when the lowest and highest bound are interpreted to be infinite.
I also tried to layout the legend with some custom spaces to get all labels aligned. A background color is chosen to contrast with the legend entries.
There is a colorbar to verify how the colormap bounds work internally. After verification, you may leave it out. The example image has values from -1000 to 1000 to show how the values outside the strict colormap limits are handled.
Here is an example with dummy data:
from bokeh.models import LinearColorMapper, Legend, LegendItem, ColorBar, SingleIntervalTicker
from bokeh.plotting import figure, show
import numpy as np
x, y = np.meshgrid(np.linspace(0, 10, 1000), np.linspace(0, 10, 1000))
z = 1000*np.sin(x + np.cos(y))
plot = figure(x_range=(0, 1), y_range=(0, 1), toolbar_location="right")
base_colors = ['#27821f', '#3fa336', '#6ce362','#ffffff','#e063a8' ,'#cc3b8b','#9e008c','#59044f']
bounds = [-1000, -500, -100, 0, 50, 100, 300, 500, 10000000]
low = -600
high = 600
bound_colors = []
j = 0
for i in range(low, high, 50):
if i >= bounds[j+1]:
j += 1
bound_colors.append(base_colors[j])
color_mapper = LinearColorMapper(palette=bound_colors, low=low, high=high, nan_color="white")
plot.image(image=[z], color_mapper=color_mapper, dh=[1.0], dw=[1.0], x=[0], y=[0])
# these are a dummy glyphs to help draw the legend
dummy_for_legend = [plot.line(x=[1, 1], y=[1, 1], line_width=15, color=c, name='dummy_for_legend')
for c in base_colors]
legend_labels = [f' < {bounds[1]}'] + \
[('' if l < 0 else ' ' if l < 10 else ' ' if l < 100 else ' ')
+ f'{l} ‒ {h}' for l, h in zip(bounds[1:], bounds[2:-1])] + \
[f' > {bounds[-2]}']
legend1 = Legend(title="NDVI", background_fill_color='gold',
items=[LegendItem(label=lab, renderers=[gly]) for lab, gly in zip(legend_labels, dummy_for_legend) ])
plot.add_layout(legend1)
color_bar = ColorBar(color_mapper=color_mapper, label_standoff=12, border_line_color=None, location=(0, 0),
ticker=SingleIntervalTicker(interval=50))
plot.add_layout(color_bar)
show(plot)
I have this image from Matplotlib :
I would like to write for each category (cat i with i in [1-10] in the figure) the highest value and its corresponding legend on the graphic.
Below you can find visually what I would like to achieve :
The thing is the fact that I don't know if it is possible because of the way of plotting from matplotlib.
Basically, this is the part of the code for drawing multiple bars :
# create plot
fig, ax = plt.subplots(figsize = (9,9))
index = np.arange(len_category)
if multiple:
bar_width = 0.3
else :
bar_width = 1.5
opacity = 1.0
#test_array contains test1 and test2
cmap = get_cmap(len(test_array))
for i in range(len(test_array)):
count = count + 1
current_label = test_array[i]
rects = plt.bar(index-0.2+bar_width*i, score_array[i], bar_width, alpha=opacity, color=np.random.rand(3,1),label=current_label )
plt.xlabel('Categories')
plt.ylabel('Scores')
plt.title('Scores by Categories')
plt.xticks(index + bar_width, categories_array)
plt.legend()
plt.tight_layout()
plt.show()
and this is the part I have added in order to do what I would like to achieve. But it searches the max across all the bars in the graphics. For example, the max of test1 will be in cat10 and the max of test2 will be cat2. Instead, I would like to have the max for each category.
for i in range(len(test_array)):
count = count + 1
current_label = test_array[i]
rects = plt.bar(index-0.2+bar_width*i, score_array[i], bar_width,alpha=opacity,color=np.random.rand(3,1),label=current_label )
max_score_current = max(score_array[i])
list_rect = list()
max_height = 0
#The id of the rectangle who get the highest score
max_idx = 0
for idx,rect in enumerate(rects):
list_rect.append(rect)
height = rect.get_height()
if height > max_height:
max_height = height
max_idx = idx
highest_rect = list_rect[max_idx]
plt.text(highest_rect.get_x() + highest_rect.get_width()/2.0, max_height, str(test_array[i]),color='blue', fontweight='bold')
del list_rect[:]
Do you have an idea about how I can achieve that ?
Thank you
It usually better to keep data generation and visualization separate. Instead of looping through the bars themselves, just get the necessary data prior to plotting. This makes everything a lot more simple.
So first create a list of labels to use and then loop over the positions to annotate then. In the code below the labels are created by mapping the argmax of a column array to the test set via a dictionary.
import numpy as np
import matplotlib.pyplot as plt
test1 = [6,4,5,8,3]
test2 = [4,5,3,4,6]
labeldic = {0:"test1", 1:"test2"}
a = np.c_[test1,test2]
maxi = np.max(a, axis=1)
l = ["{} {}".format(i,labeldic[j]) for i,j in zip(maxi, np.argmax(a, axis=1))]
for i in range(a.shape[1]):
plt.bar(np.arange(a.shape[0])+(i-1)*0.3, a[:,i], width=0.3, align="edge",
label = labeldic[i])
for i in range(a.shape[0]):
plt.annotate(l[i], xy=(i,maxi[i]), xytext=(0,10),
textcoords="offset points", ha="center")
plt.margins(y=0.2)
plt.legend()
plt.show()
From your question it is not entirely clear what you want to achieve, but assuming that you want the relative height of each bar in one group printed above that bar, here is one way to achieve that:
from matplotlib import pyplot as plt
import numpy as np
score_array = np.random.rand(2,10)
index = np.arange(score_array.shape[1])
test_array=['test1','test2']
opacity = 1
bar_width = 0.25
for i,label in enumerate(test_array):
rects = plt.bar(index-0.2+bar_width*i, score_array[i], bar_width,alpha=opacity,label=label)
heights = [r.get_height() for r in rects]
print(heights)
rel_heights = [h/max(heights) for h in heights]
idx = heights.index(max(heights))
for i,(r,h, rh) in enumerate(zip(rects, heights, rel_heights)):
plt.text(r.get_x() + r.get_width()/2.0, h, '{:.2}'.format(rh), color='b', fontweight ='bold', ha='center')
plt.show()
The result looks like this:
I am plotting a piechart with matplotlib using the following code:
ax = axes([0.1, 0.1, 0.6, 0.6])
labels = 'Twice Daily', 'Daily', '3-4 times per week', 'Once per week','Occasionally'
fracs = [20,50,10,10,10]
explode=(0, 0, 0, 0,0.1)
patches, texts, autotexts = ax.pie(fracs, labels=labels, explode = explode,
autopct='%1.1f%%', shadow =True)
proptease = fm.FontProperties()
proptease.set_size('xx-small')
setp(autotexts, fontproperties=proptease)
setp(texts, fontproperties=proptease)
rcParams['legend.fontsize'] = 7.0
savefig("pie1")
This produces the following piechart.
However, I want to start the pie-chart with the first wedge on top, the only solution I could find for this was using this code
However on using this as below,
from pylab import *
from matplotlib import font_manager as fm
from matplotlib.transforms import Affine2D
from matplotlib.patches import Circle, Wedge, Polygon
import numpy as np
fig = plt.figure()
ax = fig.add_subplot(111)
labels = 'Twice Daily', 'Daily', '3-4 times per week', 'Once per week','Occasionally'
fracs = [20,50,10,10,10]
wedges, plt_labels = ax.pie(fracs, labels=labels)
ax.axis('equal')
starting_angle = 90
rotation = Affine2D().rotate(np.radians(starting_angle))
for wedge, label in zip(wedges, plt_labels):
label.set_position(rotation.transform(label.get_position()))
if label._x > 0:
label.set_horizontalalignment('left')
else:
label.set_horizontalalignment('right')
wedge._path = wedge._path.transformed(rotation)
plt.savefig("pie2")
This produces the following pie chart
However, this does not print the fracs on the wedges as in the earlier pie chart. I have tried a few different things, but I am not able to preserve the fracs. How can I start the first wedge at noon and display the fracs on the wedges as well??
Ordinarily I wouldn't recommend changing the source of a tool, but it's hacky to fix this outside and easy inside. So here's what I'd do if you needed this to work Right Now(tm), and sometimes you do..
In the file matplotlib/axes.py, change the declaration of the pie function to
def pie(self, x, explode=None, labels=None, colors=None,
autopct=None, pctdistance=0.6, shadow=False,
labeldistance=1.1, start_angle=None):
i.e. simply add start_angle=None to the end of the arguments.
Then add the five lines bracketed by "# addition".
for frac, label, expl in cbook.safezip(x,labels, explode):
x, y = center
theta2 = theta1 + frac
thetam = 2*math.pi*0.5*(theta1+theta2)
# addition begins here
if start_angle is not None and i == 0:
dtheta = (thetam - start_angle)/(2*math.pi)
theta1 -= dtheta
theta2 -= dtheta
thetam = start_angle
# addition ends here
x += expl*math.cos(thetam)
y += expl*math.sin(thetam)
Then if start_angle is None, nothing happens, but if start_angle has a value, then that's the location that the first slice (in this case the 20%) is centred on. For example,
patches, texts, autotexts = ax.pie(fracs, labels=labels, explode = explode,
autopct='%1.1f%%', shadow =True, start_angle=0.75*pi)
produces
Note that in general you should avoid doing this, patching the source I mean, but there are times in the past when I've been on deadline and simply wanted something Now(tm), so there you go..