stacked Bar charts in Bokeh - python

I am drawing bar charts with Bokeh( http://docs.bokeh.org/en/latest/docs/user_guide.html ). It is an amazing tool but at the same time I think it is a little bit immature currently. I have a stacked bar chart with 30 categories on x axis and 40 classes corresponding to each category. I am not able to find out the function that can enable me to change colors (colors right now are very ambiguous) and align legend to top. Alternatively, if a information box can be opened when someone hovers over that color, that can be helpful. I have a very little clue if that can be done.
http://docs.bokeh.org/en/latest/docs/user_guide/charts.html#bar
My example is similar to this one except that I have many variables.
Any suggestions?
UPDATE:
I tried myself the below solution but it looks like there is some problem with Bar(). It does not recognize Bar().
import bokeh.plotting as bp
data24 =OrderedDict()
for i in range(10):
data24[i] = np.random.randint(2, size=10)
figut = bp.figure(tools="reset, hover")
s1 = figut.Bar(data24, stacked= True,color=colors )
s1.select(dict(type=HoverTool)).tooltips = {"x":"$index"}
Running it I get:
AttributeError: 'Figure' object has no attribute 'Bar'
Here are the bar colors that I am getting. There is no way to distinguish between colors.

I've had a dig in the bokeh source code and it seems that the bokeh.charts.Bar method will except some keyword arguments. These can be properties of the Builder class which includes the palette property, defined here. You should be able to pass this as an argument therefore to Bar.
Bar(...,palette=['red','green','blue'],...)
Just tested this out by modifying the example that bokeh provides:
from collections import OrderedDict
import pandas as pd
from bokeh.charts import Bar, output_file, show
from bokeh.sampledata.olympics2014 import data
df = pd.io.json.json_normalize(data['data'])
# filter by countries with at least one medal and sort
df = df[df['medals.total'] > 0]
df = df.sort("medals.total", ascending=False)
# get the countries and we group the data by medal type
countries = df.abbr.values.tolist()
gold = df['medals.gold'].astype(float).values
silver = df['medals.silver'].astype(float).values
bronze = df['medals.bronze'].astype(float).values
# build a dict containing the grouped data
medals = OrderedDict(bronze=bronze, silver=silver, gold=gold)
output_file("stacked_bar.html")
bar = Bar(
medals, countries, title="Stacked bars", stacked=True,
palette=['brown', 'silver', 'gold'])
show(bar)

Both the original question and other answer are very out of date. The bokeh.charts API was deprecated and removed years ago. For stacked bar charts in modern bokeh, see the section on Handling Categorical Data
Here is a complete example:
from bokeh.core.properties import value
from bokeh.io import show, output_file
from bokeh.plotting import figure
output_file("stacked.html")
fruits = ['Apples', 'Pears', 'Nectarines', 'Plums', 'Grapes', 'Strawberries']
years = ["2015", "2016", "2017"]
colors = ["#c9d9d3", "#718dbf", "#e84d60"]
data = {'fruits' : fruits,
'2015' : [2, 1, 4, 3, 2, 4],
'2016' : [5, 3, 4, 2, 4, 6],
'2017' : [3, 2, 4, 4, 5, 3]}
p = figure(x_range=fruits, plot_height=250, title="Fruit Counts by Year",
toolbar_location=None, tools="")
p.vbar_stack(years, x='fruits', width=0.9, color=colors, source=data,
legend=[value(x) for x in years])
p.y_range.start = 0
p.x_range.range_padding = 0.1
p.xgrid.grid_line_color = None
p.axis.minor_tick_line_color = None
p.outline_line_color = None
p.legend.location = "top_left"
p.legend.orientation = "horizontal"
show(p)

Related

bokeh hbar example with categorical values = white box

I've got code for a simple hbar plot that is stripped down to what I think should be, but shows up as a white box. (I can get the simple example of a line plot working so I know the headers are set up correctly.)
from bokeh.embed import components
from bokeh.plotting import figure
fruits = ['Apples', 'Pears', 'Nectarines', 'Plums', 'Grapes', 'Strawberries']
counts = [5, 3, 4, 2, 4, 6]
p = figure(plot_height=250, title="Fruit counts",
toolbar_location=None, tools="")
p.hbar(y=fruits, right=counts)
data, div = components(p)
The error in the console is "[Bokeh] could not set initial ranges"
If someone could point me to the documentation about anything needing to be added that would be helpful.
As you are working with categorical data, you need to assign a FactorRange for your y_range. This is done either by p.y_range=FactorRange(factors=fruits) or its shorthand version p.x_range=fruits.
The following example shows the figure corectly:
from bokeh.embed import components
from bokeh.plotting import figure, show
from bokeh.models import FactorRange
fruits = ['Apples', 'Pears', 'Nectarines', 'Plums', 'Grapes', 'Strawberries']
counts = [5, 3, 4, 2, 4, 6]
p = figure(y_range=FactorRange(factors=fruits), plot_height=250, title="Fruit counts",
toolbar_location=None, tools="")
p.hbar(y=fruits, right=counts)
show(p)

Include Bokeh Tooltips in stacked bar chart

I have the code below. Would anyone be able to let me know how to include tooltips for the bar chart below.
from bokeh.core.properties import value
from bokeh.io import show, output_file
from bokeh.plotting import figure
output_file("stacked.html")
fruits = ['Apples', 'Pears', 'Nectarines', 'Plums', 'Grapes', 'Strawberries']
years = ["2015", "2016", "2017"]
colors = ["#c9d9d3", "#718dbf", "#e84d60"]
data = {'fruits' : fruits,
'2015' : [2, 1, 4, 3, 2, 4],
'2016' : [5, 3, 4, 2, 4, 6],
'2017' : [3, 2, 4, 4, 5, 3]}
p = figure(x_range=fruits, plot_height=250, title="Fruit Counts by Year",
toolbar_location=None, tools="")
p.vbar_stack(years, x='fruits', width=0.9, color=colors, source=data,
legend=[value(x) for x in years])
p.y_range.start = 0
p.x_range.range_padding = 0.1
p.xgrid.grid_line_color = None
p.axis.minor_tick_line_color = None
p.outline_line_color = None
p.legend.location = "top_left"
p.legend.orientation = "horizontal"
show(p)
Thanks
Michael
You want tooltips to indicate the value by year:
tooltips = [
("fruit", "#fruits"),
("2015:", "#2015"),
("2016:", "#2016"),
("2017:", "#2017"),
]
p = figure(x_range=fruits, plot_height=300, title="Fruit Counts by Year",
tooltips=tooltips,
toolbar_location="right", tools="")
output:
You can add a hovertool by specifying "hover" in the list with tools and adding tooltips to it. You have two kinds of tooltips; "#" which displays sourcedata and $ which correspond to values that are intrinsic to the plot, such as the coordinates of the mouse in data or screen space. Hovertools are nice to use in combination with a ColumnDataSource so also take a look at that. More information on hovertools can be found here.
Adding a hovertool to your plot can be done by changing these lines:
tooltips = [
("fruit", "#fruits"),
("x, y", "$x,$y"),
]
p = figure(x_range=fruits, plot_height=300, title="Fruit Counts by Year",
toolbar_location="right", tools=["hover"], tooltips = tooltips)

Bokeh Position Legend outside plot area for stacked vbar

I have a stacked vbar chart in Bokeh, a simplified version of which can be reproduced with:
from bokeh.plotting import figure
from bokeh.io import show
months = ['JAN', 'FEB', 'MAR']
categories = ["cat1", "cat2", "cat3"]
data = {"month" : months,
"cat1" : [1, 4, 12],
"cat2" : [2, 5, 3],
"cat3" : [5, 6, 1]}
colors = ["#c9d9d3", "#718dbf", "#e84d60"]
p = figure(x_range=months, plot_height=250, title="Categories by month",
toolbar_location=None)
p.vbar_stack(categories, x='month', width=0.9, color=colors, source=data)
show(p)
I want to add a legend to the chart, but my real chart has a lot of categories in the stacks and therefore the legend would be very large, so I want it to be outside the plot area to the right.
There's a SO answer here which explains how to add a legend outside of the plot area, but in the example given each glyph rendered is assigned to a variable which is then labelled and added to a Legend object. I understand how to do that, but I believe the vbar_stack method creates mutliple glyphs in a single call, so I don't know how to label these and add them to a separate Legend object to place outside the chart area?
Alternatively, is there a simpler way to use the legend argument when calling vbar_stack and then locate the legend outside the chart area?
Any help much appreciated.
For anyone interested, have now fixed this using simple indexing of the vbar_stack glyphs. Solution below:
from bokeh.plotting import figure
from bokeh.io import show
from bokeh.models import Legend
months = ['JAN', 'FEB', 'MAR']
categories = ["cat1", "cat2", "cat3"]
data = {"month" : months,
"cat1" : [1, 4, 12],
"cat2" : [2, 5, 3],
"cat3" : [5, 6, 1]}
colors = ["#c9d9d3", "#718dbf", "#e84d60"]
p = figure(x_range=months, plot_height=250, title="Categories by month",
toolbar_location=None)
v = p.vbar_stack(categories, x='month', width=0.9, color=colors, source=data)
legend = Legend(items=[
("cat1", [v[0]]),
("cat2", [v[1]]),
("cat3", [v[2]]),
], location=(0, -30))
p.add_layout(legend, 'right')
show(p)
Thanks Toby Petty for your answer.
I have slightly improved your code so that it automatically graps the categories from the source data and assigns colors. I thought this might be handy as the categories are often not explicitly stored in a variable and have to be taken from the data.
from bokeh.plotting import figure
from bokeh.io import show
from bokeh.models import Legend
from bokeh.palettes import brewer
months = ['JAN', 'FEB', 'MAR']
data = {"month" : months,
"cat1" : [1, 4, 12],
"cat2" : [2, 5, 3],
"cat3" : [5, 6, 1],
"cat4" : [8, 2, 1],
"cat5" : [1, 1, 3]}
categories = list(data.keys())
categories.remove('month')
colors = brewer['YlGnBu'][len(categories)]
p = figure(x_range=months, plot_height=250, title="Categories by month",
toolbar_location=None)
v = p.vbar_stack(categories, x='month', width=0.9, color=colors, source=data)
legend = Legend(items=[(x, [v[i]]) for i, x in enumerate(categories)], location=(0, -30))
p.add_layout(legend, 'right')
show(p)

Bar chart showing count of each category month-wise using Bokeh

I have data as shown below:
So, from this, I need to display the count in each category year_month_id wise. Since I have 12 months there will be 12 sub-divisions and under each count of
ID within each class.
Something like the image below is what I am looking for.
Now the examples in Bokeh use ColumnDataSource and dictionary mapping, but how do I do this for my dataset.
Can someone please help me with this?
Below is the expected output in tabular and chart format.
I believe the pandas Python package would come in handy for preparing your data for plotting. It's useful for manipulating table-like data structures.
Here is how I went about your problem:
from pandas import DataFrame
from bokeh.io import show
from bokeh.plotting import figure
from bokeh.models import ColumnDataSource
from bokeh.palettes import Viridis5
# Your sample data
df = DataFrame({'id': [1, 2, 3, 4, 5, 6, 7, 8, 9, 1],
'year_month_id': [201612, 201612, 201612, 201612, 201612, 201612, 201612, 201612, 201612, 201701],
'class': ['A', 'D', 'B', 'other', 'other', 'other', 'A', 'other', 'A', 'B']
})
# Get counts of groups of 'class' and fill in 'year_month_id' column
df2 = DataFrame({'count': df.groupby(["year_month_id", "class"]).size()}).reset_index()
df2 now looks like this:
# Create new column to make plotting easier
df2['class-date'] = df2['class'] + "-" + df2['year_month_id'].map(str)
# x and y axes
class_date = df2['class-date'].tolist()
count = df2['count'].tolist()
# Bokeh's mapping of column names and data lists
source = ColumnDataSource(data=dict(class_date=class_date, count=count, color=Viridis5))
# Bokeh's convenience function for creating a Figure object
p = figure(x_range=class_date, y_range=(0, 5), plot_height=350, title="Counts",
toolbar_location=None, tools="")
# Render and show the vbar plot
p.vbar(x='class_date', top='count', width=0.9, color='color', source=source)
show(p)
So the Bokeh plot looks like this:
Of course you can alter it to suit your needs. The first thing I thought of was making the top of the y_range variable so it could accommodate data better, though I have not tried it myself.

How can create Python iplot graph, colors changes with value?

Here you are part of my data.
I count my data
count_interests = interests.count()
then made a graph
count_interests.iplot(kind = 'bar', xTitle='Interests', yTitle='Number of Person', colors='Red')
I tried many times to find a function change columns color with values so bigger and smaller columns looks different colors.
I know there is colorscale and color functions and I tried many times I couldn't find. Does anyone know any function?
You could define a function which returns a color for each value and then pass the colors for each bar in a list.
import pandas as pd
import plotly
def color(val, median, std):
if val > median + std:
return 'darkgreen'
if val < median - std:
return 'darkred'
return 'orange'
df = pd.DataFrame({'cinema': [1, 2, 5, 3, 3, None],
'theatre': [3, 0, 8, 4, 0, 4],
'wine': [3, 2, 5, None, 1, None],
'beer': [4, 8, 2, None, None, None]})
med = df.count().median()
std = df.count().std()
colors = [color(i, med, std) for i in df.count()]
fig = plotly.graph_objs.Bar(x=df.columns,
y=df.count(),
marker=dict(color=colors))
plotly.offline.plot([fig])
The bars could be also colored either by pd.pivot_table() the rows to columns or by creating a separate list of traces for bars. Here, each column was aggregated by taking a sum() as an example. Code below:
# Import libraries
import datetime
from datetime import date
import pandas as pd
import numpy as np
from plotly import __version__
%matplotlib inline
import cufflinks as cf
from plotly.offline import download_plotlyjs, init_notebook_mode, plot, iplot
init_notebook_mode(connected=True)
init_notebook_mode(connected=True)
cf.go_offline()
import plotly.graph_objs as go
import plotly.offline as pyo
# Create dataframe
INT_M_PUB = [0,0,0,0,0,1,0,0,0,0]
INT_M_CINEMA = [1,1,1,0,0,0,0,0,0,1]
INT_M_THEATRE = [1,0,1,0,0,1,0,1,0,1]
INT_M_GYM = [0,0,0,0,0,1,0,0,0,1]
INT_M_ENTERTAIN = [0,0,1,1,0,1,0,1,0,1]
INT_M_EATOUT = [0,1,1,0,0,1,0,0,1,1]
INT_M_WINE = [0,0,0,0,0,1,0,0,0,1]
interests = pd.DataFrame({'INT_M_PUB':INT_M_PUB, 'INT_M_CINEMA':INT_M_CINEMA, 'INT_M_THEATRE':INT_M_THEATRE,
'INT_M_GYM':INT_M_GYM, 'INT_M_ENTERTAIN':INT_M_ENTERTAIN, 'INT_M_EATOUT':INT_M_EATOUT,
'INT_M_WINE':INT_M_WINE
})
interests.head(2)
dfm = interests.sum().reset_index().rename(columns={'index':'interests', 0:'value'})
dfm
# Re-creating the plot similar to that in question (note: y-axis scales are different)
df = dfm.copy()
col_list = df.columns
df.iplot(kind = 'bar', x='interests', y='value', xTitle='Interests', yTitle='Number of Person', title='These bars need to be colored', color='red')
# Color plots by creating traces
# Initialize empty list named data to collect traces for each bar
data = []
for col_name in col_list:
trace = go.Bar(
x=[col_name],
y=df[col_name],
name=col_name
)
data.append(trace)
data = data
layout = go.Layout(
barmode='group',
title='Interests',
xaxis=dict(title='Interests'),
yaxis=dict(title='Number of Person')
)
fig = go.Figure(data=data, layout=layout)
pyo.iplot(fig, filename='grouped-bar')
# Creating plot by pivoting the table
df = pd.pivot_table(dfm, values='value', columns='interests')
df.iplot(kind = 'bar',xTitle='Interests', yTitle='Number of Person')

Categories