I have a piece of sample Python that makes a waterfall visual.
It uses the bokeh lib
It looks great and works well in Jupyter but when I come to use it in PowerBI I get an error saying that no image was created
the code uses show(p) which seems to open an internet explorer page when I run it in PowerBI
I tried a matplotlib example and it uses :
my_plot.get_figure().savefig("waterfall.png",dpi=200,bbox_inches='tight')
is there something similar for bokeh lib ?
from bokeh.plotting import figure, show
from bokeh.io import output_notebook
from bokeh.models import ColumnDataSource, LabelSet
from bokeh.models.formatters import NumeralTickFormatter
import pandas as pd
#output_notebook()
# Create the initial dataframe
index = ['sales','returns','credit fees','rebates','late charges','shipping']
data = {'amount': [350000,-30000,-7500,-25000,95000,-7000]}
df = pd.DataFrame(data=data,index=index)
# Determine the total net value by adding the start and all additional transactions
net = df['amount'].sum()
df['running_total'] = df['amount'].cumsum()
df['y_start'] = df['running_total'] - df['amount']
# Where do we want to place the label?
df['label_pos'] = df['running_total']
df_net = pd.DataFrame.from_records([(net, net, 0, net)],
columns=['amount', 'running_total', 'y_start', 'label_pos'],
index=["net"])
df = df.append(df_net)
df['color'] = 'grey'
df.loc[df.amount < 0, 'color'] = 'red'
df.loc[df.amount > 0, 'color'] = 'green'
df.loc[df.amount > 300000, 'color'] = 'blue'
df.loc[df.amount < 0, 'label_pos'] = df.label_pos - 10000
df["bar_label"] = df["amount"].map('{:,.0f}'.format)
TOOLS = "box_zoom,reset,save"
source = ColumnDataSource(df)
p = figure(tools=TOOLS, x_range=list(df.index), y_range=(0, net+40000),
plot_width=800, title = "Sales Waterfall")
p.segment(x0='index', y0='y_start', x1="index", y1='running_total',
source=source, color="color", line_width=55)
p.grid.grid_line_alpha=0.3
p.yaxis[0].formatter = NumeralTickFormatter(format="($ 0 a)")
p.xaxis.axis_label = "Transactions"
labels = LabelSet(x='index', y='label_pos', text='bar_label',
text_font_size="8pt", level='glyph',
x_offset=-20, y_offset=0, source=source)
p.add_layout(labels)
show(p)
There is a chapter of the User's Guide dedicated to Exporting Plots:
from bokeh.io import export_png
export_png(plot, filename="plot.png")
Note that you will need to have the necessary optional dependencies (PhantomJS and selenium) installed.
Related
I'm struggling to get a Bokeh map. The cell runs but does not show anything. It takes about 50s. I can get a blank map to display, but nothing I have tried has worked.
Jupyter version 6.4.12 run through Anaconda 2.3.2
import pandas as pd
import numpy as np
from bokeh.plotting import figure, show, output_notebook
from bokeh.tile_providers import CARTODBPOSITRON, get_provider
from bokeh.models import ColumnDataSource, LinearColorMapper, ColorBar, NumeralTickFormatter
from bokeh.palettes import PRGn, RdYlGn
from bokeh.transform import linear_cmap, factor_cmap
from bokeh.layouts import row, column
from bokeh.resources import INLINE
pd.set_option('display.max_columns', None)
output_notebook(INLINE)
I have Lat & Lon coordinates in my dataset, which I discovered I need to convert to mercator coordinates.
# Define function to switch from lat/long to mercator coordinates
def x_coord(x, y):
lat = x
lon = y
r_major = 6378137.000
x = r_major * np.radians(lon)
scale = x/lon
y = 180.0/np.pi * np.log(np.tan(np.pi/4.0 + lat * (np.pi/180.0)/2.0)) * scale
return (x, y)
# Define coord as tuple (lat,long)
df['coordinates'] = list(zip(df['LATITUDE'], df['LONGITUDE']))
# Obtain list of mercator coordinates
mercators = [x_coord(x, y) for x, y in df['coordinates'] ]
# Create mercator column in our df
df['mercator'] = mercators
# Split that column out into two separate columns - mercator_x and mercator_y
df[['mercator_x', 'mercator_y']] = df['mercator'].apply(pd.Series)
From there, this is my code cell for the plot:
tile = get_provider('CARTODBPOSITRON')
source = ColumnDataSource(data = df)
palette = PRGn[11]
color_mapper = linear_cmap(field_name = 'FIRE_SIZE', palette = palette,
low=df['FIRE_SIZE'].min(), high = df['FIRE_SIZE'].max())
tooltips = [('Fire Year', '#FIRE_YEAR'),('State','#STATE')]
p = figure(title = 'Fire Locations',
x_axis_type = 'mercator',
y_axis_type = 'mercator',
x_axis_label = 'Longitude',
y_axis_label = 'Latitude',
tooltips = tooltips)
p.add_tile(tile)
p.circle(x = 'mercator_x',
y = 'mercator_y',
color = color_mapper,
size = 10,
fill_alpha = 0.7,
source = source)
color_bar = ColorBar(color_mapper = color_mapper['transform'],
formatter = NumeralTickFormatter(format='0.0[0000]'),
`your text` label_standoff = 13, width = 8, location = (0,0))
p.add_layout(color_bar, 'right')
show(p)
The cell runs, but nothing shows. There are no errors. I confirmed that I can get a plot to display using this code:
#Test
tile = get_provider('CARTODBPOSITRON')
p = figure(x_range = (-2000000, 2000000),
y_range = (1000000, 7000000),
x_axis_type = 'mercator',
y_axis_type = 'mercator')
p.add_tile(tile)
show(p)
This is a large dataset, with 2,303,566 entries. I have checked that I have no null values in any of the columns that I am using, as well as verifying the correct data types (lat/lon are float64).
Returning to answer my own question here. After doing some more testing based on helpful comments I received from #mosc9575 and #bigreddot, I determined that the size of my dataset is the reason for Bokeh failing to display the map. I used a single point first, and then a small slice of my dataframe - and the map displayed just fine.
I hope this is helpful to someone else at some point!
Thanks to everyone who assisted.
I currently have a gmap displaying gps points, however, I was hoping there was a way to colour code my GPS points based on which month they were recorded ? I have looked around online but am struggling to implement it into my own code. My dataset consists of GPS points collected throughout 2017, with a localDate index (in datetime format), and a longitude and latitude:
2017-11-12 |5.043978|118.715237
Bokeh and gmap code:
def plot(lat, lng, zoom=10, map_type='roadmap'):
gmap_options = GMapOptions(lat=lat, lng=lng,
map_type=map_type, zoom=zoom)
# the tools are defined below:
hover = HoverTool(
tooltips = [
# #price refers to the price column
# in the ColumnDataSource.
('Date', '#{Local Date}{%c}'),
('Lat', '#Lat'),
('Lon', '#Lon'),
],
formatters={'#{Local Date}': 'datetime'}
)
# below we replaced 'hover' (the default hover tool),
# by our custom hover tool
p = gmap(api_key, gmap_options, title='Malaysia',
width=bokeh_width, height=bokeh_height,
tools=[hover, 'reset', 'wheel_zoom', 'pan'])
source = ColumnDataSource(day2017Averageddf)
center = p.circle('Lon', 'Lat', size=4, alpha=0.5,
color='yellow', source=source)
show(p)
return p
p = plot(Lat, Lon, map_type='satellite')
The base idea is to pass the colors to the color keyword in p.circle(). You are using one color, but you could create also a list of colors with the correct length and implement your own logic or you could make use of a mapper.
The code below is a copy from the original documentation about mappers.
from bokeh.models import ColumnDataSource
from bokeh.palettes import Spectral6
from bokeh.plotting import figure, output_notebook, show
from bokeh.transform import linear_cmap
output_notebook()
x = [1,2,3,4,5,7,8,9,10]
y = [1,2,3,4,5,7,8,9,10]
#Use the field name of the column source
mapper = linear_cmap(field_name='y', palette=Spectral6 ,low=min(y) ,high=max(y))
source = ColumnDataSource(dict(x=x,y=y))
p = figure(width=300, height=300, title="Linear Color Map Based on Y")
p.circle(x='x', y='y', line_color=mapper,color=mapper, fill_alpha=1, size=12, source=source)
color_bar = ColorBar(color_mapper=mapper['transform'], width=8)
p.add_layout(color_bar, 'right')
show(p)
To come back to you problem. If the items in your Local Date column are of type pd.Timestamp, you can create a column "month" by this line
day2017Averageddf["month"] = day2017Averageddf["Local Date"].month
and use it for the mapper.
I've studied the post:
"How do I link the CrossHairTool in bokeh over several plots?" (See How do I link the CrossHairTool in bokeh over several plots?.
I used the function written by Hamid Fadishei on June 2020 within this post but cannot manage to get the CrossHairTool to correctly display over several plots.
In my implementation, the crosshair displays only within the plot hovered over. I am currently using Bokeh version 2.1.1 with Python Anaconda version 3.7.6 using the Python extension in VSCode version 1.48. I am not familiar with Javascript, so any help to debug my code to correctly display the crosshair across the two plots will be welcomed.
My code:
# Importing libraries:
import pandas as pd
import random
from datetime import datetime, timedelta
from bokeh.models import CustomJS, CrosshairTool, ColumnDataSource, DatetimeTickFormatter, HoverTool
from bokeh.layouts import gridplot
from bokeh.plotting import figure, output_file, show
# Function wrote by Hamid Fadishei to enable a linked crosshair within gridplot:
def add_vlinked_crosshairs(figs):
js_leave = ''
js_move = 'if(cb_obj.x >= fig.x_range.start && cb_obj.x <= fig.x_range.end &&\n'
js_move += 'cb_obj.y >= fig.y_range.start && cb_obj.y <= fig.y_range.end){\n'
for i in range(len(figs)-1):
js_move += '\t\t\tother%d.spans.height.computed_location = cb_obj.sx\n' % i
js_move += '}else{\n'
for i in range(len(figs)-1):
js_move += '\t\t\tother%d.spans.height.computed_location = null\n' % i
js_leave += '\t\t\tother%d.spans.height.computed_location = null\n' % i
js_move += '}'
crosses = [CrosshairTool() for fig in figs]
for i, fig in enumerate(figs):
fig.add_tools(crosses[i])
args = {'fig': fig}
k = 0
for j in range(len(figs)):
if i != j:
args['other%d'%k] = crosses[j]
k += 1
fig.js_on_event('mousemove', CustomJS(args=args, code=js_move))
fig.js_on_event('mouseleave', CustomJS(args=args, code=js_leave))
# Create dataframe consisting of 5 random numbers within column A and B as a function of an arbitrary time range:
startDate = datetime(2020,5,1)
timeStep = timedelta(minutes = 5)
df = pd.DataFrame({
"Date": [startDate + (i * timeStep) for i in range(5)],
"A": [random.randrange(1, 50, 1) for i in range(5)],
"B": [random.randrange(1, 50, 1) for i in range(5)]})
# Generate output file as html file:
output_file("test_linked_crosshair.html", title='Results')
# Define selection tools within gridplot:
select_tools = ["xpan", "xwheel_zoom", "box_zoom", "reset", "save"]
sample = ColumnDataSource(df)
# Define figures:
fig_1 = figure(plot_height=250,
plot_width=800,
x_axis_type="datetime",
x_axis_label='Time',
y_axis_label='A',
toolbar_location='right',
tools=select_tools)
fig_1.line(x='Date', y='A',
source=sample,
color='blue',
line_width=1)
fig_2 = figure(plot_height=250,
plot_width=800,
x_range=fig_1.x_range,
x_axis_type="datetime",
x_axis_label='Time',
y_axis_label='B',
toolbar_location='right',
tools=select_tools)
fig_2.line(x='Date', y='B',
source=sample,
color='red',
line_width=1)
# Define hover tool for showing timestep and value of crosshair on graph:
fig_1.add_tools(HoverTool(tooltips=[('','#Date{%F,%H:%M}'),
('','#A{0.00 a}')],
formatters={'#Date':'datetime'},mode='vline'))
fig_2.add_tools(HoverTool(tooltips=[('','#Date{%F,%H:%M}'),
('','#B{0.00 a}')],
formatters={'#Date':'datetime'},mode='vline'))
# Calling function to enable linked crosshairs within gridplot:
add_vlinked_crosshairs([fig_1, fig_2])
# Generate gridplot:
p = gridplot([[fig_1], [fig_2]])
show(p)
myGraphenter code here
Here's a solution that works as of Bokeh 2.2.1: Just use the same crosshair tool object for all the plots that need it linked. Like so:
import numpy as np
from bokeh.plotting import figure, show
from bokeh.layouts import gridplot
from bokeh.models import CrosshairTool
plots = [figure() for i in range(6)]
[plot.line(np.arange(10), np.random.random(10)) for plot in plots]
linked_crosshair = CrosshairTool(dimensions="both")
for plot in plots:
plot.add_tools(linked_crosshair)
show(gridplot(children=[plot for plot in plots], ncols=3))
I am trying to plot RPI, CPI and CPIH on one chart with a HoverTool showing the value of each when you pan over a given area of the chart.
I initially tried adding each line separately using line() which kind of worked:
However, the HoverTool only works correctly when you scroll over the individual lines.
I have tried using multi_line() like:
combined_inflation_metrics = 'combined_inflation_metrics.csv'
df_combined_inflation_metrics = pd.read_csv(combined_inflation_metrics)
combined_source = ColumnDataSource(df_combined_inflation_metrics)
l.multi_line(xs=['Date','Date','Date'],ys=['RPI', 'CPI', 'CPIH'], source=combined_source)
#l.multi_line(xs=[['Date'],['Date'],['Date']],ys=[['RPI'], ['CPI'], ['CPIH']], source=combined_source)
show(l)
However, this is throwing the following:
RuntimeError:
Supplying a user-defined data source AND iterable values to glyph methods is
not possibe. Either:
Pass all data directly as literals:
p.circe(x=a_list, y=an_array, ...)
Or, put all data in a ColumnDataSource and pass column names:
source = ColumnDataSource(data=dict(x=a_list, y=an_array))
p.circe(x='x', y='y', source=source, ...)
But I am not too sure why this is?
Update:
I figured out a workaround by adding all of the values in each of the data sources. It works, but doesn't feel most efficient and would still like to know how to do this properly.
Edit - Code request:
from bokeh.plotting import figure, output_file, show
from bokeh.models import NumeralTickFormatter, DatetimeTickFormatter, ColumnDataSource, HoverTool, CrosshairTool, SaveTool, PanTool
import pandas as pd
import os
os.chdir(r'path')
#output_file('Inflation.html', title='Inflation')
RPI = 'RPI.csv'
CPI = 'CPI.csv'
CPIH = 'CPIH.csv'
df_RPI = pd.read_csv(RPI)
df_CPI = pd.read_csv(CPI)
df_CPIH = pd.read_csv(CPIH)
def to_date_time(data_frame, data_series):
data_frame[data_series] = data_frame[data_series].astype('datetime64[ns]')
to_date_time(df_RPI, 'Date')
to_date_time(df_CPI, 'Date')
to_date_time(df_CPIH, 'Date')
RPI_source = ColumnDataSource(df_RPI)
CPI_source = ColumnDataSource(df_CPI)
CPIH_source = ColumnDataSource(df_CPIH)
l = figure(title="Historic Inflaiton Metrics", logo=None)
l.plot_width = 1200
l.xaxis[0].formatter=DatetimeTickFormatter(
days=["%d %B %Y"],
months=["%d %B %Y"],
years=["%d %B %Y"],
)
glyph_1 = l.line('Date','RPI',source=RPI_source, legend='TYPE', color='red')
glyph_2 = l.line('Date','CPI',source=CPI_source, legend='TYPE', color='blue')
glyph_3 = l.line('Date','CPIH',source=CPIH_source, legend='TYPE', color='gold')
hover = HoverTool(renderers=[glyph_1],
tooltips=[ ("Date","#Date{%F}"),
("RPI","#RPI"),
("CPI","#CPI"),
("CPIH","#CPIH")],
formatters={"Date": "datetime"},
mode='vline'
)
l.tools = [SaveTool(), PanTool(), hover, CrosshairTool()]
show(l)
The hover tool looks up the data to show in the ColumnDataSource. Because you created a new ColumnDataSource for each line and restricted the hover tool to line1 it can only lookup data in the data source there.
The general solution is to only create one ColumnDataSource and reuse that in each line:
df_RPI = pd.read_csv(RPI)
df_CPI = pd.read_csv(CPI)
df_CPIH = pd.read_csv(CPIH)
df = df_RPI.merge(dfd_CPI, on="date")
df = df.merge(df_CPIH, on="date")
source = ColumnDataSource(df)
l = figure(title="Historic Inflation Metrics", logo=None)
glyph_1 = l.line('Date','RPI',source=source, legend='RPI', color='red')
l.line('Date','CPI',source=source, legend='CPI', color='blue')
l.line('Date','CPIH',source=source, legend='CPIH', color='gold')
hover = HoverTool(renderers=[glyph_1],
tooltips=[ ("Date","#Date{%F}"),
("RPI","#RPI"),
("CPI","#CPI"),
("CPIH","#CPIH")],
formatters={"Date": "datetime"},
mode='vline'
)
show(l)
This is of course only possible if you all your dataframes can be merged into one, i.e. the measurement timepoints are the same. If they are not besides resampling/interpolating I do not know a good method to do what you want.
I am creating an interactive world map with Bokeh (Bokeh server). The countries are represented by patches. Countries should be selectable by the Taptool. However, some countries consist of several patches. By clicking on one country patch, the entire country, i.e. all corresponding patches should appear as selected.
I can solve this by the following code. However, there is a visible time lag between the selection of the patch I click on, and the other patches that belong to this country. Therefore I wonder, is there a more efficient/easy way achieve this?
from bokeh.models import ColumnDataSource, Patches
from bokeh.plotting import figure
from bokeh.layouts import row
from bokeh.io import curdoc
import pandas as pd
from bokeh.models.selections import Selection
x = [[5,2,4], [3,5,6], [6,9,7], [8,7,6]]
y = [[5,3,2], [6,5,8], [3,1,6], [1,2,1]]
country = ['A', 'A', 'B', 'B']
id = [0,1,2,3]
df = pd.DataFrame(data=dict(x=x, y=y, country=country, id=id))
source = ColumnDataSource(df)
p = figure(tools="tap")
renderer = p.patches('x', 'y', source=source)
def my_tap_handler(attr,old,new):
indices = source.selected.indices
country_name = source.data['country'][indices[0]]
country_indices = df['id'][df['country'] == country_name]
if len(source.selected.indices) == 1:
new_indices = list(country_indices)
source.selected = Selection(indices=new_indices)
renderer.data_source.on_change("selected", my_tap_handler)
curdoc().add_root(row(p, width=800))
Run in terminal: bokeh serve filename.py --show