Color points in scatter plot of Bokeh - python

I have the following simple pandas.DataFrame:
df = pd.DataFrame(
{
"journey": ['ch1', 'ch2', 'ch2', 'ch1'],
"cat": ['a', 'b', 'a', 'c'],
"kpi1": [1,2,3,4],
"kpi2": [4,3,2,1]
}
)
Which I plot as follows:
import bokeh.plotting as bpl
import bokeh.models as bmo
bpl.output_notebook()
source = bpl.ColumnDataSource.from_df(df)
hover = bmo.HoverTool(
tooltips=[
("index", "#index"),
('journey', '#journey'),
("Cat", '#cat')
]
)
p = bpl.figure(tools=[hover])
p.scatter(
'kpi1',
'kpi2', source=source)
bpl.show(p) # open a browser
I am failing to color code the dots according to the cat. Ultimately, I want to have the first and third point in the same color, and the second and fourth in two more different colors.
How can I achieve this using Bokeh?

Here's a way that avoids manual mapping to some extent. I recently stumbled on bokeh.palettes at this github issue, as well as CategoricalColorMapper in this issue. This approach combines them. See the full list of available palettes here and the CategoricalColorMapper details here.
I had issues getting this to work directly on a pd.DataFrame, and also found it didn't work using your from_df() call. The docs show passing a DataFrame directly, and that worked for me.
import pandas as pd
import bokeh.plotting as bpl
import bokeh.models as bmo
from bokeh.palettes import d3
bpl.output_notebook()
df = pd.DataFrame(
{
"journey": ['ch1', 'ch2', 'ch2', 'ch1'],
"cat": ['a', 'b', 'a', 'c'],
"kpi1": [1,2,3,4],
"kpi2": [4,3,2,1]
}
)
source = bpl.ColumnDataSource(df)
# use whatever palette you want...
palette = d3['Category10'][len(df['cat'].unique())]
color_map = bmo.CategoricalColorMapper(factors=df['cat'].unique(),
palette=palette)
# create figure and plot
p = bpl.figure()
p.scatter(x='kpi1', y='kpi2',
color={'field': 'cat', 'transform': color_map},
legend='cat', source=source)
bpl.show(p)

For the sake of completeness, here is the adapted code using low-level chart:
import pandas as pd
import bokeh.plotting as bpl
import bokeh.models as bmo
bpl.output_notebook()
df = pd.DataFrame(
{
"journey": ['ch1', 'ch2', 'ch2', 'ch1'],
"cat": ['a', 'b', 'a', 'c'],
"kpi1": [1,2,3,4],
"kpi2": [4,3,2,1],
"color": ['blue', 'red', 'blue', 'green']
}
)
df
source = bpl.ColumnDataSource.from_df(df)
hover = bmo.HoverTool(
tooltips=[
('journey', '#journey'),
("Cat", '#cat')
]
)
p = bpl.figure(tools=[hover])
p.scatter(
'kpi1',
'kpi2', source=source, color='color')
bpl.show(p)
Note that the colors are "hard-coded" into the data.
Here is the alternative using high-level chart:
import pandas as pd
import bokeh.plotting as bpl
import bokeh.charts as bch
bpl.output_notebook()
df = pd.DataFrame(
{
"journey": ['ch1', 'ch2', 'ch2', 'ch1'],
"cat": ['a', 'b', 'a', 'c'],
"kpi1": [1,2,3,4],
"kpi2": [4,3,2,1]
}
)
tooltips=[
('journey', '#journey'),
("Cat", '#cat')
]
scatter = bch.Scatter(df, x='kpi1', y='kpi2',
color='cat',
legend="top_right",
tooltips=tooltips
)
bch.show(scatter)

you could use the higher level Scatter like here
or provide a color column to the ColumnDataSource and reference it in your p.scatter(..., color='color_column_label')

Related

Set individual wedge hatching for pandas pie chart

I am trying to make pie charts where some of the wedges have hatching and some of them don't, based on their content. The data consists of questions and yes/no/in progress answers, as shown below in the MWE.
import pandas as pd
import matplotlib.pyplot as plt
raw_data = {'Q1': ['IP', 'IP', 'Y/IP', 'Y', 'IP'],
'Q2': ['Y', 'Y', 'Y', 'Y', 'N/IP'],
'Q3': ['N/A', 'IP', 'Y/IP', 'N', 'N']}
df = pd.DataFrame(raw_data, columns = ['Q1', 'Q2', 'Q3'])
df= df.astype('string')
colors={'Y':'green',
'Y/IP':'greenyellow',
'IP':'orange',
'N/IP':'gold',
'N':'red',
'N/A':'grey'
}
for i in df.columns:
pie = df[i].value_counts().plot.pie(colors=[colors[v] for v in df[i].value_counts().keys()])
fig = pie.get_figure()
fig.savefig("D:/windows/"+i+"test.png")
fig.clf()
However, instead of greenyellow and gold I am trying to make the wedges green with yellow hatching, and yellow with red hatching, like so (note the below image does not match the data from the MWE):
I had a look online and am aware I will likely have to split the pie(s) into individual wedges but can't seem to get that to work alongside the pandas value counts. Any help would be massively appreciated. Thanks!
This snippet shows how to add hatching in custom colors to a pie chart. You can extract the Pandas valuecount - this will be a Series - then use it with the snippet I have provided.
I have added the hatch color parameter as a second parameter in the color dictionary:
import matplotlib.pyplot as plt
colors={'Y' :['green', 'lime'],
'IP': ['orange', 'red'],
'N' : ['red', 'cyan']}
labels=['Y', 'N', 'IP']
wedges, _ = plt.pie(x=[1, 2, 3], labels=labels)
for pie_wedge in wedges:
pie_wedge.set_edgecolor(colors[pie_wedge.get_label()][1])
pie_wedge.set_facecolor(colors[pie_wedge.get_label()][0])
pie_wedge.set_hatch('/')
plt.legend(wedges, labels, loc="best")
plt.show()
The result looks like so:

How to add annotation to heatmap cells?

This is a follow-up question of this one.
I would like to add text to the cells in the heatmap. I thought I could use LabelSet as described here. However, unfortunately, I don't see any labels when I run the following code:
import pandas as pd
from bokeh.io import show
from bokeh.models import (CategoricalColorMapper, LinearColorMapper,
BasicTicker, PrintfTickFormatter, ColorBar,
ColumnDataSource, LabelSet)
from bokeh.plotting import figure
from bokeh.palettes import all_palettes
from bokeh.transform import transform
df = pd.DataFrame({
'row': list('xxxxxxyyyyyyzzzzzz'),
'column': list('aabbccaabbccaabbcc'),
'content': ['c1', 'c2', 'c3', 'c1', 'c2', 'c3'] * 3,
'amount': list('123212123212123212')})
df = df.drop_duplicates(subset=['row', 'column'])
source = ColumnDataSource(df)
rows = df['row'].unique()
columns = df['column'].unique()
content = df['content'].unique()
colors = all_palettes['Viridis'][max(len(content), 3)]
mapper = CategoricalColorMapper(palette=colors, factors=content)
TOOLS = "hover,save,pan,box_zoom,reset,wheel_zoom"
p = figure(title="My great heatmap",
x_range=rows, y_range=columns,
x_axis_location="above", plot_width=600, plot_height=400,
tools=TOOLS, toolbar_location='below',
tooltips=[('cell content', '#content'), ('amount', '#amount')])
p.grid.grid_line_color = None
p.axis.axis_line_color = None
p.axis.major_tick_line_color = None
p.axis.major_label_text_font_size = "5pt"
p.axis.major_label_standoff = 0
p.rect(x="row", y="column", width=1, height=1,
source=source,
fill_color=transform('content', mapper))
labels = LabelSet(x='row', y='column', text='content', level='glyph',
x_offset=1, y_offset=1, source=source,
render_mode='canvas')
p.add_layout(labels)
show(p)
I see the heatmap, but no labels. How can I display the text?
There are five levels: "image, underlay, glyph, annotation, overlay". The level of p.rect is glyph,
if you don't set the level argument of LabelSet, the level of it is annotation, which is on top of
the level glyph.
Interestingly, OP's code worked for me. I came here because I had the same problem. Turns out that the annotation data should be a string. After converting the respective column in ColumnDataSource() my annotations (numbers) showed up in the heatmap.

Bokeh: how to add legend to patches glyph with GeoJSONDataSource and CategoricalColorMapper?

I'm trying to add a legend to a Bokeh patches figure, but I end up with only one legend item (and with the wrong label).
I have a shape file with polygons. Each polygon has an attribute called 'category', which can take the values 'A', 'B, 'C', 'D' and 'E'. I convert the shape file to geojson and subsequently create a Bokeh patches figure, using CategoricalColorMapper to add a colour to each polygon depending on the 'category' it is in. Now I want the legend to show the five category options and their respective colours.
Here's my code:
import geopandas as gpd
from bokeh.io import show, output_notebook, output_file, export_png
from bokeh.models import GeoJSONDataSource, CategoricalColorMapper, Legend, LegendItem
from bokeh.plotting import figure, reset_output
from bokeh.transform import factor_cmap
import selenium
import numpy as np
gdf = gpd.GeoDataFrame.from_file("test.shp")
gdf_json = gdf.to_json()
source_shape = GeoJSONDataSource(geojson=gdf_json)
cmap = CategoricalColorMapper(palette=["black", "purple", "pink", "brown", "blue"], factors=['A','B','C','D', 'E'])
p = figure(height=500, match_aspect=True,
h_symmetry=False, v_symmetry=False, min_border=0)
p.patches('xs', 'ys', source=source_shape, fill_color={'field': 'category', 'transform': cmap},
line_color='black', line_width=0.5, legend='category')
export_png(p, filename="map.png")
However, the output I get is as follows:
map.png output
The legend shows only one item, with the label 'category' rather than the actual category names. How can I fix this such that the legend shows all 5 categories with their labels (A,B,C,D,E)?
This code does what you want, however, I think it could be easier to manipulate the GeoDataFrame directly instead of converting to JSON. This code is compatible with Bokeh v1.0.4.
from bokeh.models import GeoJSONDataSource, CategoricalColorMapper
from bokeh.plotting import figure, show
from bokeh.io import export_png
import geopandas as gpd
import random
import json
gdf = gpd.GeoDataFrame.from_file("Judete/Judete.shp")
gdf_json = gdf.to_json()
gjson = json.loads(gdf_json)
categories = ['A', 'B', 'C', 'D', 'E']
for item in gjson['features']:
item['properties']['category'] = random.choice(categories)
source_shapes = {}
for category in categories:
source_shapes[category] = {"type": "FeatureCollection", "features": []}
for item in gjson['features']:
source_shapes[item['properties']['category']]['features'].append(item)
p = figure(match_aspect = True, min_border = 0,
h_symmetry = False, v_symmetry = False,
x_axis_location = None, y_axis_location = None)
cmap = CategoricalColorMapper(palette = ["orange", "purple", "pink", "brown", "blue"],
factors = ['A', 'B', 'C', 'D', 'E'])
for category in categories:
source_shape = GeoJSONDataSource(geojson = json.dumps(source_shapes[category]))
p.patches('xs', 'ys', fill_color = {'field': 'category', 'transform': cmap},
line_color = 'black', line_width = 0.5,
legend = category, source = source_shape,)
p.legend.click_policy = 'hide'
show(p) # export_png(p, filename = "map.png")
Result:
It seems that the legend is not currently working with GeoJSONDataSource as there is an open issue Legend not working with GeoJSONDataSource #5904 that is still unresolved.

How to use a colorscale palette with plotly and python?

I am trying to change the colors of a stack bar chart that I draw in python with plotly and cufflinks (cufflinks library allows to draw chart directly form a dataframe which is super useful).
Let's take the following figure (I use jupyter notebook):
import plotly.plotly as py
import cufflinks as cf
cf.set_config_file(offline=True, world_readable=True, theme='white')
df = pd.DataFrame(np.random.rand(10, 4), columns=['A', 'B', 'C', 'D'])
df.iplot(kind='bar', barmode='stack')
How do you implement a new color palette using the above code? I would like to use the 'Viridis' color palette. I haven't found a way to modify the colors of the graph or to use a color palette to automatically color differently the different stack of the bar chart. Does one of you knows how to do it?
Many thanks for your help,
trace0 = go.Scatter(
x = foo,
y = bar,
name = 'baz',
line = dict(
color = ('rgb(6, 12, 24)'),
width = 4)
)
This allows you to change the color of the line or you could use
colors = `['rgb(67,67,67)', 'rgb(115,115,115)', 'rgb(49,130,189)', 'rgb(189,189,189)']`
for separate lines of a graph. To use the specified color gradient try
data = [
go.Scatter(
y=[1, 1, 1, 1, 1],
marker=dict(
size=12,
cmax=4,
cmin=0,
color=[0, 1, 2, 3, 4],
colorbar=dict(
title='Colorbar'
),
colorscale='Viridis'
),
mode='markers')
]
Found an answer to my problem:
import plotly.plotly as py
import cufflinks as cf
from bokeh.palettes import viridis
cf.set_config_file(offline=True, world_readable=True, theme='white')
colors = viridis(4)
df = pd.DataFrame(np.random.rand(10, 4), columns=['A', 'B', 'C', 'D'])
fig = df.iplot(kind='bar', barmode='stack',asFigure = True)
for i,color in enumerate(colors):
fig['data'][i]['marker'].update({'color': color})
fig['data'][i]['marker']['line'].update({'color': color})
py.offline.iplot(fig)
To build upon the answer of #Peslier53:
You can specify colors or a colorscale directly within df.iplot():
import plotly.plotly as py
import cufflinks as cf
from bokeh.palettes import viridis
cf.set_config_file(offline=True, world_readable=True, theme='white')
colors = viridis(4)
df = pd.DataFrame(np.random.rand(10, 4), columns=['A', 'B', 'C', 'D'])
df.iplot(kind='bar', barmode='stack', colors = colors)
This saves you some lines of code and makes plotting very convenient.
It also works with any list of colors (depending on the graph type, heat maps need a color gradient instead of a color list for example), so you can also use custom colors.

How to shift the tick labels between two ticks python plotly

I have created a plot bar using plotly. An xticklabel is under each bar. Is it possible to shift the xticklabels a bit to the right or the left or even in the middle between two ticks?
import plotly
import pandas as pd
from plotly.graph_objs import *
json_file = {'y': [0, 1, 2, 3, 1]}
df = pd.DataFrame(json_file, index=['a', 'b', 'c', 'd', 'e'])
trace1 = Bar(
x=df.index,
y=df['y'])
layout = Layout(
xaxis=XAxis(
ticks=df.index,
tickvals=df.index))
data = Data([trace1])
fig = Figure(data=data, layout=layout)
plotly.offline.plot(fig)
A part of the result is the following:
Is there a way to place b between the two bars?

Categories