I have a DataFrame which holds the data I want to plot in a scatter plot.
The DataFrame has far more information than only the columns needed for the scatter x and y data.
I want to show the additional data as hover (which is not the problem) but also if I Tap-select one data point in the scatter the additional data in the other columns of the ColumnDataSource shall be plottet in a Bar Plot.
My main Problem is to get the bar plot to accept the data stored in the one selected row of the ColumnDataSource.
Everything I have seen uses column based data to feed it to the bar plot.
I was half-way through a workaround where I use the selected row of the ColumnDatasource transform it back to a DataFrame then transpose it (so it is column based) and then back to a ColumnDataSource but this can not be the intention of the creators of bokeh, right?
I stripped my Problem down to a minimalistic code snippet:
df = pd.DataFrame({"x": [1,2,3,4,5,6],
"y": [6,5,4,3,2,1],
"cat1": [11,12,13,14,15,16],
"cat2": [100,99,98,97,96,95]})
SRC = ColumnDataSource(df)
def Plot(doc):
def callback(event):
SELECTED = SRC.selected.indices
bplot = make_bPlot(SELECTED)
def make_bPlot(selected):
#Here is my question:
#How to feed the row-wise data of the SRC to the barplot?
b = figure(x_range=["cat1", "cat2"])
b.vbar(x=["cat1", "cat2"], top=["cat1", "cat2"], source=SRC)
return b
TOOLTIPS = [
("x", "#x"),
("y", "#y"),
("Category 1", "#cat1"),
("Category 2", "#cat2")]
TOOLS="pan,wheel_zoom,zoom_in,zoom_out,box_zoom,reset,tap"
cplot = figure(tools = TOOLS, tooltips=TOOLTIPS)
cplot.circle("x", "y", source=SRC)
bplot = make_bPlot(None) # init
taptool = plot.select(type=TapTool)
cplot.on_event(Tap, callback)
layout = column(cplot, bplot)
doc.add_root(layout)
Thanks in advance.
I got my answer from the Bokeh Discourse Forum:
https://discourse.bokeh.org/t/tap-on-scatter-to-show-additional-data-in-bar-plot/6939
Related
I have a plot which uses US states to map symbols. I currently assign symbols using the "state" column in my dataframe so that I can select particular states of interest by clicking or double clicking on the Plotly Express legend. This part is working fine. However, the symbol mapping I'm using also communicates information about territory, e.g. triangle-down means poor coverage in that state and many states will share this symbol. I would like to add another legend that shows what each shape means. How can I do this in Plotly Express? Alternatively, is there a way to display symbols in a footnote? I could also give the symbol definitions there.
The goal is to display that circle=Medium coverage, triangle-down=poor coverage, etc. in addition to the individual state legend I already have. If the legend is clickable such that I can select entire groups based on the symbol shape that would be the best possible outcome.
Thank you for any tips!
I tried using html and footnotes to display the symbols but it did not work.
as noted in comment, it can be achieved by additional traces on different axes
have simulated some data that matches what is implied in image and comments
from scatter figure extract out how symbols and colors have been assigned to states
build another scatter that is effectively a legend.
import pandas as pd
import numpy as np
import plotly.express as px
df_s = pd.read_html(
"https://en.wikipedia.org/wiki/List_of_states_and_territories_of_the_United_States"
)[1].iloc[:, 0:2]
df_s.columns = ["name", "state"]
# generate a dataframe that matches structure in image and question
df = pd.DataFrame(
{"activity_month": pd.date_range("1-jan-2020", "today", freq="W")}
).assign(
value=lambda d: np.random.uniform(0, 1, len(d)),
state=lambda d: np.random.choice(df_s["state"], len(d)),
)
# straight forward scatter
fig = px.scatter(df, x="activity_month", y="value", symbol="state", color="state")
# extract out how symbols and colors have been assigned to states
df_symbol = pd.DataFrame(
[
{"symbol": t.marker.symbol, "state": t.name, "color": t.marker.color}
for t in fig.data
]
).assign(y=lambda d: d.index//20, x=lambda d: d.index%20)
# build a figure that effectively the legend
fig_legend = px.scatter(
df_symbol,
x="x",
y="y",
symbol="symbol",
color="state",
text="state",
color_discrete_sequence=df_symbol["color"]
).update_traces(textposition="middle right", showlegend=False, xaxis="x2", yaxis="y2")
# insert legend into scatter and format axes
fig.add_traces(fig_legend.data).update_layout(
yaxis_domain=[.15, 1],
yaxis2={"domain": [0, .15], "matches": None, "visible": False},
xaxis2={"visible":False},
xaxis={"position":0, "anchor":"free"},
showlegend=False
)
I'm trying to make an animated plotly graph within R where both axes' min and max change during the animation. Someone recently asked this exact question for python, and it was solved by using layout update and looping through the "frames."
Python solution here: Is there a way to dynamically change a plotly animation axis scale per frame?
Here is the python code.
import plotly.express as px
df = px.data.gapminder()
df = df[(df['continent'] == 'Asia') & (df['year'].isin([1997, 2002, 2007]))]
scales = [2002]
fig = px.scatter(df, x="gdpPercap", y="lifeExp", animation_frame="year", animation_group="country",
size="pop", color="continent", hover_name="country",
log_x=True, size_max=55, range_x=[100,100000], range_y=[25,90])
yranges = {2002:[0, 200]}
for f in fig.frames:
if int(f.name) in yranges.keys():
f.layout.update(yaxis_range = yranges[int(f.name)])
fig.show()
I've been looking through the R examples and the R plotly object, and I can't figure out... how would I apply this solution to R?
There's a solution using Shiny here, but the animation is much less smooth, it doesn't have the slider, and I don't know anything about shiny, so I'd rather avoid it if I could: https://community.plotly.com/t/what-is-the-most-performant-way-to-update-a-graph-with-new-data/639/6?u=fabern
In case it's useful, I wrote up my code to start recreating the example in R... as far as I could get.Maybe I need to be using plotly express or dash or some such?
library("gapminder")
library('data.table')
data(package = "gapminder")
df <- as.data.table(as.data.frame(gapminder))
df <- df[continent=='Asia' & year %in% c(1997, 2002, 2007)]
graph_animated <- plot_ly(df, x = ~gdpPercap, y = ~lifeExp, frame = ~year, size = ~pop,
type = 'scatter', color = ~continent)
The easiest way to do this is to build the plot and then add the change. You have to build first, in order to see the frames you want to modify.
# build
plt <- plotly_build(graph_animated)
# modify the layout for the second frame
plt$x$frames[[2]]$layout = list(yaxis = list(range = c(0, 200)))
After that, you can visualize it as you would any other plot.
I have several columns (eg, Column, y1, y2, y3..) that I need to relate to column "X" on a scatter plot in Altair. I have included a dropdown combo box to make the selection between the "y" columns however the plots fail to change according to the selection. How can I make the y-axis selection responsive? Here is the code
# CHART 1
input_dropdown = alt.binding_select(options = \
np.array(df.drop(["Student IDs", "Average Marks"],
axis = 1).columns),
name = "Module")
selection = alt.selection_single(bind = input_dropdown)
# plot the first chart
chart1 = alt.Chart(df).mark_point().encode(
x = "Average Marks",
y = "CSE103"
).add_selection(
selection)
chart1
This is not directly supported in Vega-Lite, you can add your thumbs up and subscribe to this issue to find out when/if it is implemented https://github.com/vega/vega-lite/issues/7365.
In the meantime, you could workaround it using the same approach as in Altair heatmap with dropdown variable selector, where the data frame is first melted (but you can't dynamically change the axis title).
import altair as alt
from vega_datasets import data
df = data.cars().melt(id_vars=['Origin', 'Name', 'Year', 'Horsepower'])
dropdown_options = df['variable'].drop_duplicates().tolist()
dropdown = alt.binding_select(options=dropdown_options, name='X-axis column ')
selection = alt.selection_single(
fields=['variable'],
init={'variable': dropdown_options[0]},
bind=dropdown
)
alt.Chart(df).mark_circle().encode(
x=alt.X('value:Q', title=''),
y='Horsepower',
color='Origin',
).add_selection(
selection
).transform_filter(
selection
)
I'm plotting a line graph in Altair (4.1.0) and would like to use direct labeling (annotations) instead of a regular legend.
As such, the text mark for each line (say, time series) should appear only once and at the right-most point of the x-axis (as opposed to this scatter plot example labeling every data point).
While I'm able to use pandas to manipulate the data to get the desired results, I think it would be more elegant to have a pure-Altair implementation, but I can't seem to get it right.
For example, given the following data:
import numpy as np
import pandas as pd
import altair as alt
np.random.seed(10)
time = pd.date_range(start="10/21/2020", end="10/22/2020", periods=n)
data = pd.concat([
pd.DataFrame({
"time": time,
"group": "One",
"value": np.random.normal(10, 2, n)}),
pd.DataFrame({
"time": time,
"group": "Two",
"value": np.random.normal(5, 2, n)}).iloc[:-1]
], ignore_index=True)
I can generate a satisfactory result using pandas to create a subset that includes the last time-point for each group:
lines = alt.Chart(data).mark_line(
point=True
).encode(
x="time:T",
y="value:Q",
color=alt.Color("group:N", legend=None), # Remove legend
)
text_data = data.loc[data.groupby('group')['time'].idxmax()] # Subset the data for text positions
labels = alt.Chart(text_data).mark_text(
# some adjustments
).encode(
x="time:T",
y="value:Q",
color="group:N",
text="group:N"
)
chart = lines + labels
However, if I try to use the main data and add Altair aggregations, for example using x=max(time) or explicit transform_aggregate(), I either get text annotations on all points or none at all (respectively).
Is there a better way to obtain the above result?
You can do this using an argmax aggregate in the y encoding. For example, your labels layer might look like this:
labels = alt.Chart(data).mark_text(
align='left', dx=5
).encode(
x='max(time):T',
y=alt.Y('value:Q', aggregate={'argmax': 'time'}),
text='group:N',
color='group:N',
)
I am trying to generate 2 plots in Altair that share the same selection.
I would like to plot scatter and bar charts of population (y) vs Age (x). I am using the Altair built-in dataset population. The population is the sum of the people column in this dataset. The dataset has columns for year, people, age and sex. I can get total populate using sum(people) and plot this as y against age. For the bar chart, I can plot similarly sum(people) versus age and color using the sex column.
I am trying to set up a brush/selection between these 2 plots so that I can hilight in the scatter plot and simultaneously the bar plot is updated to reflect that selection. However, I am stuck with the following problem
I am using the layered bar graph example from the Altair documentation for the example.
Here is the code
import altair as alt
from altair.expr import datum, if_
from vega_datasets import data
interval = alt.selection_interval(encodings=['x', 'y'])
df = data.population.url
scatter = alt.Chart(df).mark_point().encode(
alt.X('age:O', axis=alt.Axis(title='')),
y='sum(people)',
color=alt.condition(interval, 'sex:N', alt.value('lightgrey'))
).transform_filter(
filter = datum.year == 2000
).transform_calculate(
"sex", if_(datum.sex == 2, 'Female', 'Male')
).properties(
selection=interval
)
bar = alt.Chart(df).mark_bar(opacity=0.7).encode(
alt.X('age:O', scale=alt.Scale(rangeStep=17)),
alt.Y('sum(people)', stack=None),
color=alt.condition(interval, 'sex:N', alt.value('lightgrey')),
).transform_filter(
filter = datum.year == 2000
).transform_calculate(
"sex", if_(datum.sex == 2, 'Female', 'Male')
).properties(height=100, width=400)
scatter & bar
I have modified the code in the documentation example. I am first creating a scatter plot and then using the color based on the selection. Then I define a bar plot of the same 2 columns and again use the selection to specify the color. Here is the output
Now, I would like to drag a box across the top (scatter) plot to select some points and simultaneously the bottom (bar) chart should update based on the selection. When I drag in the top plot to make my selection, this happens
Problems
After dragging to make a selection in the top plot, the colors (inside and outside the selection) in both plots are changed to lightgrey. I expected, in both plots, inside the selection/brush to be hilighted but outside should be lightgrey.
How can I get a selection that is hilighted in both the top and bottom plots simultaneously?
EDIT
I want this behaviour, where a brush/selection in one plot is simultaneously hilighted in a 2nd (linked) plot.
Package versions:
Python = 3.6
Altair = 2.2
Jupyter = 5.6
To trigger a selection on an aggregated value, the best approach is to use an aggregate transform to define that quantity so that it is available to the entire chart.
Here is an example:
import altair as alt
from altair.expr import datum, if_
from vega_datasets import data
interval = alt.selection_interval(encodings=['x', 'y'])
base = alt.Chart(data.population.url).transform_filter(
filter = datum.year == 2000
).transform_calculate(
"sex", if_(datum.sex == 2, 'Female', 'Male')
).transform_aggregate(
population="sum(people)",
groupby=['age', 'sex']
)
scatter = base.mark_point().encode(
alt.X('age:O', title=''),
y='population:Q',
color=alt.condition(interval, 'sex:N', alt.value('lightgrey'))
).properties(
selection=interval
)
bar = base.mark_bar(opacity=0.7).encode(
alt.X('age:O', scale=alt.Scale(rangeStep=17)),
alt.Y('population:Q'),
color=alt.condition(interval, 'sex:N', alt.value('lightgrey')),
).properties(height=100, width=400)
scatter & bar
Note that I took away the filtering by the interval selection on the lower plot, because that's not the behavior you described.
Based on (and adapting) the answer by #jakevdp above, I tried something similar to this example from the Interactive Charts section of the docs gallery.
Instead of using a base object, I used the vconcat function, which joins the Chart instances and passes the transform and data to the vconcat object. Here is the approach
import altair as alt
from altair.expr import datum, if_
from vega_datasets import data
interval = alt.selection_interval(encodings=['x', 'y'])
scatter = alt.Chart().mark_point().encode(
alt.X('age:O', title=''),
y='population:Q',
color=alt.condition(interval, 'sex:N', alt.value('lightgrey'))
).properties(
selection=interval
)
bar = alt.Chart().mark_bar(opacity=0.7).encode(
alt.X('age:O', scale=alt.Scale(rangeStep=17)),
alt.Y('population:Q'),
color=alt.condition(interval, 'sex:N', alt.value('lightgrey')),
).properties(height=100, width=400)
alt.vconcat(scatter, bar,
data=data.population.url
).transform_filter(
filter = datum.year == 2000
).transform_calculate(
"sex", if_(datum.sex == 2, 'Female', 'Male')
).transform_aggregate(
population="sum(people)",
groupby=['age', 'sex']
)
This approach appears to give the same functionality as #jakevdp's answer.i.e. a selection can be made to the scatter (top) plot and this will be reflected in the bar chart (bottom), as required.