Interactive plot selection in Altair does not hi-light points - python

I am trying to generate 2 plots in Altair that share the same selection.
I would like to plot scatter and bar charts of population (y) vs Age (x). I am using the Altair built-in dataset population. The population is the sum of the people column in this dataset. The dataset has columns for year, people, age and sex. I can get total populate using sum(people) and plot this as y against age. For the bar chart, I can plot similarly sum(people) versus age and color using the sex column.
I am trying to set up a brush/selection between these 2 plots so that I can hilight in the scatter plot and simultaneously the bar plot is updated to reflect that selection. However, I am stuck with the following problem
I am using the layered bar graph example from the Altair documentation for the example.
Here is the code
import altair as alt
from altair.expr import datum, if_
from vega_datasets import data
interval = alt.selection_interval(encodings=['x', 'y'])
df = data.population.url
scatter = alt.Chart(df).mark_point().encode(
alt.X('age:O', axis=alt.Axis(title='')),
y='sum(people)',
color=alt.condition(interval, 'sex:N', alt.value('lightgrey'))
).transform_filter(
filter = datum.year == 2000
).transform_calculate(
"sex", if_(datum.sex == 2, 'Female', 'Male')
).properties(
selection=interval
)
bar = alt.Chart(df).mark_bar(opacity=0.7).encode(
alt.X('age:O', scale=alt.Scale(rangeStep=17)),
alt.Y('sum(people)', stack=None),
color=alt.condition(interval, 'sex:N', alt.value('lightgrey')),
).transform_filter(
filter = datum.year == 2000
).transform_calculate(
"sex", if_(datum.sex == 2, 'Female', 'Male')
).properties(height=100, width=400)
scatter & bar
I have modified the code in the documentation example. I am first creating a scatter plot and then using the color based on the selection. Then I define a bar plot of the same 2 columns and again use the selection to specify the color. Here is the output
Now, I would like to drag a box across the top (scatter) plot to select some points and simultaneously the bottom (bar) chart should update based on the selection. When I drag in the top plot to make my selection, this happens
Problems
After dragging to make a selection in the top plot, the colors (inside and outside the selection) in both plots are changed to lightgrey. I expected, in both plots, inside the selection/brush to be hilighted but outside should be lightgrey.
How can I get a selection that is hilighted in both the top and bottom plots simultaneously?
EDIT
I want this behaviour, where a brush/selection in one plot is simultaneously hilighted in a 2nd (linked) plot.
Package versions:
Python = 3.6
Altair = 2.2
Jupyter = 5.6

To trigger a selection on an aggregated value, the best approach is to use an aggregate transform to define that quantity so that it is available to the entire chart.
Here is an example:
import altair as alt
from altair.expr import datum, if_
from vega_datasets import data
interval = alt.selection_interval(encodings=['x', 'y'])
base = alt.Chart(data.population.url).transform_filter(
filter = datum.year == 2000
).transform_calculate(
"sex", if_(datum.sex == 2, 'Female', 'Male')
).transform_aggregate(
population="sum(people)",
groupby=['age', 'sex']
)
scatter = base.mark_point().encode(
alt.X('age:O', title=''),
y='population:Q',
color=alt.condition(interval, 'sex:N', alt.value('lightgrey'))
).properties(
selection=interval
)
bar = base.mark_bar(opacity=0.7).encode(
alt.X('age:O', scale=alt.Scale(rangeStep=17)),
alt.Y('population:Q'),
color=alt.condition(interval, 'sex:N', alt.value('lightgrey')),
).properties(height=100, width=400)
scatter & bar
Note that I took away the filtering by the interval selection on the lower plot, because that's not the behavior you described.

Based on (and adapting) the answer by #jakevdp above, I tried something similar to this example from the Interactive Charts section of the docs gallery.
Instead of using a base object, I used the vconcat function, which joins the Chart instances and passes the transform and data to the vconcat object. Here is the approach
import altair as alt
from altair.expr import datum, if_
from vega_datasets import data
interval = alt.selection_interval(encodings=['x', 'y'])
scatter = alt.Chart().mark_point().encode(
alt.X('age:O', title=''),
y='population:Q',
color=alt.condition(interval, 'sex:N', alt.value('lightgrey'))
).properties(
selection=interval
)
bar = alt.Chart().mark_bar(opacity=0.7).encode(
alt.X('age:O', scale=alt.Scale(rangeStep=17)),
alt.Y('population:Q'),
color=alt.condition(interval, 'sex:N', alt.value('lightgrey')),
).properties(height=100, width=400)
alt.vconcat(scatter, bar,
data=data.population.url
).transform_filter(
filter = datum.year == 2000
).transform_calculate(
"sex", if_(datum.sex == 2, 'Female', 'Male')
).transform_aggregate(
population="sum(people)",
groupby=['age', 'sex']
)
This approach appears to give the same functionality as #jakevdp's answer.i.e. a selection can be made to the scatter (top) plot and this will be reflected in the bar chart (bottom), as required.

Related

how to create a grouped bar chart in streamlit

how to create grouped bar chart in Streamlit
I have tried st.altair(chart) method to get the answer but still it shows the stacked bar chart instead of grouped bar chart
You should be able to create the bar chart with Altair and then just pass it to Streamlit:
chart = alt.Chart(prediction_table2, title='Simulated (attainable) and predicted yield ').mark_bar(
opacity=1,
).encode(
column = alt.Column('date:O', spacing = 5, header = alt.Header(labelOrient = "bottom")),
x =alt.X('variable', sort = ["Actual_FAO", "Predicted", "Simulated"], axis=None),
y =alt.Y('value:Q'),
color= alt.Color('variable')
).configure_view(stroke='transparent')
st.altair_chart(chart)

Make dropdown selection responsive for y axis Altair python

I have several columns (eg, Column, y1, y2, y3..) that I need to relate to column "X" on a scatter plot in Altair. I have included a dropdown combo box to make the selection between the "y" columns however the plots fail to change according to the selection. How can I make the y-axis selection responsive? Here is the code
# CHART 1
input_dropdown = alt.binding_select(options = \
np.array(df.drop(["Student IDs", "Average Marks"],
axis = 1).columns),
name = "Module")
selection = alt.selection_single(bind = input_dropdown)
# plot the first chart
chart1 = alt.Chart(df).mark_point().encode(
x = "Average Marks",
y = "CSE103"
).add_selection(
selection)
chart1
This is not directly supported in Vega-Lite, you can add your thumbs up and subscribe to this issue to find out when/if it is implemented https://github.com/vega/vega-lite/issues/7365.
In the meantime, you could workaround it using the same approach as in Altair heatmap with dropdown variable selector, where the data frame is first melted (but you can't dynamically change the axis title).
import altair as alt
from vega_datasets import data
df = data.cars().melt(id_vars=['Origin', 'Name', 'Year', 'Horsepower'])
dropdown_options = df['variable'].drop_duplicates().tolist()
dropdown = alt.binding_select(options=dropdown_options, name='X-axis column ')
selection = alt.selection_single(
fields=['variable'],
init={'variable': dropdown_options[0]},
bind=dropdown
)
alt.Chart(df).mark_circle().encode(
x=alt.X('value:Q', title=''),
y='Horsepower',
color='Origin',
).add_selection(
selection
).transform_filter(
selection
)

How to color lines on mouseover in a bump chart using Altair Viz?

The goal is to highlight the entire line when hovering anywhere (not just at the data points) on the line.
Imports:
from IPython.display import display
import pandas as pd
import altair as alt
Data:
data = '{"Date":{"5":1560643200000,"18":1560643200000,"22":1560643200000,"24":1560643200000,"59":1560643200000,"65":1561248000000,"78":1561248000000,"82":1561248000000,"84":1561248000000,"119":1561248000000,"125":1561852800000,"138":1561852800000,"142":1561852800000,"144":1561852800000,"179":1561852800000,"185":1562457600000,"198":1562457600000,"202":1562457600000,"204":1562457600000,"239":1562457600000,"245":1563062400000,"258":1563062400000,"262":1563062400000,"264":1563062400000,"299":1563062400000,"305":1563667200000,"318":1563667200000,"322":1563667200000,"324":1563667200000,"359":1563667200000,"365":1564272000000,"378":1564272000000,"382":1564272000000,"384":1564272000000,"419":1564272000000,"425":1564876800000,"438":1564876800000,"442":1564876800000,"444":1564876800000,"479":1564876800000,"485":1565481600000,"498":1565481600000,"502":1565481600000,"504":1565481600000,"539":1565481600000,"545":1566086400000,"558":1566086400000,"562":1566086400000,"564":1566086400000,"599":1566086400000,"605":1566691200000,"618":1566691200000,"622":1566691200000,"624":1566691200000,"659":1566691200000,"665":1567296000000,"678":1567296000000,"682":1567296000000,"684":1567296000000,"719":1567296000000,"725":1567900800000,"738":1567900800000,"742":1567900800000,"744":1567900800000,"779":1567900800000,"785":1568505600000,"798":1568505600000,"802":1568505600000,"804":1568505600000,"839":1568505600000,"845":1569110400000,"858":1569110400000,"862":1569110400000,"864":1569110400000,"899":1569110400000,"905":1569715200000,"918":1569715200000,"922":1569715200000,"924":1569715200000,"959":1569715200000,"965":1570320000000,"978":1570320000000,"982":1570320000000,"984":1570320000000,"1019":1570320000000,"1025":1570924800000,"1038":1570924800000,"1042":1570924800000,"1044":1570924800000,"1079":1570924800000,"1085":1571529600000,"1098":1571529600000,"1102":1571529600000,"1104":1571529600000,"1139":1571529600000,"1145":1572134400000,"1158":1572134400000,"1162":1572134400000,"1164":1572134400000,"1199":1572134400000,"1205":1572739200000,"1218":1572739200000,"1222":1572739200000,"1224":1572739200000,"1259":1572739200000,"1265":1573344000000,"1278":1573344000000,"1282":1573344000000,"1284":1573344000000,"1319":1573344000000,"1325":1573948800000,"1338":1573948800000,"1342":1573948800000,"1344":1573948800000,"1379":1573948800000,"1385":1574553600000,"1398":1574553600000,"1402":1574553600000,"1404":1574553600000,"1439":1574553600000,"1445":1575158400000,"1458":1575158400000,"1462":1575158400000,"1464":1575158400000,"1499":1575158400000,"1505":1575763200000,"1518":1575763200000,"1522":1575763200000,"1524":1575763200000,"1559":1575763200000,"1565":1576368000000,"1578":1576368000000,"1582":1576368000000,"1584":1576368000000,"1619":1576368000000},"Store":{"5":"store1","18":"store2","22":"store3","24":"store4","59":"store5","65":"store1","78":"store2","82":"store3","84":"store4","119":"store5","125":"store1","138":"store2","142":"store3","144":"store4","179":"store5","185":"store1","198":"store2","202":"store3","204":"store4","239":"store5","245":"store1","258":"store2","262":"store3","264":"store4","299":"store5","305":"store1","318":"store2","322":"store3","324":"store4","359":"store5","365":"store1","378":"store2","382":"store3","384":"store4","419":"store5","425":"store1","438":"store2","442":"store3","444":"store4","479":"store5","485":"store1","498":"store2","502":"store3","504":"store4","539":"store5","545":"store1","558":"store2","562":"store3","564":"store4","599":"store5","605":"store1","618":"store2","622":"store3","624":"store4","659":"store5","665":"store1","678":"store2","682":"store3","684":"store4","719":"store5","725":"store1","738":"store2","742":"store3","744":"store4","779":"store5","785":"store1","798":"store2","802":"store3","804":"store4","839":"store5","845":"store1","858":"store2","862":"store3","864":"store4","899":"store5","905":"store1","918":"store2","922":"store3","924":"store4","959":"store5","965":"store1","978":"store2","982":"store3","984":"store4","1019":"store5","1025":"store1","1038":"store2","1042":"store3","1044":"store4","1079":"store5","1085":"store1","1098":"store2","1102":"store3","1104":"store4","1139":"store5","1145":"store1","1158":"store2","1162":"store3","1164":"store4","1199":"store5","1205":"store1","1218":"store2","1222":"store3","1224":"store4","1259":"store5","1265":"store1","1278":"store2","1282":"store3","1284":"store4","1319":"store5","1325":"store1","1338":"store2","1342":"store3","1344":"store4","1379":"store5","1385":"store1","1398":"store2","1402":"store3","1404":"store4","1439":"store5","1445":"store1","1458":"store2","1462":"store3","1464":"store4","1499":"store5","1505":"store1","1518":"store2","1522":"store3","1524":"store4","1559":"store5","1565":"store1","1578":"store2","1582":"store3","1584":"store4","1619":"store5"},"Rank":{"5":1.0,"18":1.0,"22":1.0,"24":1.0,"59":1.0,"65":2.0,"78":2.0,"82":2.0,"84":2.0,"119":2.0,"125":2.0,"138":2.0,"142":2.0,"144":2.0,"179":2.0,"185":2.0,"198":2.0,"202":2.0,"204":2.0,"239":2.0,"245":2.0,"258":2.0,"262":2.0,"264":2.0,"299":2.0,"305":2.0,"318":2.0,"322":2.0,"324":2.0,"359":2.0,"365":2.0,"378":2.0,"382":2.0,"384":1.0,"419":2.0,"425":3.0,"438":1.0,"442":3.0,"444":2.0,"479":3.0,"485":4.0,"498":1.0,"502":4.0,"504":3.0,"539":4.0,"545":4.0,"558":1.0,"562":3.0,"564":3.0,"599":4.0,"605":5.0,"618":1.0,"622":2.0,"624":4.0,"659":5.0,"665":6.0,"678":1.0,"682":2.0,"684":5.0,"719":6.0,"725":7.0,"738":1.0,"742":2.0,"744":5.0,"779":7.0,"785":8.0,"798":1.0,"802":2.0,"804":6.0,"839":8.0,"845":8.0,"858":1.0,"862":2.0,"864":5.0,"899":8.0,"905":8.0,"918":1.0,"922":2.0,"924":4.0,"959":8.0,"965":8.0,"978":1.0,"982":2.0,"984":4.0,"1019":8.0,"1025":10.0,"1038":1.0,"1042":2.0,"1044":5.0,"1079":10.0,"1085":10.0,"1098":1.0,"1102":2.0,"1104":5.0,"1139":10.0,"1145":11.0,"1158":1.0,"1162":2.0,"1164":5.0,"1199":11.0,"1205":12.0,"1218":1.0,"1222":2.0,"1224":6.0,"1259":12.0,"1265":13.0,"1278":2.0,"1282":1.0,"1284":7.0,"1319":13.0,"1325":13.0,"1338":2.0,"1342":1.0,"1344":6.0,"1379":13.0,"1385":14.0,"1398":2.0,"1402":1.0,"1404":6.0,"1439":14.0,"1445":3.0,"1458":2.0,"1462":1.0,"1464":6.0,"1499":8.0,"1505":3.0,"1518":2.0,"1522":1.0,"1524":6.0,"1559":4.0,"1565":3.0,"1578":2.0,"1582":1.0,"1584":8.0,"1619":5.0}}'
Dataframe:
df_slim = pd.read_json(data)
Chart:
highlight = alt.selection(type='single', on='mouseover',
fields=['Store'], nearest=True, empty="none")
chart = alt.Chart(df_slim).mark_line().encode(
x='Date',
y='Rank',
#color='Store',
strokeDash='Store',
color=alt.condition(highlight, 'Store', alt.value("lightgray")),
tooltip=['Rank','Store']
).properties(
width=800,
height=600,
title='Bump Chart: Store Ranking'
).configure_title(
fontSize=30,
font='Courier',
anchor='start',
color='gray'
).add_selection(
highlight
)
display(chart)
Output:
Any help? Not sure what went wrong here.
Good question! It turns out this is one of the current limitations of Vega-Lite. I found this note in the VL docs on Nearest Value
The nearest transform is not supported for continuous mark types (i.e., line and area). For these mark types, consider layering a discrete mark type (e.g., point) with a 0-value opacity
So for your example I would do something like this
highlight = alt.selection(type='single', on='mouseover',
fields=['Store'], nearest=True, empty="none")
chart = alt.Chart(df_slim).mark_line().encode(
x='Date',
y='Rank',
#color='Store',
strokeDash='Store',
color=alt.condition(highlight, 'Store', alt.value("lightgray")),
tooltip=['Rank','Store']
)
points = alt.Chart(df_slim).mark_point(opacity=0).encode(
x='Date',
y='Rank',).add_selection(
highlight
)
chart + points

Adding legend to layerd chart in altair

Consider the following example:
import altair as alt
from vega_datasets import data
df = data.seattle_weather()
temp_max = alt.Chart(df).mark_line(color='blue').encode(
x='yearmonth(date):T',
y='max(temp_max)',
)
temp_min = alt.Chart(df).mark_line(color='red').encode(
x='yearmonth(date):T',
y='max(temp_min)',
)
temp_max + temp_min
In the resulting chart, I would like to add a legend that shows, that the blue line shows the maximum temperature and the red line the minimum temperature. What would be the easiest way to achieve this?
I saw (e.g. in the solution to this question: Labelling Layered Charts in Altair (Python)) that altair only adds a legend if in the encoding, you set the color or size or so, usually with a categorical column, but that is not possible here because I'm plotting the whole column and the label should be the column name (which is now shown in the y-axis label).
I would do a fold transform such that the variables could be encoded correctly.
import altair as alt
from vega_datasets import data
df = data.seattle_weather()
alt.Chart(df).mark_line().transform_fold(
fold=['temp_max', 'temp_min'],
as_=['variable', 'value']
).encode(
x='yearmonth(date):T',
y='max(value):Q',
color='variable:N'
)
If you layer two charts with the same columns and tell them to color by the same one, the legend will appear. Don't know is this helps but..
For example, i had:
Range, Amount, Type
0_5, 3, 'Private'
5_10, 5, 'Private'
Range, Amount, Type
0_5, 3, 'Public'
5_10, 5, 'Public'
and I charted both with 'color = 'Type'' and said alt.layer(chart1, chart2) and it showed me a proper legend

Plotly subplot represent same y-axis name with same color and single legend

I am trying to create a plot for two categories in a subplot. 1st column represent category FF and 2nd column represent category RF in the subplot.
The x-axis is always time and y-axis is remaining columns. In other words, it is a plot with one column vs rest.
1st category and 2nd category always have same column names just only the values differs.
I tried to generate the plot in a for loop but the problem is plotly treats each column name as distinct and thereby it represents the lines in different color for y-axis with same name. As a consequence, in legend also an entry is created.
For example, in first row Time vs price2010 I want both subplot FF and RF to be represented in same color (say blue) and a single entry in legend.
I tried adding legendgroup in go.Scatter but it doesn't help.
import pandas as pd
from pandas import DataFrame
from plotly import tools
from plotly.offline import init_notebook_mode, plot, iplot
import plotly.graph_objs as go
from plotly.subplots import make_subplots
CarA = {'Time': [10,20,30,40 ],
'Price2010': [22000,26000,27000,35000],
'Price2011': [23000,27000,28000,36000],
'Price2012': [24000,28000,29000,37000],
'Price2013': [25000,29000,30000,38000],
'Price2014': [26000,30000,31000,39000],
'Price2015': [27000,31000,32000,40000],
'Price2016': [28000,32000,33000,41000]
}
ff = DataFrame(CarA)
CarB = {'Time': [8,18,28,38 ],
'Price2010': [19000,20000,21000,22000],
'Price2011': [20000,21000,22000,23000],
'Price2012': [21000,22000,23000,24000],
'Price2013': [22000,23000,24000,25000],
'Price2014': [23000,24000,25000,26000],
'Price2015': [24000,25000,26000,27000],
'Price2016': [25000,26000,27000,28000]
}
rf = DataFrame(CarB)
Type = {
'FF' : ff,
'RF' : rf
}
fig = make_subplots(rows=len(ff.columns), cols=len(Type), subplot_titles=('FF','RF'),vertical_spacing=0.3/len(ff.columns))
labels = ff.columns[1:]
for indexC, (cat, values) in enumerate(Type.items()):
for indexP, params in enumerate(values.columns[1:]):
trace = go.Scatter(x=values.iloc[:,0], y=values[params], mode='lines', name=params,legendgroup=params)
fig.append_trace(trace,indexP+1, indexC+1)
fig.update_xaxes(title_text=values.columns[0],row=indexP+1, col=indexC+1)
fig.update_yaxes(title_text=params,row=indexP+1, col=indexC+1)
fig.update_layout(height=2024, width=1024,title_text="Car Analysis")
iplot(fig)
It might not be a good solution, but so far I can able to come up only with this hack.
fig = make_subplots(rows=len(ff.columns), cols=len(Type), subplot_titles=('FF','RF'),vertical_spacing=0.2/len(ff.columns))
labels = ff.columns[1:]
colors = [ '#a60000', '#f29979', '#d98d36', '#735c00', '#778c23', '#185900', '#00a66f']
legend = True
for indexC, (cat, values) in enumerate(Type.items()):
for indexP, params in enumerate(values.columns[1:]):
trace = go.Scatter(x=values.iloc[:,0], y=values[params], mode='lines', name=params,legendgroup=params, showlegend=legend, marker=dict(
color=colors[indexP]))
fig.append_trace(trace,indexP+1, indexC+1)
fig.update_xaxes(title_text=values.columns[0],row=indexP+1, col=indexC+1)
fig.update_yaxes(title_text=params,row=indexP+1, col=indexC+1)
fig.update_layout(height=1068, width=1024,title_text="Car Analysis")
legend = False
If you combine your data into a single tidy data frame, you can use a simple Plotly Express call to make the chart: px.line() with color, facet_row and facet_col

Categories