I'm trying to use mark_text to create a stacked text in a stacked bar chart. I would like to label each bar with the value of 'Time'. Is it possible to have text marks in the corresponding stack of a stacked area chart?
Here's how I create bar & text chart:
bar = alt.Chart(df_pivot, title = {'text' :'How do people spend their time?', 'subtitle' : 'Average of minutes per day from time-use diaries for people between 15 and 64'}).mark_bar().transform_calculate(
filtered="datum.Category == 'Paid work'"
).transform_joinaggregate(sort_val="sum(filtered)", groupby=["Country"]
).encode(
x=alt.X('Time', stack='zero'),
y=alt.Y('Country', sort=alt.SortField('sort_val', order='descending')),
color=alt.Color('Category:N', sort=CatOrder),
order=alt.Order('color_Category_sort_index:Q'),
tooltip=['Country', 'Category', 'Time']
).interactive()
bar
text = alt.Chart(df_pivot).mark_text(align='center', baseline='middle', color='black').transform_calculate(
filtered="datum.Category == 'Paid work'"
).transform_joinaggregate(sort_val="sum(filtered)", groupby=["Country"]
).encode(
x=alt.X('Time:Q', stack='zero'),
y=alt.Y('Country', sort=alt.SortField('sort_val', order='descending')),
detail='Category:N',
text=alt.Text('Time:Q', format='.0f')
)
bar + text
Issue:
The text is not in its proper stack & The order of the text is also wrong.
The Y sorting is reset and they are no longer sorted as expected.
It's not that I don't understand why I have these issues. I'm new to this platform, the source code via my notebook: https://www.kaggle.com/interphuoc0101/times-use. Thanks a lot.
Your bar chart specifies a stack order:
order=alt.Order('color_Category_sort_index:Q'),
You should add a matching order encoding to your text layer to ensure the text appears in the same order.
Here is an example of how you can use order in both charts:
import altair as alt
from vega_datasets import data
source=data.barley()
bars = alt.Chart(source).mark_bar().encode(
x=alt.X('sum(yield):Q', stack='zero'),
y=alt.Y('variety:N'),
color=alt.Color('site'),
order=alt.Order('color_Category_sort_index:Q'),
)
text = alt.Chart(source).mark_text(dx=-15, dy=3, color='white').encode(
x=alt.X('sum(yield):Q', stack='zero'),
y=alt.Y('variety:N'),
detail='site:N',
text=alt.Text('sum(yield):Q', format='.1f'),
order=alt.Order('color_Category_sort_index:Q')
)
bars + text
Related
how to create grouped bar chart in Streamlit
I have tried st.altair(chart) method to get the answer but still it shows the stacked bar chart instead of grouped bar chart
You should be able to create the bar chart with Altair and then just pass it to Streamlit:
chart = alt.Chart(prediction_table2, title='Simulated (attainable) and predicted yield ').mark_bar(
opacity=1,
).encode(
column = alt.Column('date:O', spacing = 5, header = alt.Header(labelOrient = "bottom")),
x =alt.X('variable', sort = ["Actual_FAO", "Predicted", "Simulated"], axis=None),
y =alt.Y('value:Q'),
color= alt.Color('variable')
).configure_view(stroke='transparent')
st.altair_chart(chart)
Here is the current code for my visualization and the chart it produces:
base = alt.Chart(cs_data).mark_bar().encode(x=alt.X("PROGRAM:N", axis=alt.Axis(title='University/Credit Level', labels=False)),
y=alt.Y('MD_EARN_WNE:Q', axis=alt.Axis(title='Median Graduate Salary')),
).properties(
width=480,
height=320
)
credit_labels = base.mark_text(align='left', baseline='middle', angle=270, dx=3, color='black').encode(
text='CREDDESC:O'
)
chart = base.mark_bar().encode(
color=alt.Color("INSTNM", title="University")
)
final = alt.layer(chart, credit_labels, data=cs_data)
final
https://i.stack.imgur.com/pvFcC.png
As you can see, the text seems to be misaligned for the two orange bars. They are actually aligned to where the bar should end, but an extra bit gets added to the bar.
If I remove the color encoding this goes away:
base = alt.Chart(cs_data).mark_bar().encode(x=alt.X("PROGRAM:N", axis=alt.Axis(title='University/Credit Level', labels=False)),
y=alt.Y('MD_EARN_WNE:Q', axis=alt.Axis(title='Median Graduate Salary')),
).properties(
width=480,
height=320
)
credit_labels = base.mark_text(align='left', baseline='middle', angle=270, dx=3, color='black').encode(
text='CREDDESC:O'
)
chart = base.mark_bar().encode(
)
final = alt.layer(chart, credit_labels, data=cs_data)
final
https://i.stack.imgur.com/MNVBY.png
What's going on here?
It seems you are overplotting for these 2 bars. With the color encoding, the bars get stacked, without they are just overlayed on each other. You can also see that the text is bolder, because these are 2 labels overlapping.
A fix would be to use y=alt.Y('sum(MD_EARN_WNE):Q') to sum the overlapping bars.
It would be good to look in the data and find the dataframe column which is causing this grouping, so you understand what you are exactly plotting.
I have several columns (eg, Column, y1, y2, y3..) that I need to relate to column "X" on a scatter plot in Altair. I have included a dropdown combo box to make the selection between the "y" columns however the plots fail to change according to the selection. How can I make the y-axis selection responsive? Here is the code
# CHART 1
input_dropdown = alt.binding_select(options = \
np.array(df.drop(["Student IDs", "Average Marks"],
axis = 1).columns),
name = "Module")
selection = alt.selection_single(bind = input_dropdown)
# plot the first chart
chart1 = alt.Chart(df).mark_point().encode(
x = "Average Marks",
y = "CSE103"
).add_selection(
selection)
chart1
This is not directly supported in Vega-Lite, you can add your thumbs up and subscribe to this issue to find out when/if it is implemented https://github.com/vega/vega-lite/issues/7365.
In the meantime, you could workaround it using the same approach as in Altair heatmap with dropdown variable selector, where the data frame is first melted (but you can't dynamically change the axis title).
import altair as alt
from vega_datasets import data
df = data.cars().melt(id_vars=['Origin', 'Name', 'Year', 'Horsepower'])
dropdown_options = df['variable'].drop_duplicates().tolist()
dropdown = alt.binding_select(options=dropdown_options, name='X-axis column ')
selection = alt.selection_single(
fields=['variable'],
init={'variable': dropdown_options[0]},
bind=dropdown
)
alt.Chart(df).mark_circle().encode(
x=alt.X('value:Q', title=''),
y='Horsepower',
color='Origin',
).add_selection(
selection
).transform_filter(
selection
)
This is the code I am using to build the bar chart itself:
bar = alt.Chart(selected_publication_total_subs_df).transform_joinaggregate(
TotalSubCnt='sum(subscriber_cnt)'
).transform_calculate(
PercentOfTotal='datum.subscriber_cnt / datum.TotalSubCnt'
).mark_bar().encode(
x=alt.X('yearmonthdate(as_of_date):O', axis=alt.Axis(title='', labelFontSize=12, labelPadding=10, labelAngle=-45)),
y=alt.Y('sum(subscriber_cnt)', stack='normalize', axis=alt.Axis(format='%', title='% of Subscriber Count' ,labelFontSize=12, labelPadding=10, titleFontSize=16, titlePadding=20)),
color=alt.Color('status'),
opacity=alt.condition(legend_selection, alt.value(1), alt.value(0.1))
).properties(width=1000,height=400).add_selection(legend_selection)
It is resulting in this visual:
I am trying to generate 2 plots in Altair that share the same selection.
I would like to plot scatter and bar charts of population (y) vs Age (x). I am using the Altair built-in dataset population. The population is the sum of the people column in this dataset. The dataset has columns for year, people, age and sex. I can get total populate using sum(people) and plot this as y against age. For the bar chart, I can plot similarly sum(people) versus age and color using the sex column.
I am trying to set up a brush/selection between these 2 plots so that I can hilight in the scatter plot and simultaneously the bar plot is updated to reflect that selection. However, I am stuck with the following problem
I am using the layered bar graph example from the Altair documentation for the example.
Here is the code
import altair as alt
from altair.expr import datum, if_
from vega_datasets import data
interval = alt.selection_interval(encodings=['x', 'y'])
df = data.population.url
scatter = alt.Chart(df).mark_point().encode(
alt.X('age:O', axis=alt.Axis(title='')),
y='sum(people)',
color=alt.condition(interval, 'sex:N', alt.value('lightgrey'))
).transform_filter(
filter = datum.year == 2000
).transform_calculate(
"sex", if_(datum.sex == 2, 'Female', 'Male')
).properties(
selection=interval
)
bar = alt.Chart(df).mark_bar(opacity=0.7).encode(
alt.X('age:O', scale=alt.Scale(rangeStep=17)),
alt.Y('sum(people)', stack=None),
color=alt.condition(interval, 'sex:N', alt.value('lightgrey')),
).transform_filter(
filter = datum.year == 2000
).transform_calculate(
"sex", if_(datum.sex == 2, 'Female', 'Male')
).properties(height=100, width=400)
scatter & bar
I have modified the code in the documentation example. I am first creating a scatter plot and then using the color based on the selection. Then I define a bar plot of the same 2 columns and again use the selection to specify the color. Here is the output
Now, I would like to drag a box across the top (scatter) plot to select some points and simultaneously the bottom (bar) chart should update based on the selection. When I drag in the top plot to make my selection, this happens
Problems
After dragging to make a selection in the top plot, the colors (inside and outside the selection) in both plots are changed to lightgrey. I expected, in both plots, inside the selection/brush to be hilighted but outside should be lightgrey.
How can I get a selection that is hilighted in both the top and bottom plots simultaneously?
EDIT
I want this behaviour, where a brush/selection in one plot is simultaneously hilighted in a 2nd (linked) plot.
Package versions:
Python = 3.6
Altair = 2.2
Jupyter = 5.6
To trigger a selection on an aggregated value, the best approach is to use an aggregate transform to define that quantity so that it is available to the entire chart.
Here is an example:
import altair as alt
from altair.expr import datum, if_
from vega_datasets import data
interval = alt.selection_interval(encodings=['x', 'y'])
base = alt.Chart(data.population.url).transform_filter(
filter = datum.year == 2000
).transform_calculate(
"sex", if_(datum.sex == 2, 'Female', 'Male')
).transform_aggregate(
population="sum(people)",
groupby=['age', 'sex']
)
scatter = base.mark_point().encode(
alt.X('age:O', title=''),
y='population:Q',
color=alt.condition(interval, 'sex:N', alt.value('lightgrey'))
).properties(
selection=interval
)
bar = base.mark_bar(opacity=0.7).encode(
alt.X('age:O', scale=alt.Scale(rangeStep=17)),
alt.Y('population:Q'),
color=alt.condition(interval, 'sex:N', alt.value('lightgrey')),
).properties(height=100, width=400)
scatter & bar
Note that I took away the filtering by the interval selection on the lower plot, because that's not the behavior you described.
Based on (and adapting) the answer by #jakevdp above, I tried something similar to this example from the Interactive Charts section of the docs gallery.
Instead of using a base object, I used the vconcat function, which joins the Chart instances and passes the transform and data to the vconcat object. Here is the approach
import altair as alt
from altair.expr import datum, if_
from vega_datasets import data
interval = alt.selection_interval(encodings=['x', 'y'])
scatter = alt.Chart().mark_point().encode(
alt.X('age:O', title=''),
y='population:Q',
color=alt.condition(interval, 'sex:N', alt.value('lightgrey'))
).properties(
selection=interval
)
bar = alt.Chart().mark_bar(opacity=0.7).encode(
alt.X('age:O', scale=alt.Scale(rangeStep=17)),
alt.Y('population:Q'),
color=alt.condition(interval, 'sex:N', alt.value('lightgrey')),
).properties(height=100, width=400)
alt.vconcat(scatter, bar,
data=data.population.url
).transform_filter(
filter = datum.year == 2000
).transform_calculate(
"sex", if_(datum.sex == 2, 'Female', 'Male')
).transform_aggregate(
population="sum(people)",
groupby=['age', 'sex']
)
This approach appears to give the same functionality as #jakevdp's answer.i.e. a selection can be made to the scatter (top) plot and this will be reflected in the bar chart (bottom), as required.