Use Holoviz Panel Dropdown value to query dataframe - python

I am trying to use a Holoviz Panel dropdown widget value to query a dataframe. The dataframe however does not reflect the change in the dropdown value. I added a markdown widget to check if the change in the dropdown value is being captured - It seems to be. However, I can't figure out how to update the dataframe. I am a complete beginner to programming, just trying to learn. Any help is appreciated.
import pandas as pd
import panel as pn
pn.extension()
# Dataframe
df = pd.DataFrame({'CcyPair':['EUR/USD', 'AUD/USD' ,'USD/JPY'],
'Requester':['Client1', 'Client2' ,'Client3'],
'Provider':['LP1', 'LP2' ,'LP3']})
# Dropdown
a2 = pn.widgets.Select(options=list(df.Provider.unique()))
# Query dataframe based on value in Provider dropdown
def query(x=a2):
y = pn.widgets.DataFrame(df[(df.Provider==x)])
return y
# Test Markdown Panel to check if the dropdown change returns value
s = pn.pane.Markdown(object='')
# Register watcher and define callback
w = a2.param.watch(callback, ['value'], onlychanged=False)
def callback(*events):
print(events)
for event in events:
if event.name == 'value':
df1 = query(event.new)
s.object = event.new
# Display Output
pn.Column(query, s)
Output Image

Inspired by the self-answer, the following code produces a select box containing the list of providers and a dataframe filtered on that selection. It was tested on Panel version 0.13.1.
Note that the watch=True suggestion in the self-answer wasn't necessary.
import pandas as pd
import panel as pn
pn.extension()
# Dataframe
df = pd.DataFrame({
'CcyPair':['EUR/USD', 'AUD/USD' ,'USD/JPY'],
'Requester':['Client1', 'Client2' ,'Client3'],
'Provider':['LP1', 'LP2' ,'LP3']
})
# Dropdown
providers = list(df.Provider.unique())
select_widget = pn.widgets.Select(options=providers)
# Query dataframe based on value in Provider dropdown
#pn.depends(select_widget)
def query(x):
filtered_df = pn.widgets.DataFrame(df[df.Provider==x])
return filtered_df
# Display Output
pn.Column(select_widget, query)

Figured it out, turned out I just needed to add #pn.depends above my query function. Once I added pn.depends(a2, watch=True), the dataframe was filtered based on a2 input. The callback and watcher were unnecessary.

Related

Understanding streamlit data flow and how to submit form in a sequential way

Below is a simple reproducible example that works to illustrate the problem in its simple form. You can jump to the code and expected behaviour as the problem description can be long.
The main concept
There are 3 dataframes stored in a list, and a form on the sidebar shows the supplier_name and po_number from the relevant dataframe. When the user clicks the Next button, the information inside the supplier_name and po_number text_input will be saved (in this example, they basically got printed out on top of the sidebar).
Problem
This app works well when the user don't change anything inside the text_input, but if the user changes something, it breaks the app. See below pic for example, when I change the po_number to somethingrandom, the saved information is not somethingrandom but p123 from the first dataframe.
What's more, if the information from the next dataframe is the same as the first dataframe, the changed value inside the text_input will be unchanged for the next display. For example, because the first and second dataframe's supplier name are both S1, if I change the supplier name to S10, then click next, the supplier_name is still S10 on the second dataframe, while the second dataframe's supplier_name should be S1. But if the supplier name for the next dataframe changed, the information inside the text_input will be changed.
Justification
If you are struggling to understand why I want to do this, the original use for this is for the sidebar input area to extract information from each PDFs, then when the user confirms the information are all correct, they click next to review the next PDF. But if something is wrong, they can change the information inside the text_input, then click next, and the information of the changed value will be recorded, and for the next pdf, the extracted information should reflect on what the next pdf is. I did this in R shiny quite simply, but can't figure out how the data flow works here in streamlit, please help.
Reproducible Example
import streamlit as st
import pandas as pd
# 3 dataframes that are stored in a list
data1 = {
"supplier_name": ["S1"],
"po_number": ["P123"],
}
data2 = {
"supplier_name": ["S1"],
"po_number": ["P124"],
}
data3 = {
"supplier_name": ["S2"],
"po_number": ["P125"],
}
df1 = pd.DataFrame(data1)
df2 = pd.DataFrame(data2)
df3 = pd.DataFrame(data3)
list1 = [df1, df2, df3]
# initiate a page session state, every time next button is clicked
# it will go to the next dataframe in the list
if 'page' not in st.session_state:
st.session_state.page = 0
def next_page():
st.sidebar.write(f"Submitted! supplier_name: {supplier_name} po_number: {po_number}")
st.session_state.page += 1
supplier_name_value = list1[st.session_state.page]["supplier_name"][0]
po_number_value = list1[st.session_state.page]["po_number"][0]
# main area
list1[st.session_state.page]
# sidebar form
with st.sidebar.form("form"):
supplier_name = st.text_input(label="Supplier Name", value=supplier_name_value)
po_number = st.text_input(label="PO Number", value=po_number_value)
next_button = st.form_submit_button("Next", on_click=next_page)
Expected behaviour
The dataframe's info are extracted into the sidebar input area. The user can change the input if they wish, then click next, and the values inside the input areas will be saved. When it goes to the next dataframe, the values inside the text input will be refreshed to extract from the next dataframe, and repeats.
I'm not totally sure what you're going for, but after some messing around, the only way I was able to achieve this sort of sequential form submission handling is with st.experimental_rerun(). I hate to resort to that since it may be removed any time, so hopefully there's a better way.
Without experimental_rerun(), forms take two submits to actually update state. I wasn't able to find a "correct" way to achieve an immediate update to support the expected behavior.
Here's my attempt:
import pandas as pd # 1.5.1
import streamlit as st # 1.18.1
def initialize_state():
data = [
{
"supplier_name": ["S1"],
"po_number": ["P123"],
},
{
"supplier_name": ["S1"],
"po_number": ["P124"],
},
{
"supplier_name": ["S2"],
"po_number": ["P125"],
},
]
state.dfs = state.get("dfs", [pd.DataFrame(x) for x in data])
first_vals = [{x: df[x][0] for x in df.columns} for df in state.dfs]
state.selections = state.get("selections", first_vals)
state.pages_expanded = state.get("pages_expanded", 0)
state.current_page = state.get("current_page", 0)
state.just_modified_page = state.get("just_modified_page", -1)
def handle_submit(i):
st.session_state.selections[i] = {
"supplier_name": state.new_supplier_name,
"po_number": state.new_po_number,
}
state.current_page = i
state.just_modified_page = i
if i < len(state.dfs) - 1 and state.pages_expanded == i:
state.pages_expanded += 1
st.experimental_rerun()
def render_form(i):
with st.sidebar.form(key=f"form-{i}"):
supplier_name = state.selections[i]["supplier_name"]
po_number = state.selections[i]["po_number"]
if i == state.just_modified_page:
st.sidebar.write(
f"Submitted! supplier_name: {supplier_name} "
f"po_number: {po_number}"
)
state.just_modified_page = -1
state.new_supplier_name = st.text_input(
label="Supplier Name",
value=supplier_name,
)
state.new_po_number = st.text_input(
label="PO Number",
value=po_number,
)
if st.form_submit_button("Next"):
handle_submit(i)
state = st.session_state
initialize_state()
for i in range(state.pages_expanded + 1):
render_form(i)
# debug
st.write("state.pages_expanded", state.pages_expanded)
st.write("state.current_page", state.current_page)
st.write("state.just_modified_page", state.just_modified_page)
st.write("state.dfs[state.current_page]", state.dfs[state.current_page])
st.write("state.selections", state.selections)
I'm assuming you want to keep track of the user's selections, but not actually modify the dataframes. If you do want to modify the dataframes, that's simpler: replace state.selections with actual writes to dfs by index and column:
# ...
def handle_submit(i):
st.session_state.dfs[i]["supplier_name"] = state.new_supplier_name,
st.session_state.dfs[i]["po_number"] = state.new_po_number,
#st.session_state.selections[i] = {
# "supplier_name": state.new_supplier_name,
# "po_number": state.new_po_number,
#}
# ...
def render_form(i):
with st.sidebar.form(key=f"form-{i}"):
supplier_name = state.dfs[i]["supplier_name"][0]
po_number = state.dfs[i]["po_number"][0]
#supplier_name = state.selections[i]["supplier_name"]
#po_number = state.selections[i]["po_number"]
# ...
Now, it's possible to make this 100% dynamic, but I hardcoded supplier_name and po_number to avoid premature generalization that you may not need. If you do want to generalize, use df.columns like initialize_state does throughout the code.
I'm not sure I quite understand what you're trying to accomplish, but it seems like you're never updating the supplier name in list1 after the user updates the name via the text input widget.

KeyError warning from pandas dataframe inside plotly dash chained callback

I got multiple dropdowns that I'd like to populate depending on what the user chooses in the previous dropdown. I populate the first dropdown with:
schools_requests = requests.get("http://ipwhatever:portwhatever/list_all_schools")
schools_data = schools_requests.json()
df = pd.DataFrame(schools_data)
nome = df['nome'].tolist()
It gives me the names of the schools I got listed. I then send it (nome) to the first dropdown like this:
html.Label('Escola'),
dcc.Dropdown(
options = nome,
id = "escola",
)
The first callback works fine and it's the one down below:
#callback(
Output('id_school', 'children'),
Input('escola','value')
)
def find_id_school(school_name):
all_schools = requests.get(
"http://ipwhatever:portwhatever/list_all_schools")
all_schools_data = all_schools.json()
for element in all_schools_data:
if school_name == element['nome']:
id_school = element['id_escola']
return id_school
It basically searches for the corresponding school id given the name of the school the user chose in the first dropdown and stores this id in a hidden html.Div.
Now comes the second callback, where I use pandas and don't understand why it's different from the first time.
#callback(
Output('ano', 'options'),
Input('id_school', 'children')
)
def render_grade_from_school(chosen_id):
grade = requests.get(
"http://ipwhatever:portwhatever/grade?school_id="+str(chosen_id))
grade_data = grade.json()
indices = list(range(0,len(grade_data)))
df = pd.DataFrame(grade_data, index=indices)
ano = df['serie'].tolist()
return ano
So it takes the school id, requests the grades from another endpoint and basically does the same thing as the first time I used pandas in the code.
The only difference is the index argument. It started complaining about the lack of index. So I check the length of the list of jsons, generate a list of indices like [0,1,2,...] and passes it as argument to dataframe. So it stopped complaining about it.
But now...I get a KeyError: 'serie'. The warning highlights this: return self._engine.get_loc(casted_key) as the source, I don't know. Still, the dropdown 'ano' (grade) correctly updates and shows it in the dropdown. But the warning never goes away.

converting links data from list to series changes the links inside the final series (I don't want it clickable just the data frame to be perfect) [duplicate]

print('http://google.com') outputs a clickable url.
How do I get clickable URLs for pd.DataFrame(['http://google.com', 'http://duckduckgo.com']) ?
If you want to apply URL formatting only to a single column, you can use:
data = [dict(name='Google', url='http://www.google.com'),
dict(name='Stackoverflow', url='http://stackoverflow.com')]
df = pd.DataFrame(data)
def make_clickable(val):
# target _blank to open new window
return '<a target="_blank" href="{}">{}</a>'.format(val, val)
df.style.format({'url': make_clickable})
(PS: Unfortunately, I didn't have enough reputation to post this as a comment to #Abdou's post)
Try using pd.DataFrame.style.format for this:
df = pd.DataFrame(['http://google.com', 'http://duckduckgo.com'])
def make_clickable(val):
return '{}'.format(val,val)
df.style.format(make_clickable)
I hope this proves useful.
#shantanuo : not enough reputation to comment.
How about the following?
def make_clickable(url, name):
return '{}'.format(url,name)
df['name'] = df.apply(lambda x: make_clickable(x['url'], x['name']), axis=1)
I found this at How to Create a Clickable Link(s) in Pandas DataFrame and JupyterLab which solved my problem:
HTML(df.to_html(render_links=True, escape=False))
from IPython.core.display import display, HTML
import pandas as pd
# create a table with a url column
df = pd.DataFrame({"url": ["http://google.com", "http://duckduckgo.com"]})
# create the column clickable_url based on the url column
df["clickable_url"] = df.apply(lambda row: "<a href='{}' target='_blank'>{}</a>".format(row.url, row.url.split("/")[2]), axis=1)
# display the table as HTML. Note, only the clickable_url is being selected here
display(HTML(df[["clickable_url"]].to_html(escape=False)))

Python Streamlit - filter pandas dataframe without rerun entire script

I have the following code:
import streamlit as st
import pandas as pd
#define data
d = {'id': ['a', 'b', 'c'], 'data': [3, 4,6]}
df = pd.DataFrame(data=d)
#create sidebar input
with st.sidebar.form("my_form"):
a = st.slider('sidebar for testing', 5, 10, 9)
calculate = st.form_submit_button('Calculate')
if calculate:
df['result'] = df['data'] + a
st.write(df)
#no issues up to this point. When I move the slider in 10 the output in 16 stays on the web page
########debug############
# I am trying to select an 'id' from the dropdown and use that to filter df, but when I select a value from the dropdown,
# the script runs again and the output disappears
filter = st.selectbox('filter data', df['id'].unique())
st.write(df[df['id'] == filter])
I would like to filter the Pandas dataframe using a drop down menu to select the id I am interested in, but when I use the drop down the code reruns.
Any idea how I can solve this?
PS I also tried enclosing the entire computation in a function and adding the #st.cache decorator, but without success. I would appreciate it if anyone could show me how it’s done.
I was able to get this behavior by not using the submit button. Streamlit reruns the script from top to bottom any time there's user input, so the form submit resets as well.
d = {'id': ['a', 'b', 'c'], 'data': [3, 4, 6]}
df = pd.DataFrame(data=d)
a = st.slider('sidebar for testing', 5, 10, 9)
df['result'] = df['data'] + a
st.write(df)
# Now this will show the filtered row in the dataframe as you change the inputs
filter = st.selectbox('filter data', df['id'].unique())
st.write(df[df['id'] == filter])
For more complicated workflows, I'd refactor this and cache data that gets loaded in, but for filtering your dataframe, this should work.
Streamlit always re-runs the code on each user-submission. You can however solve this with st.session_state, which allows sharing states between reruns. Its api is a lot like a standard python dictionary.
Here is your example with st.session_state:
import streamlit as st
import pandas as pd
#define data
d = {'id': ['a', 'b', 'c'], 'data': [3, 4,6]}
df = pd.DataFrame(data=d)
#create sidebar input
with st.sidebar.form("my_form"):
a = st.slider('sidebar for testing', 5, 10, 9)
calculate = st.form_submit_button('Calculate')
# Initialization
if 'button_pressed' not in st.session_state:
st.session_state['button_pressed'] = False
# Changes if calculated button is pressed
if calculate:
st.session_state['button_pressed'] = True
# Conditional on session_state instead
if st.session_state['button_pressed']:
df['result'] = df['data'] + a
st.write(df)
#no issues up to this point. When I move the slider in 10 the output in 16 stays on the web page
########debug############
# I am trying to select an 'id' from the dropdown and use that to filter df, but when I select a value from the dropdown,
# the script runs again and the output disappears
filter = st.selectbox('filter data', df['id'].unique())
st.write(df[df['id'] == filter])

Jupyter Notebook Widgets: Create dependent dropdowns

I want to create 2 dropdown widgets in my Jupyter Notebook. The dropdown content is taken from a dataframe.
Let's say I have a pandas dataframe consisting of 3 categorical variables 'a', 'b', 'c'. 'a' has 3 subtypes 'a1','a2' and 'a3'. 'b' and 'c' are similar to a in the sense that they also have their own subtypes. I want to create 2 dropdown widgets: the first dropdown widget will have ['a','b','c'], and the second dropdown widget will display subtypes depending on what variable the user selects for the first widget.
I honestly have any idea how to do this. I'll try to write out some codes for this:
import pandas as pd
from IPython.display import *
import ipywidgets as widgets
from ipywidgets import *
# Create the dataframe
df = pd.DataFrame([['a1','a2','a3'],
['b1','b2','b3'],
['c1','c2','c3']], index = ['a','b','c']).transpose()
# Widgets
widget1 = Dropdown(options = ['a','b','c'])
display(widget1)
widget2 = Dropdown(???????)
display(widget2)
And depending on what I select for the two dropdown widgets, I want some function executed.
Any help is appreciated.
I found out how to do this. I hope this helps for anyone else who's also looking to do the same thing.
x_widget = Dropdown(options = ['a','b','c'])
y_widget = Dropdown()
# Define a function that updates the content of y based on what we select for x
def update(*args):
y_widget.options = df[x_widget.value].unique().tolist()
x_widget.observe(update)
# Some function you want executed
def random_function():
...
interact(random_function,
x = x_widget,
y = y_widget);

Categories