I have a Dash application that lets the user filter a pandas dataframe, which results in a graph. They can also download a CSV of the filtered dataframe. This is accomplished by passing arguments to the URL, retrieving the arguments using flask.request.args, refiltering the database, and writing the output to a file.
While working on this solution, I added print statements to help me track variable values. Although the download link is working with the desired result, I came across some behavior that I don't fully understand. I think this may have something to do with #app.server.route and when/how it is executed.
For example:
Print statements are not always executed. They are executed sometimes.
They seem to have a higher rate of execution once I apply filters, rather than clicking the download link with no filters applied.
After applying a filter, clicking the download link, and confirming that it caused the print statements to execute, reloading the app and applying the same filter may result in the print statements not executing.
The download link always performs as intended, but I do not understand how the dataframe is being filtered and written via #app.server.route('/download_csv'), while also skipping over the print statements.
UPDATE
I have produced an MRE for Jupyter notebooks. Please note that you must pip install jupyter-dash for this code to execute.
Some notes:
I ran three tests where I would click the download CSV link 10x each.
In the first two tests, the print statements executed 8/10 times.
In the final test, they executed 3/10 times.
In the third test, I cleared the age dropdown and performed most of the tests with it as 'Null.' Sometimes print statements will execute and return 'Null', however most times there was no execution.
The MRE is below:
Update 2
After waiting 24hrs and running the code again, all of the lines in #app.server.route seem to be executing. That is, until I click the download button quickly after changing filters. When this happens, I can get the print statements to not execute. Despite not executing, the other lines are. Some guesses as to what is going on:
When the print statements don't execute, it seems that a previous version of the code is being executed. Perhaps it is stored in some temporary memory location?
It seems that restarting the system or long periods of inactivity cause the current version of the code to become the default when print statements don't execute.
print statements seem to execute less frequently after quick filter changes and download requests.
import plotly.express as px
from jupyter_dash import JupyterDash
import dash_core_components as dcc
import dash_html_components as html
from dash.dependencies import Input, Output
import numpy as np
# Load Data
df = pd.DataFrame({'id':np.arange(100), 'name':['cat', 'dog', 'mouse', 'bird', 'rabbit']*20, 'age':np.random.randint(1,30,size=100)})
# Build App
app = JupyterDash(__name__)
app.layout = html.Div([
html.H1("JupyterDash Demo"),
dcc.Graph(id='graph'),
html.Label([
"names",
dcc.Dropdown(
id='names', clearable=False,
value='cat', options=[
{'label': names, 'value': names}
for names in df.name.unique()
])
]),
html.Label([
"age",
dcc.Dropdown(
id='age', clearable=True,
options=[
{'label': att, 'value': att}
for att in df.age.unique()
])
]),
html.Br(),
html.A(
"Download CSV",
id="download_csv",
href="#",
className="btn btn-outline-secondary btn-sm"
)
])
# Define callback to update graph
#app.callback(
[Output('graph', 'figure'),
Output('download_csv', 'href')],
[Input("names", "value"),
Input('age', 'value')]
)
def update_figure(names, age):
if not names:
names = 'Null'
fil_df = df
else:
fil_df = df[df['name'].isin([names])]
fig = px.bar(
fil_df, x='id', y='age',
title="Animals"
)
if not age:
age = 'Null'
fil_df = fil_df
else:
fil_df = fil_df[(fil_df['age'].isin([age]))]
fig = px.bar(
fil_df, x='id', y='age', title="Animals"
)
return fig, "/download_csv?value={}/{}".format(names, age)
app.run_server(mode='inline')
#app.server.route('/download_csv')
def download_csv():
value = flask.request.args.get('value')
value = value.split('/')
selected_1 = value[0]
selected_2 = value[1]
print(selected_1)
print(selected_2)
str_io = io.StringIO()
df.to_csv(str_io)
mem = io.BytesIO()
mem.write(str_io.getvalue().encode('utf-8'))
mem.seek(0)
str_io.close()
return flask.send_file(mem,
mimetype='text/csv',
attachment_filename='downloadFile.csv',
as_attachment=True)
Related
I'm making a multi-page dash application that I plan to host on a server using Gunicorn and Nginx. It will access a PostgreSQL database on an external server over the network.
The data on one of the pages is obtained by a query from the database and should be updated every 30 seconds. I use to update the #callback through the dcc.Interval.
My code (simplified version):
from dash import Dash, html, dash_table, dcc, Input, Output, callback
import dash_bootstrap_components as dbc
from flask import Flask
import pandas as pd
from random import random
server = Flask(__name__)
app = Dash(__name__, server=server, suppress_callback_exceptions=True, external_stylesheets=[dbc.themes.BOOTSTRAP])
app.layout = html.Div([
dcc.Interval(
id='interval-component-time',
interval=1000,
n_intervals=0
),
html.Br(),
html.H6(id='time_update'),
dcc.Interval(
id='interval-component-table',
interval=1/2*60000,
n_intervals=0
),
html.Br(),
html.H6(id='table_update')
])
#callback(
Output('time_update', 'children'),
Input('interval-component-time', 'n_intervals')
)
def time_update(n_intervals):
time_show = 30
text = "Next update in {} sec".format(time_show - (n_intervals % 30))
return text
#callback(
Output('table_update', 'children'),
Input('interval-component-table', 'n_intervals')
)
def data_update(n_intervals):
# here in a separate file a query is made to the database and a dataframe is returned
# now here is a simplified receipt df
col = ["Col1", "Col2", "Col3"]
data = [[random(), random(), random()]]
df = pd.DataFrame(data, columns=col)
return dash_table.DataTable(df.to_dict('records'),
style_cell={'text-align': 'center', 'margin-bottom': '0'},
style_table={'width':'500px'})
if __name__ == '__main__':
server.run(port=5000, debug=True)
Locally, everything works fine for me, the load on the database is small, one such request loads 1 out of 8 processors by 30% for 3 seconds.
But, if you open my application in several browser windows, then the same data is displayed on two pages by two queries to the database at different times, that is, the load doubles. I am worried that when connecting more than 10 people, my server with the database will not withstand / will freeze heavily, and the database on it should work without delay and not fall.
Question:
Is it possible to make page refresh the same for different connections? That is, so that the data is updated at the same time for different users and only with the help of one query to the database.
I studied everything about the callback in the documentation and did not find an answer.
Solution
Thanks for the advice, #Epsi95! I studied page Dash Performance and added this to my code:
cache = Cache(app.server, config={
'CACHE_TYPE': 'filesystem',
'CACHE_DIR': 'cache-directory',
'CACHE_THRESHOLD': 50
})
#cache.memoize(timeout=30)
def query_data():
# here I make a query to the database and save the result in a dataframe
return df
def dataframe():
df = query_data()
return df
And in the #callback function I make a call to the dataframe() function.
Everything works the way I needed it. Thank you!
I have this following problem:
I'm am doing (at the same time) to sql request from the same database and the second request is depending from the result of the first:
e.g.
def get_alldata(timestart):
statement = f""" SELECT * FROM "xxx"."xxx" WHERE timestart>=toDateTime('{timestart}')"""
df = client.query_dataframe(statement) #note that i already defined my
#client from clickhouse and it works
return df
#now the second
def second_request(id, time, line): #note that i get the id,time and line from the table(df) of
#the first function (i'm working
#with dash plotly) so i saved df in a Store and used it for
#the second function
statement = f'''SELECT student,class FROM "xxx"."xxx" WHERE id IN ({id}) AND time IN ({time}) AND line IN ({line})'''
df = client.query_dataframe(statement)
return df
The first function works in my app but when it comes to the second i get this error:
Error on socket shutdown: [WinError 10038] An operation related to an object that is not a socket
and also AttributeError: 'NoneType' object has no attribute 'close'
I tried a bit to look online about socket but i didn't find something in relation with my issue
Has someone an idea??
I call the function in two callbacks from dash
#app.callback(Output('id', 'options'),
Output('time', 'options'),
Output('line', 'options'),
[Input('loaded_data', 'data'),
]
)
def get_first_fuction(data):
df = get_alldata(data) # then i used it for a dropdown for the three outputs
.
.
#app.callback(
Output('graph', 'children'),
[Input('id', 'value'),
Input('time', 'value'),
Input('line', 'value'),
]
)
def second_function(id,time,line):
df = second_request(id,time,line)
# and i use it for a plot
# I get the error by this df
maybe you close the cursor and/or the connection between the execution (call) of those functions ? i guess functions seems ok but could you show the whole programm or at least the part where you actually call the functions ?
I am trying to build a dashboard using Dash. I keep getting this error when I go to the default website http://127.0.0.1:8050/ and I get TypeError: cannot convert 'NoneType' object. Check the image for the error. My code does not have any mistakes and I was able to run it before and the dashboard would work perfectly. Can someone please help me? Here is the code:
import dash # (version 1.12.0) pip install dash
import dash_core_components as dcc
import dash_html_components as html
from dash.dependencies import Input, Output
import datetime
from datetime import date
app = dash.Dash(__name__)
# App layout
app.layout = html.Div([
html.H1("Snodas SWE/SD For January", style={'text-align': 'center'}),
dcc.DatePickerSingle(
id='my-date-picker-single',
min_date_allowed=date(2020, 1, 1),
max_date_allowed=date(2020, 12, 30),
initial_visible_month=date(2020, 1, 1),
date=date(2020, 1, 1)
),
html.Div(id='output-container-date-picker-single'),
dcc.Checklist(
options=[
{'label': 'SWE', 'value': 'SWE'},
{'label': 'SD', 'value': 'SD'}
],
labelStyle={'display': 'inline-block'}
),
html.Iframe(id='map', srcDoc=open('map1.html', 'r').read(), width='100%', height='1000')
])
#app.callback(
Output('map', 'srcDoc'),
Input('my-date-picker-single', 'date'))
def update_output(date):
return open('map_swe_sd_{}.html'.format(str(date)), 'r').read()
if __name__ == "__main__":
app.run_server(debug = True)
Error Message
Try this:
#app.callback(
Output('map', 'srcDoc'),
Input('my-date-picker-single', 'date'))
def update_output(date):
if not date:
raise dash.exceptions.PreventUpdate
return open('map_swe_sd_{}.html'.format(str(date)), 'r').read()
I am new to all of this, so I hope I do this right and according to the conventional rules:
I had the same problem when I simply tried to do the dash tutorial on the dash website.
There was something wrong with my installation of dash or with my python environment (I know that "something wrong is not really an identification of the problem). It seems like this is similar in your case since the tracebacks point to packages, right? After creating and activating a new python environment, and installing dash there, everything worked.
I hope this might help you as well.
I have a dash table. Table allows edit. I want to sort the table by column, so that if user input data, the table is resorted right away. I achieve this like on the page https://dash.plotly.com/datatable/callbacks. The sort is already set when the page loads. I got stuck on the last step, where I want to hide the sort option from user. Is that possible?
Example on the image. I want to delete the arrows marked yellow, but keep sort by column 'pop'.
edited code example from https://dash.plotly.com/datatable/callbacks:
import dash
from dash.dependencies import Input, Output
import dash_table
import pandas as pd
app = dash.Dash(__name__)
df = pd.read_csv('https://raw.githubusercontent.com/plotly/datasets/master/gapminder2007.csv')
PAGE_SIZE = 5
app.layout = dash_table.DataTable(
id='table-multicol-sorting',
columns=[
{"name": i, "id": i} for i in sorted(df.columns)
],
data=df.to_dict('records'),
page_size=PAGE_SIZE,
sort_action='native',
sort_mode='multi',
sort_as_null=['', 'No'],
sort_by=[{'column_id': 'pop', 'direction': 'asc'}],
editable=True,
)
if __name__ == '__main__':
app.run_server(debug=True)
You can target the sort element and hide it using css like this:
span.column-header--sort {
display: none;
}
So you can put that code in a css file in your assets directory for example. See the documentation here for more information about ways to include styles in a dash app.
I am able to do it by sort_action='none' in Dash v1.16.2
I have a dashtable with different pages (pagination = True) because it has over 5000 records, and the page loads very slowly when I use, for example, page_size=5000. I made a def that returns the contents of the row when clicked with more details from the sqlite database I’m using.
Here is the base code:
#app.callback(
Output('popup-container', 'children'),
[Input('table', 'selected_cells'),
Input('table', 'derived_virtual_data'),
Input('my-dropdown', 'value'),
],
)
def update_popup(sel, current_table, value):
if sel:
column_name = sel[0]['column_id']
row_num = sel[0]['row']
valor = current_table[row_num][column_name]
print(valor)
print(current_table)
Where ‘popup-container’ is a Div, ‘table’ is the dashTable and ‘my-dropdown’ is a dropdown that doesn’t interfere. Sel is a variable that gets the selected_cell row.
This def is working fine for the first page of the dashtable. However, when I change pages the current_table variable returns only the first page of the table. How can I make the current_table return the current page data?
I think this is what you need. The docs go over how to handle pagination on the back end. Not only does it help you speed things up in the browser by loading only what you need, but you can get the page_current prop as an input for your callback.