Dash dcc.upload component for large file - python

I am developing a dash application. In that I have file upload feature. The file size is big enough minimum is some about 100MB to support that I have set max_size=-1 (no file size limit).
Below is code:
dcc.Upload(
id="upload_dataset",
children=html.Div(
[
"Drag and Drop or ",
html.A(
"Select File",
style={
"font-weight": "bold",
},
title="Click to select file.",
),
]
),
multiple=False,
max_size=-1,
)
The uploaded files are saved on server side. This dcc.upload component has attribute contents which holds the entire data in string format using base64. While browsing I come to know that before sending the data to server, this contents is also stored in web browser memory.
Problem: for small file size storing contents in web browser memory may be fine. Since I have large file size by doing so browser may crash and app freeze.
Is there any way to by-pass this default behavior and I will like to send file in chunks or as stream?
How to achieve this in dash using dcc.upload component or any other way?

You can use the dash-uploader library. It allows you to directly transfer data from the browser to the server hard drive, so you don't face any file size issues.
This library is a hobby project of the maintainer, so it might not be the most production worthy library. Though, I tested it today and it seems stable enough, I even got it to working with a Dash app that runs on an AWS Lambda.
Visit the more extensive documentation to get started with the library.
Here is a short code example to get you started with a local version.
requirements.txt
install with pip install -r requirements.txt
dash==2.8.1
dash-uploader==0.7.0a1
packaging==21.3
app.py
Copy code in file app.py and run the file. It runs as-is.
import pprint
from pathlib import Path
import os
import uuid
import dash_uploader as du
import dash
from dash import Output, html
app = dash.Dash(__name__)
UPLOAD_FOLDER_ROOT = Path("./tmp") / "uploads"
du.configure_upload(
app,
str(UPLOAD_FOLDER_ROOT),
use_upload_id=True,
)
def get_upload_component(id):
return du.Upload(
id=id,
max_file_size=50, # 50 Mb
chunk_size=4, # 4 MB
filetypes=["csv", "json", "txt", "xlsx", "xls", "png"],
upload_id=uuid.uuid1(), # Unique session id
)
def get_app_layout():
return html.Div(
[
html.H1("Demo"),
html.Div(
children=[
get_upload_component("upload_data"),
html.Div(
id="upload_output",
),
],
style={ # wrapper div style
"textAlign": "center",
"width": "600px",
"padding": "10px",
"display": "inline-block",
},
),
],
style={
"textAlign": "center",
},
)
# get_app_layout is a function
# This way we can use unique session id's as upload_id's
app.layout = get_app_layout
#du.callback(
output=Output("upload_output", "children"),
id="upload_data",
)
def callback_on_completion(status: du.UploadStatus):
"""Has some print statements to get you started in understanding the status
object and how to access the file location of your uploaded file."""
pprint.pprint(status.__dict__)
print(f"Contents of {UPLOAD_FOLDER_ROOT}:\n{os.listdir(UPLOAD_FOLDER_ROOT)}")
upload_id_folder = Path(status.uploaded_files[0]).parent
print(f"Current upload_id: {upload_id_folder.name}")
print(
f"Contents of subfolder {upload_id_folder.name}:\n{os.listdir(upload_id_folder)}"
)
return html.Ul([html.Li(str(x)) for x in status.uploaded_files])
if __name__ == "__main__":
app.run(debug=True)

Related

Serve TailwindCSS with django_plotly_dash

I have a Dash app in Django being served via django-plotly-dash and I'm using Tailwind for the styling across the site. Tailwind seems to be working everywhere except for the Dash app, where it is kind of working, but seems to be overwritten by the Bootstrap at some points.
I can see the Tailwind styling without any issues if I run the Dash app on its own, but not when embedded in Django.
Here's the view inside Django (and the code for this basic example):
And here it is (with garish colors to see the difference) while running Dash and Tailwind without Django:
Some of the Tailwind styling is being applied, such as the container mx-auto bit of the Dash layout, but others (e.g. coloring) are being dropped.
Here's the code for the Dash app, which is split into layout.py, callbacks.py, and dashboard.py:
layout.py:
from dash import dcc, html
layout = html.Div(
className="bg-green-100 container mx-auto my-auto px-15 py-5",
children=[
html.Div(
className="bg-red-100 py-5",
children=[
dcc.Dropdown(
id="symbol-input",
options=[
{"label": "Apple", "value": "AAPL"},
{"label": "Tesla", "value": "TSLA"},
{"label": "Meta", "value": "META"},
{"label": "Amazon", "value": "AMZN"}
],
searchable=True,
value="AAPL",
)
]),
html.Div(
className="max-w-full shadow-2xl rounded-lg border-3",
id="price-chart"
)
]
)
callbacks.py:
from dash import dcc, html
from dash.dependencies import Input, Output
import yfinance as yf
import plotly.express as px
def register_callbacks(app):
#app.callback(
Output("price-chart", "children"),
Input("symbol-input", "value"),
)
def get_data(symbol):
df = yf.Ticker(symbol).history()
fig = px.line(
x=df.index,
y=df.Close,
title=f"Price for {symbol}",
labels={
"x": "Date",
"y": "Price ($)",
}
)
return dcc.Graph(
id="price-chart-1",
figure=fig
)
dashboard.py:
from django_plotly_dash import DjangoDash
from .layout import layout
from .callbacks import register_callbacks
app = DjangoDash("Dashboard")
app.css.append_css({"external_url": "/static/css/output.css"})
app.layout = layout
register_callbacks(app)
The Tailwind CSS is in /static/css/output.css and is linked as the stylesheet in the base.html. To ensure it's working correctly in Django, I put a simple homepage up and copied code from Tailwind's site to confirm that it works. Again, it's partially coming through in the Dash app, but seems to get overwritten.
After viewing your repository, I think the problem is not that the Bootstrap CSS overrides the tailwind's one, the problem here is that the classes that you defined are simply not scanned by Tailwindcss. I'm going to assume that you generate the output.css using this command:
> npx tailwindcss -i ./static/css/input.css -o ./static/css/output.css --watch
If that's what you did to generate the CSS, then I can understand what's going on here. That's simply because of your tailwind.config.js file looks like this:
...
content: [
"./static/css/*.html",
"./templates/*.html",
"./static/css/*.js",
],
...
You said that container, mx-auto classes are applied, but not the color classes (e.g. bg-green-100, bg-red-100), that's simply because container, mx-auto classes are defined in one of "./static/css/*.html", "./templates/*.html", "./static/css/*.js", while bg-green-100, bg-red-100 are defined in other directory than those directories (it's defined in apps\dashboard\layout.py).
The easiest fix is to add the directories in which CSS classes need to be applied to the tailwind.config.js file, e.g.:
/** #type {import('tailwindcss').Config} */
module.exports = {
content: [
"./static/css/*.html",
"./templates/*.html",
"./static/css/*.js",
"./apps/**" // add this line
],
theme: {
extend: {},
},
plugins: [],
}
This will add all classes from any files or any files in ./apps directory or subdirectories to the tailwindcss build process. Don't forget to run the tailwindcss cli command (the one I mentioned earlier) every time you run the server though.

Consistent update dash page across connections

I'm making a multi-page dash application that I plan to host on a server using Gunicorn and Nginx. It will access a PostgreSQL database on an external server over the network.
The data on one of the pages is obtained by a query from the database and should be updated every 30 seconds. I use to update the #callback through the dcc.Interval.
My code (simplified version):
from dash import Dash, html, dash_table, dcc, Input, Output, callback
import dash_bootstrap_components as dbc
from flask import Flask
import pandas as pd
from random import random
server = Flask(__name__)
app = Dash(__name__, server=server, suppress_callback_exceptions=True, external_stylesheets=[dbc.themes.BOOTSTRAP])
app.layout = html.Div([
dcc.Interval(
id='interval-component-time',
interval=1000,
n_intervals=0
),
html.Br(),
html.H6(id='time_update'),
dcc.Interval(
id='interval-component-table',
interval=1/2*60000,
n_intervals=0
),
html.Br(),
html.H6(id='table_update')
])
#callback(
Output('time_update', 'children'),
Input('interval-component-time', 'n_intervals')
)
def time_update(n_intervals):
time_show = 30
text = "Next update in {} sec".format(time_show - (n_intervals % 30))
return text
#callback(
Output('table_update', 'children'),
Input('interval-component-table', 'n_intervals')
)
def data_update(n_intervals):
# here in a separate file a query is made to the database and a dataframe is returned
# now here is a simplified receipt df
col = ["Col1", "Col2", "Col3"]
data = [[random(), random(), random()]]
df = pd.DataFrame(data, columns=col)
return dash_table.DataTable(df.to_dict('records'),
style_cell={'text-align': 'center', 'margin-bottom': '0'},
style_table={'width':'500px'})
if __name__ == '__main__':
server.run(port=5000, debug=True)
Locally, everything works fine for me, the load on the database is small, one such request loads 1 out of 8 processors by 30% for 3 seconds.
But, if you open my application in several browser windows, then the same data is displayed on two pages by two queries to the database at different times, that is, the load doubles. I am worried that when connecting more than 10 people, my server with the database will not withstand / will freeze heavily, and the database on it should work without delay and not fall.
Question:
Is it possible to make page refresh the same for different connections? That is, so that the data is updated at the same time for different users and only with the help of one query to the database.
I studied everything about the callback in the documentation and did not find an answer.
Solution
Thanks for the advice, #Epsi95! I studied page Dash Performance and added this to my code:
cache = Cache(app.server, config={
'CACHE_TYPE': 'filesystem',
'CACHE_DIR': 'cache-directory',
'CACHE_THRESHOLD': 50
})
#cache.memoize(timeout=30)
def query_data():
# here I make a query to the database and save the result in a dataframe
return df
def dataframe():
df = query_data()
return df
And in the #callback function I make a call to the dataframe() function.
Everything works the way I needed it. Thank you!

Dash Uploader will not work with files above 1Mb

This uploader is for .xlsx files. It doesn't work with files above 1.0Mb, but works fine with files smaller than this.
I have max_file_size set well above this.
Running this locally, I can upload any size file without a problem - it is only an issue with the version that is deployed here:
link to Elastic Beanstalk app
import dash_uploader as du
import dash
from dash import html
app = dash.Dash(__name__)
#application = app.server
du.configure_upload(app, r'')
app.layout = html.Div([
du.Upload(
text='Drag and Drop Here',
text_completed='Successful Upload of ',
id='upload',
max_file_size=18000,
max_files=1,
filetypes=['xlsx'],
upload_id= 'uploader_id'
),
])
if __name__ == '__main__':
#application.run_server(port=8080)
app.run_server(debug=True)

Dash testing dcc.upload with dash.testing

When writing production ready code we want to be able to automatically test our webapp everytime we update the code. Dash for python allows this through dash.testing. However in my app I upload an excel file utilizing the dcc.Upload() component.
How do I write a test that can send the upload link to this component?
The dcc.Upload component does not allow you to put an id on the that stores the upload link.
It is easy to work around this by inspecting the upload button/field that you have created with web developer tools. look for the line that contains "<input type=file ... >". in the elements tab.
Right click it and press copy xpath and it should give you a relative path like //*[#id="upload-data"]/div/input
The test case would look like this
from dash.testing.application_runners import import_app
def test_xxxx001_upload(dash_duo):
# get app from app.py
app = import_app("src.app")
dash_duo.start_server(app)
# find element that contains input link. Utilize the web driver to get the element
element = dash_duo.driver.find_element_by_xpath('//*[#id="upload-data"]/div/input')
element.send_keys("C:\\path\\to\\testData.xlsx")
folder structure
myapp
--src
--app.py
--server.py
--run.py
--tests
--test_app
the use of the dcc.Upload component to create an upload button
import dash_core_components as dcc
import dash_html_components as html
html.Div(
id="file-drop",
children=[
dcc.Upload(
id="upload-data",
children=html.Div(
["Drag and Drop or ", html.A("Select File"),],
id="select-file",
),
multiple=False,
),
html.Div(id="output-data-upload"),
],
)

Call Local CSS files in Dash App

I am attempting to run the Dash Vanguard demo app while hosting the 4 css files locally. I have successfully been able to use a workaround and locally host a single css file in Dash, but have not been able to simultaneously call all 4.
This is the current Vanguard dash app with the css files externally hosted:
external_css =
["https://cdnjs.cloudflare.com/ajax/libs/normalize/7.0.0/normalize.min.css",
"https://cdnjs.cloudflare.com/ajax/libs/skeleton/2.0.4/skeleton.min.css",
"//fonts.googleapis.com/css?family=Raleway:400,300,600",
"https://codepen.io/bcd/pen/KQrXdb.css",
"https://maxcdn.bootstrapcdn.com/font-awesome/4.7.0/css/font-awesome.min.css"]
for css in external_css:
app.css.append_css({"external_url": css})
My attempt at hosting css files locally:
app.scripts.config.serve_locally = True
app.css.config.serve_locally = True
....
app.layout = html.Div([
html.Link(href='/assets/skeleton.min.css', rel='stylesheet'),
html.Link(href='/assets/skelly.css', rel='stylesheet'),
html.Link(href='/assets/normalize.min.css', rel='stylesheet'),
html.Link(href='/assets/font.css', rel='stylesheet'),
dcc.Location(id='url', refresh=False),
html.Div(id='page-content')
])
....
#app.server.route('/assets/<path:path>')
def static_file(path):
static_folder = os.path.join(os.getcwd(), 'assets')
return send_from_directory(static_folder, path)
The app currently loads without any styling. Not sure why it won't load even one of the css files.
I had the same issue loading local files. The problem was in the #app.server.route. I changed it to:
#app.server.route('/static/<path>')
and it worked.
Edit: Starting with Dash 0.22 you now just need to put the css file in an assets folder. See the docs
I'm currently having the same issue so if you find an answer please add it here!... I don't have a solution but here is the research I've done in case you haven't seen any of these:
https://github.com/plotly/dash/pull/171
https://dash.plot.ly/external-resources
https://github.com/plotly/dash-recipes/blob/master/dash-local-css-link.py

Categories