I would like a Dash dashboard to extract data from powerpoint .pptx files, a deployment constraint is that we can't read or write files to a directory so I would like to stream the input file straight into python-pptx's Presentation function.
here is a small reprex:
from flask import Flask, send_from_directory
import dash
import dash_core_components as dcc
import dash_html_components as html
from dash.dependencies import Input, Output
from pptx import Presentation
server = Flask(__name__)
app = dash.Dash(server=server)
app.layout = html.Div(
[
html.H1("File Browser"),
html.H2("Upload"),
dcc.Upload(
id="upload-data",
children=html.Div(
["Drag and drop or click to select a file to upload."]
),
multiple=True,
),
html.H2("Shape List"),
html.Ul(id="shape-list"),
],
style={"max-width": "500px"},
)
#app.callback(
Output("shape-list", "children"),
[Input("upload-data", "filename"), Input("upload-data", "contents")],
)
def update_output(uploaded_filenames, uploaded_file_contents):
shape_text = []
if uploaded_filenames is not None and uploaded_file_contents is not None:
for name, data in zip(uploaded_filenames, uploaded_file_contents):
prs = Presentation(data)
shape_text += [shape.text for shape in prs.slides[0].shapes]
return [html.Li(txt) for txt in shape_text]
if __name__ == "__main__":
app.run_server(debug=True, port=8888)
which gives error:
pptx.exc.PackageNotFoundError: Package not found at 'data:application/vnd.openxmlformats-officedocument.presentationml.presentation;base64...
I tried a few different attempts at encoding/decoding the input with StringIO and BytesIO but couldn't get it into a working format.
The data property contains also content type, which should be separated before doing the base64 decoding. Hence the parsing code should be along the lines of,
content_type, content_string = data.split(',')
prs = Presentation(BytesIO(base64.b64decode(content_string)))
Replacing
prs = Presentation(data)
in you example with that parsing code, I am able to parse a pptx file as intended.
Related
This uploader is for .xlsx files. It doesn't work with files above 1.0Mb, but works fine with files smaller than this.
I have max_file_size set well above this.
Running this locally, I can upload any size file without a problem - it is only an issue with the version that is deployed here:
link to Elastic Beanstalk app
import dash_uploader as du
import dash
from dash import html
app = dash.Dash(__name__)
#application = app.server
du.configure_upload(app, r'')
app.layout = html.Div([
du.Upload(
text='Drag and Drop Here',
text_completed='Successful Upload of ',
id='upload',
max_file_size=18000,
max_files=1,
filetypes=['xlsx'],
upload_id= 'uploader_id'
),
])
if __name__ == '__main__':
#application.run_server(port=8080)
app.run_server(debug=True)
I've tried to embed a local html file into a basic Dash App.
I used the code in this link and replaced the path with my local relative path (the dash app is in the same folder as the html local page)
html.Iframe(src="random_example.html",
style={"height": "1067px", "width": "100%"})
but this is the result I get:
You could put the html file in the assets folder and reference it like this:
import dash
import dash_html_components as html
app = dash.Dash(__name__)
app.layout = html.Div(
children=[
html.Iframe(
src="assets/random_example.html",
style={"height": "1067px", "width": "100%"},
)
]
)
if __name__ == "__main__":
app.run_server(debug=True)
I've run into a strange behaviour when using the Dash library and I apologies in advance as I don't think I can't make an example which is easy to reproduce, so instead I will include all the information I think is relevant.
I was trying to create a small dashboard which loaded some data and then plotted it in a bar chart. The code I used for this is below (it is adapted from the Docs):
import dash
import dash_core_components as dcc
import dash_html_components as html
import plotly.express as px
import pandas as pd
import os.path as path
def find_unique_id(data, first_name, second_name, season):
name = first_name + '_' + second_name
data_filt = data[data['season'] == season]
data_filt = data_filt[data_filt['name'] == name]
print(data_filt['unique_id'])
return data_filt['unique_id'].iloc[0]
external_stylesheets = ['https://codepen.io/chriddyp/pen/bWLwgP.css']
app = dash.Dash(__name__, external_stylesheets=external_stylesheets)
# assume you have a "long-form" data frame
# see https://plotly.com/python/px-arguments/ for more options
path_data = path.join(path.dirname(path.dirname(path.abspath(__file__))), 'Processed', 'player_database.csv')
data = pd.read_csv(path_data)
unique_id = []
unique_id.append(find_unique_id(data, 'Harry', 'Kane', 2020))
unique_id.append(find_unique_id(data, 'Mohamed', 'Salah', 2020))
unique_id.append(find_unique_id(data, 'Heung-Min', 'Son', 2020))
fig = px.bar(data[data['unique_id'].isin(unique_id)], x="round", y="points_mean3", color="name", barmode="group")
app.layout = html.Div(children=[
html.H1(children='Hello Dash'),
html.Div(children='''
Dash: A web application framework for Python.
'''),
dcc.Graph(
id='example-graph',
figure=fig
)
])
if __name__ == '__main__':
app.run_server(debug=True)
The player_database.csv is loaded and detailed in the picture (notice the file sizes):
However now whenever I import any Dash related module even in a different file, it changes all the files in this folder, and appears to role them back to an earlier state. For example when I run the following code, you can see the change in file size happening:
import dash
import dash_core_components as dcc
import dash_html_components as html
import plotly.express as px
This alters all the files in this directory (again as shown by the file size):
Does anyone know why this might be happening, has Dash got some kind of cache somewhere which loads the files it was working with from some previous state? I've tried googling and can't find anything on this.
Thanks
When writing production ready code we want to be able to automatically test our webapp everytime we update the code. Dash for python allows this through dash.testing. However in my app I upload an excel file utilizing the dcc.Upload() component.
How do I write a test that can send the upload link to this component?
The dcc.Upload component does not allow you to put an id on the that stores the upload link.
It is easy to work around this by inspecting the upload button/field that you have created with web developer tools. look for the line that contains "<input type=file ... >". in the elements tab.
Right click it and press copy xpath and it should give you a relative path like //*[#id="upload-data"]/div/input
The test case would look like this
from dash.testing.application_runners import import_app
def test_xxxx001_upload(dash_duo):
# get app from app.py
app = import_app("src.app")
dash_duo.start_server(app)
# find element that contains input link. Utilize the web driver to get the element
element = dash_duo.driver.find_element_by_xpath('//*[#id="upload-data"]/div/input')
element.send_keys("C:\\path\\to\\testData.xlsx")
folder structure
myapp
--src
--app.py
--server.py
--run.py
--tests
--test_app
the use of the dcc.Upload component to create an upload button
import dash_core_components as dcc
import dash_html_components as html
html.Div(
id="file-drop",
children=[
dcc.Upload(
id="upload-data",
children=html.Div(
["Drag and Drop or ", html.A("Select File"),],
id="select-file",
),
multiple=False,
),
html.Div(id="output-data-upload"),
],
)
Since the version 0.41.0 of dash, the following code is on error :
import dash
from dash_table_experiments import DataTable
app = dash.Dash() app.layout = DataTable( id='datatable', rows=[{'V'+dash.__version__: i} for i in range(5)] )
app.run_server(debug=True)
whereas version 0.40.0 displays the table properly.
Would any one know what has changed ?
Thanks for your help ;
From the README file on the dash-table-experiements github page:
dash-table-experiments
DEPRECATED
If you are looking for a first class data table component for Dash head on over to https://github.com/plotly/dash-table and check out the documentation https://dash.plot.ly/datatable.
The component is no longer supported. The dash-table component works really well, though. Here is a basic example:
import dash
import numpy as np
import pandas
from dash_table import DataTable
app = dash.Dash()
df = pandas.DataFrame(np.arange(30).reshape(5, 6))
app.layout = DataTable(id='datatable',
columns=[{"name": i, "id": i} for i in df.columns],
data=df.to_dict(orient='records'))
if __name__ == '__main__':
app.run_server(debug=True)