Uploading and Downloading Files with Flask - python

I'm trying to write a really simply webapp with PythonAnywhere and Flask that has lets the user upload a text file, generates a csv file, then lets the user download the csv file. It doesn't have to be fancy, it only has to work. I have already written the program for generating the csv from a txt file on the drive.
Right now, my function opens the file on the drive with:
with open(INPUTFILE, "r") as fname:
and writes the csv with:
with open(OUTPUTFILE, 'w') as fname:
with INPUTFILE and OUTPUTFILE being filename strings.
Would it be better for me to handle the files as objects, returned by the flask/html somehow?
I don't know how to do this. How should I structure this program? How many HTML Templates do I need? I would prefer to work on the files wihthout saving them anywhere but if I have to save them to the PythonAnywhere directory, I could. How can I do that?

PythonAnywhere dev here. This is a good question about Flask and web development in general rather than specific to our system, so I'll try to give a generic answer without anything specific to us :-)
There are a few things that I'd need to know to give a definitive answer to your question, so I'll start by listing the assumptions I'm making -- leave me a comment if I'm wrong with any of them and I'll update the answer appropriately.
I'm assuming that the files you're uploading aren't huge and can fit into a reasonable amount of memory -- let's say, smaller than a megabyte.
I'm assuming that the program that you've already written to generate the CSV from the text file is in Python, and that it has (or, perhaps more likely, could be easily changed to have) a function that takes a string containing the contents of the text file, and returns the contents that need to be written into the CSV.
If both of those are the case, then the best way to structure your Flask app would be to handle everything inside Flask. A code sample is worth a thousand words, so here's a simple one I put together that allows the user to upload a text file, runs it through a function called transform (which is where the function from your conversion program would slot in -- mine just replaces = with , throughout the file), and sends the results back to the browser. There's a live version of this app on PythonAnywhere here.
from flask import Flask, make_response, request
app = Flask(__name__)
def transform(text_file_contents):
return text_file_contents.replace("=", ",")
#app.route('/')
def form():
return """
<html>
<body>
<h1>Transform a file demo</h1>
<form action="/transform" method="post" enctype="multipart/form-data">
<input type="file" name="data_file" />
<input type="submit" />
</form>
</body>
</html>
"""
#app.route('/transform', methods=["POST"])
def transform_view():
request_file = request.files['data_file']
if not request_file:
return "No file"
file_contents = request_file.stream.read().decode("utf-8")
result = transform(file_contents)
response = make_response(result)
response.headers["Content-Disposition"] = "attachment; filename=result.csv"
return response
Regarding your other questions:
Templates: I didn't use a template for this example, because I wanted it all to fit into a single piece of code. If I were doing it properly then I'd put the stuff that's generated by the form view into a template, but that's all.
Can you do it by writing to files -- yes you can, and the uploaded file can be saved by using the save(filename) method on the file object that I'm using the stream property of. But if your files are pretty small (as per my assumption above) then it probably makes more sense to process them in-memory like the code above does.
I hope that all helps, and if you have any questions then just leave a comment.

Better to add
response.headers["Cache-Control"] = "must-revalidate"
response.headers["Pragma"] = "must-revalidate"
response.headers["Content-type"] = "application/csv"
If you don't add the content type, FF 48.0 reported it as html and opened Save dialog once for HTML and then for CSV. If you don't add Cache-Control your result may get cached, and if you serve active content this is not what you want. If you use must-revalidate with no age, it will effectively serve as no-cache - see here and here for an explanation.

Related

Can't open and read content of an uploaded zip file with FastAPI

I am currently developing a little backend project for myself with the Python Framework FastAPI. I made an endpoint, where the user should be able to upload 2 files, while the first one is a zip-file (which contains X .xmls) and the latter a normal .xml file.
The code is as follows:
#router.post("/sendxmlinzip/")
def create_upload_files_with_zip(files: List[UploadFile] = File(...)):
if not len(files) == 2:
raise Httpex.EXPECTEDTWOFILES
my_file = files[0].file
zfile = zipfile.ZipFile(my_file, 'r')
filelist = []
for finfo in zfile.infolist():
print(finfo)
ifile = zfile.open(finfo)
line_list = ifile.readlines()
print(line_list)
This should print the content of the files, that are in the .zip file, but it raises the Exception
AttributeError: 'SpooledTemporaryFile' object has no attribute 'seekable'
In the row ifile = zfile.open(finfo)
Upon approximately 3 days research with a lot of trial and error involved, trying to use different functions such as .read() or .extract(), I gave up. Because the python docs literally state, that this should be possible in this way...
For you, who do not know about FastAPI, it's a backend fw for Restful Webservices and is using the starlette datastructure for UploadFile. Please forgive me, if I have overseen something VERY obvious, but I literally tried to check every corner, that may have been the possible cause of the error such as:
Check, whether another implementation is possible
Check, that the .zip file is correct
Check, that I attach the correct file (lol)
Debug to see, whether the actual data, that comes to the backend is indeed the .zip file
This is a known Python bug:
SpooledTemporaryFile does not fully satisfy the abstract for IOBase.
Namely, seekable, readable, and writable are missing.
This was discovered when seeking a SpooledTemporaryFile-backed lzma
file.
As #larsks suggested in his comment, I would try writing the contents of the spooled file to a new TemporaryFile, and then operate on that. As long as your files aren't too large, that should work just as well.
This is my workaround
with zipfile.ZipFile(io.BytesIO(file.read()), 'r') as zip:

Read-in Files from Flask request module

I am trying to read-in a file from a Python request, form data. All I want to do is read-in the incoming file in the request body, parse the contents and return the contents as a json body. I see many examples out there like: if 'filename' in request.files:, however this never works for me. I know that the file does in fact live within the ImmutableMultiDict type. Here is my working code example:
if 'my_file.xls' in request.files:
# do something
else:
# return error
if 'file' in request.files:
This is looking for the field name 'file' which corresponds to the name attribute you set in the form:
<input type='file' name='file'>
You then need to do something like this to assign the FileStorage object to the variable mem:
mem = request.files['file']
See my recent answer for more details of how and why.
You can then access the filename itself with:
mem.filename # should give back 'my_file.xls'
To actually read the stream data:
mem.read()
The official flask docs have further info on this, and how to save to disk with secure_filename() etc. Probably worth a read.
All I want to do is read-in the incoming file in the request body, parse the contents and return the contents as a json body.
If you actually want to read the contents of that Excel file, then you'll need to use a library which has compatibility for this such as xlrd. this answer demonstrates how to open a workbook, passing it as a stream. Note that they have used fileobj as the variable name, instead of mem.

LibreOffice/other method of filling template .txt file for import into Zim Wiki

I am using the application Zim Wiki(cross-platform, FOSS), which I am using to keep a personal wiki with lots of data coming from tables, copy and pasting, my own writing, and downloading and attaching .png and .html files for viewing and offline access.
The data that is not written or pasted can be stored in tables in the form of names, url addresses, and the names and locations of images and other attachments.
To insert into zim, I can use the front end with WSIWYG, or to make the skeleton of each entry, I could modify a template text entry. If I do this, nothing matters except for the location and identity of each character in each line.
By supplying the text in this image:
DandelionDemo source text,
--I can make this entry for Dandelion:
DandelionDemo Wiki.
So, I can generate and name the Wiki entry in Zim, which creates the .txt file for me, and inserts the time stamp and title, so, the template for this type of entry without the pasted fields would be:
**Full Scientific Name: **[[|]]**[syn]**
**Common Name(s): **
===== =====
**USDA PLANTS entry for Code:** [[https://plants.usda.gov/core/profile?symbol=|]] **- CalPhotos available images for:** [[https://calphotos.berkeley.edu/cgi/img_query?query_src=photos_index&where-taxon=|]]
**---**
**From - Wikipedia **[[wp?]] **- **[[/Docs/Plants/]]
{{/Docs/Plants/?height=364}}{{/Docs/Plants/?height=364}}
**()** //,// [[|(source)]]
**()** //// [[|(source)]]
**Wikipedia Intro: **////
---
So the first line with content, after the 31st character(which is a tab), you paste "http... {etc}. Then the procedure would insert "Taraxacum officinale... {etc}" after the "|", or what was the 32nd character, and so on. This data could be from "table1" and "table2", or combining the tables to make an un-normalized "table1table2", where each row could be converted to text or a .csv or I don't know, what do you think?
Is there a way, in LibreOffice to do this? I have used LibreOffice Base to generate a "book" form that populated fields, but it was much less complex data, without wiki liking and drag-and-drop pasting of images and attachments. So maybe the answer is to go simpler? The tables are not currently part of a registered database, but I could do that, once I have decided on the method of doing this.
I am ultimately looking for a "way", hopefully an "easy" way. However, that may not be in LibreOffice. If not, I know that I could do this in Python, but I haven't learned much about Python yet. If it involves a language, that is the first and only one I don't know that I will invest in learning for this project. If you know a "way" to do this in Python, let me know, and my first project and way of framing my study process will be in learning the methods that you share.
If you know of some other Linux GUI, I am definitely interested, but only in active free and open source builds that involve minimal/no compiling. I know the basics of SQL and DBMS's. In the past, have gotten Microsoft SQL server lite to work, but not DBeaver, yet. If you know of a CLI way also let me know, but I am a self-taught outdoors-loving Linux newb and mostly know about how to tweak little settings in programs, how to use moderately easy programs like ImageMagick, and I have built a few Lamp stacks for Drupal and Wordpress (no BASH etc...).
Thank you very much!
Ok, since you want to learn some python, let me propose you a way to do it this. First you need a template engine -like jinja2 (there are many others)-, a data source in our example a .csv file, -could be other like a db- and finally some code that reads the csv line by line and mix the content with the template.
Sample CSV file:
1;sample
2;dandelion
3;just for fun
Sample template:
**Full Scientific Name: **[[|]]**[syn]**
**Common Name(s): *{{name}}*
===== =====
USDA PLANTS entry for Code: *{{symbol}}*
---
Sample code:
#!/usr/bin/env/python
#
# Using the file system load
#
# We now assume we have a file in the same dir as this one called
# test_template.ziim
#
from jinja2 import Environment, FileSystemLoader
import os
import csv
# Capture our current directory
THIS_DIR = os.path.dirname(os.path.abspath(__file__))
def print_zim_doc():
# Create the jinja2 environment.
# Notice the use of trim_blocks, which greatly helps control whitespace.
j2_env = Environment(loader=FileSystemLoader(THIS_DIR),
trim_blocks=True)
template = j2_env.get_template('test_template.zim')
with open('example.csv') as File:
reader = csv.reader(File, delimiter=';')
for row in reader:
result = template.render(
symbol=row[0]
, name=row[1]
)
# to save the results
with open(row[0]+".txt", "wt") as fh:
fh.write(result)
fh.close()
if __name__ == '__main__':
print_zim_doc()
The code is pretty simple, reads the template located in the same folder as the python code, opens the csv file (also located in the same place), iterates over each line of the csv and renders the template using the values of the csv columns to fill the {{var_name}} in the template, finally saves the rendered result in a new file named as one of the csv column values. This sample will generate 3 files (1.txt, 2.txt, 3.txt). From here you can extend and improve the code to get your desired results.

Bottle file upload and process

I am using Bottle for uploading rather large files. The idea is that when the file is uploaded, the web app run (and forget) a system command with the uploaded file-path as an argument. Except for starting the system command with the correct file-path as an argument I do not need to save the file, but I need to be certain that the file will be available until the process completes the processing.
I use the exact code described here:
http://bottlepy.org/docs/dev/tutorial.html#post-form-data-and-file-uploads
My questions are:
Do bottle store uploaded file in memory or on a specific place on the disk (or perhaps like Flask, a bit of both)?
Will the uploaded file be directly available to other tools without .read() and then manually saving the bytes to a specified file on disk?
What would be the best way to start the system command with the file as an argument? Is it possible to just pass the path to an existing file directly?
Ok, let's break this down.
The full code is:
HTML:
<form action="/upload" method="post" enctype="multipart/form-data">
<input type="text" name="name" />
<input type="file" name="data" />
</form>
PYTHON CODE:
from bottle import route, request
#route('/upload', method='POST')
def do_upload():
name = request.forms.name
data = request.files.data
if name and data and data.file:
raw = data.file.read() # This is dangerous for big files
filename = data.filename
return "Hello %s! You uploaded %s (%d bytes)." % (name, filename, len(raw))
return "You missed a field."
(From the doc's you provided)
So, first of all, we can see that we first pull the information from the name and the data in the html form, and assign them to the variables name and data. Thats pretty straight forward. However, next we assign the variable raw to data.file.read(). This is basically taking all of the file uploaded into the variable raw. This being said, the entire file is in memory, which is why they put "This is dangerous for big files" as a comment next to the line.
This being said, if you wanted to save the file out to disk, you could do so (but be careful) using something like:
with open(filename,'w') as open_file:
open_file.write(data.file.read())
As for your other questions:
1."What would be the best way to start the system command with the file as an argument? Is it possible to just pass the path to an existing file directly?"
You should see the subprocess module, specifically Popen: http://docs.python.org/2/library/subprocess.html#popen-constructor
2."Will the uploaded file be directly available to other tools without .read() and then manually saving the bytes to a specified file on disk?"
Yes, you can pass the file data around without saving it to disk, however, be warned that memory consumption is something to watch. However, if these "tools" are not in python, you may be dealing with pipes or subprocesses to pass the data to these "tools".
with open(filename,'w') as open_file:
open_file.write(data.file.read())
dont work
you can use
data = request.files.data
data.save(Path,overwrite=True)
The file will be handled by the routine you use. That means your read handles the connection (the file should not be there, according to wsgi spec)
with open(filename, "wb") as file:
Data = data.file.read()
if type(Data) == bytes: file.write(Data)
elif type(Data) == str: file.write(Data.encode("utf-8"))
Easy :D

Difference between opening a file and using a module

Below, I print out the html for a form defined externally. Is there a difference in the way each string is retrieved and used in foo.py, other than the syntax? If so, in what circumstances would one method be preferred over the other? For example, would I be better off defining a number of html files in a module as strings and access them that way, as opposed keeping them in separate .html files and using open over and over?
mod.py
form = """\
<form type="POST" action="test.py">
Enter something:<input type="text" name="somethign">
</form>
"""
form.html
<form type="POST" action="test.py">
Enter something:<input type="text" name="something">
</form>
foo.py
import mod
print mod.form
with open('form.html', 'r') as form:
print form.read()
Having .html files is better. Sure, you will have some overhead opening a file, reading its content and then closing it, but you'll have many advantages:
.html file can be edited by any person who knows HTML syntax.
.html files can be edited without restarting your program, it is very useful for services.
You can eliminate open/read/close overhead by introducing some caching technique.
It's a lot easier for designers to edit discrete HTML files than to deal with HTML embedded in code.

Categories