I have a form with an input tag and a submit button:
<input type="file" name="filename" size="25">
I have a python file that handles the post:
def post(self):
The file that I'm receiving in the form is an .xml file, in the python post function I want to send that 'foo.xml' to another function that is going to validate it (using minixsv)
My question is how do I retrieve the file? I tried:
form = cgi.FieldStorage()
inputfile = form.getvalue('filename')
but this puts the content in the inputfile, I don't have a 'foo.xml' file per se that I can pass to the minisxv function which request a .xml file not the text...
Update I found a function that accepts text instead of an input file, thanks anyway
Oftentimes, there's also a function to extract XML from a string. For example, minidom has parseString, and lxml etree.XML.
If you have the content, you can make a file-like object with StringIO:
from StringIO import StringIO
content = form.getvalue('filename')
fileh = StringIO(content)
# You can now call fileh.read, or iterate over it
If you must have a file on the disk, use tempfile.mkstemp:
import tempfile
content = form.getvalue('filename')
tmpf, tmpfn = tempfile.mkstemp()
tmpf.write(content)
tmpf.close()
# Now, give tmpfn to the function expecting a filename
os.unlink(tmpfn) # Finally, delete the file
This probably won't be the best answer, but why not consider using StringIO on your inputfile variable, and passing the StringIO object as the file handle to your minisxv function? Alternately, why not open an actual new file handle for foo.xml, save the contents of inputfile to it (i.e., via open), and then pass foo.xml to your minisxv function?
Related
I would like to read all the files in a zip file of a specific type sent to a flask server via a form post request without having to store the zip file on disk.
First, get the code to get the zip file
from flask import Flask, request
app = Flask(__name__)
#app.route("/",methods=["GET"])
def page_name_get():
return """<form action="." method="post" enctype=multipart/form-data>
<input type="file" accept="application/zip" name="data_zip_file" accept="application/zip" required>
<button type="submit">Send zip file!</button>
</form>"""
app.run()
This is how the post request function should look like
import zipfile
#app.route("/",methods=["POST"])
def page_name_post():
file = request.files['data_zip_file']
file_like_object = file.stream._file
zipfile_ob = zipfile.ZipFile(file_like_object)
file_names = zipfile_ob.namelist()
# Filter names to only include the filetype that you want:
file_names = [file_name for file_name in file_names if file_name.endswith(".txt")]
files = [(zipfile_ob.open(name).read(),name) for name in file_names]
return str(files)
Now I will go over this line by line
file = request.files['data_zip_file'] First, you need to get the file object from the request this is is an instance of the werkzeug.datastructures.FileStorage class.
file_like_object = file.stream._file here you first take the stream attribute of the werkzeug.datastructures.FileStorage this is the input stream of the file. This will return an instance of tempfile.SpooledTemporaryFile a class used for temporary files. From that instance, you take the ._file attribute. This will return an instance of tempfile._TemporaryFileWrapper This is enough like an io.BytesIO to be understood by the zipfile.ZipFile class.
zipfile_ob = zipfile.ZipFile(file_like_object) here you create the zipfile.Zipfile object
Now you should be able to do pretty much everything you would want to do with the zip. To select a file from the zip use the zipfile_ob.open() method and pass in the path to the file you want to open.
To get those paths we use file_names = zipfile_ob.namelist() this will return a list with strings of all the paths to all the files and directories in the zip.
You can then filter those names with file_names = [file_name for file_name in file_names if file_name.endswith(".txt")]
All those paths you want are now in file_names. Then you can extract the data of those files using the open function.
files = [(zipfile_ob.open(name).read(),name) for name in file_names]
In the example given I keep the paths to the file in the final list but if you don't want that you can use:
files = [zipfile_ob.open(name).read() for name in file_names] or use some other way to go over the file. If you want more info about the files there is also the infolist() method that can be used instead of the namelist() this will return a list of ZipInfo Objects instead of a list of just strings. These objects hold some more data about all the files.
I'm trying to open a file in GAE that was retrieved using urlfetch().
Here's what I have so far:
from google.appengine.api import urlfetch
result = urlfetch.fetch('http://example.com/test.txt')
data = result.content
## f = open(...) <- what goes in here?
This might seem strange but there's a very similar function in the BlobStore that can write data to a blobfile:
f = files.blobstore.create(mime_type='txt', _blobinfo_uploaded_filename='test')
with files.open(f, 'a') as data:
data.write(result.content)
How can I write data into an arbitrary file object?
Edit: Should've been more clear; I'm trying to urlfetch any file and open result.content in a file object. So it might be a .doc instead of a .txt
You can use the StringIO module to emulate a file object using the contents of your string.
from google.appengine.api import urlfetch
from StringIO import StringIO
result = urlfetch.fetch('http://example.com/test.txt')
f = StringIO(result.content)
You can then read() from the f object or use other file object methods like seek(), readline(), etc.
Yoy do not have to open a file. You have received the txt data in data = result.content.
So I've been playing around with raw WSGI, cgi.FieldStorage and file uploads. And I just can't understand how it deals with file uploads.
At first it seemed that it just stores the whole file in memory. And I thought hm, that should be easy to test - a big file should clog up the memory!.. And it didn't. Still, when I request the file, it's a string, not an iterator, file object or anything.
I've tried reading the cgi module's source and found some things about temporary files, but it returns a freaking string, not a file(-like) object! So... how does it fscking work?!
Here's the code I've used:
import cgi
from wsgiref.simple_server import make_server
def app(environ,start_response):
start_response('200 OK',[('Content-Type','text/html')])
output = """
<form action="" method="post" enctype="multipart/form-data">
<input type="file" name="failas" />
<input type="submit" value="Varom" />
</form>
"""
fs = cgi.FieldStorage(fp=environ['wsgi.input'],environ=environ)
f = fs.getfirst('failas')
print type(f)
return output
if __name__ == '__main__' :
httpd = make_server('',8000,app)
print 'Serving'
httpd.serve_forever()
Thanks in advance! :)
Inspecting the cgi module description, there is a paragraph discussing how to handle file uploads.
If a field represents an uploaded file, accessing the value via the value attribute or the getvalue() method reads the entire file in memory as a string. This may not be what you want. You can test for an uploaded file by testing either the filename attribute or the file attribute. You can then read the data at leisure from the file attribute:
fileitem = form["userfile"]
if fileitem.file:
# It's an uploaded file; count lines
linecount = 0
while 1:
line = fileitem.file.readline()
if not line: break
linecount = linecount + 1
Regarding your example, getfirst() is just a version of getvalue().
try replacing
f = fs.getfirst('failas')
with
f = fs['failas'].file
This will return a file-like object that is readable "at leisure".
The best way is to NOT to read file (or even each line at a time as gimel suggested).
You can use some inheritance and extend a class from FieldStorage and then override make_file function. make_file is called when FieldStorage is of type file.
For your reference, default make_file looks like this:
def make_file(self, binary=None):
"""Overridable: return a readable & writable file.
The file will be used as follows:
- data is written to it
- seek(0)
- data is read from it
The 'binary' argument is unused -- the file is always opened
in binary mode.
This version opens a temporary file for reading and writing,
and immediately deletes (unlinks) it. The trick (on Unix!) is
that the file can still be used, but it can't be opened by
another process, and it will automatically be deleted when it
is closed or when the current process terminates.
If you want a more permanent file, you derive a class which
overrides this method. If you want a visible temporary file
that is nevertheless automatically deleted when the script
terminates, try defining a __del__ method in a derived class
which unlinks the temporary files you have created.
"""
import tempfile
return tempfile.TemporaryFile("w+b")
rather then creating temporaryfile, permanently create file wherever you want.
Using an answer by #hasanatkazmi (utilized in a Twisted app) I got something like:
#!/usr/bin/env python2
# -*- coding: utf-8 -*-
# -*- indent: 4 spc -*-
import sys
import cgi
import tempfile
class PredictableStorage(cgi.FieldStorage):
def __init__(self, *args, **kwargs):
self.path = kwargs.pop('path', None)
cgi.FieldStorage.__init__(self, *args, **kwargs)
def make_file(self, binary=None):
if not self.path:
file = tempfile.NamedTemporaryFile("w+b", delete=False)
self.path = file.name
return file
return open(self.path, 'w+b')
Be warned, that the file is not always created by the cgi module. According to these cgi.py lines it will only be created if the content exceeds 1000 bytes:
if self.__file.tell() + len(line) > 1000:
self.file = self.make_file('')
So, you have to check if the file was actually created with a query to a custom class' path field like so:
if file_field.path:
# Using an already created file...
else:
# Creating a temporary named file to store the content.
import tempfile
with tempfile.NamedTemporaryFile("w+b", delete=False) as f:
f.write(file_field.value)
# You can save the 'f.name' field for later usage.
If the Content-Length is also set for the field, which seems rarely, the file should also be created by cgi.
That's it. This way you can store the file predictably, decreasing the memory usage footprint of your app.
Hello
My error is produced in generating a zip file. Can you inform what I should do?
main.py", line 2289, in get
buf=zipf.read(2048)
NameError: global name 'zipf' is not defined
The complete code is as follows:
def addFile(self,zipstream,url,fname):
# get the contents
result = urlfetch.fetch(url)
# store the contents in a stream
f=StringIO.StringIO(result.content)
length = result.headers['Content-Length']
f.seek(0)
# write the contents to the zip file
while True:
buff = f.read(int(length))
if buff=="":break
zipstream.writestr(fname,buff)
return zipstream
def get(self):
self.response.headers["Cache-Control"] = "public,max-age=%s" % 86400
start=datetime.datetime.now()-timedelta(days=20)
count = int(self.request.get('count')) if not self.request.get('count')=='' else 1000
from google.appengine.api import memcache
memcache_key = "ads"
data = memcache.get(memcache_key)
if data is None:
a= Ad.all().filter("modified >", start).filter("url IN", ['www.koolbusiness.com']).filter("published =", True).order("-modified").fetch(count)
memcache.set("ads", a)
else:
a = data
dispatch='templates/kml.html'
template_values = {'a': a , 'request':self.request,}
path = os.path.join(os.path.dirname(__file__), dispatch)
output = template.render(path, template_values)
self.response.headers['Content-Length'] = len(output)
zipstream=StringIO.StringIO()
file = zipfile.ZipFile(zipstream,"w")
url = 'http://www.koolbusiness.com/list.kml'
# repeat this for every URL that should be added to the zipfile
file =self.addFile(file,url,"list.kml")
# we have finished with the zip so package it up and write the directory
file.close()
zipstream.seek(0)
# create and return the output stream
self.response.headers['Content-Type'] ='application/zip'
self.response.headers['Content-Disposition'] = 'attachment; filename="list.kmz"'
while True:
buf=zipf.read(2048)
if buf=="": break
self.response.out.write(buf)
That is probably zipstream and not zipf. So replace that with zipstream and it might work.
i don't see where you declare zipf?
zipfile? Senthil Kumaran is probably right with zipstream since you seek(0) on zipstream before the while loop to read chunks of the mystery variable.
edit:
Almost certainly the variable is zipstream.
zipfile docs:
class zipfile.ZipFile(file[, mode[, compression[, allowZip64]]])
Open a ZIP file, where file can be either a path to a file (a string) or
a file-like object. The mode parameter
should be 'r' to read an existing
file, 'w' to truncate and write a new
file, or 'a' to append to an existing
file. If mode is 'a' and file refers
to an existing ZIP file, then
additional files are added to it. If
file does not refer to a ZIP file,
then a new ZIP archive is appended to
the file. This is meant for adding a
ZIP archive to another file (such as
python.exe).
your code:
zipsteam=StringIO.StringIO()
create a file-like object using StringIO which is essentially a "memory file" read more in docs
file = zipfile.ZipFile(zipstream,w)
opens the zipfile with the zipstream file-like object in 'w' mode
url = 'http://www.koolbusiness.com/list.kml'
# repeat this for every URL that should be added to the zipfile
file =self.addFile(file,url,"list.kml")
# we have finished with the zip so package it up and write the directory
file.close()
uses the addFile method to retrieve and write the retrieved data to the file-like object and returns it. The variables are slightly confusing because you pass a zipfile to the addFile method which aliases as zipstream (confusing because we are using zipstream as a StringIO file-like object). Anyways, the zipfile is returned, and closed to make sure everything is "written".
It was written to our "memory file", which we now seek to index 0
zipstream.seek(0)
and after doing some header stuff, we finally reach the while loop that will read our "memory-file" in chunks
while True:
buf=zipstream.read(2048)
if buf=="": break
self.response.out.write(buf)
You need to declare:
global zipf
right after your
def get(self):
line. you are modifying a global variable, and this is the only way python knows what you are doing.
I'm trying to return a CSV from an action in my webapp, and give the user a prompt to download the file or open it from a spreadsheet app. I can get the CSV to spit out onto the screen, but how do I change the type of the file so that the browser recognizes that this isn't supposed to be displayed as HTML? Can I use the csv module for this?
import csv
def results_csv(self):
data = ['895', '898', '897']
return data
To tell the browser the type of content you're giving it, you need to set the Content-type header to 'text/csv'. In your Pylons function, the following should do the job:
response.headers['Content-type'] = 'text/csv'
PAG is correct, but furthermore if you want to suggest a name for the downloaded file you can also set response.headers['Content-disposition'] = 'attachment; filename=suggest.csv'
Yes, you can use the csv module for this:
import csv
from cStringIO import StringIO
...
def results_csv(self):
response.headers['Content-Type'] = 'text/csv'
s = StringIO()
writer = csv.writer(s)
writer.writerow(['header', 'header', 'header'])
writer.writerow([123, 456, 789])
return s.getvalue()