Uploading CSV to Flask for background processing - python

I'm looking to use Flask to host a single-page website that would allow users to upload a CSV that would be parsed and put into a database. All of the database shenanigans are complete (through SQLalchemy in another Python script) and I've got everything worked out once a script has access to the CSV, I just need help getting it there.
Here's the scenario:
1. User directs browser at URL (probably something like
http://xxx.xxx.xxx.xxx/upload/)
2. User chooses CSV to upload
3. User presses upload
4. File is uploaded and processed, but user is sent to a thank you page while our
script is still working on the CSV (so that their disconnect doesn't cause the
script to abort).
It's totally cool if the CSV is left on the server (in fact, it's probably preferred since we'd have a backup in case processing went awry)
I think what I want is a daemon that listens on a socket, but I'm not really experienced with this and don't know where to start getting it configured or setting up Flask.
If you think some framework other than Flask would be easier, definitely let me know, I'm not tied to Flask, I've just read that it's pretty easy to set up!
Thank you very much!!

Here is a (very slightly simplified) example of handling file uploading in web.py based on a cook book example (the Flash example, which I have less experience with, looks even easier):
import web
urls = ('/', 'Upload')
class Upload:
def GET(self):
web.header("Content-Type","text/html; charset=utf-8")
return """
<form method="POST" enctype="multipart/form-data" action="">
<input type="file" name="myfile" />
<br/>
<input type="submit" />
"""
def POST(self):
x = web.input(myfile={})
filedir = '/uploads' # change this to the directory you want to store the file in.
if 'myfile' in x: # to check if the file-object is created
filepath=x.myfile.filename.replace('\\','/') # replaces the windows-style slashes with linux ones.
filename=filepath.split('/')[-1] # splits the and chooses the last part (the filename with extension)
fout = open(filedir +'/'+ filename,'wb') # creates the file where the uploaded file should be stored
fout.write(x.myfile.file.read()) # writes the uploaded file to the newly created file.
fout.close() # closes the file, upload complete.
raise web.seeother('/')
if __name__ == "__main__":
app = web.application(urls, globals())
app.run()
This renders a upload form, and then (on POST) reads the uploaded file and saves it to a designated path.

Related

How to make Bottle refuse too big uploaded files?

Here is a way to accept upload with Bottle:
<form action="/upload" method="post" enctype="multipart/form-data">
<input type="file" name="file" />
</form>
and
from bottle import route, request
#route('/upload', method='POST')
def do_upload():
myfile = request.files.get('file')
size = len(myfile.read()) # oops the file is already read anyway!
if size > 1024*1024: # 1 MB
return "File too big"
However, with this technique a 500 MB file would be read anyway, before noticing it's a "too big file".
Question: how to prevent a Bottle server to even accept a too big uploaded file, without having to read it first (and waste bandwidth/memory!)?
If not possible with Bottle only, how to do it with Apache + mod_wsgi (I currently use this)?
Because you are using Apache, you can add to the Apache configuration the LimitRequestBody directive and specify the limit. The request will be rejected before it even gets to your Python code.
https://httpd.apache.org/docs/2.4/mod/core.html#limitrequestbody

Play a downloaded video with flask [duplicate]

This question already has an answer here:
Can't play HTML5 video using Flask
(1 answer)
Closed 7 years ago.
I have a simple flask server. I downloaded, using pafy, a video from a youtube link provided by the user.
#app.route('/')
def download():
return render_template('basic.html')
The basic.html template has a form that submits an action to download:
<form action="download_vid" method="post">
Link: <input type="text" name="download_path"><br>
<input type="submit" value="Submit">
</form>
I have another end point, /download_vid that looks like this.
#app.route('/download_vid', methods=['POST'])
def download_vid():
url = request.form['download_path']
v = pafy.new(url)
s = v.allstreams[len(v.allstreams)-1]
filename = s.download("static/test.mp4")
return redirect(url_for('done'))
The desired link is indeed downloaded as a .mp4 file in my static folder. I can watch it and I can also use it as a source for a tag in an HTML file, if I open it locally.
#app.route('/done')
def done():
return app.send_static_file('test.mp4')
From what I understand, 'send_static_file' serves files from the static directory. However, I get a 404 error when I run the server, even though the video is clearly there.
I have also tried a different version for done():
#app.route('/done')
def done():
return return render_template('vid.html')
Here, vid.html resides in templates and has a hard coded path to static/test.mp4. It is loaded after the download is complete. I do not have a 404 error in this case, but the tag don't do anything, it's just gray. If I open vid.html locally (double click on it), it works, it shows the video.
Can you please help me understand what is going on?
What I want to achieve is this:
Take an input from the user [ Done ]
Use that input to download a video [ Done ]
Serve that video back to the user [ ??? ]
I think you have something going on with file paths or file permissions.
Is the video being downloaded into static directory?
Is the static directory in the same directory, along with your main.py file?
Does your flask app have permissions to read the file?
I think the reason your file did not load in html template is because you referenced it as static/test.mp4 from an url - /done which translates the video path to be /done/static/test.mp4.
Instead of trying to push the file using Flask, you can redirect to the actual media file.
#app.route('/done')
def done():
return redirect('/static/test.mp4')

Uploading Files in webapp2/GAE

I need to upload and process a CSV file from a form in a Google App Engine application based on Webapp2 (Python) I understand I could use blobstore to temporary store the file but I am curious to know if there is a way to process the file without having to store it at all.
If you need to upload a file via webapp2 with an HTML form, the first thing you need to do is change HTML form enctype attribute to multipart/form-data, so the code snippet seems like:
<form action="/emails" class="form-horizontal" enctype="multipart/form-data" method="post">
<input multiple id="file" name="attachments" type="file">
</form>
In python code, you can read the file directly via request.POST, here's a sample code snippet:
class UploadHandler(BaseHandler):
def post(self):
attachments = self.request.POST.getall('attachments')
_attachments = [{'content': f.file.read(),
'filename': f.filename} for f in attachments]
The content of uploaded files is in self.request.POST in your handler, so you can get that content (assuming e.g the field for the uploaded file is named 'foo') with e.g
content = self.request.POST.multi['foo'].file.read()
So now you have the content as a string -- process it as you wish. This does of course assume the thing will fit in memory (no multi-megabyte uploads!-)...

Python / CGI - Upload file attempt returns an empty page

I really searched about 50 related pages but never seen a problem similar to my problem. When I press the submit button, it calls the script but the script returns an empty page and I see no file was uploaded. There is no typing error in my codes, I checked it several times and I really need this code running for my project. What might be the problem? I am running apache under ubuntu and my codes are:
html code:
<html><body>
<form enctype="multipart/form-data" action="save_file.py" method="post">
<p>File: <input type="file" name="file"></p>
<p><input type="submit" value="Upload"></p>
</form>
</body></html>
python code:
#!/usr/bin/env python
import cgi, os
import cgitb; cgitb.enable()
try: #windows needs stdio set for binary mode
import msvcrt
msvcrt.setmode (0, os.O_BINARY)
msvcrt.setmode (1, os.O_BINARY)
except ImportError:
pass
form = cgi.FieldStorage()
#nested FieldStorage instance holds the file
fileitem = form['file']
#if file is uploaded
if fileitem.filename:
#strip leading path from filename to avoid directory based attacks
fn = os.path.basename(fileitem.filename)
open('/files' + fn, 'wb').write(fileitem.file.read())
message = 'The file "' + fn + '" was uploaded successfully'
else:
message = 'No file was uploaded'
print """\
Content-Type: text/html\n
<html><body>
<p>%s</p>
</body></html>
""" % (message,)
I just tested your script, with a few small corrections to the paths to make it work for me locally. With the paths set correctly, and permissions set properly, this code does work fine.
Here are the things to make sure of:
In your html file's form properties, make sure you are pointing to the python script that lives in a cgi-bin: action="/cgi-bin/save_file.py". For me, I have a cgi-bin at the root of my web server, and I placed the python script there. It will not work if you are running the script from a standard document location on the web server
Make sure your save_file.py has executable permissions: chmod 755 save_file.py
In your save_file.py, ensure that you are building a valid path to open the file for saving. I made mine absolute just for testing purposes, but something like this: open(os.path.join('/path/to/upload/files', fn)
With those points set correctly, you should not have any problems.

How do I display a website with html-forms locally using python and collect the user input?

I am a behavorial scientist and usually collect data by letting participants do some tasks on a computer and record their responses (I write the programs using the pyglet wrapper PsychoPy). That is, the program runs locally and the data is stored locally.
Now I would like to know if there is a way to use Python to display a (local) website with html-forms to the user and collect the input (locally). The reason for this idea is that currently whenever I want to display checkboxes, radiobuttons, or input fields I use wxPython. This works quite well, but programming and layouting in wxPython is kind of cumbersome and I would prefer html with forms.
A requirement would be that it would need to rum without any borders, adress field, menu bar, ... The reason is that I need it in kind of fullscreen mode (I currently open a non-fullscreen pygflet window in the size of the screen to hide the desktop) so that participants can do nothing but work on the forms.
So I am looking for a way to (a) display html websites including html form above a pyglet window with no menu bar or whatsoever, (b) collect the input when clicking on the Ok button (i.e., the form is send), (c) control what is presented prior and after viewing this website, and (d) everything of this should happen locally!
My idea would be that the data is collected when participants hit the "Send away" button in the following example pic and the next page is displayed.
Update: I use windows (XP or 7).
This is a solution using Qt Webkit for rendering HTML. The default navigation request handler is wrapped by a function that checks for submitted form requests. The form uses the "get" method, so the data is included in the url of the request and can be retrieved that way. The original request is declined and you can change the content of the displayed web page as you wish.
from PyQt4 import QtGui, QtWebKit
app = QtGui.QApplication([])
view = QtWebKit.QWebView()
# intercept form submits
class MyWebPage(QtWebKit.QWebPage):
def acceptNavigationRequest(self, frame, req, nav_type):
if nav_type == QtWebKit.QWebPage.NavigationTypeFormSubmitted:
text = "<br/>\n".join(["%s: %s" % pair for pair in req.url().queryItems()])
view.setHtml(text)
return False
else:
return super(MyWebPage, self).acceptNavigationRequest(frame, req, nav_type)
view.setPage(MyWebPage())
# setup the html form
html = """
<form action="" method="get">
Like it?
<input type="radio" name="like" value="yes"/> Yes
<input type="radio" name="like" value="no" /> No
<br/><input type="text" name="text" value="Hello" />
<input type="submit" name="submit" value="Send"/>
</form>
"""
view.setHtml(html)
# run the application
view.show()
app.exec_()
As AdamKG mentioned, using a webframework would be a good choice. Since Django and similar might be an overkill here, using a micro webframework like 'flask' or 'bottle' would be a great choice.
This link demonstrates via step by step instruction how to make a simple form via a To-DO application. It assumes zero previous knowledge.
You can run it only locally also.
your want a simple solution, so just write a http server and run your simple page.
using python.BaseHTTPServer, coding a 15 line web server:
import BaseHTTPServer
class WebRequestHandler(BaseHTTPServer.BaseHTTPRequestHandler):
def do_GET(self):
if self.path == '/foo':
self.send_response(200)
self.do_something()
else:
self.send_error(404)
def do_something(self):
print 'hello world'
server = BaseHTTPServer.HTTPServer(('',80), WebRequestHandler)
server.serve_forever()
easy enough,but i suggest using some web frameworks. They are easy too.
for example, web.py. here is what u want in 50 line codes:
install web.py
make a dir with 2 files:
./
|-- app.py
`-- templates
`-- index.html
index.html
$def with (form, ret)
<html>
<head>
<title> another site </title>
</head>
<body>
<h1> hello, this is a web.py page </h1>
<form action="" method="post">
$:form.render()
</form>
<h2>$:ret</h2>
</body>
</html>
app.py logic file:
import web
### Url mappings
urls = (
'/', 'Index', )
### Templates
render = web.template.render('templates')
class Index:
form = web.form.Form(
web.form.Textbox('fav_name', web.form.notnull, description="Favorite Name:"),
web.form.Textbox('cur_name', web.form.notnull, description="Current Name:"),
web.form.Button('Send Away'),
)
def GET(self):
""" Show page """
form = self.form()
return render.index(form, "")
def POST(self):
""" handle button clicked """
form = self.form()
if not form.validates():
return render.index(form, "INPUT ERROR")
# save data by ur method, or do some task
#pyglet.save_data(form.d.fav_name, form.d.cur_name)
#pyglet.draw(some_pic)
#os.system(some_cmd)
form = self.form()
return render.index(form, "YOUR DATA SAVED")
app = web.application(urls, globals())
if __name__ == '__main__':
app.run()
run this server in your windows:
python app.py 9999
open browser: http://127.0.0.1:9999/
by the way, if ur data is only strings, u can save them in web.by by sqlite.
My suggestion would be:
Use some python server as, for example SimpleHTTPServer. It is needed because the submit button on forms sends the information to a server. There you should manage the received info some way;
Have your browser configured with one of those Kiosk extensions, which disallow even the use of Alt+F4. An example would be Open Kiosk extension for Firefox
Optionally, if you have affinity with scripts in general, you could create a script which, when executed, would at the same time run the python server AND open your html file in the browser. That would ease a lot your setup work for every subject in your group.
EDIT: I've read you need the pyglet over the browser window. That could be included in the script of step 3, using "always on top" option and absolute positioning of the pyglet (I can tell this would probably be simpler on Linux, which could be run from persistent LiveUSB - just a thought!)
EDIT (regarding the posted comment):
I think the most reliable option for output would be to disk (file or database) instead or RAM (running python object), then you read the info from file afterwards. Then, in case of a surprise (system hang, power failure), the already-entered data would be there.
The only (and most important) part I don't know HOW to do is to handle the content of the form's "submit" on the server-side. Probably some server-side script file (php, python) shoud be created and left on the server root, so the server would receive an http request containing the info, and send the info to the script, which then handles the processing and file/database storage activities.
This might be of your interest:
"The POST request method is used when the client needs to send data to the server as part of the request, such as when uploading a file or submitting a completed form." (from wikipedia on "POST(HTTP)" ENTRY)
In another link, some thoughts on using SimpleHTTPServer itself for handling POST requests:
http://islascruz.org/html/index.php/blog/show/Python%3A-Simple-HTTP-Server-on-python..html
Hope this helps.
The reason for this idea is that currently whenever I want to display
checkboxes, radiobuttons, or input fields I use wxPython. This works
quite well, but programming and layouting in wxPython is kind of
cumbersome and I would prefer html with forms.
You can combine the ease of HTML and still create native Windows applications using Flex with a Python backend.
If you are averse to Flex, a bit more - involved - but still native windows application generator is Camelot
Edit
Instead of typing it out again - I would suggest the django + flex + pyamf article on Adobe that explains it all with screenshots as well. You can replace django with flask or bottle as they are more lightweight, however the PyAMF library provides native support for django which is why it was used in the example.
PyAMF provides Action Message Format (a binary protocol to exchange object with the flash runtime) support for Python.

Categories