Upload using threading Python Flask - python

I would like to upload multiple files using a thread. This way the files can upload in the background and not make the user wait.
Here is my simplified code:
In app.py:
from file_upload import upload_process
from flask import request
#app.route('/complete', methods=['POST'])
def complete():
id = 5 #for simplified example
upload_process(id) #My thread
...
return render_template('complete.html')
In file_upload.py
from threading import Thread
from flask import request
def upload_process(id):
thr = Thread(target = upload_files, args = [id])
thr.start()
def upload_files(id):
file_1= request.files['file_1']
file_2= request.files['file_2']
file_3= request.files['file_3']
newFiles = FileStorage(id= id, file_1 = file_1.read(), file_2 =
file_2.read(), file_3 = file_3.read())
db.session.add(newFiles)
db.session.commit()
I get the error:
RuntimeError: Working outside of request context.
This typically means that you attempted to use functionality that needed an active HTTP request. Consult the documentation on testing for information about how to avoid this problem.
How would I get the request to work within the upload_files function.
(Without threading the files upload correctly.)

Related

Correct way of passing object instance between modules

SOLVED: Turns out problem comes from gunicorn preloading and forking vs the apscheduler. See comment.
Background
I am writing a simple flask API that does periodic background query to a SQL database using apscheduler, then serves incoming rest requests with flask. The API will do different aggregation based on the incoming request.
I have a data class object that has methods for 1) querying/updating, 2) responding to aggregation requests. The problem arises when somehow the flask resource seems to be stuck at an older version of the data while the logs show that the query/update method was called properly.
Code so far
I broke down my app in modules as follow:
app/
├── app.py
└── apis
├── __init__.py
└── model1.py
Data model file
In model1.py, I defined the data class, the API endpoints with flask-restplus namespace, and initialize the data object:
from flask_restplus import Namespace, Resource
import pandas as pd
api = Namespace('sales')
#api.route('/check')
class check_sales(Resource):
def post(self):
import json
req = api.payload
result = data.get_sales(**req)
return result, 200
class sales_today():
def __init__(self):
self.data = None
self.update()
def update(self):
# some logging here
self.data = self.check_sql()
logging.debug("Last Order: %s" % str(self.data.sales_time.max()))
def check_sql(self):
query = """
SELECT region, store, item, sales_count, MAX(UtcTimeStamp) as sales_time FROM db GROUP BY 1,2,3
"""
sales = pd.read_gbq(query)
return sales
def get_sales(self, **kwargs):
'''
kwargs here is a dict where we filter and sum
'''
for arg_name in (x for x in kwargs):
mask = True
if type(kwargs[arg_name]) is str:
arg_value = kwargs[arg_name].split(',')
mask = mask & (self.data[arg_name].isin(arg_value))
result = {k:v for k,v in kwargs.items()}
result['count'] = int(self.data.loc[mask]['sales_count'])
result['last_updated'] = str(self.data.sales_time.max())
return result
data = sales_today()
Module init file
In __init__.py inside app/apis I pass the data object instance as well as the api namespace.
from .model1 import api as ns_model1
from .model1 import data as data_model1
def add_apins(api):
api.add_namespace(ns_model1, path='/model1')
Main app file
In the main app.py file I layout the scheduler to keep the data refreshed every 5 minutes with apscheduler. I then serve this app with gunicorn.
import atexit
from apscheduler.schedulers.background import BackgroundScheduler
from flask import Flask
from flask_restplus import Resource, Api
from apis import add_apins
from apis import data_model1
# parameters
port = 8888
poll_freq = '0-59/5'
# flask app
main_app = Flask(__name__)
api = Api()
add_apins(api)
api.init_app(main_app)
# background scheduler
sched = BackgroundScheduler()
sched.add_job(data_model1.update, 'cron', minute=poll_freq)
sched.start()
atexit.register(lambda: sched.shutdown(wait=False))
if __name__ == "__main__":
# serve(application, host='0.0.0.0', port=port) # ssl_context="adhoc" for https testing locally
run_simple(application=main_app, hostname='0.0.0.0', port=port, use_debugger=True)
Expectation and issues
Since the query is updated every 5 minutes, I would expect whenever I query the /check endpoint, the responding payload's last_updated value will match the latest from the logs (logging.debug line in the update() method). However, I'm getting responses indicating that the last_updated value equals to when the app was run initially.
I have confirmed in the DB that indeed data is up to date there, and from logging, I'm also confirmed that the update() method is being run every 5 minutes and showing the latest timestamp.
I also noticed that the app runs fine with python app.py in Windows, but when running the app with gunicorn it starts exhibiting this weird behaviour.
I am quite puzzled as to where things go wrong. Could it be scoping? Or am I passing the instance between modules wrongly?
Thank you so much for your time and help. Any ideas would be much appreciated.

How to use flask context with concurrent.futures.ThreadPoolExecutor

I'm trying to make multiple requests async and get response back, I'm using concurrent.futures to do this, but inside my function using current_app which from flask and I always got this error:
RuntimeError: Working outside of application context.
I don't know how to resolve this. Can anyone please help?
Below are my code:
run.py:
import concurrent.futures
from flask import current_app
from http_calls import get_price, get_items
def init():
with current_app._get_current_object().test_request_context():
with concurrent.futures.ThreadPoolExecutor(max_workers=20) as executor:
futs = []
futs.append(executor.submit(get_price))
futs.append(executor.submit(get_items))
print([fut.result() for fut in concurrent.futures.as_completed(futs)])
init()
http_calls.py
from flask import current_app
def get_price():
url = current_app.config['get_price_url']
return requests.get(url).json()
def get_items():
url = current_app.config['get_items_url']
return requests.get(url).json()
I was running into similar issues around using concurrent.futures with Flask. I wrote Flask-Executor as a Flask-friendly wrapper for concurrent.futures to solve this problem. It may be an easier way for you to work with these two together.
You should import your Flask instance in your script. Use current_app under the app context.
import concurrent.futures
from your_application import your_app # or create_app function to return a Flask instance
from flask import current_app
from http_calls import get_price, get_items
def init():
with your_app.app_context():
with concurrent.futures.ThreadPoolExecutor(max_workers=20) as executor:
...

How to run python script results on Flask

I have created an app.py and index.html file. My problem is that I want to execute a python script with the input I gathered from POST when submit is clicked, and then display the script output on the same or different html page. I used CGI and Flask. I do not fully know how to proceed. I research online, but couldn't find anything very helpful. Any help would be appreciated.
Here is my code.
from flask import Flask, render_template, request, redirect
app = Flask(__name__)
#app.route("/")
def main():
return render_template('index.html')
#app.route("/src_code/main.py", methods = ['POST'])
def run_app():
id = request.form['id']
name = request.form['name']
url = request.form['url']
if not id or not name or not url:
return render_template('index.html')
else:
#execute the python script.
if __name__ == "__main__":
app.run()
EDIT:
I have used the following code to import my function. At the end, though I have received an error when I clicked the submit button on index.html
script_analyze = Analyzer()
result = script_analyze.main()
return render_template(results.html', data=result)
AttributeError: 'WSGIRequestHandler' object has no attribute 'environ'
I am unsure why this attribute error is raised.
Since you want to execute another Python script... If you are able to import the other script then you can just use something like the following to call it and store the results - assuming the other script is a value-returning function.
from othermodule import function_to_run
...
# where you want to call it
result = function_to_run()
Then you can use render_template as others have said, passing this result as the data to the template (or simply return the result if it's already in the format you want to output with Flask).
Does that work, or is the script you want to run something that this wouldn't work for? Let us know more about the script if it's an issue.

How to use correctly importlib in a flask controller?

I am trying to load a module according to some settings. I have found a working solution but I need a confirmation from an advanced python developer that this solution is the best performance wise as the API endpoint which will use it will be under heavy load.
The idea is to change the working of an endpoint based on parameters from the user and other systems configuration. I am loading the correct handler class based on these settings. The goal is to be able to easily create new handlers without having to modify the code calling the handlers.
This is a working example :
./run.py :
from flask import Flask, abort
import importlib
import handlers
app = Flask(__name__)
#app.route('/')
def api_endpoint():
try:
endpoint = "simple" # Custom logic to choose the right handler
handlerClass = getattr(importlib.import_module('.'+str(endpoint), 'handlers'), 'Handler')
handler = handlerClass()
except Exception as e:
print(e)
abort(404)
print(handlerClass, handler, handler.value, handler.name())
# Handler processing. Not yet implemented
return "Hello World"
if __name__ == "__main__":
app.run(host='0.0.0.0', port=8080, debug=True)
One "simple" handler example. A handler is a module which needs to define an Handler class :
./handlers/simple.py :
import os
class Handler:
def __init__(self):
self.value = os.urandom(5)
def name(self):
return "simple"
If I understand correctly, the import is done on each query to the endpoint. It means IO in the filesystem with lookup for the modules, ...
Is it the correct/"pythonic" way to implement this strategy ?
Question moved to codereview. Thanks all for your help : https://codereview.stackexchange.com/questions/96533/extension-pattern-in-a-flask-controller-using-importlib
I am closing this thread.

How to get the return value (like Ajax) using task queue on Google App Engine

I can use a task queue to change the database value, but how can I get the return value like Ajax using task queue?
This is my code:
from google.appengine.api.labs import taskqueue
from google.appengine.ext import db
from google.appengine.ext import webapp
from google.appengine.ext.webapp import template
from google.appengine.ext.webapp.util import run_wsgi_app
import os
class Counter(db.Model):
count = db.IntegerProperty(indexed=False)
class BaseRequestHandler(webapp.RequestHandler):
def render_template(self, filename, template_values={}):
values={
}
template_values.update(values)
path = os.path.join(os.path.dirname(__file__), 'templates', filename)
self.response.out.write(template.render(path, template_values))
class CounterHandler(BaseRequestHandler):
def get(self):
self.render_template('counters.html',{'counters': Counter.all()})
def post(self):
key = self.request.get('key')
# Add the task to the default queue.
for loop in range(0,1):
a=taskqueue.add(url='/worker', params={'key': key})
#self.redirect('/')
self.response.out.write(a)
class CounterWorker(webapp.RequestHandler):
def post(self): # should run at most 1/s
key = self.request.get('key')
def txn():
counter = Counter.get_by_key_name(key)
if counter is None:
counter = Counter(key_name=key, count=1)
else:
counter.count += 1
counter.put()
db.run_in_transaction(txn)
self.response.out.write('sss')#used for get by task queue
def main():
run_wsgi_app(webapp.WSGIApplication([
('/', CounterHandler),
('/worker', CounterWorker),
]))
if __name__ == '__main__':
main()
How can I show the 'sss'?
The current Task Queue API doesn't support processing return values or sending them back to the point of origin. Your appengine process isn't long-lived enough for that programming paradigm to work.
In your example, it looks like what you want is something like this:
Create task
Return AJAX code that will poll a task-status handler
Task processes, updates datastore with a return value
Task-status url returns updated value
Alternatively, if you don't want to return the 'sss' to the client but instead need it for further processing, you'll need to split your method into multiple parts. The first part creates the task and then exits. At the end of the task's process, it adds a new task itself to call back into the second part with the return value.

Categories