I have created my micro web framework with flask which uses fabric to call the shell scripts which are in remote servers.
The shell script might take a longer time to get completed. I send the POST request from my browser and awaits for the results.
The fabric displays the real time contents on the flask run screen but flask returns the values to the browser after the completion of that remote script.
How can i make my flask to print that real time values on my browser screen ?
My flask piece:
#app.route("/abc/execute", methods=['POST'])
def execute_me():
value = request.json['value']
result = fabric_call(value)
result = formations(result)
return json.dumps(result)
My fabric piece:
def fabric_call(value):
with settings(host_string='my server', user='user',password='passwd',warn_only=True):
proc = run(my shell script)
return json.dumps(proc)
Update
I tried streamin` as well. But it didn't work. The output is displayed to my curl POST after script's complete execution. What am I missing ?
#app.route("/abc/execute", methods=['POST'])
def execute_me():
value = request.json['value']
def generate():
for row in formations(fabric_call(value)):
yield row + '\n'
return Response(generate(), mimetype="text/event-stream")
First of all, you need to make sure your data source (formations()) is actually a generator that yields data when available. Right now it very much looks like it runs the command and only returns a value once it has completely finished.
Also, in case you are using AJAX to call your endpoint remember that you cannot use e.g. jQuery's $.ajax(); you need to use XHR directly and poll for new data instead of using the onreadystatechange event since you want data once it's available and not only when the request finished.
Related
I am working on a Python flask app, and the main method start() calls an external API (third_party_api_wrapper()). That external API has an associated webhook (webhook()) that receives the output of that external API call (note that the output that webhook() receives is actually different from the response returned in the third_party_wrapper())
The main method start() needs the result of webhook(). How do I make start() wait for webhook() to be executed? And how do wo pass the returned value of webhook() back to start()?
Here's is a minimal code snippet to capture the scenario.
#app.route('/webhook', methods=['POST'])
def webhook():
return "webhook method has executed"
# this method has a webhook that calls webhook() after this method has executed
def third_party_api_wrapper():
url = 'https://api.thirdparty.com'
response = requests.post(url)
return response
# this is the main entry point
#app.route('/start', methods=['POST'])
def start():
third_party_api_wrapper()
# The rest of this code depends on the output of webhook().
# How do we wait until webhook() is called, and how do we access the returned value?
The answer to this question really depends on how you plan on running your app in production. It's much simpler if we make the assumption that you only plan to have a single instance of your app running at once (as opposed to multiple behind a load balancer, for example), so I'll make that assumption first to give you a place to start, and comment on a more "production-ready" solution afterwards.
A big thing to keep in mind when writing a web application is that you have to understand how you want the outside world to interact with your app. Do you expect to have the /start endpoint called only once at the beginning of your app's lifetime, or is this a generic endpoint that may start any number of background processes that you want the caller of each to wait for? Or, do you want the behavior where any caller after the first one will wait for the same process to complete as the first one? I can't answer these questions for you, it depends on the use-case you're trying to implement. I'll give you a relatively simple solution that you should be able to modify to fulfill any of the ones I mentioned though.
This solution will use the Event class from the threading standard library module; I added some comments to clarify which parts you may have to change depending on the specifics of the API you're calling and stuff like that.
import threading
import uuid
from typing import Any
import requests
from flask import Flask, Response, request
# The base URL for your app, if you're running it locally this should be fine
# however external providers can't communicate with your `localhost` so you'll
# need to change this for your app to work end-to-end.
BASE_URL = "http://localhost:5000"
app = Flask(__name__)
class ThirdPartyProcessManager:
def __init__(self) -> None:
self.events = {}
self.values = {}
def wait_for_request(self, request_id: str) -> None:
event = threading.Event()
actual_event = self.events.setdefault(request_id, event)
if actual_event is not event:
raise ValueError(f"Request {request_id} already exists.")
event.wait()
return self.values.pop(request_id)
def finish_request(self, request_id: str, value: Any) -> None:
event = self.events.pop(request_id, None)
if event is None:
raise ValueError(f"Request {request_id} does not exist.")
self.values[request_id] = value
event.set()
MANAGER = ThirdPartyProcessManager()
# This is assuming that you can specify the callback URL per-request, otherwise
# you may have to get the request ID from the body of the request or something
#app.route('/webhook/<request_id>', methods=['POST'])
def webhook(request_id: str) -> Response:
MANAGER.finish_request(request_id, request.json)
return "webhook method has executed"
# Somehow in here you need to create or generate a unique identifier for this
# request--this may come from the third-party provider, or you can generate one
# yourself. There are three main paths I see here:
# - If you can specify the callback/webhook URL in each call, you can just pass them
# <base>/webhook/<request_id> and use that to identify which request is being
# responded to in the webhook.
# - If the provider gives you a request ID, you can return it from this function
# then retrieve it from the request body in the webhook route
# For now, I'll assume the first situation but you should be able to implement the second
# with minimal changes
def third_party_api_wrapper() -> str:
request_id = uuid.uuid4().hex
url = 'https://api.thirdparty.com'
# Just an example, I don't know how the third party API you're working with works
response = requests.post(
url,
json={"callback_url": f"{BASE_URL}/webhook/{request_id}"}
)
# NOTE: unrelated to the problem at hand, you should always check for errors
# in HTTP responses. This method is an easy way provided by requests to raise
# for non-success status codes.
response.raise_for_status()
return request_id
#app.route('/start', methods=['POST'])
def start() -> Response:
request_id = third_party_api_wrapper()
result = MANAGER.wait_for_request(request_id)
return result
If you want to run the example fully locally to test it, do the following:
Comment out lines 62-71, which actually make the external API call
Add a print statement after line 77, so that you can get the ID of the "in flight" request. E.g. print("Request ID", request_id)
In one terminal, run the app by pasting the above code into an app.py file and running flask run in that directory.
In another terminal, start the process via:
curl -XPOST http://localhost:5000/start
Copy the request ID that will be logged in the first terminal that's running the server.
In a third terminal, complete the process by calling the webhook:
curl -XPOST http://localhost:5000/webhook/<your_request_id> -H Content-Type:application/json -d '{"foo":"bar"}'
You should see {"foo":"bar"} as the response in the second terminal that made the /start request.
I hope that's enough to help you get started w/ whatever problem you're trying to solve.
There are a couple of design-y comments I have based on the information provided as well:
As I mentioned before, this will not work if you have more than one instance of the app running at once. This works by storing the state of in-flight requests in a global state inside your python process, so if you have more than one process, they won't all be working and modifying the same state. If you need to run more than one instance of your process, I would use a similar approach with some database backend to store the shared state (assuming your requests are pretty short-lived, Redis might be a good choice here, but once again it'll depend on exactly what you're trying to do).
Even if you do only have one instance of the app running, flask is capable of being run in a variety of different server contexts--for example, the server might be using threads (the default), greenlets via gevent or a similar library, or multiple processes, or maybe some other approach entirely in order to handle multiple requests concurrently. If you're using an approach that creates multiple processes, you should be able to use the utilities provided by the multiprocessing module to implement the same approach as I've given above.
This approach probably will work just fine for something where the difference in time between the API call and the webhook response is small (on the order of a couple of seconds at most I'd say), but you should be wary of using this approach for something where the difference in time can be quite large. If the connection between the client and your server fails, they'll have to make another request and run the long-running process that your third party is completing for you again. Some proxies and load balancers may also have time out behavior that could terminate the request after a certain amount of time even if nothing goes wrong in the connection between your server and the client making a request to it. An alternative approach would be for your /start endpoint to return quickly and give the client a request_id that they could poll for updates. As an example, AWS Athena's API is structured like this--there is a StartQueryExecution method, and separate GetQueryExecution and GetQueryResults methods that the client makes requests to check the status of a query and retrieve the results respectively (there are also other methods like StopQueryExecution and GetQueryRuntimeStatistics available as well). You can check out the documentation here.
I know that's a lot of info, but I hope it helps. Happy to update the answer w/ more specific info if you'll provide some more details about your use-case.
I am fairly new to flask and therefore I have a question regarding the logic behind my code and if my thought process makes sense.
I have an old python script, that takes a lot of data, processes it and then produces a plot with matplotlib. This works fine.
Now I want to build a web application, where a user selects specific input parameters, clicks the submit button, my server checks (with the help of sqlite) if there already exists a plot/the data to make the plot with those parameters, if yes then make it downloadable by the user.
If this plot/the data does not exist, my flask application calls my python script and the new plot/data for it will get created, uploaded to sqlite and then the user can download it from the web application.
As you can see in my code so far the python script is an external one and my plan is to not include it in my views from my flask app, does this make sense? Should I call an external script here or just copy the code directly to my views? (I wanted to avoid it so far since it's a pretty big script)
My logic so far looks like this:
# this part works well so far, I get the user input here and redirect it to the specific page
#app.route('/plot', methods = ['GET','POST'])
def get_plot():
if request.method == 'POST':
input1 = request.form['input1']
input2 = request.form['input2']
input3 = request.form['input3']
return redirect('plot/{}/{}/{}/'.format(input1, input2, input3))
else:
return render_template('plot.html')
# here it get's a bit tricky for me
#app.route('/plot/<input1>/<input2>/<input3>/')
def create_plot(input1='1', input2='2', input3='3'):
try:
db = get_db(DB_PLOT)
cur = db.execute('SELECT * FROM {} WHERE param1 = {} AND param2={}'.format(input1, input2, input3)) # get all the data from the table
except:
return "Plot not found !"
# CALL THE EXTERNAL PYTHON SCRIPT HERE?
cur = db.execute('SELECT * FROM {} WHERE param1 = {} AND param2={}'.format(input1, input2, input3)) # python script should have updated the database, so i can call the data here
data = cur.fetchall()
return render_template('show_plot.html', data=data)
Furthermore, I have another question:
As said my python script, which I've only ever used so far on its own, takes raw data, manipulates it and then creates a plot with matplotlib.
When I want to implement it in my web application, should I still create the plot with the python script, upload the plot to sqlite and then get the image from sqlite with the web application OR should I just upload the manipulated data, then download this data from sqlite and create the plot with flask?
In the end, I want to make it possible for the user to download the plot as .jpg and .pdf file.
Thank you so much!
You can use the subprocess module to achieve this. Depending on how long your script takes to generate the plot you should consider returning a page to the user asking him to refresh the page after some time until the plot is available (you could also auto refresh using javascript frequently). A view function which takes too long can be a problem because the amount of threads Flask uses to handle requests is limited, this could make your application unavailable if too many users are generating plots at the same time.
Returning before calling your script won't work obviously.
return "Plot not found !"
# CALL THE EXTERNAL PYTHON SCRIPT HERE?
You also need to insert a entry in the database before starting to work on the plot generation so no other user can run your script again with the same parameters.
Your except case (when the plot is not is the database already) would look something like this (probably vulerable to SQL injection):
# write entry to the database with parameters but without plot
db.execute('INSERT INTO {} (param1, param2, param3) VALUES ({}, {}, {})'.format(table_name, input1, input2, input3))
# start process which does it's calculations in another process and
# updates the table we just inserted with the plot when finished
p = Popen(['/path/to/script.py', input1, input2, input3], stdin=None, stdout=None, stderr=None, close_fds=True)
return "Plot is beeing generated ..., please refresh page until the plot is available"
The script generating the plot would then update the entry as follows (probably vulerable to SQL injection):
db.execute('UPDATE {} SET plot_blob = {} WHERE param1 = {} AND param2 = {} AND param3 = {}'.format(table_name, binary_image_data, input1, input2, input3))
NOTES:
You can build the URL with url_for which might be neater:
return redirect(url_for('create_plot', input1=input1, input2=input2, input3=input3))
Your SQL query is probably vulnerable to SQL injection, although the library your using might handle this. I don't know which it is.
I am trying to setting up an acceptance test harness for a flask app and I am currently struggling to wait for the app to start before making calls.
Following construct works fine:
class SpinUpTests(unittest.TestCase):
def tearDown(self):
super().tearDown()
self.stubby_server.kill()
self.stubby_server.communicate()
def test_given_not_yet_running_when_created_without_config_then_started_on_default_port(self):
self.not_yet_running(5000)
self.stubby_server = subprocess.Popen(['python', '../../app/StubbyServer.py'], stdout=subprocess.PIPE)
time.sleep(1)#<--- I would like to get rid of this
self.then_started_on_port(5000)
I would like to wait on stdout for:
self.stubby_server = subprocess.Popen(['python', '../../app/StubbyServer.py'], stdout=subprocess.PIPE)
time.sleep(1)#<--- I would like to get rid of this
Running on http://127.0.0.1:[port]/ (Press CTRL+C to quit)
I tried
for line in self.stubby_server.stdout.readline()
but readline() never finishes, tho I already see the output in the test output window.
Any ideas how I can wait for the flask app to start without having to use an explicit sleep()?
Using the retry package, this will help overcome your problem. Ultimately, you set what you are looking to try again, what exception you want to retry on, and you can set specific timing parameters based on how you want to retry. It's pretty well documented.
Here is an example of how I solved this in one of the projects I was working on here
Here is the snippet of code that will help you, in case that link does not work:
#classmethod
def _start_app_locally(cls):
subprocess.Popen(["fake-ubersmith"])
retry_call(
requests.get,
fargs=["{}/status".format(cls.endpoint)],
exceptions=RequestException,
delay=1
)
As you can see I just tried to hit my endpoint with a get using requests (the fargs are the arguments passed to requests.get as you can see it calls back the method you pass to retry_call), and based on the RequestException I was expecting, I would retry with a 1 second delay.
Finally, "fake-ubersmith" is the command that will run your server, which is ultimately your similar command of: 'python', '../../app/StubbyServer.py'
I am playing around with a small project to get a better understanding of web technologies.
One requirement is that if multiple clients have access to my site, and one make a change all the others should be notified. From what I have gathered Server Sent events seems to do what I want.
However when I open my site in both Firefox and Chrome and try to send an event, only one of the browsers gets it. If I send an event again only one of the browsers gets the new event, usually the browser that did not get event number one.
Here is the relevant code snippets.
Client:
console.log("setting sse handlers")
viewEventSource = new EventSource("{{ url_for('viewEventRequest') }}");
viewEventSource.onmessage = handleViewEvent;
function handleViewEvent(event){
console.log("called handle view event")
console.log(event);
}
Server:
#app.route('/3-3-3/view-event')
def view_event_request():
return Response(handle_view_event(), mimetype='text/event-stream')
def handle_view_event():
while True:
for message in pubsub_view.listen():
if message['type'] == 'message':
data = 'retry: 1\n'
data += 'data: ' + message['data'] + '\n\n'
return data
#app.route('/3-3-3/test')
def test():
red.publish('view-event', "This is a test message")
return make_response("success", 200)
My question is, how do I get the event send to all connected clients and not just one?
Here are some gists that may help (I've been meaning to release something like 'flask-sse' based on 'django-sse):
https://gist.github.com/3680055
https://gist.github.com/3687523
also useful - https://github.com/jkbr/chat/blob/master/app.py
The first thing I notice about your code is that 'handle_view_event' is not a generator.
Though it is in a 'while' loop, the use of 'return' will always exit the function the first time we return data; a function can only return once. I think you want it to be 'yield' instead.
In any case, the above links should give you an example of a working setup.
As Anarov says, websockets and socket.io are also an option, but SSE should work anyway. I think socket.io supports using SSE if ws are not needed.
Using: Django with Python
Overall objective: Call a function which processes video conversion (internally makes a curl command to the media server) and should immediately return back to the user.
Using message queue would be an overkill for the app.
So I had decided to use threads, I have written a class which overwrites the init and run method and calls the curl command
class process_video(Thread):
def __init__ (self,video_id,video_title,fileURI):
Thread.__init__(self)
self.video_id = video_id
self.video_title = video_title
self.fileURI = fileURI
self.status =-1
def run(self):
logging.debug("FileURi" + self.fileURI)
curlCmd = "curl --data-urlencode \"fileURI=%s\" %s/finalize"% (self.fileURI, settings.MEDIA_ROOT)
logging.debug("Command to be executed" + str(curlCmd))
#p = subprocess.call(str(curlCmd), shell=True)
output_media_server,error = subprocess.Popen(curlCmd,stdout = subprocess.PIPE).communicate()
logging.debug("value returned from media server:")
logging.debug(output_media_server)
And I instantiate this class from another function called createVideo
which calls like this success = process_video(video_id, video_title, fileURI)
Problem:
The user gets redirected back to the other view from the createVideo and the processVideo gets called, however for some reason the created thread (process_video) doesn't wait for the output from the media server.
I wouldn't rely on threads being executed correctly within web applications. Depending on the web server's MPM, the process that executes the request might get killed after a request is done (I guess).
I'd recommend to make the media server request synchronously but let the media server return immediately after it started the encoding without problems (if you have control over its source code). Then a background process (or cron) could poll for the result regularly. This is only one solution - you should provide more information about your infrastructure (e.g. do you control the media server?).
Also check the duplicates in another question's comments for some answers about using task queues in such a scenario.
BTW I assume that no exception occurs in the background thread?!
Here is the thing what I did for getting around the issue which I was facing.
I used django piston to create an API for calling the processvideo with the parameters passed as GET, I was getting a 403 CSRF error when I was trying to send the parameters as POST.
and from the createVideo function I was calling the API like this
cmd = "curl \"%s/api/process_video/?video_id=%s&fileURI=%s&video_title=%s\" > /dev/null 2>&1 &" %(settings.SITE_URL, str(video_id),urllib.quote(fileURI),urllib.quote(video_title))
and this worked.
I feel it would have been helpful if I could have got the session_id and post parameters to work. Not sure how I could get off that csrf thing to work.