I have two functions, A and B in flask:
#app.route("/")
def index():
session["var"] = 0
return render_template("index.html")
#app.route("/A")
def A():
session["var"] += 1
return {}
#app.route("/B")
def B():
i = 0
while(i < 300):
time.sleep(1)
print(session["var"])
i += 1
return {}
and in the client side, I call function A, wait for response, and then call function B without waiting for response.
A_response = await GET("A");
B_response = GET("B");
The problem is that if I do the operation above in the client side twice, the output in the server will be always 1 for the first call of B and 2 for the second call of B, what I want to achive is to get 2 as output for the two calls of B (ofcurse after I call both).
What I tried to do:
set session as a global variable in the server side instead of importing it from Flask package. It worked locally for one process, but it didn't work in the cloud, also the problem got worse as function A didn't recognize session["var"].
I thought of using a database as it will solve the problem I described here, but I don't like this solution as I will have to deal with other problems that the flask.session has already solved for me, such as:
It adds more complexity identifying a session between the server and the browser, which means, I need to recognize each user uniquly and get his information from the database in someway.
I should somehow clear the database since now and then, its not so clear to me when should I do it, intuitively the session should be cleared from the database when the user closes the browser (which happens with Flask.session) but I don't know how exactly to implement that.
If there was a solution that can be made to Flask.session to solve the described problem, without replacing it with database that would be great!
I think the best solution for your problem is using a database to handle sessions.
For the issues you mentioned using a database as a session:
you can generate a unique random ID in index function which will identify your user
You could use a websocket (i.e see https://flask-socketio.readthedocs.io/en/latest/#connection-events). You can register a disconnect handler function and clear the session.
Related
I am working on a Python flask app, and the main method start() calls an external API (third_party_api_wrapper()). That external API has an associated webhook (webhook()) that receives the output of that external API call (note that the output that webhook() receives is actually different from the response returned in the third_party_wrapper())
The main method start() needs the result of webhook(). How do I make start() wait for webhook() to be executed? And how do wo pass the returned value of webhook() back to start()?
Here's is a minimal code snippet to capture the scenario.
#app.route('/webhook', methods=['POST'])
def webhook():
return "webhook method has executed"
# this method has a webhook that calls webhook() after this method has executed
def third_party_api_wrapper():
url = 'https://api.thirdparty.com'
response = requests.post(url)
return response
# this is the main entry point
#app.route('/start', methods=['POST'])
def start():
third_party_api_wrapper()
# The rest of this code depends on the output of webhook().
# How do we wait until webhook() is called, and how do we access the returned value?
The answer to this question really depends on how you plan on running your app in production. It's much simpler if we make the assumption that you only plan to have a single instance of your app running at once (as opposed to multiple behind a load balancer, for example), so I'll make that assumption first to give you a place to start, and comment on a more "production-ready" solution afterwards.
A big thing to keep in mind when writing a web application is that you have to understand how you want the outside world to interact with your app. Do you expect to have the /start endpoint called only once at the beginning of your app's lifetime, or is this a generic endpoint that may start any number of background processes that you want the caller of each to wait for? Or, do you want the behavior where any caller after the first one will wait for the same process to complete as the first one? I can't answer these questions for you, it depends on the use-case you're trying to implement. I'll give you a relatively simple solution that you should be able to modify to fulfill any of the ones I mentioned though.
This solution will use the Event class from the threading standard library module; I added some comments to clarify which parts you may have to change depending on the specifics of the API you're calling and stuff like that.
import threading
import uuid
from typing import Any
import requests
from flask import Flask, Response, request
# The base URL for your app, if you're running it locally this should be fine
# however external providers can't communicate with your `localhost` so you'll
# need to change this for your app to work end-to-end.
BASE_URL = "http://localhost:5000"
app = Flask(__name__)
class ThirdPartyProcessManager:
def __init__(self) -> None:
self.events = {}
self.values = {}
def wait_for_request(self, request_id: str) -> None:
event = threading.Event()
actual_event = self.events.setdefault(request_id, event)
if actual_event is not event:
raise ValueError(f"Request {request_id} already exists.")
event.wait()
return self.values.pop(request_id)
def finish_request(self, request_id: str, value: Any) -> None:
event = self.events.pop(request_id, None)
if event is None:
raise ValueError(f"Request {request_id} does not exist.")
self.values[request_id] = value
event.set()
MANAGER = ThirdPartyProcessManager()
# This is assuming that you can specify the callback URL per-request, otherwise
# you may have to get the request ID from the body of the request or something
#app.route('/webhook/<request_id>', methods=['POST'])
def webhook(request_id: str) -> Response:
MANAGER.finish_request(request_id, request.json)
return "webhook method has executed"
# Somehow in here you need to create or generate a unique identifier for this
# request--this may come from the third-party provider, or you can generate one
# yourself. There are three main paths I see here:
# - If you can specify the callback/webhook URL in each call, you can just pass them
# <base>/webhook/<request_id> and use that to identify which request is being
# responded to in the webhook.
# - If the provider gives you a request ID, you can return it from this function
# then retrieve it from the request body in the webhook route
# For now, I'll assume the first situation but you should be able to implement the second
# with minimal changes
def third_party_api_wrapper() -> str:
request_id = uuid.uuid4().hex
url = 'https://api.thirdparty.com'
# Just an example, I don't know how the third party API you're working with works
response = requests.post(
url,
json={"callback_url": f"{BASE_URL}/webhook/{request_id}"}
)
# NOTE: unrelated to the problem at hand, you should always check for errors
# in HTTP responses. This method is an easy way provided by requests to raise
# for non-success status codes.
response.raise_for_status()
return request_id
#app.route('/start', methods=['POST'])
def start() -> Response:
request_id = third_party_api_wrapper()
result = MANAGER.wait_for_request(request_id)
return result
If you want to run the example fully locally to test it, do the following:
Comment out lines 62-71, which actually make the external API call
Add a print statement after line 77, so that you can get the ID of the "in flight" request. E.g. print("Request ID", request_id)
In one terminal, run the app by pasting the above code into an app.py file and running flask run in that directory.
In another terminal, start the process via:
curl -XPOST http://localhost:5000/start
Copy the request ID that will be logged in the first terminal that's running the server.
In a third terminal, complete the process by calling the webhook:
curl -XPOST http://localhost:5000/webhook/<your_request_id> -H Content-Type:application/json -d '{"foo":"bar"}'
You should see {"foo":"bar"} as the response in the second terminal that made the /start request.
I hope that's enough to help you get started w/ whatever problem you're trying to solve.
There are a couple of design-y comments I have based on the information provided as well:
As I mentioned before, this will not work if you have more than one instance of the app running at once. This works by storing the state of in-flight requests in a global state inside your python process, so if you have more than one process, they won't all be working and modifying the same state. If you need to run more than one instance of your process, I would use a similar approach with some database backend to store the shared state (assuming your requests are pretty short-lived, Redis might be a good choice here, but once again it'll depend on exactly what you're trying to do).
Even if you do only have one instance of the app running, flask is capable of being run in a variety of different server contexts--for example, the server might be using threads (the default), greenlets via gevent or a similar library, or multiple processes, or maybe some other approach entirely in order to handle multiple requests concurrently. If you're using an approach that creates multiple processes, you should be able to use the utilities provided by the multiprocessing module to implement the same approach as I've given above.
This approach probably will work just fine for something where the difference in time between the API call and the webhook response is small (on the order of a couple of seconds at most I'd say), but you should be wary of using this approach for something where the difference in time can be quite large. If the connection between the client and your server fails, they'll have to make another request and run the long-running process that your third party is completing for you again. Some proxies and load balancers may also have time out behavior that could terminate the request after a certain amount of time even if nothing goes wrong in the connection between your server and the client making a request to it. An alternative approach would be for your /start endpoint to return quickly and give the client a request_id that they could poll for updates. As an example, AWS Athena's API is structured like this--there is a StartQueryExecution method, and separate GetQueryExecution and GetQueryResults methods that the client makes requests to check the status of a query and retrieve the results respectively (there are also other methods like StopQueryExecution and GetQueryRuntimeStatistics available as well). You can check out the documentation here.
I know that's a lot of info, but I hope it helps. Happy to update the answer w/ more specific info if you'll provide some more details about your use-case.
Has anyone gotten parallel tests to work in Django with Elasticsearch? If so, can you share what configuration changes were required to make it happen?
I've tried just about everything I can think of to make it work including the solution outlined here. Taking inspiration from how Django itself does the parallel DB's, I currently have created a custom new ParallelTestSuite that overrides the init_worker to iterate through each index/doctype and change the index names roughly as follows:
_worker_id = 0
def _elastic_search_init_worker(counter):
global _worker_id
with counter.get_lock():
counter.value += 1
_worker_id = counter.value
for alias in connections:
connection = connections[alias]
settings_dict = connection.creation.get_test_db_clone_settings(_worker_id)
# connection.settings_dict must be updated in place for changes to be
# reflected in django.db.connections. If the following line assigned
# connection.settings_dict = settings_dict, new threads would connect
# to the default database instead of the appropriate clone.
connection.settings_dict.update(settings_dict)
connection.close()
### Everything above this is from the Django version of this function ###
# Update index names in doctypes
for doc in registry.get_documents():
doc._doc_type.index += f"_{_worker_id}"
# Update index names for indexes and create new indexes
for index in registry.get_indices():
index._name += f"_{_worker_id}"
index.delete(ignore=[404])
index.create()
print(f"Started thread # {_worker_id}")
This seems to generally work, however, there's some weirdness that happens seemingly randomly (i.e. running the test suite again doesn't reliably reproduce the issue and/or the error messages change). The following are the various errors I've gotten and it seems to randomly fail on one of them each test run:
Raise a 404 when trying to create the index in the function above (I've confirmed that it's the 404 coming back from the PUT request, however in the Elasticsearch server logs it says that it's created the index without issue)
a 500 when trying to create the index, although this one hasn't happened in a while so I think this was fixed by something else
query responses will sometimes not have an items dictionary value inside the _process_bulk_chunk function from the elasticsearch library
I'm thinking that there's something weird going on at the connection layer (like somehow the connections between Django test runner processes are getting the responses mixed up?) but I'm at a loss as to how that would be even possible since Django uses multiprocessing to parallelize the tests and thus they are each running in their own process. Is it somehow possible that the spun-off processes are still trying to use the connection pool of the original process or something? I'm really at a loss of other things to try from here and would greatly appreciate some hints or even just confirmation that this is in fact possible to do.
I'm thinking that there's something weird going on at the connection layer (like somehow the connections between Django test runner processes are getting the responses mixed up?) but I'm at a loss as to how that would be even possible since Django uses multiprocessing to parallelize the tests and thus they are each running in their own process. Is it somehow possible that the spun-off processes are still trying to use the connection pool of the original process or something?
This is exactly what is happening. From the Elasticsearch DSL docs:
Since we use persistent connections throughout the client it means that the client doesn’t tolerate fork very well. If your application calls for multiple processes make sure you create a fresh client after call to fork. Note that Python’s multiprocessing module uses fork to create new processes on POSIX systems.
What I observed happening is that the responses get very weirdly interleaved with a seemingly random client that may have started the request. So a request to index a document might end up with a response to create an index which have very different attributes on them.
The fix is to ensure that each test worker has its own Elasticsearch client. This can be done by creating worker-specific connection aliases and then overwriting the current connection aliases (with the private attribute _using) with the worker-specific one. Below is a modified version of the code you posted with the change
_worker_id = 0
def _elastic_search_init_worker(counter):
global _worker_id
with counter.get_lock():
counter.value += 1
_worker_id = counter.value
for alias in connections:
connection = connections[alias]
settings_dict = connection.creation.get_test_db_clone_settings(_worker_id)
# connection.settings_dict must be updated in place for changes to be
# reflected in django.db.connections. If the following line assigned
# connection.settings_dict = settings_dict, new threads would connect
# to the default database instead of the appropriate clone.
connection.settings_dict.update(settings_dict)
connection.close()
### Everything above this is from the Django version of this function ###
from elasticsearch_dsl.connections import connections
# each worker needs its own connection to elasticsearch, the ElasticsearchClient uses
# global connection objects that do not play nice otherwise
worker_connection_postfix = f"_worker_{_worker_id}"
for alias in connections:
connections.configure(**{alias + worker_connection_postfix: settings.ELASTICSEARCH_DSL["default"]})
# Update index names in doctypes
for doc in registry.get_documents():
doc._doc_type.index += f"_{_worker_id}"
# Use the worker-specific connection
doc._doc_type._using = doc.doc_type._using + worker_connection_postfix
# Update index names for indexes and create new indexes
for index in registry.get_indices():
index._name += f"_{_worker_id}"
index._using = doc.doc_type._using + worker_connection_postfix
index.delete(ignore=[404])
index.create()
print(f"Started thread # {_worker_id}")
I've been working in a web app for a while and this is the first time I realize this problem, I think it could be related with how SQLAlchemy sessions are handled, so some clarification in simple term would be helpful.
My configuration for work with flask sqlAlchemy is:
from flask_sqlalchemy import SQLAlchemy
db = SQLAlchemy(app)
My problem: db.session.commit() sometimes doesn't save changes. I wrote some flask endpoints which are reached via the front end requests in the user browser.
In this particular case, I'm editing a hotel "Booking" object altering the "Rooms" columns which is a Text field.
the function does the following:
1-Query the Booking object from the dates in the request
2- Edit the Rooms column of this Booking object
3- Commit the changes "db.session.commit()"
4- If a user has X functionality active, I make some checks calling a second function:
·4.1- This functions make some checks and query and edit another object in the database different from the "Booking" object I edited previously.
·4.2- At the end of this secondary function I call db.session.commit() "Note this changes always got saved correctly in the database"
·4.3- Return the results to the previous function
5- Return results to the front end ("just before this return, I print the Booking.Rooms to make sure it looks as it should, and it does... I even tried to make a second commit after the print but before the return... But after this, sometimes Booking.Rooms are updated as expected but some other times it doesn't... I noted if repeat the action many times it finally works, but given the intermediate function "described in point 4" saves all his changes correctly, this causes an inconsistency in the data and drives me mad because if I repeat the action and procedure in the function of point 4 worked correctly, I can't repeat the mod Rooms action...
So, I'm now really confused if this is something I don't understand from flask sessions, for what I understand, whenever I make a new request to flask, it's an isolated session, right?
I mean, if 2 concurrent users are storing some changes in the database, a db.session.commit() from one of the users won't commit the changes from the other one, right?
Same way, if I call db.session.commit() in one request, that changes are stored in the database, and if after that "in the same request", I keep modding things, it's like another session, right? And the committed changes are there already safely stored? And I can still use previous objects for further modifications
Anyway, all of this shouldn't be a problem because after the commit() I print the Booking.Rooms and looks as expected... And some times it works getting stored correctly and some times it doesn't...
Also note: When I return this result to the client, the client makes instantly a second request to the server to request updated Booking data, and then the data is returned without this expected changes committed... I suppose flask handled all the commit() before it gets the second request "other way it wouldn't have returned the result previously..."
Can this be a limitation of the flask development server which can't handle correctly many requests and that when deployed with gunicorn it doesn't happen?
Any hint or clarification about Sessions would be nice, because this is pretty strange behaviour, especially that sometimes works and others don't...
And as requested here is the code, I know is not possible to reproduce, have a lot of setup behind and would need a lot of data to works as intended under same circumstances as in my case, but this should provide an overview of how the functions looks like and where are the commits I mention above. Any ideas of where can be the problem is very helpful.
#Main function hit by the frontend
#users.route('/url_endpoint1', methods=['POST'], strict_slashes=False)
#login_required
def add_room_to_booking_Api():
try:
bookingData = request.get_json()
roomURL=bookingData["roomSafeURL"]
targetBooking = bookingData["bookingURL"]
startDate = bookingData["checkInDate"]
endDate = bookingData["checkOutDate"]
roomPrices=bookingData["roomPrices"]
booking = Bookings.query.filter_by(SafeURL=targetBooking).first()
alojamiento = Alojamientos.query.filter_by(id=reserva.CodigoAlojamiento).first() #owner of the booking
room=Rooms.query.filter_by(SafeURL=roomURL).first()
roomsInBooking=ast.literal_eval(reserva.Habitaciones) #I know, should be json.loads() and json.dumps() for better performance probably...
#if room is available for given days add it to the booking
if CheckIfRoomIsAvailableForBooking(alojamiento.id, room, startDate, endDate, booking) == "OK":
roomsInBooking.append({"id": room.id, "Prices": roomPrices, "Guests":[]}) #add the new room the Rooms column of the booking
booking.Habitaciones = str(roomsInBooking)#save the new rooms data
print(booking.Habitaciones) # check changes applied
room.ReservaAsociada = booking.id # associate booking and room
for ocupante in room.Ocupantes: #associate people in the room with the booking
ocupante.Reserva = reserva.id
#db.session.refresh(reserva) # test I made to check if something changes but didn't worked
if some_X_function() == True: #if user have some functionality enabled
#db.session.begin() #another test which didn't worked
RType = WuBook_Rooms.query.filter_by(ParentType=room.Tipo).first()
RType=[RType] #convert to list because I resuse the function in cases with multiple types
resultAdd = function4(RType, booking.Entrada.replace(hour=0, minute=0, second=0), booking.Salida.replace(hour=0, minute=0, second=0))
if resultAdd["resultado"] == True: # "resultado":error, "casos":casos
return (jsonify({"resultado": "Error", "mensaje": resultAdd["casos"]}))
print(booking.Habitaciones) #here I still get expected result
db.session.commit()
#I get this return of averything correct in my frontend but not really stored in the database
return jsonify({"resultado": "Ok", "mensaje": "Room " + str(room.Identificador) + " added to the booking"})
else:
return (jsonify({"resultado": "Error", "mensaje": "Room " + str(room.Identificador) + " not available to book in target dates"}))
except Exception as e:
#some error handling which is not getting hit now
db.session.rollback()
print(e, ": en linea", lineno())
excepcion = str((''.join(traceback.TracebackException.from_exception(e).format()).replace("\n","</br>"), "</br>Excepcion emitida ne la línea: ", lineno()))
sendExceptionEmail(excepcion, current_user)
return (jsonify({"resultado":"Error","mensaje":"Error"}))
#function from point 4
def function4(RType, startDate, endDate):
delta = endDate - startDate
print(startDate, endDate)
print(delta)
for ind_type in RType:
calendarUpdated=json.loads(ind_type.updated_availability_by_date)
calendarUpdatedBackup=calendarUpdated
casos={}
actualizar=False
error=False
for i in range(delta.days):
day = (startDate + timedelta(days=i))
print(day, i)
diaString=day.strftime("%d-%m-%Y")
if day>=datetime.now() or diaString==datetime.now().strftime("%d-%m-%Y"): #only care about present and future dates
disponibilidadLocal=calendarUpdated[diaString]["local"]
yaReservadas=calendarUpdated[diaString]["local_booked"]
disponiblesChannel=calendarUpdated[diaString]["avail"]
#adjust availability data
if somecondition==True:
actualizar=True
casos.update({diaString:"Happened X"})
else:
actualizar=False
casos.update({diaString:"Happened Y"})
error="Error"
if actualizar==True: #this part of the code is hit normally and changes stored correctly
ind_type.updated_availability_by_date=json.dumps(calendarUpdated)
wubookproperty=WuBook_Properties.query.filter_by(id=ind_type.PropertyCode).first()
wubookproperty.SyncPending=True
ind_type.AvailUpdatePending=True
elif actualizar==False: #some error occured, revert changes
ind_type.updated_availability_by_date = json.dumps(calendarUpdatedBackup)
db.session.commit()#this commit persists
return({"resultado":error, "casos":casos}) #return to main function with all this chnages stored
Realized nothing was wrong at session level, it was my fault in another function client side which sends a request to update same data which is just being updated but with the old data... so in fact, I was getting the data saved correctly in the database but overwrote few milliseconds later. It was just a return statement missing in a javascript file to avoid this outcome...
I have created my micro web framework with flask which uses fabric to call the shell scripts which are in remote servers.
The shell script might take a longer time to get completed. I send the POST request from my browser and awaits for the results.
The fabric displays the real time contents on the flask run screen but flask returns the values to the browser after the completion of that remote script.
How can i make my flask to print that real time values on my browser screen ?
My flask piece:
#app.route("/abc/execute", methods=['POST'])
def execute_me():
value = request.json['value']
result = fabric_call(value)
result = formations(result)
return json.dumps(result)
My fabric piece:
def fabric_call(value):
with settings(host_string='my server', user='user',password='passwd',warn_only=True):
proc = run(my shell script)
return json.dumps(proc)
Update
I tried streamin` as well. But it didn't work. The output is displayed to my curl POST after script's complete execution. What am I missing ?
#app.route("/abc/execute", methods=['POST'])
def execute_me():
value = request.json['value']
def generate():
for row in formations(fabric_call(value)):
yield row + '\n'
return Response(generate(), mimetype="text/event-stream")
First of all, you need to make sure your data source (formations()) is actually a generator that yields data when available. Right now it very much looks like it runs the command and only returns a value once it has completely finished.
Also, in case you are using AJAX to call your endpoint remember that you cannot use e.g. jQuery's $.ajax(); you need to use XHR directly and poll for new data instead of using the onreadystatechange event since you want data once it's available and not only when the request finished.
I am playing around with a small project to get a better understanding of web technologies.
One requirement is that if multiple clients have access to my site, and one make a change all the others should be notified. From what I have gathered Server Sent events seems to do what I want.
However when I open my site in both Firefox and Chrome and try to send an event, only one of the browsers gets it. If I send an event again only one of the browsers gets the new event, usually the browser that did not get event number one.
Here is the relevant code snippets.
Client:
console.log("setting sse handlers")
viewEventSource = new EventSource("{{ url_for('viewEventRequest') }}");
viewEventSource.onmessage = handleViewEvent;
function handleViewEvent(event){
console.log("called handle view event")
console.log(event);
}
Server:
#app.route('/3-3-3/view-event')
def view_event_request():
return Response(handle_view_event(), mimetype='text/event-stream')
def handle_view_event():
while True:
for message in pubsub_view.listen():
if message['type'] == 'message':
data = 'retry: 1\n'
data += 'data: ' + message['data'] + '\n\n'
return data
#app.route('/3-3-3/test')
def test():
red.publish('view-event', "This is a test message")
return make_response("success", 200)
My question is, how do I get the event send to all connected clients and not just one?
Here are some gists that may help (I've been meaning to release something like 'flask-sse' based on 'django-sse):
https://gist.github.com/3680055
https://gist.github.com/3687523
also useful - https://github.com/jkbr/chat/blob/master/app.py
The first thing I notice about your code is that 'handle_view_event' is not a generator.
Though it is in a 'while' loop, the use of 'return' will always exit the function the first time we return data; a function can only return once. I think you want it to be 'yield' instead.
In any case, the above links should give you an example of a working setup.
As Anarov says, websockets and socket.io are also an option, but SSE should work anyway. I think socket.io supports using SSE if ws are not needed.