I have an async http webs service using fastapi. I am running multiple instances of the same service on the server on a different port and I have an nginx server in front so I can utilise them all. I have a particular resource I need to protect that only one client is accessing it.
#app.get("/do_something")
async def do_something():
critical_section_here()
I tried to protect this critical section using a file lock like this:
#app.get("/do_something")
async def do_something():
with FileLock("dosomething.lock"):
critical_section()
This will prevent multiple processes to enter the critical section at the same time. But what I found is that this will actually dead lock. Think about the following event:
client 1 connected to port 8000 and enter the critical section
while client 1 is still using the resource client 2 is routed to the same port 8000 and then it will try to acquire the file lock, it cannot, so it will keep trying and this will block the execution of client 1 and client 1 will never be able to release the filelock and this means not only this process is locked every other server instance will be locked as well.
Is there a way for me to coordinate these processes so that only one of them access the critical section? I thought about adding a timeout to the filelock but I really don't want to reject user, I just want to wait until it's his/her turn to enter the critical section.
You can try something like this:
import fcntl
from contextlib import asynccontextmanager
from fastapi import FastAPI
app = FastAPI()
def acquire_lock():
f = open("/tmp/test.lock", "w")
fcntl.flock(f, fcntl.LOCK_EX)
return f
#asynccontextmanager
async def lock():
loop = asyncio.get_running_loop()
f = await loop.run_in_executor(None, acquire_lock)
try:
yield
finally:
f.close()
#app.get("/test/")
async def test():
async with lock():
print("Enter critical section")
await asyncio.sleep(5)
print("End critical section")
It will basically serialize all your requests.
You could use aioredlock.
It allows you to create distributed locks between workers (processes). For more information about its usage, follow the link above.
The redlock algorithm is a distributed lock implementation for Redis. There are many implementations of it in several languages. In this case, this is the asyncio compatible implementation for python 3.5+.
Example of usage:
# Or you can use the lock as async context manager:
try:
async with await lock_manager.lock("resource_name") as lock:
assert lock.valid is True
# Do your stuff having the lock
await lock.extend() # alias for lock_manager.extend(lock)
# Do more stuff having the lock
assert lock.valid is False # lock will be released by context manager
except LockError:
print('Lock not acquired')
raise
Related
I'm designing an automated test suite to simulate a client which logs in to the backend via api rest and then opens up a websocket communication. I have to test different features over REST and Websocket.
Currently I'm performing each websocket test like this:
-The client logs in
-The ws communication starts
-It sends a ws message and awaits a response
-It checks the respose schema and asserts the result
-The ws connection is closed
-The test end
My problem:
When I run multiple websocket tests as I described above, they open and close the websocket communication several times and my test client ends up being "Blacklisted" because of the irregular behaveour, and from there is not able to reconnect via ws for a considerable time period.
My question:
How do I open a websocket connection and keep it open and active across all my tests?
I'm using pytest with "requests" module for api calls and "websockets" module for ws communication.
I've tried to split the python process into two sub-processes with "multiprosessig" module but I'm quite lost here because I'm not able yet to commuicate the pytest process with the websocket process to send the messages and retrive the responses
My websocket connection logic is the following code:
async def websocket_connection(device: Device, cmd_list: list[WebsocketMsg] = None):
init_cmd = WsInitCommand(device)
cmd_list.insert(0, init_cmd)
async def wait_for_correct_response(ws_connection, msj_id: str) -> dict:
response_received = False
ws_response: dict = {}
while not response_received:
ws_response = json.loads(await ws_connection.recv())
if 'id' in ws_response and ws_response['id'] == msj_id:
response_received = True
return ws_response
async with websockets.connect(init_cmd.url, subprotocols=init_cmd.sub_protocols) as websocket:
for cmd in cmd_list:
await websocket.send(str(cmd.message))
msg_response: dict = await wait_for_correct_response(websocket, cmd.msg_id)
return True
Use pytest fixture on a session scope to share singleton websocket connection across tests https://docs.pytest.org/en/stable/reference/fixtures.html#higher-scoped-fixtures-are-executed-first.
Omit multiprocesses and splitting into two processes as it would bring additional complexity and will be tricky to implement.
UPD: answering the comment with this example
import pytest
class WS:
pass
#pytest.fixture(scope="session")
def websocket_conn():
ws = WS()
# establish connection
print("This is the same object for each test", id(ws))
yield ws
# do some cleanup, close connection
del ws
def test_something(websocket_conn):
print("Receiving data with", websocket_conn)
websocket_conn.state = "modified"
print("Still the same", id(websocket_conn))
def test_something_else(websocket_conn):
print("Sending data with", websocket_conn)
print("State preserved:", websocket_conn.state)
print("Still the same", id(websocket_conn))
tests/test_ws.py::test_something This is the same object for each test 4498209520
Receiving data with <tests.test_ws.WS object at 0x10c1d3af0>
Still the same 4498209520
PASSED
tests/test_ws.py::test_something_else Sending data with <tests.test_ws.WS object at 0x10c1d3af0>
State preserved: modified
Still the same 4498209520
PASSED
Use case
The client micro service, which calls /do_something, has a timeout of 60 seconds in the request/post() call. This timeout is fixed and can't be changed. So if /do_something takes 10 mins, /do_something is wasting CPU resources since the client micro service is NOT waiting after 60 seconds for the response from /do_something, which wastes CPU for 10 mins and this increases the cost. We have limited budget.
The current code looks like this:
import time
from uvicorn import Server, Config
from random import randrange
from fastapi import FastAPI
app = FastAPI()
def some_func(text):
"""
Some computationally heavy function
whose execution time depends on input text size
"""
randinteger = randrange(1,120)
time.sleep(randinteger)# simulate processing of text
return text
#app.get("/do_something")
async def do_something():
response = some_func(text="hello world")
return {"response": response}
# Running
if __name__ == '__main__':
server = Server(Config(app=app, host='0.0.0.0', port=3001))
server.run()
Desired Solution
Here /do_something should stop the processing of the current request to endpoint after 60 seconds and wait for next request to process.
If execution of the end point is force stopped after 60 seconds we should be able to log it with custom message.
This should not kill the service and work with multithreading/multiprocessing.
I tried this. But when timeout happends the server is getting killed.
Any solution to fix this?
import logging
import time
import timeout_decorator
from uvicorn import Server, Config
from random import randrange
from fastapi import FastAPI
app = FastAPI()
#timeout_decorator.timeout(seconds=2, timeout_exception=StopIteration, use_signals=False)
def some_func(text):
"""
Some computationally heavy function
whose execution time depends on input text size
"""
randinteger = randrange(1,30)
time.sleep(randinteger)# simulate processing of text
return text
#app.get("/do_something")
async def do_something():
try:
response = some_func(text="hello world")
except StopIteration:
logging.warning(f'Stopped /do_something > endpoint due to timeout!')
else:
logging.info(f'( Completed < /do_something > endpoint')
return {"response": response}
# Running
if __name__ == '__main__':
server = Server(Config(app=app, host='0.0.0.0', port=3001))
server.run()
This answer is not about improving CPU time—as you mentioned in the comments section—but rather explains what would happen, if you defined an endpoint with normal def or async def, as well as provides solutions when you run blocking operations inside an endpoint.
You are asking how to stop the processing of a request after a while, in order to process further requests. It does not really make that sense to start processing a request, and then (60 seconds later) stop it as if it never happened (wasting server resources all that time and having other requests waiting). You should instead let the handling of requests to FastAPI framework itself. When you define an endpoint with async def, it is run on the main thread (in the event loop), i.e., the server processes the requests sequentially, as long as there is no await call inside the endpoint (just like in your case). The keyword await passes function control back to the event loop. In other words, it suspends the execution of the surrounding coroutine, and tells the event loop to let something else run, until the awaited task completes (and has returned the result data). The await keyword only works within an async function.
Since you perform a heavy CPU-bound operation inside your async def endpoint (by calling your some_func() function), and you never give up control for other requests to run in the event loop (e.g., by awaiting for some coroutine), the server will be blocked and wait for that request to be fully processed and complete, before moving on to the next one(s)—have a look at this answer for more details.
Solutions
One solution would be to define your endpoint with normal def instead of async def. In brief, when you declare an endpoint with normal def instead of async def in FastAPI, it is run in an external threadpool that is then awaited, instead of being called directly (as it would block the server); hence, FastAPI would still work asynchronously.
Another solution, as described in this answer, is to keep the async def definition and run the CPU-bound operation in a separate thread and await it, using Starlette's run_in_threadpool(), thus ensuring that the main thread (event loop), where coroutines are run, does not get blocked. As described by #tiangolo here, "run_in_threadpool is an awaitable function, the first parameter is a normal function, the next parameters are passed to that function directly. It supports sequence arguments and keyword arguments". Example:
from fastapi.concurrency import run_in_threadpool
res = await run_in_threadpool(cpu_bound_task, text='Hello world')
Since this is about a CPU-bound operation, it would be preferable to run it in a separate process, using ProcessPoolExecutor, as described in the link provided above. In this case, this could be integrated with asyncio, in order to await the process to finish its work and return the result(s). Note that, as described in the link above, it is important to protect the main loop of code to avoid recursive spawning of subprocesses, etc—essentially, your code must be under if __name__ == '__main__'. Example:
import concurrent.futures
from functools import partial
import asyncio
loop = asyncio.get_running_loop()
with concurrent.futures.ProcessPoolExecutor() as pool:
res = await loop.run_in_executor(pool, partial(cpu_bound_task, text='Hello world'))
About Request Timeout
With regards to the recent update on your question about the client having a fixed 60s request timeout; if you are not behind a proxy such as Nginx that would allow you to set the request timeout, and/or you are not using gunicorn, which would also allow you to adjust the request timeout, you could use a middleware, as suggested here, to set a timeout for all incoming requests. The suggested middleware (example is given below) uses asyncio's .wait_for() function, which waits for an awaitable function/coroutine to complete with a timeout. If a timeout occurs, it cancels the task and raises asyncio.TimeoutError.
Regarding your comment below:
My requirement is not unblocking next request...
Again, please read carefully the first part of this answer to understand that if you define your endpoint with async def and not await for some coroutine inside, but instead perform some CPU-bound task (as you already do), it will block the server until is completed (and even the approach below wont' work as expected). That's like saying that you would like FastAPI to process one request at a time; in that case, there is no reason to use an ASGI framework such as FastAPI, which takes advantage of the async/await syntax (i.e., processing requests asynchronously), in order to provide fast performance. Hence, you either need to drop the async definition from your endpoint (as mentioned earlier above), or, preferably, run your synchronous CPU-bound task using ProcessPoolExecutor, as described earlier.
Also, your comment in some_func():
Some computationally heavy function whose execution time depends on
input text size
indicates that instead of (or along with) setting a request timeout, you could check the length of input text (using a dependency fucntion, for instance) and raise an HTTPException in case the text's length exceeds some pre-defined value, which is known beforehand to require more than 60s to complete the processing. In that way, your system won't waste resources trying to perform a task, which you already know will not be completed.
Working Example
import time
import uvicorn
import asyncio
import concurrent.futures
from functools import partial
from fastapi import FastAPI, Request
from fastapi.responses import JSONResponse
from starlette.status import HTTP_504_GATEWAY_TIMEOUT
from fastapi.concurrency import run_in_threadpool
REQUEST_TIMEOUT = 2 # adjust timeout as desired
app = FastAPI()
#app.middleware('http')
async def timeout_middleware(request: Request, call_next):
try:
return await asyncio.wait_for(call_next(request), timeout=REQUEST_TIMEOUT)
except asyncio.TimeoutError:
return JSONResponse({'detail': f'Request exceeded the time limit for processing'},
status_code=HTTP_504_GATEWAY_TIMEOUT)
def cpu_bound_task(text):
time.sleep(5)
return text
#app.get('/')
async def main():
loop = asyncio.get_running_loop()
with concurrent.futures.ProcessPoolExecutor() as pool:
res = await loop.run_in_executor(pool, partial(cpu_bound_task, text='Hello world'))
return {'response': res}
if __name__ == '__main__':
uvicorn.run(app)
I have a FastAPI application which, in several different occasions, needs to call external APIs. I use httpx.AsyncClient for these calls. The point is that I don't fully understand how I shoud use it.
From httpx' documentation I should use context managers,
async def foo():
""""
I need to call foo quite often from different
parts of my application
"""
async with httpx.AsyncClient() as aclient:
# make some http requests, e.g.,
await aclient.get("http://example.it")
However, I understand that in this way a new client is spawned each time I call foo(), and is precisely what we want to avoid by using a client in the first place.
I suppose an alternative would be to have some global client defined somewhere, and just import it whenever I need it like so
aclient = httpx.AsyncClient()
async def bar():
# make some http requests using the global aclient, e.g.,
await aclient.get("http://example.it")
This second option looks somewhat fishy, though, as nobody is taking care of closing the session and the like.
So the question is: how do I properly (re)use httpx.AsyncClient() within a FastAPI application?
You can have a global client that is closed in the FastApi shutdown event.
import logging
from fastapi import FastAPI
import httpx
logging.basicConfig(level=logging.INFO, format="%(levelname)-9s %(asctime)s - %(name)s - %(message)s")
LOGGER = logging.getLogger(__name__)
class HTTPXClientWrapper:
async_client = None
def start(self):
""" Instantiate the client. Call from the FastAPI startup hook."""
self.async_client = httpx.AsyncClient()
LOGGER.info(f'httpx AsyncClient instantiated. Id {id(self.async_client)}')
async def stop(self):
""" Gracefully shutdown. Call from FastAPI shutdown hook."""
LOGGER.info(f'httpx async_client.is_closed(): {self.async_client.is_closed} - Now close it. Id (will be unchanged): {id(self.async_client)}')
await self.async_client.aclose()
LOGGER.info(f'httpx async_client.is_closed(): {self.async_client.is_closed}. Id (will be unchanged): {id(self.async_client)}')
self.async_client = None
LOGGER.info('httpx AsyncClient closed')
def __call__(self):
""" Calling the instantiated HTTPXClientWrapper returns the wrapped singleton."""
# Ensure we don't use it if not started / running
assert self.async_client is not None
LOGGER.info(f'httpx async_client.is_closed(): {self.async_client.is_closed}. Id (will be unchanged): {id(self.async_client)}')
return self.async_client
httpx_client_wrapper = HTTPXClientWrapper()
app = FastAPI()
#app.get('/test-call-external')
async def call_external_api(url: str = 'https://stackoverflow.com'):
async_client = httpx_client_wrapper()
res = await async_client.get(url)
result = res.text
return {
'result': result,
'status': res.status_code
}
#app.on_event("startup")
async def startup_event():
httpx_client_wrapper.start()
#app.on_event("shutdown")
async def shutdown_event():
await httpx_client_wrapper.stop()
if __name__ == '__main__':
import uvicorn
LOGGER.info(f'starting...')
uvicorn.run(f"{__name__}:app", host="127.0.0.1", port=8000)
Note - this answer was inspired by a similar answer I saw elsewhere a long time ago for aiohttp, I can't find the reference but thanks to whoever that was!
EDIT
I've added uvicorn bootstrapping in the example so that it's now fully functional. I've also added logging to show what's going on on startup and shutdown, and you can visit localhost:8000/docs to trigger the endpoint and see what happens (via the logs).
The reason for calling the start() method from the startup hook is that by the time the hook is called the eventloop has already started, so we know we will be instantiating the httpx client in an async context.
Also I was missing the async on the stop() method, and had a self.async_client = None instead of just async_client = None, so I have fixed those errors in the example.
The answer to this question depends on how you structure your FastAPI application and how you manage your dependencies. One possible way to use httpx.AsyncClient() is to create a custom dependency function that returns an instance of the client and closes it when the request is finished. For example:
from fastapi import FastAPI, Depends
import httpx
app = FastAPI()
async def get_client():
# create a new client for each request
async with httpx.AsyncClient() as client:
# yield the client to the endpoint function
yield client
# close the client when the request is done
#app.get("/foo")
async def foo(client: httpx.AsyncClient = Depends(get_client)):
# use the client to make some http requests, e.g.,
response = await client.get("http://example.it")
return response.json()
This way, you don't need to create a global client or worry about closing it manually. FastAPI will handle the dependency injection and the context management for you. You can also use the same dependency function for other endpoints that need to use the client.
Alternatively, you can create a global client and close it when the application shuts down. For example:
from fastapi import FastAPI, Depends
import httpx
import atexit
app = FastAPI()
# create a global client
client = httpx.AsyncClient()
# register a function to close the client when the app exits
atexit.register(client.aclose)
#app.get("/bar")
async def bar():
# use the global client to make some http requests, e.g.,
response = await client.get("http://example.it")
return response.json()
This way, you don't need to create a new client for each request, but you need to make sure that the client is closed properly when the application stops. You can use the atexit module to register a function that will be called when the app exits, or you can use other methods such as signal handlers or event hooks.
Both methods have their pros and cons, and you should choose the one that suits your needs and preferences. You can also check out the FastAPI documentation on dependencies and testing for more examples and best practices.
I'm developing an API (using Sanic) which is a gateway to IB, using ib_insync
This API exposes endpoints to place a new order and getting live positions, but also is in charge to update order statuses in a DB using the events of ib_insync.
My question is - is it possible to have the API connect only once to IB when it goes up, and re-using the same connection for all requests?
I'm currently connecting to IB using connectAsync on every request. And while this is working - the API will not receive events if not currently handling a request.
This is one endpoint code, for reference
#app.post("/order/<symbol>")
async def post_order(request, symbol):
order = jsonpickle.decode(request.body)
with await IB().connectAsync("127.0.0.1", 7496, clientId=100) as ib:
ib.orderStatusEvent += onOrderStatus
ib.errorEvent += onTWSError
ib.newOrderEvent += onNewOrderEvent
contract = await ib.qualifyContractsAsync(contract)
trade = ib.placeOrder(contract[0], order)
return text(trade.order.orderId)
So I wish to not use the with statement, and just use a global ib connection.
When I'm connecting on the module init (using connectAsync), every later call that is async, like qualifyContractsAsync, just hangs. Debugging showed me that it hangs on asyncio.gather which means I'm doing something wrong with the event loops.
I'm not familiar with this particular connection, but yes it should be possible. Presumably the with statement is opening and closing the connection.
Instead, open and close it with a listener.
Docs re: listeners
#app.before_server_start
async def connect(app,_):
app.ctx.foo = await Foobar()
#app.route("/")
async def handler(request):
await request.app.ctx.foo.do_something()
#app.after_server_stop
async def close(app,_):
app.ctx.foo.close()
I'm trying to create a Python-based CLI that communicates with a web service via websockets. One issue that I'm encountering is that requests made by the CLI to the web service intermittently fail to get processed. Looking at the logs from the web service, I can see that the problem is caused by the fact that frequently these requests are being made at the same time (or even after) the socket has closed:
2016-09-13 13:28:10,930 [22 ] INFO DeviceBridge - Device bridge has opened
2016-09-13 13:28:11,936 [21 ] DEBUG DeviceBridge - Device bridge has received message
2016-09-13 13:28:11,937 [21 ] DEBUG DeviceBridge - Device bridge has received valid message
2016-09-13 13:28:11,937 [21 ] WARN DeviceBridge - Unable to process request: {"value": false, "path": "testcube.pwms[0].enabled", "op": "replace"}
2016-09-13 13:28:11,936 [5 ] DEBUG DeviceBridge - Device bridge has closed
In my CLI I define a class CommunicationService that is responsible for handling all direct communication with the web service. Internally, it uses the websockets package to handle communication, which itself is built on top of asyncio.
CommunicationService contains the following method for sending requests:
def send_request(self, request: str) -> None:
logger.debug('Sending request: {}'.format(request))
asyncio.ensure_future(self._ws.send(request))
...where ws is a websocket opened earlier in another method:
self._ws = await websockets.connect(websocket_address)
What I want is to be able to await the future returned by asyncio.ensure_future and, if necessary, sleep for a short while after in order to give the web service time to process the request before the websocket is closed.
However, since send_request is a synchronous method, it can't simply await these futures. Making it asynchronous would be pointless as there would be nothing to await the coroutine object it returned. I also can't use loop.run_until_complete as the loop is already running by the time it is invoked.
I found someone describing a problem very similar to the one I have at mail.python.org. The solution that was posted in that thread was to make the function return the coroutine object in the case the loop was already running:
def aio_map(coro, iterable, loop=None):
if loop is None:
loop = asyncio.get_event_loop()
coroutines = map(coro, iterable)
coros = asyncio.gather(*coroutines, return_exceptions=True, loop=loop)
if loop.is_running():
return coros
else:
return loop.run_until_complete(coros)
This is not possible for me, as I'm working with PyRx (Python implementation of the reactive framework) and send_request is only called as a subscriber of an Rx observable, which means the return value gets discarded and is not available to my code:
class AnonymousObserver(ObserverBase):
...
def _on_next_core(self, value):
self._next(value)
On a side note, I'm not sure if this is some sort of problem with asyncio that's commonly come across or whether I'm just not getting it, but I'm finding it pretty frustrating to use. In C# (for instance), all I would need to do is probably something like the following:
void SendRequest(string request)
{
this.ws.Send(request).Wait();
// Task.Delay(500).Wait(); // Uncomment If necessary
}
Meanwhile, asyncio's version of "wait" unhelpfully just returns another coroutine that I'm forced to discard.
Update
I've found a way around this issue that seems to work. I have an asynchronous callback that gets executed after the command has executed and before the CLI terminates, so I just changed it from this...
async def after_command():
await comms.stop()
...to this:
async def after_command():
await asyncio.sleep(0.25) # Allow time for communication
await comms.stop()
I'd still be happy to receive any answers to this problem for future reference, though. I might not be able to rely on workarounds like this in other situations, and I still think it would be better practice to have the delay executed inside send_request so that clients of CommunicationService do not have to concern themselves with timing issues.
In regards to Vincent's question:
Does your loop run in a different thread, or is send_request called by some callback?
Everything runs in the same thread - it's called by a callback. What happens is that I define all my commands to use asynchronous callbacks, and when executed some of them will try to send a request to the web service. Since they're asynchronous, they don't do this until they're executed via a call to loop.run_until_complete at the top level of the CLI - which means the loop is running by the time they're mid-way through execution and making this request (via an indirect call to send_request).
Update 2
Here's a solution based on Vincent's proposal of adding a "done" callback.
A new boolean field _busy is added to CommunicationService to represent if comms activity is occurring or not.
CommunicationService.send_request is modified to set _busy true before sending the request, and then provides a callback to _ws.send to reset _busy once done:
def send_request(self, request: str) -> None:
logger.debug('Sending request: {}'.format(request))
def callback(_):
self._busy = False
self._busy = True
asyncio.ensure_future(self._ws.send(request)).add_done_callback(callback)
CommunicationService.stop is now implemented to wait for this flag to be set false before progressing:
async def stop(self) -> None:
"""
Terminate communications with TestCube Web Service.
"""
if self._listen_task is None or self._ws is None:
return
# Wait for comms activity to stop.
while self._busy:
await asyncio.sleep(0.1)
# Allow short delay after final request is processed.
await asyncio.sleep(0.1)
self._listen_task.cancel()
await asyncio.wait([self._listen_task, self._ws.close()])
self._listen_task = None
self._ws = None
logger.info('Terminated connection to TestCube Web Service')
This seems to work too, and at least this way all communication timing logic is encapsulated within the CommunicationService class as it should be.
Update 3
Nicer solution based on Vincent's proposal.
Instead of self._busy we have self._send_request_tasks = [].
New send_request implementation:
def send_request(self, request: str) -> None:
logger.debug('Sending request: {}'.format(request))
task = asyncio.ensure_future(self._ws.send(request))
self._send_request_tasks.append(task)
New stop implementation:
async def stop(self) -> None:
if self._listen_task is None or self._ws is None:
return
# Wait for comms activity to stop.
if self._send_request_tasks:
await asyncio.wait(self._send_request_tasks)
...
You could use a set of tasks:
self._send_request_tasks = set()
Schedule the tasks using ensure_future and clean up using add_done_callback:
def send_request(self, request: str) -> None:
task = asyncio.ensure_future(self._ws.send(request))
self._send_request_tasks.add(task)
task.add_done_callback(self._send_request_tasks.remove)
And wait for the set of tasks to complete:
async def stop(self):
if self._send_request_tasks:
await asyncio.wait(self._send_request_tasks)
Given that you're not inside an asynchronous function you can use the yield from keyword to effectively implement await yourself. The following code will block until the future returns:
def send_request(self, request: str) -> None:
logger.debug('Sending request: {}'.format(request))
future = asyncio.ensure_future(self._ws.send(request))
yield from future.__await__()