Catch Firebase 503 error exception not caught my try statement - python

I'm using the Firebase Realtime Database listener to listen to changes on a database path.
My program recently crashed because of the following 503 error that seems to be raised by the underlying requests library:
Traceback (most recent call last):
File "/usr/local/lib/python3.7/threading.py", line 917, in _bootstrap_inner
self.run()
File "/usr/local/lib/python3.7/threading.py", line 865, in run
self._target(*self._args, **self._kwargs)
File "/usr/local/lib/python3.7/site-packages/firebase_admin/db.py", line 123, in _start_listen
for sse_event in self._sse:
File "/usr/local/lib/python3.7/site-packages/firebase_admin/_sseclient.py", line 128, in __next__
self._connect()
File "/usr/local/lib/python3.7/site-packages/firebase_admin/_sseclient.py", line 112, in _connect
self.resp.raise_for_status()
File "/usr/local/lib/python3.7/site-packages/requests/models.py", line 940, in raise_for_status
raise HTTPError(http_error_msg, response=self)
requests.exceptions.HTTPError: 503 Server Error: Service Unavailable for url: https://database_url...
My listener initialization is wrapped in a try statement, so I'm unsure why this wasn't caught, swallowed and retried as I expected it to:
def init_listener():
try:
listener = firebase_admin.db.reference(db_path).listen(handle_change)
except Exception as e:
time.sleep(1) # Retry in one second.
init_listener()
I'd like to handle future 503 errors, but I'm not sure how to go about doing this.
Additionally, I'm using except Exception as e above for demo/debugging purposes, but I'm also not sure if requests.exceptions.HTTPError will be specific enough to catch only 500 errors (though I don't know what other errors can be raised).

From the firebase_admin reference docs:
This API is based on the event streaming support available in the
Firebase REST API. Each call to listen() starts a new HTTP connection
and a background thread. This is an experimental feature.
The key here is that this all runs in a background thread. Therefore, wrapping the call to listen() in a try/except will not catch exceptions thrown in the thread. There is no simple way to catch the exceptions happening in the background thread.
To solve your issue, you will probably need to know more about why the database is returning an HTTP 503 status. Or you will need to switch to some other firebase_admin API that will allow you to catch and ignore these exceptions.

Related

Django Channels Redis: Exception inside application: Lock is not acquired

Fully loaded multi-tenant Django application with 1000's of WebSockets using Daphne/Channels, running fine for a few months and suddenly tenants all calling it the support line the application running slow or outright hanging. Narrowed it down to WebSockets as HTTP REST API hits came through fast and error free.
None of the application logs or OS logs indicate some issue, so only thing to go on is the exception noted below. It happened over and over again here and there throughout 2 days.
I don't expect any deep debugging help, just some off-the-cuff advice on possibilities.
AWS Linux 1
Python 3.6.4
Elasticache Redis 5.0
channels==2.4.0
channels-redis==2.4.2
daphne==2.5.0
Django==2.2.13
Split configuration HTTP served by uwsgi, daphne serves asgi, Nginx
May 10 08:08:16 prod-b-web1: [pid 15053] [version 119.5.10.5086] [tenant_id -] [domain_name -] [pathname /opt/releases/r119.5.10.5086/env/lib/python3.6/site-packages/daphne/server.py] [lineno 288] [priority ERROR] [funcname application_checker] [request_path -] [request_method -] [request_data -] [request_user -] [request_stack -] Exception inside application: Lock is not acquired.
Traceback (most recent call last):
File "/opt/releases/r119.5.10.5086/env/lib/python3.6/site-packages/channels_redis/core.py", line 435, in receive
real_channel
File "/opt/releases/r119.5.10.5086/env/lib/python3.6/site-packages/channels_redis/core.py", line 484, in receive_single
await self.receive_clean_locks.acquire(channel_key)
File "/opt/releases/r119.5.10.5086/env/lib/python3.6/site-packages/channels_redis/core.py", line 152, in acquire
return await self.locks[channel].acquire()
File "/opt/python3.6/lib/python3.6/asyncio/locks.py", line 176, in acquire
yield from fut
concurrent.futures._base.CancelledError
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/opt/releases/r119.5.10.5086/env/lib/python3.6/site-packages/channels/sessions.py", line 183, in __call__
return await self.inner(receive, self.send)
File "/opt/releases/r119.5.10.5086/env/lib/python3.6/site-packages/channels/middleware.py", line 41, in coroutine_call
await inner_instance(receive, send)
File "/opt/releases/r119.5.10.5086/env/lib/python3.6/site-packages/channels/consumer.py", line 59, in __call__
[receive, self.channel_receive], self.dispatch
File "/opt/releases/r119.5.10.5086/env/lib/python3.6/site-packages/channels/utils.py", line 58, in await_many_dispatch
await task
File "/opt/releases/r119.5.10.5086/env/lib/python3.6/site-packages/channels_redis/core.py", line 447, in receive
self.receive_lock.release()
File "/opt/python3.6/lib/python3.6/asyncio/locks.py", line 201, in release
raise RuntimeError('Lock is not acquired.')
RuntimeError: Lock is not acquired.
First, lets have a look at the source of the RuntimeError: Lock is not acquired. error. As given by the traceback, the release() method in the file /opt/python3.6/lib/python3.6/asyncio/locks.py is defined like so:
def release(self):
"""Release a lock.
When the lock is locked, reset it to unlocked, and return.
If any other coroutines are blocked waiting for the lock to become
unlocked, allow exactly one of them to proceed.
When invoked on an unlocked lock, a RuntimeError is raised.
There is no return value.
"""
if self._locked:
self._locked = False
self._wake_up_first()
else:
raise RuntimeError('Lock is not acquired.')
A primitive lock is a synchronization primitive that is not owned by a particular thread when locked.
When attempting to release an unlocked lock by calling the release() method, the RuntimeError will be raised, as the method should only be called in the locked state. The state changes to unlocked when called in the locked state.
Now for the previous error raised in the acquire() method in the same file, the acquire() method is defined like so:
async def acquire(self):
"""Acquire a lock.
This method blocks until the lock is unlocked, then sets it to
locked and returns True.
"""
if (not self._locked and (self._waiters is None or
all(w.cancelled() for w in self._waiters))):
self._locked = True
return True
if self._waiters is None:
self._waiters = collections.deque()
fut = self._loop.create_future()
self._waiters.append(fut)
# Finally block should be called before the CancelledError
# handling as we don't want CancelledError to call
# _wake_up_first() and attempt to wake up itself.
try:
try:
await fut
finally:
self._waiters.remove(fut)
except exceptions.CancelledError:
if not self._locked:
self._wake_up_first()
raise
self._locked = True
return True
So in order for the concurrent.futures._base.CancelledError error you're getting to be raised, the await fut must've caused the issue.
To fix it, you can have a look at Awaiting an asyncio.Future raises concurrent.futures._base.CancelledError instead of waiting for a value/exception to be set
Basically, you might have an awaitable in your code that you didn't await, and by not awaiting it, you never handed control back to the event loop or store the awaitable, causing it to be immediately cleaned up, completely cancelling it (and all of the awaitables it controlled).
Simply make sure you await the results of the awaitables in your code, finding any you missed.

Python FastAPI: how to create a general exception handler to stop the app from crashing on unhandled exceptions?

I want to capture all unhandled exceptions in a FastAPI app run using uvicorn, log them, save the request information, and let the application continue. I seem to have all of that working except the last bit
#app.exception_handler(Exception)
async def general_exception_handler(request: APIRequest, exception) -> JSONResponse:
...
It runs to completion, and then the app shows
2021-05-20 11:45:45,261.261Z | ERROR | uvicorn.error | Exception in ASGI application
Traceback (most recent call last):
File "/Users/rhaven/code/projectblue-api/venv/lib/python3.8/site-packages/uvicorn/protocols/http/httptools_impl.py", line 385, in
...
File "/Users/rhaven/code/projectblue-api/venv/lib/python3.8/site-packages/fastapi/routing.py", line 149, in run_endpoint_function
return await dependant.call(**values)
File "./app/main.py", line 236, in internal_testing
raise Exception("test exception from blue-api")
How do I eat the exception once I've handled it?
Cheers

Catching Firebase 504 gateway timeout

I'm building a simple IOT device (with a Raspberry Pi Zero) which pulls data from Firebase Realtime Database every 1 second and checks for updates.
However, after a certain time (not sure exactly how much but somewhere between 1 hour and 3 hours) the program exits with a 504 Server Error: Gateway Time-out message.
I couldn't understand exactly why this is happening, I tried to recreate this error by disconnecting the Pi from the internet and I did not get this message. Instead, the program simply paused in a ref.get() line and automatically resumed running once the connection was back.
This device is meant to be always on, so ideally if I get some kind of error, I would like to restart the program / reinitiate the connection / reboot the Pi. Is there a way to achieve something like this?
It seems like the message is actually generated by the firebase_admin package.
Here is the error message:
Traceback (most recent call last):
File "/home/pi/.local/lib/python3.7/site-packages/firebase_admin/db.py", line 944, in request
return super(_Client, self).request(method, url, **kwargs)
File "/home/pi/.local/lib/python3.7/site-packages/firebase_admin/_http_client.py", line 105, in request
resp.raise_for_status()
File "/usr/lib/python3/dist-packages/requests/models.py", line 940, in raise_for_status
raise HTTPError(http_error_msg, response=self)
requests.exceptions.HTTPError: 504 Server Error: Gateway Time-out for url: https://someFirebaseProject.firebaseio.com/someRef/subSomeRef/payload.json
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/home/pi/Desktop/project/main.py", line 94, in <module>
lastUpdate = ref.get()['lastUpdate']
File "/home/pi/.local/lib/python3.7/site-packages/firebase_admin/db.py", line 223, in get
return self._client.body('get', self._add_suffix(), params=params)
File "/home/pi/.local/lib/python3.7/site-packages/firebase_admin/_http_client.py", line 117, in body
resp = self.request(method, url, **kwargs)
File "/home/pi/.local/lib/python3.7/site-packages/firebase_admin/db.py", line 946, in request
raise _Client.handle_rtdb_error(error)
firebase_admin.exceptions.UnknownError: Internal server error.
>>>
To reboot the whole Raspberry Pi, you can just run a shell command:
import os
os.system("sudo reboot")
I've had this problem too and usually feel safer with that, but there's obvious downsides. I'd try resetting the wifi connection or network interface in a similar way

Google PubSub python client returning StatusCode.UNAVAILABLE

I am trying to establish a long running Pull subscription to a Google Cloud PubSub topic.
I am using a code very similar to the example given in the documentation here, i.e.:
def receive_messages(project, subscription_name):
"""Receives messages from a pull subscription."""
subscriber = pubsub_v1.SubscriberClient()
subscription_path = subscriber.subscription_path(
project, subscription_name)
def callback(message):
print('Received message: {}'.format(message))
message.ack()
subscriber.subscribe(subscription_path, callback=callback)
# The subscriber is non-blocking, so we must keep the main thread from
# exiting to allow it to process messages in the background.
print('Listening for messages on {}'.format(subscription_path))
while True:
time.sleep(60)
The problem is that I'm receiving the following traceback sometimes:
Exception in thread Consumer helper: consume bidirectional stream:
Traceback (most recent call last):
File "/usr/lib/python3.5/threading.py", line 914, in _bootstrap_inner
self.run()
File "/usr/lib/python3.5/threading.py", line 862, in run
self._target(*self._args, **self._kwargs)
File "/path/to/google/cloud/pubsub_v1/subscriber/_consumer.py", line 248, in _blocking_consume
self._policy.on_exception(exc)
File "/path/to/google/cloud/pubsub_v1/subscriber/policy/thread.py", line 135, in on_exception
raise exception
File "/path/to/google/cloud/pubsub_v1/subscriber/_consumer.py", line 234, in _blocking_consume
for response in response_generator:
File "/path/to/grpc/_channel.py", line 348, in __next__
return self._next()
File "/path/to/grpc/_channel.py", line 342, in _next
raise self
grpc._channel._Rendezvous: <_Rendezvous of RPC that terminated with (StatusCode.UNAVAILABLE, The service was unable to fulfill your request. Please try again. [code=8a75])>
I saw that this was referenced in another question but here I am asking to how to handle it properly in Python. I have tried to wrap the request in an exception but it seems to run in the background and I am not able to retry in case of that error.
A somewhat hacky approach that is working for me is a custom policy_class. The default one has an on_exception function that ignores DEADLINE_EXCEEDED. You can make a class that inherits the default and also ignores UNAVAILABLE. Mine looks like this:
from google.cloud import pubsub
from google.cloud.pubsub_v1.subscriber.policy import thread
import grpc
class AvailablePolicy(thread.Policy):
def on_exception(self, exception):
"""The parent ignores DEADLINE_EXCEEDED. Let's also ignore UNAVAILABLE.
I'm not sure what triggers that error, but if you ignore it, your
subscriber seems to work just fine. It's probably an intermittent
thing and it reconnects later if you just give it a chance.
"""
# If this is UNAVAILABLE, then we want to retry.
# That entails just returning None.
unavailable = grpc.StatusCode.UNAVAILABLE
if getattr(exception, 'code', lambda: None)() == unavailable:
return
# For anything else, fallback on super.
super(AvailablePolicy, self).on_exception(exception)
subscriber = pubsub.SubscriberClient(policy_class=AvailablePolicy)
# Continue to set up as normal.
It looks a lot like the original on_exception just ignores a different error. If you want, you can add some logging whenever the exception is thrown and verify that everything still works. Future messages will still come through.

Handling Non-SSL Traffic in Python/Tornado

I have a webservice running in python 2.7.10 / Tornado that uses SSL. This service throws an error when a non-SSL call comes through (http://...).
I don't want my service to be accessible when SSL is not used, but I'd like to handle it in a cleaner fashion.
Here is my main code that works great over SSL:
if __name__ == "__main__":
tornado.options.parse_command_line()
#does not work on 2.7.6
ssl_ctx = ssl.create_default_context(ssl.Purpose.CLIENT_AUTH)
ssl_ctx.load_cert_chain("...crt.pem","...key.pem")
ssl_ctx.load_verify_locations("...CA.crt.pem")
http_server = tornado.httpserver.HTTPServer(application, ssl_options=ssl_ctx, decompress_request=True)
http_server.listen(options.port)
mainloop = tornado.ioloop.IOLoop.instance()
print("Main Server started on port XXXX")
mainloop.start()
and here is the error when I hit that server with http://... instead of https://...:
[E 151027 20:45:57 http1connection:700] Uncaught exception
Traceback (most recent call last):
File "/usr/local/lib/python2.7/dist-packages/tornado/http1connection.py", line 691, in _server_request_loop
ret = yield conn.read_response(request_delegate)
File "/usr/local/lib/python2.7/dist-packages/tornado/gen.py", line 807, in run
value = future.result()
File "/usr/local/lib/python2.7/dist-packages/tornado/concurrent.py", line 209, in result
raise_exc_info(self._exc_info)
File "/usr/local/lib/python2.7/dist-packages/tornado/gen.py", line 810, in run
yielded = self.gen.throw(*sys.exc_info())
File "/usr/local/lib/python2.7/dist-packages/tornado/http1connection.py", line 166, in _read_message
quiet_exceptions=iostream.StreamClosedError)
File "/usr/local/lib/python2.7/dist-packages/tornado/gen.py", line 807, in run
value = future.result()
File "/usr/local/lib/python2.7/dist-packages/tornado/concurrent.py", line 209, in result
raise_exc_info(self._exc_info)
File "<string>", line 3, in raise_exc_info
SSLError: [SSL: HTTP_REQUEST] http request (_ssl.c:590)
Any ideas how I should handle that exception?
And what the standard-conform return value would be when I catch a non-SSL call to an SSL-only API?
UPDATE
This API runs on a specific port e.g. https://example.com:1234/. I want to inform a user who is trying to connect without SSL, e.g. http://example.com:1234/ that what they are doing is incorrect by returning an error message or status code. As it is the uncaught exception returns a 500, which they could interpret as a programming error on my part. Any ideas?
There's an excelent discussion in this Tornado issue about that, where Tornado maintainer says:
If you have both HTTP and HTTPS in the same tornado process, you must be running two separate HTTPServers (of course such a feature should not be tied to whether SSL is handled at the tornado level, since you could be terminating SSL in a proxy, but since your question stipulated that SSL was enabled in tornado let's focus on this case first). You could simply give the HTTP server a different Application, one that just does this redirect.
So, the best solution it's to HTTPServer that listens on port 80 and doesn't has the ssl_options parameter setted.
UPDATE
A request to https://example.com/some/path will go to port 443, where you must have an HTTPServer configured to handle https traffic; while a request to http://example.com/some/path will go to port 80, where you must have another instance of HTTPServer without ssl options, and this is where you must return the custom response code you want. That shouldn't raise any error.

Categories