GoneException when calling post_to_connection on AWS lambda and API gateway - python

I want to send a message to a websocket client when it connects to the server on AWS lambda and API gateway. Currently, I use wscat as a client. Since the response 'connected' is not shown on the wscat console when I connect to the server, I added post_to_connection to send a message 'hello world' to the client. However, it raises GoneException.
An error occurred (GoneException) when calling the PostToConnection
operation
How can I solve this problem and send some message to wscat when connecting to the server?
My python code is below. I use Python 3.8.5.
import os
import boto3
import botocore
dynamodb = boto3.resource('dynamodb')
connections = dynamodb.Table(os.environ['TABLE_NAME'])
def lambda_handler(event, context):
domain_name = event.get('requestContext',{}).get('domainName')
stage = event.get('requestContext',{}).get('stage')
connection_id = event.get('requestContext',{}).get('connectionId')
result = connections.put_item(Item={ 'id': connection_id })
apigw_management = boto3.client('apigatewaymanagementapi',
endpoint_url=F"https://{domain_name}/{stage}")
ret = "hello world";
try:
_ = apigw_management.post_to_connection(ConnectionId=connection_id,
Data=ret)
except botocore.exceptions.ClientError as e:
print(e);
return { 'statusCode': 500,
'body': 'something went wrong' }
return { 'statusCode': 200,
"body": 'connected'};

Self-answer: you cannot post_to_connection to the connection itself in onconnect.

I have found that the GoneException can occur when the client that initiated the websocket has disconnected from the socket and its connectionId can no longer be found. Is there something causing the originating client to disconnect from the socket before it can receive your return message?
My use case is different but I am basically using a DB to check the state of a connection before replying to it, and not using the request context to do that. This error's appearance was reduced by writing connectionIds to DynamoDB on connect, and deleting them from the table upon disconnect events. Messaging now writes to connectionIds in the table instead of the id in the request context. Most messages go through but some errors are still emitted when the client leaves the socket but does not emit a proper disconnect event which leaves orphans in the table. The next step is to enforce item deletes when irregular disconnections occur. Involving a DB may be overkill for your situation, just sharing what helped me make progress on the GoneException error.

We need to post to connection after connecting (i.e. when the routeKey is not $connect)
routeKey = event.get('requestContext', {}).get('routeKey')
print(routeKey) # for debugging
if routeKey != '$connect': # if we have defined multiple route keys we can choose the right one here
apigw_management.post_to_connection(ConnectionId=connection_id, Data=ret)

#nemy's answer is totally true but it doesn't explain the reason. So, I just want to explain...
So, first of all What is GoneException or GoneError 410 ?
A 410 Gone error occurs when a user tries to access an asset which no longer exists on the requested server. In order for a request to return a 410 Gone status, the resource must also have no forwarding address and be considered to be gone permanently.
you can find more details about GoneException in this article.
In here, GoneException has occured; it means that the POST connection we are trying to make, doesn't exist - which fits perfectly in the scenario. Because we still haven't established the connection between Client and Server. The way APIGatewayWebsocketAPIs work is that you request an Endpoint(Route) and that Endpoint will invoke that Lambda Function (In our case it is ConnectionLambdaFunction for $connect Route).
Now, if The Lambda function resolves with statusCode: 200 then and only then the API Gateway will allow the connection to be established. So, basically untill we return statusCode: 200 from our Lambda Function we are not connected and untill then we are totally unknown to server and thats why the Post call that has been made before the return statement itself will throw an error.

Related

Access Azure EventHub with WebSocket and proxy

I'm trying to access Azure EvenHub but my network makes me use proxy and allows connection only over https (port 443)
Based on https://learn.microsoft.com/en-us/python/api/azure-eventhub/azure.eventhub.aio.eventhubproducerclient?view=azure-python
I added proxy configuration and TransportType.AmqpOverWebsocket parametr and my Producer looks like this:
async def run():
producer = EventHubProducerClient.from_connection_string(
"Endpoint=sb://my_eh.servicebus.windows.net/;SharedAccessKeyName=eh-sender;SharedAccessKey=MFGf5MX6Mdummykey=",
eventhub_name="my_eh",
auth_timeout=180,
http_proxy=HTTP_PROXY,
transport_type=TransportType.AmqpOverWebsocket,
)
and I get an error:
File "/usr/local/lib64/python3.9/site-packages/uamqp/authentication/cbs_auth_async.py", line 74, in create_authenticator_async
raise errors.AMQPConnectionError(
uamqp.errors.AMQPConnectionError: Unable to open authentication session on connection b'EHProducer-a1cc5f12-96a1-4c29-ae54-70aafacd3097'.
Please confirm target hostname exists: b'my_eh.servicebus.windows.net'
I don't know what might be the issue.
Might it be related to this one ? https://github.com/Azure/azure-event-hubs-c/issues/50#issuecomment-501437753
you should be able to set up a proxy that the SDK uses to access EventHub. Here is a sample that shows you how to set the HTTP_PROXY dictionary with the proxy information. Behind the scenes when proxy is passed in, it automatically goes over websockets.
As #BrunoLucasAzure suggested checking the ports on the proxy itself will be good to check, because based on the error message it looks like it made it past the proxy and cant resolve the endpoint.

Troubleshooting "Connection reset by peer"

I have a web api (Flask) and a mobile app (Xamarin.Forms) that uses it. Everything had been working well for a few months until a week ago. Suddenly mobile app clients started throwing "Connection reset by peer" when trying to access the web api. The exact exception message is:
Read error: ssl=0xd6269d18: I/O error during system call, Connection reset by peer
It seems to happen randomly - sometimes everything works well, sometimes not.
On the clients' side, requests are made using the System.Net.Http.HttpClient:
public async Task<string> GetClientData(string token)
{
HttpResponseMessage response =
await httpClient.GetAsync(server_url + Resx.Urls.get_route + "?token=" + token);
return await response.Content.ReadAsStringAsync();
}
The server is hosted on Heroku. Even when clients throw exceptions, the Heroku logs show that the requests were correctly handled (status 200).
An example route of my server:
#app.route('/get', methods=['GET'])
def get():
clients = get_users() # database call
token = request.args.get('token')
for c in clients:
if(c.token == token):
return c.to_json()
abort(400) # no client with such token
I wrote a short python script that tries to use the api the same way my mobile app does, and it seems that the problem does not occur there.
Where should I look for the solution? Is it more likely that something is wrong with the Flask server, or is it a problem with the mobile app?

HTTP REST Gateway to AMQP Request-Response, Without Web Sockets Or Polling

I've struggled for two days to understand how REST API Gateways should return GET requests to browsers when the backend service runs on AMQP (without using Web Sockets or polling).
Have successfully RPC'ed betweeen AMQP service (with RabbitMqs reply_to & correlation_id), but with Flask HTTP request waiting I'm still lost.
gateway.py - Response Handler Inside The HTTP Handler, Times out
def products_get():
def handler(ch=None, method=None, properties=None, body=None):
if body:
return body
return False
return_queue = 'products.get.return'
broker.channel.queue_declare(return_queue)
broker.channel.basic_consume(handler, return_queue)
broker.publish(exchange='', routing_key='products.get', body='Request data', properties=pika.BasicProperties(reply_to=return_queue))
now = time.time() # for timeout. Not having this returns 'no content' immediately
while time.time() < now + 1:
if handler():
return handler()
return 'Time out'
POST/PUT can simply send the AMQP message, return 200/201/201 immediately and the service work at its own pace. A separate REST interface just for GET requests seems implausible, but don't know the other options.
Regards
I think what you're asking is "how to perform asynchronous GET requests". and I reckon that the answer is - you can't. and should not. its bad practice or bad design. and it does not scale.
Why are you trying to get your GET response payload from AMQP?
If the paylaod (the content of the response) can be pulled from some DB, just pull it from there. that's called a synchronous request.
If the payload must be processed in some backend, send it away and don't have the requester wait for a response. You could assign some ID and have the requester ask again later (or collect some callback URL from the requester and have your backend POST the response once its ready - less common design).
EDIT:
so, given that you have to work with AMQP-backed backend, I would do something a little more elaborate: spawn a thread or a process in your front end that would constantly consume from AMQP and store the results locally or in some db. and serve GET results based on data that you stored locally. if the data isn't yet available, just return 404. ideally you'll need to re-shape your API: split it into "post" requests (that would trigger work at the backend) and "get" requests (that would return the results if they're available).

Health check failed when service is still running

I'm using google health check in order to send request to my flask client to make sure my service is alive.
the same route in flask client sends request to two more flask clients to make sure the other two is also alive.
For some reason the request sometimes fails when the service is still running.
I tries to figure out why but there is nothing in my services logs that indicates that something happened and on most cases it works fine.
This is my code:
#GET /health_check//
def get(self):
try:
for service in INTERNAL_SERVICES_HEALTH_CHECKS:
client = getattr(all_clients, service + '_client')
response = client.get('g_health_check')
except Exception, e:
sentry_client.captureMessage('health check failed for '+env+ ' environment. error log:' + repr(e))
return output_json({'I\'m Not fine!':False}, requests.codes.server_error)
return output_json({'I\'m fine!':True}, requests.codes.ok)
If anyone has any suggestions I will be happy to try and fix it.

Flask JSON request is None

I'm working on my first Flask app (version 0.10.1), and also my first Python (version 3.5) app. One of its pieces needs to work like this:
Submit a form
Run a Celery task (which makes some third-party API calls)
When the Celery task's API calls complete, send a JSON post to another URL in the app
Get that JSON data and update a database record with it
Here's the relevant part of the Celery task:
if not response['errors']: # response comes from the Salesforce API call
# do something to notify that the task was finished successfully
message = {'flask_id' : flask_id, 'sf_id' : response['id']}
message = json.dumps(message)
print('call endpoint now and update it')
res = requests.post('http://0.0.0.0:5000/transaction_result/', json=message)
And here's the endpoint it calls:
#app.route('/transaction_result/', methods=['POST'])
def transaction_result():
result = jsonify(request.get_json(force=True))
print(result.flask_id)
return result.flask_id
So far I'm just trying to get the data and print the ID, and I'll worry about the database after that.
The error I get though is this: requests.exceptions.ConnectionError: None: Max retries exceeded with url: /transaction_result/ (Caused by None)
My reading indicates that my data might not be coming over as JSON, hence the Force=True on the result, but even this doesn't seem to work. I've also tried doing the same request in CocoaRestClient, with a Content-Type header of application/json, and I get the same result.
Because both of these attempts break, I can't tell if my issue is in the request or in the attempt to parse the response.
First of all request.get_json(force=True) returns an object (or None if silent=True). jsonify converts objects to JSON strings. You're trying to access str_val.flask_id. It's impossible. However, even after removing redundant jsonify call, you'll have to change result.flask_id to result['flask_id'].
So, eventually the code should look like this:
#app.route('/transaction_result/', methods=['POST'])
def transaction_result():
result = request.get_json()
return result['flask_id']
And you are absolutely right when you're using REST client to test the route. It crucially simplifies testing process by reducing involved parts. One well-known problem during sending requests from a flask app to the same app is running this app under development server with only one thread. In such case a request will always be blocked by an internal request because the current thread is serving the outermost request and cannot handle the internal one. However, since you are sending a request from the Celery task, it's not likely your scenario.
UPD: Finally, the last one reason was an IP address 0.0.0.0. Changing it to the real one solved the problem.

Categories