Gremlin Python - "Server disconnected - please try to reconnect" error

Gremlin Python - "Server disconnected - please try to reconnect" error - python

I have a Flask web app in which I want to keep a persistent connection to an AWS Neptune graph database. This connection is established as follows:
from gremlin_python.process.anonymous_traversal import traversal
from gremlin_python.driver.driver_remote_connection import DriverRemoteConnection
neptune_endpt = 'db-instance-x.xxxxxxxxxx.xx-xxxxx-x.neptune.amazonaws.com'
remoteConn = DriverRemoteConnection(f'wss://{neptune_endpt}:8182/gremlin','g')
self.g = traversal().withRemote(remoteConn)
The issue I'm facing is that the connection automatically drops off if left idle, and I cannot find a way to detect if the connection has dropped off (so that I can reconnect by using the code snippet above).
I have seen this similar issue: Gremlin server withRemote connection closed - how to reconnect automatically? however that question has no solution as well. This similar question has no answer either.
I've tried the following two solutions (both of which did not work):
I setup my webapp behind four Gunicorn workers with a timeout of a 100 seconds, hoping that worker restarts would take care of Gremlin timeouts.
I tried catching exceptions to detect if the connection has dropped off. Every time I use self.g to do some traversal on my graph, I try to "refresh" the connection, by which I mean this:
def _refresh_neptune(self):
try:
self.g = traversal().withRemote(self.conn)
except:
self.conn = DriverRemoteConnection(f'wss://{neptune_endpt}:8182/gremlin','g')
self.g = traversal().withRemote(self.conn)
Here self.conn was initialized as:
self.conn = DriverRemoteConnection(f'wss://{neptune_endpt}:8182/gremlin','g')
Is there any way to get around this connection error?
Thanks
Update: Added the error message below:
File "/home/ubuntu/.virtualenvs/rundev/lib/python3.6/site-packages/gremlin_python/process/traversal.py
", line 58, in toList
return list(iter(self))
File "/home/ubuntu/.virtualenvs/rundev/lib/python3.6/site-packages/gremlin_python/process/traversal.py
", line 48, in __next__
self.traversal_strategies.apply_strategies(self)
File "/home/ubuntu/.virtualenvs/rundev/lib/python3.6/site-packages/gremlin_python/process/traversal.py
", line 573, in apply_strategies
traversal_strategy.apply(traversal)
File "/home/ubuntu/.virtualenvs/rundev/lib/python3.6/site-packages/gremlin_python/driver/remote_connec
tion.py", line 149, in apply
remote_traversal = self.remote_connection.submit(traversal.bytecode)
File "/home/ubuntu/.virtualenvs/rundev/lib/python3.6/site-packages/gremlin_python/driver/driver_remote
_connection.py", line 56, in submit
results = result_set.all().result()
File "/usr/lib/python3.6/concurrent/futures/_base.py", line 425, in result
return self.__get_result()
File "/usr/lib/python3.6/concurrent/futures/_base.py", line 384, in __get_result
raise self._exception
File "/home/ubuntu/.virtualenvs/rundev/lib/python3.6/site-packages/gremlin_python/driver/resultset.py"
, line 90, in cb
f.result()
File "/usr/lib/python3.6/concurrent/futures/_base.py", line 425, in result
return self.__get_result()
File "/usr/lib/python3.6/concurrent/futures/_base.py", line 384, in __get_result
raise self._exception
File "/usr/lib/python3.6/concurrent/futures/thread.py", line 56, in run
result = self.fn(*self.args, **self.kwargs)
File "/home/ubuntu/.virtualenvs/rundev/lib/python3.6/site-packages/gremlin_python/driver/connection.py
", line 83, in _receive
status_code = self._protocol.data_received(data, self._results)
File "/home/ubuntu/.virtualenvs/rundev/lib/python3.6/site-packages/gremlin_python/driver/protocol.py",
line 81, in data_received
'message': 'Server disconnected - please try to reconnect', 'attributes': {}})
gremlin_python.driver.protocol.GremlinServerError: 500: Server disconnected - please try to reconnect

I am not sure that this is the best way to solve this, but I'm also using gremlin-python and Neptune and I've had the same issue. I worked around it by implementing a Transport that you can provide to DriverRemoteConnection.
DriverRemoteConnection(
url=endpoint,
traversal_source=self._traversal_source,
transport_factory=Transport
)
gremlin-python returns connections to the pool on exception and the exception returned when a connection is closed is GremlinServerError which is also raised for other errors.
gremlin_python/driver/connection.py#L69 -
gremlin_python/driver/protocol.py#L80
The custom transport is the same as gremlin-python's TornadoTransport but the read and write methods are extended to:
Reopen closed connections, if the web socket client is closed
Raise a StreamClosedError, if the web socket client returns None from read_message
Dead connections that are added back to the pool are able to be reopended and then, you can then handle the StreamClosedError to apply some retry logic. I did it by overriding the submit and submitAsync methods in DriverRemoteConnection, but you could catch and retry anywhere.
class Transport(AbstractBaseTransport):
def __init__(self):
self._ws = None
self._loop = ioloop.IOLoop(make_current=False)
self._url = None
# Because the transport will try to reopen the underlying ws connection
# track if the closed() method has been called to prevent the transport
# from reopening.
self._explicit_closed = True
#property
def closed(self):
return not self._ws.protocol
def connect(self, url, headers=None):
self._explicit_closed = False
# Set the endpoint URL
self._url = httpclient.HTTPRequest(url, headers=headers) if headers else url
# Open the connection
self._connect()
def write(self, message):
# Before writing, try to ensure that the connection is open.
if self.closed:
self._connect()
self._loop.run_sync(lambda: self._ws.write_message(message, binary=True))
def read(self):
result = self._loop.run_sync(self._ws.read_message)
# If the read call returns None, the stream has closed.
if result is None:
self._ws.close() # Ensure we close the stream
raise StreamClosedError()
return result
def close(self):
self._ws.close()
self._loop.close()
self._explicit_closed = True
def _connect(self):
# If close() was called explicitly on the transport, don't allow
# subsequent calls to write() to reopen the connection.
if self._explicit_closed:
raise TransportClosedError(
"Transport has been closed and can not be reopened."
)
# Check if the ws is closed, if it is not, close it.
if self._ws and not self.closed:
self._ws.close()
# Open the ws connection
self._ws = self._loop.run_sync(
lambda: websocket.websocket_connect(url=self._url)
)
class TransportClosedError(Exception):
pass
This will work in with gremlin-pythons connection pooling as well.
If you don't need pooling, an alternate approach is to set the pool size to 1 and implement some form of keep-alive like is discussed here. TINKERPOP-2352
It looks like the web socket ping/keep-alive in gremlin-python is not implemented yet TINKERPOP-1886.

Related

how to set connection timeout in flask redis cache

I'm trying to use redis cache with my python code, below code works fine and it sets the keys perfectly. I wanted to set timeout when its not able to connect to redis or if the ports are not open.
unfortunately I could not able to find any document on how to pass the timeout to the connection parameters.
Following is my code.
from flask import Flask, render_template
from flask_caching import Cache
app = Flask(__name__, static_url_path='/static')
config = {
"DEBUG": True,
"CACHE_TYPE": "redis",
"CACHE_DEFAULT_TIMEOUT": 300,
"CACHE_KEY_PREFIX": "inventory",
"CACHE_REDIS_HOST": "localhost",
"CACHE_REDIS_PORT": "6379",
"CACHE_REDIS_URL": 'redis://localhost:6379'
}
cache = Cache(app, config=config)
socket_timeout = 5
#app.route('/')
#cache.memoize()
def dev():
# some code
return render_template("index.html", data=json_data, columns=columns)
when its not able to connect it waits for long time and throws the following error:
Traceback (most recent call last):
File "/Users/amjad/.virtualenvs/inventory/lib/python3.7/site-packages/flask_caching/__init__.py", line 771, in decorated_function
f, *args, **kwargs
File "/Users/amjad/.virtualenvs/inventory/lib/python3.7/site-packages/flask_caching/__init__.py", line 565, in make_cache_key
f, args=args, timeout=_timeout, forced_update=forced_update
File "/Users/amjad/.virtualenvs/inventory/lib/python3.7/site-packages/flask_caching/__init__.py", line 524, in _memoize_version
version_data_list = list(self.cache.get_many(*fetch_keys))
File "/Users/amjad/.virtualenvs/inventory/lib/python3.7/site-packages/flask_caching/backends/rediscache.py", line 101, in get_many
return [self.load_object(x) for x in self._read_clients.mget(keys)]
File "/Users/amjad/.virtualenvs/inventory/lib/python3.7/site-packages/redis/client.py", line 1329, in mget
return self.execute_command('MGET', *args, **options)
File "/Users/amjad/.virtualenvs/inventory/lib/python3.7/site-packages/redis/client.py", line 772, in execute_command
connection = pool.get_connection(command_name, **options)
File "/Users/amjad/.virtualenvs/inventory/lib/python3.7/site-packages/redis/connection.py", line 994, in get_connection
connection.connect()
File "/Users/amjad/.virtualenvs/inventory/lib/python3.7/site-packages/redis/connection.py", line 497, in connect
raise ConnectionError(self._error_message(e))
redis.exceptions.ConnectionError: Error 60 connecting to localhost:6379. Operation timed out.
Thanks in advance.

This question is fairly old but came across this exact problem just now and I found a solution. Leaving here for posterity for future readers.
According to the documentation at https://flask-caching.readthedocs.io/en/latest/index.html, the CACHE_TYPE parameter:
Specifies which type of caching object to use. This is an import string that will be imported and instantiated. It is assumed that the import object is a function that will return a cache object that adheres to the cache API.
So make a modified version of their redis function, found in flask_caching.backends.cache like so:
def redis_with_timeout(app, config, args, kwargs):
try:
from redis import from_url as redis_from_url
except ImportError:
raise RuntimeError("no redis module found")
# [... extra lines skipped for brevity ...]
# kwargs set here are passed through to the underlying Redis client
kwargs["socket_connect_timeout"] = 0.5
kwargs["socket_timeout"] = 0.5
return RedisCache(*args, **kwargs)
And use it instead of the default redis like so:
CACHE_TYPE = 'path.to.redis_with_timeout'
And the library will use that one instead, with the custom kwargs passed into the underlying Redis client. Hope that helps.

From latest document, there is an CACHE_OPTIONS config passed to almost every types of cache backends as keyword arguments:
Entries in CACHE_OPTIONS are passed to the redis client as **kwargs
We can simply pass additional settings like this:
from flask import Flask
from flask_caching import Cache
app = Flask(__name__)
config = {
"CACHE_TYPE": "redis",
...
"CACHE_REDIS_HOST": "localhost",
"CACHE_REDIS_PORT": "6379",
"CACHE_REDIS_URL": 'redis://localhost:6379',
"CACHE_OPTIONS": {
"socket_connect_timeout": 5, # connection timeout in seconds
"socket_timeout": 5, # send/recv timeout in seconds
}
}
cache = Cache(app, config=config)

Python Requests hanging/freezing

I'm using the requests library to get a lot of webpages from somewhere. He's the pertinent code:
response = requests.Session()
retries = Retry(total=5, backoff_factor=.1)
response.mount('http://', HTTPAdapter(max_retries=retries))
response = response.get(url)
After a while it just hangs/freezes (never on the same webpage) while getting the page. Here's the traceback when I interrupt it:
File "/Users/Student/Hockey/Scrape/html_pbp.py", line 21, in get_pbp
response = r.read().decode('utf-8')
File "/anaconda/lib/python3.6/http/client.py", line 456, in read
return self._readall_chunked()
File "/anaconda/lib/python3.6/http/client.py", line 566, in _readall_chunked
value.append(self._safe_read(chunk_left))
File "/anaconda/lib/python3.6/http/client.py", line 612, in _safe_read
chunk = self.fp.read(min(amt, MAXAMOUNT))
File "/anaconda/lib/python3.6/socket.py", line 586, in readinto
return self._sock.recv_into(b)
KeyboardInterrupt
Does anybody know what could be causing it? Or (more importantly) does anybody know a way to stop it if it takes more than a certain amount of time so that I could try again?

Seems like setting a (read) timeout might help you.
Something along the lines of:
response = response.get(url, timeout=5)
(This will set both connect and read timeout to 5 seconds.)
In requests, unfortunately, neither connect nor read timeouts are set by default, even though the docs say it's good to set it:
Most requests to external servers should have a timeout attached, in case the server is not responding in a timely manner. By default, requests do not time out unless a timeout value is set explicitly. Without a timeout, your code may hang for minutes or more.
Just for completeness, the connect timeout is the number of seconds requests will wait for your client to establish a connection to a remote machine, and the read timeout is the number of seconds the client will wait between bytes sent from the server.

Patching the documented "send" function will fix this for all requests - even in many dependent libraries and sdk's. When patching libs, be sure to patch supported/documented functions, otherwise you may wind up silently losing the effect of your patch.
import requests
DEFAULT_TIMEOUT = 180
old_send = requests.Session.send
def new_send(*args, **kwargs):
if kwargs.get("timeout", None) is None:
kwargs["timeout"] = DEFAULT_TIMEOUT
return old_send(*args, **kwargs)
requests.Session.send = new_send
The effects of not having any timeout are quite severe, and the use of a default timeout can almost never break anything - because TCP itself has timeouts as well.
On Windows the default TCP timeout is 240 seconds, TCP RFC recommend a minimum of 100 seconds for RTO*retry. Somewhere in that range is a safe default.

To set timeout globally instead of specifying in every request:
from requests.adapters import TimeoutSauce
REQUESTS_TIMEOUT_SECONDS = float(os.getenv("REQUESTS_TIMEOUT_SECONDS", 5))
class CustomTimeout(TimeoutSauce):
def __init__(self, *args, **kwargs):
if kwargs["connect"] is None:
kwargs["connect"] = REQUESTS_TIMEOUT_SECONDS
if kwargs["read"] is None:
kwargs["read"] = REQUESTS_TIMEOUT_SECONDS
super().__init__(*args, **kwargs)
# Set it globally, instead of specifying ``timeout=..`` kwarg on each call.
requests.adapters.TimeoutSauce = CustomTimeout
sess = requests.Session()
sess.get(...)
sess.post(...)

Using gevent.queue.Queue.get(): gevent.hub.LoopExit: 'This operation would block forever'

I've been trying to integrate event streaming into my flask application for the past few days with good results on my local testing, but somewhat worse when running the application with uWSGI on my server. My code is basically built upon the example from flask. I'm using python 3.4.2.
The problem
When running the app on my uWSGI server, it raises gevent.hub.LoopExit: 'This operation would block forever'. whenever a client tries connecting to the /streaming endpoint. My assumption is that this is caused by calling get() on an empty queue indefinitely.
Full traceback:
Traceback (most recent call last):
File "/usr/lib/python3/dist-packages/werkzeug/wsgi.py", line 691, in __next__
return self._next()
File "/usr/lib/python3/dist-packages/werkzeug/wrappers.py", line 81, in _iter_encoded
for item in iterable:
File "./voting/__init__.py", line 49, in gen
result = queue.get(block=True)
File "/usr/local/lib/python3.4/dist-packages/gevent/queue.py", line 284, in get
return self.__get_or_peek(self._get, block, timeout)
File "/usr/local/lib/python3.4/dist-packages/gevent/queue.py", line 261, in __get_or_peek
result = waiter.get()
File "/usr/local/lib/python3.4/dist-packages/gevent/hub.py", line 878, in get
return self.hub.switch()
File "/usr/local/lib/python3.4/dist-packages/gevent/hub.py", line 609, in switch
return greenlet.switch(self)
gevent.hub.LoopExit: ('This operation would block forever', <Hub at 0x7f717f40f5a0 epoll default pending=0 ref=0 fileno=6>)
My code
The /streaming endpoint:
#app.route("/streaming", methods=["GET", "OPTIONS"])
def streaming():
def gen():
queue = Queue()
subscriptions.add_subscription(session_id, queue)
try:
while True:
result = queue.get() # Where the Exception is raised
ev = ServerSentEvent(json.dumps(result["data"]), result["type"])
yield ev.encode()
except GeneratorExit: # TODO Need a better method to detect disconnecting
subscriptions.remove_subscription(session_id, queue)
return Response(gen(), mimetype="text/event-stream")
Adding an event to the queue:
def notify():
msg = {"type": "users", "data": db_get_all_registered(session_id)}
subscriptions.add_item(session_id, msg) # Adds the item to the relevant queues.
gevent.spawn(notify)
As previously said, it runs fine locally with werkzeug:
from app import app
from gevent.wsgi import WSGIServer
from werkzeug.debug import DebuggedApplication
a = DebuggedApplication(app, evalex=True)
server = WSGIServer(("", 5000), a)
server.serve_forever()
What I've tried
Monkey-patching with monkey.patch_all().
Switching from Queue to JoinableQueue.
gevent.sleep(0) in combination with Queue.get().

That exception basically means that there are no other greenlets running in that loop/thread to switch to. So when the greenlet goes to block (queue.get()), the hub has nowhere else to go, nothing else to do.
The same code would work in gevent's WSGIServer because the server itself is a greenlet that's running the socket.accept loop, so there's always another greenlet to switch to. But apparently uwsgi doesn't work that way.
The way to fix this is to arrange for there to be other greenlets running. For example, instead of spawning a greenlet to notify on demand, arrange for such a greenlet to already be running and blocking on its own queue.

Obscure encoding WSGI error when upgrading: "write() argument must be a bytes instance"

I'm trying to create a duplicate of a django app I wrote, hosted on a different VPS provider with up-to-date everything (Ubuntu 16.04, django 1.9.5, python 3.5). It was successfully deployed using the previous version of everything in the stack (Ubuntu 15.1, django 1.9.4, python 3.4).
I've got a problem with the WSGI content, which I narrowed down to this obscure error when run the development server ./manage.py runserver 0.0.0.0:8000 (below is a GET request to /login but the error is the same with a fake URL not matched in urls.py):
[04/May/2016 09:33:54] "GET /login HTTP/1.1" 200 0
Traceback (most recent call last):
File "/usr/lib/python3.5/wsgiref/handlers.py", line 138, in run
self.finish_response()
File "/usr/lib/python3.5/wsgiref/handlers.py", line 180, in finish_response
self.write(data)
File "/usr/lib/python3.5/wsgiref/handlers.py", line 266, in write
"write() argument must be a bytes instance"
AssertionError: write() argument must be a bytes instance
I gather that this would be an encoding error, but why does it occur, and how can I fix it? Looking at the source of handlers.py I can't see why finish_response data would have incorrect encoding. I've copied the three functions referenced in the error (with the relevant lines marked) for convenience:
def run(self, application):
"""Invoke the application"""
# Note to self: don't move the close()! Asynchronous servers shouldn't
# call close() from finish_response(), so if you close() anywhere but
# the double-error branch here, you'll break asynchronous servers by
# prematurely closing. Async servers must return from 'run()' without
# closing if there might still be output to iterate over.
try:
self.setup_environ()
self.result = application(self.environ, self.start_response)
###line 138###
self.finish_response()
except:
try:
self.handle_error()
except:
# If we get an error handling an error, just give up already!
self.close()
raise # ...and let the actual server figure it out.
def finish_response(self):
"""Send any iterable data, then close self and the iterable
Subclasses intended for use in asynchronous servers will
want to redefine this method, such that it sets up callbacks
in the event loop to iterate over the data, and to call
'self.close()' once the response is finished.
"""
try:
if not self.result_is_file() or not self.sendfile():
for data in self.result:
###line 180###
self.write(data)
self.finish_content()
finally:
self.close()
def write(self, data):
"""'write()' callable as specified by PEP 3333"""
assert type(data) is bytes, \
###line 266###
"write() argument must be a bytes instance"
if not self.status:
raise AssertionError("write() before start_response()")
elif not self.headers_sent:
# Before the first output, send the stored headers
self.bytes_sent = len(data) # make sure we know content-length
self.send_headers()
else:
self.bytes_sent += len(data)
# XXX check Content-Length and truncate if too many bytes written?
self._write(data)
self._flush()

i had (maybe) the same issue and it was not related to inner wsgiref code: your response must be a list. if it is not a list, that very error is triggered. here is an exemple from the wsgiref docs
from wsgiref.util import setup_testing_defaults
from wsgiref.simple_server import make_server
# A relatively simple WSGI application. It's going to print out the
# environment dictionary after being updated by setup_testing_defaults
def simple_app(environ, start_response):
setup_testing_defaults(environ)
status = '200 OK'
headers = [('Content-type', 'text/plain; charset=utf-8')]
start_response(status, headers)
#return b'fail' <- triggers: write() argument must be a bytes instance
return [b'win']
with make_server('', 8000, simple_app) as httpd:
print("Serving on port 8000...")
httpd.serve_forever()

How can I return a complex object type (python-ldap connection) from a Pyro4 Daemon?

I've got a Pyro4 daemon going which I would like to have return a connection to LDAP (instantiated by the python-ldap module). The code is short and simple, but I run into an error with (I believe) serialization of the connection object upon my attempt to return the connection to the client script.
class LDAPDaemon(object):
def get_ldap_connection(self):
conn = ldap.initialize("ldap://ds1")
conn.simple_bind_s("cn=Directory Manager", "abc123")
return conn
daemon = Pyro4.Daemon(unixsocket="/tmp/ldap_unix.sock")
os.system("chmod 700 /tmp/ldap_unix.sock")
uri=daemon.register(LDAPDaemon(), "LDAPDaemon")
daemon.requestLoop()
Then in my driver script, I have the following (assume uri is known, cut all that out for brevity's sake):
with Pyro4.Proxy(uri) as ldap_daemon:
conn = ldap_daemon.get_ldap_connection()
This results in the following error:
Traceback (most recent call last):
File "./tester.py", line 14, in <module>
conn = ldap_daemon.get_ldap_connection()
File "/opt/csw/lib/python2.6/site-packages/Pyro4/core.py", line 160, in __call__
return self.__send(self.__name, args, kwargs)
File "/opt/csw/lib/python2.6/site-packages/Pyro4/core.py", line 318, in _pyroInvoke
raise data
AttributeError: __class__
I tried changing the Pyro4 configuration to accept different serializers, i.e.:
Pyro4.config.SERIALIZERS_ACCEPTED = set(['json', 'marshal', 'serpent', 'pickle'])
but that didn't change anything.
Please ignore the glaring security holes as this was dumbed down to the most basic code to produce the error.

You guessed right. The LDAPOject is not serializable.
Arguments passed to a remote object and the return values of its methods are serialized and then sent through a socket. Not serializable objects will cause errors. You should consider User's comment, create a proxy for the connection instead of sending it to the other process or you have to find a way to serialize it.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Gremlin Python - "Server disconnected - please try to reconnect" error - python

Related

how to set connection timeout in flask redis cache

Python Requests hanging/freezing

Using gevent.queue.Queue.get(): gevent.hub.LoopExit: 'This operation would block forever'

Obscure encoding WSGI error when upgrading: "write() argument must be a bytes instance"

How can I return a complex object type (python-ldap connection) from a Pyro4 Daemon?

Categories

Resources