While researching some strange issues with my Python web application (in particular, issues regarding MongoDB connectivity), I noticed something on the official PyMongo documentation page. My web application uses Flask, but this shouldn't influence the issue I'm facing.
The PyMongo driver does connection pooling, but it also throws an exception (AutoReconnect) when a connection is stale and a reconnect is due.
It states that (regarding the AutoReconnect exception):
In order to auto-reconnect you must handle this exception, recognizing
that the operation which caused it has not necessarily succeeded.
Future operations will attempt to open a new connection to the
database (and will continue to raise this exception until the first
successful connection is made).
I have noticed that this actually happens constantly (and it doesn't seem to be an error). Connections are closed by the MongoDB server after what seems like several minutes of inactivity, and need to be recreated by the web application.
What I don't understand it why the PyMongo driver throws an error when it reconnects (which the user of the driver needs to handle themselves), instead of doing it transparently. (There could even be an option a user could set so that AutoReconnect exceptions do get thrown, but wouldn't a sensible default be that these exceptions don't get thrown at all, and the connections are recreated seamlessly?)
I have never encountered this behavior using other database systems, which is why I'm a bit confused.
It's also worth mentioning that my web application's MongoDB connections never fail when connecting to my local development MongoDB server (I assume it would have something to do with the fact that it's a local connection, and that the connection is done through a UNIX socket instead of a network socket, but I could be wrong).
You're misunderstanding AutoReconnect. It is raised when the driver attempts to communicate with the server (to send a command or other operation) and a network failure or similar problem occurs. The name of the exception is meant to communicate that you do not have to create a new instance of MongoClient, the existing client will attempt to reconnect automatically when your application tries the next operation. If the same problem occurs, AutoReconnect is raised again.
I suspect the reason you are seeing sockets timeout (and AutoReconnect being raised) is that there is a load balancer between the server and your application that closes connections after some period of inactivity. For example, this apparently happens on Microsoft's Azure platform after 13 minutes of no activity on a socket. You might be able to fix this by using the socketKeepAlive option, added in PyMongo 2.8. Note that you will also have to set the keepalive interval on your application server to an appropriate value (the default on Linux is 2 hours). See here for more information.
Related
I wonder how does Postgres sever determine to close a DB connection, if I forgot at the Python source code side.
Does the Postgres server send a ping to the source code? From my understanding, this is not possible.
PostgreSQL indeed does something like that, although it is not a ping.
PostgreSQL uses a TCP feature called keepalive. Once enabled for a socket, the operating system kernel will regularly send keepalive messages to the other party (the peer), and if it doesn't get an answer after a couple of tries, it closes the connection.
The default timeouts for keepalive are pretty long, in the vicinity of two hours. You can configure the settings in PostgreSQL, see the documentation for details.
The default values and possible values vary according to the operating system used.
There is a similar feature available for the client side, but it is less useful and not enabled by default.
When your script quits your connection will close and the server will clean it up accordingly. Likewise, it's often the case in garbage collected languages like Python that when you stop using the connection and it falls out of scope it will be closed and cleaned up.
It is possible to write code that never releases these resources properly, that just perpetually creates new handles, something that can be problematic if you don't have something server-side that handles killing these after some period of idle time. Postgres doesn't do this by default, though it can be configured to, but MySQL does.
In short Postgres will keep a database connection open until you kill it either explicitly, such as via a close call, or implicitly, such as the handle falling out of scope and being deleted by the garbage collector.
I'm uploading hundreds of millions of items to my database via a REST API from a cloud server on Heroku to a database in AWS EC2. I'm using Python and I am constantly seeing the following INFO log message in the logs.
[requests.packages.urllib3.connectionpool] [INFO] Resetting dropped connection: <hostname>
This "resetting of the dropped connection" seems to take many seconds (sometimes 30+ sec) before my code continues to execute again.
Firstly what exactly is happening here and why?
Secondly is there a way to stop the connection from dropping so that I am able to upload data faster?
Thanks for your help.
Andrew.
Requests uses Keep-Alive by default. Resetting dropped connection, from my understanding, means a connection that should be alive was dropped somehow. Possible reasons are:
Server doesn't support Keep-Alive.
There's no data transfer in established connections for a while, so server drops connections.
See https://stackoverflow.com/a/25239947/2142577 for more details.
The problem is really that the server has closed the connection even though the client has requested it be kept alive.
This is not necessarily because the server doesn't support keepalives, but could be that the server is configured to only allow a certain number of requests on a connection. This could be done to help spread out requests on different servers, but I think this practice is/was common as a practical defence against poorly written code that operates in the server (eg. PHP) that doesn't clean up after itself after serving a request (perhaps due to an error condition etc.)
If you think this is the case for you and you'd like to not see these logs (which are logged at INFO level), then you can add the following to quieten that part of the logging:
# Really don't need to hear about connections being brought up again after server has closed it
logging.getLogger("requests.packages.urllib3.connectionpool").setLevel(logging.WARNING)
This is common practice for services that expose RESTful APIs to avoid abuse (or DoS).
If you're stressing their API they'll drop your connection.
Try getting your script to sleep a bit every once in a while to avoid the drop.
I have code that writes files to s3. The code was working fine
conn = S3Connection(AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY)
bucket = conn.get_bucket(BUCKET, validate=False)
k = Key(bucket)
k.key = self.filekey
k.set_metadata('Content-Type', 'text/javascript')
k.set_contents_from_string(json.dumps(self.output))
k.set_acl(FILE_ACL)
This was working just fine. Then I noticed I wasn't closing my connection so I added this line at the end:
conn.close()
Now, the file writes as before but, I'm seeing this error in my logs now
S3Connection instance has no attribute '_cache', unable to write file
Anyone see what I'm doing wrong here or know what's causing this? I noticed that none of the tutorials on boto show people closing connections but I know you should close your connections for IO operations as a general rule...
EDIT
A note about this, when I comment out conn.close() the error disappears
I can't find that error message in the latest boto source code, so unfortunately I can't tell you what caused it. Recently, we had problems when we were NOT calling conn.close(), so there definitely is at least one case where you must close the connection. Here's my understanding of what's going on:
S3Connection (well, its parent class) handles almost all connectivity details transparently, and you shouldn't have to think about closing resource, reconnecting, etc.. This is why most tutorials and docs don't mention closing resources. In fact, I only know of one situation where you should close resources explicitly, which I describe at the bottom. Read on!
Under the covers, boto uses httplib. This client library supports HTTP 1.1 Keep-Alive, so it can and should keep the socket open so that it can perform multiple requests over the same connection.
AWS will close your connection (socket) for two reasons:
According to the boto source code, "AWS starts timing things out after three minutes." Presumably "things" means "idle connections."
According to Best Practices for Using Amazon S3, "S3 will accept up to 100 requests before it closes a connection (resulting in 'connection reset')."
Fortunately, boto works around the first case by recycling stale connections well before three minutes are up. Unfortunately, boto doesn't handle the second case quite so transparently:
When AWS closes a connection, your end of the connection goes into CLOSE_WAIT, which means that the socket is waiting for the application to execute close(). S3Connection handles connectivity details so transparently that you cannot actually do this directly! It's best to prevent it from happening in the first place.
So, circling back to the original question of when you need to close explicitly, if your application runs for a long time, keeps a reference to (reuses) a boto connection for a long time, and makes many boto S3 requests over that connection (thus triggering a "connection reset" on the socket by AWS), then you may find that more and more sockets are in CLOSE_WAIT. You can check for this condition on linux by calling netstat | grep CLOSE_WAIT. To prevent this, make an explicit call to boto's connection.close before you've made 100 requests. We make hundreds of thousands of S3 requests in a long running process, and we call connection.close after every, say, 80 requests.
After accessing my web app using:
- Python 2.7
- the Bottle micro framework v. 0.10.6
- Apache 2.2.22
- mod_wsgi
- on Ubuntu Server 12.04 64bit; I'm receiving this error after several hours:
OperationalError: (2006, 'MySQL server has gone away')
I'm using MySQL - the native one included in Python. It usually happens when I don't access the server. I've tried closing all the connections, which I do, using this:
cursor.close()
db.close()
where db is the standard MySQLdb.Connection() call.
The my.cnf file looks something like this:
key_buffer = 16M
max_allowed_packet = 128M
thread_stack = 192K
thread_cache_size = 8
# This replaces the startup script and checks MyISAM tables if needed
# the first time they are touched
myisam-recover = BACKUP
#max_connections = 100
#table_cache = 64
#thread_concurrency = 10
It is the default configuration file except max_allowed_packet is 128M instead of 16M.
The queries to the database are quite simple, at most they retrieve approximately 100 records.
Can anyone help me fix this? One idea I did have was use try/except but I'm not sure if that would actually work.
Thanks in advance,
Jamie
Update: try/except calls didn't work.
This is MySQL error, not Python's.
The list of possible causes and possible solutions is here: MySQL 5.5 Reference Manual: C.5.2.9. MySQL server has gone away.
Possible causes include:
You tried to run a query after closing the connection to the server. This indicates a logic error in the application that should be corrected.
A client application running on a different host does not have the necessary privileges to connect to the MySQL server from that host.
You have encountered a timeout on the server side and the automatic reconnection in the client is disabled (the reconnect flag in the MYSQL structure is equal to 0).
You can also get these errors if you send a query to the server that is incorrect or too large. If mysqld receives a packet that is too large or out of order, it assumes that something has gone wrong with the client and closes the connection. If you need big queries (for example, if you are working with big BLOB columns), you can increase the query limit by setting the server's max_allowed_packet variable, which has a default value of 1MB. You may also need to increase the maximum packet size on the client end. More information on setting the packet size is given in Section C.5.2.10, “Packet too large”.
You also get a lost connection if you are sending a packet 16MB or larger if your client is older than 4.0.8 and your server is 4.0.8 and above, or the other way around.
and so on...
In other words, there are plenty of possible causes. Go through that list and check every possible cause.
Make sure you are not trying to commit to a closed MySqldb object
An answer to a (very closely related) question has been posted here: https://stackoverflow.com/a/982873/209532
It relates directly to the MySQLdb driver (MySQL-python (unmaintained) and mysqlclient (maintained fork)), but the approach is the the same for other driver the does not support automatic reconnect.
For me this was fixed using
MySQLdb.connect("127.0.0.1","root","","db" )
instead of
MySQLdb.connect("localhost","root","","db" )
and then
df.to_sql('df',sql_cnxn,flavor='mysql',if_exists='replace', chunksize=100)
I'm doing something fairly outside of my comfort zone here, so hopefully I'm just doing something stupid.
I have an Amazon EC2 instance which I'm using to run a specialized database, which is controlled through a webapp inside of Tomcat that provides a REST API. On the same server, I'm running a Python script that uses the Requests library to make hundreds of thousands of simple queries to the database (I don't think it's possible to consolidate the queries, though I am going to try that next.)
The problem: after running the script for a bit, I suddenly get a broken pipe error on my SSH terminal. When I try to log back in with SSH, I keep getting "operation timed out" errors. So I can't even log back in to terminate the Python process and instead have to reboot the EC2 instance (which is a huge pain, especially since I'm using ephemeral storage)
My theory is that each time requests makes a REST call, it activates a pair of ports between Python and Tomcat, but that it never closes the ports when it's done. So python keeps trying to grab more and more ports and eventually either somehow grabs away and locks the SSH port (booting me off), or it just uses all the ports and that causes the network system to crap out somehow (as I said, I'm out of my depth.)
I also tried using httplib2, and was getting a similar problem.
Any ideas? If my port theory is correct, is there a way to force requests to surrender the port when it's done? Or otherwise is there at least a way to tell Ubuntu to keep the SSH port off-limits so that I can at least log back in and terminate the process?
Or is there some sort of best practice to using Python to make lots and lots of very simple REST calls?
Edit:
Solved...do:
s = requests.session()
s.config['keep_alive'] = False
Before making the request to force Requests to release connections when it's done.
My speculation:
https://github.com/kennethreitz/requests/blob/develop/requests/models.py#L539 sets conn to connectionpool.connection_from_url(url)
That leads to https://github.com/kennethreitz/requests/blob/develop/requests/packages/urllib3/connectionpool.py#L562, which leads to https://github.com/kennethreitz/requests/blob/develop/requests/packages/urllib3/connectionpool.py#L167.
This eventually leads to https://github.com/kennethreitz/requests/blob/develop/requests/packages/urllib3/connectionpool.py#L185:
def _new_conn(self):
"""
Return a fresh :class:`httplib.HTTPConnection`.
"""
self.num_connections += 1
log.info("Starting new HTTP connection (%d): %s" %
(self.num_connections, self.host))
return HTTPConnection(host=self.host, port=self.port)
I would suggest hooking a handler up to that logger, and listening for lines that match that one. That would let you see how many connections are being created.
Figured it out...Requests has a default 'Keep Alive' policy on connections which you have to explicitly override by doing
s = requests.session()
s.config['keep_alive'] = False
before you make a request.
From the doc:
"""
Keep-Alive
Excellent news — thanks to urllib3, keep-alive is 100% automatic within a session! Any requests that you make within a session will automatically reuse the appropriate connection!
Note that connections are only released back to the pool for reuse once all body data has been read; be sure to either set prefetch to True or read the content property of the Response object.
If you’d like to disable keep-alive, you can simply set the keep_alive configuration to False:
s = requests.session()
s.config['keep_alive'] = False
"""
There may be a subtle bug in Requests here because I WAS reading the .text and .content properties and it was still not releasing the connections. But explicitly passing 'keep alive' as false fixed the problem.