I want to write a Python module to abstract away database transactions for my application. My question is whether I need to call connect() and close() for every transaction? In code:
import sqlite3
# Can I put connect() here?
conn = sqlite3.connect('db.py')
def insert(args):
# Or should I put it here?
conn = sqlite3.connect('db.py')
# Perform the transaction.
c = conn.cursor()
c.execute(''' insert args ''')
conn.commit()
# Do I close the connection here?
conn.close()
# Or can I close the connection whenever the application restarts (ideally, very rarely)
conn.close()
I have don't much experience with databases, so I'd appreciate an explanation for why one method is preferred over the other.
You can use the same connection repeatedly. You can also use the connection (and the cursor) as a context manager so that you don't need to explicitly call close on either.
def insert(conn, args):
with conn.cursor() as c:
c.execute(...)
conn.commit()
with connect('db.py') as conn:
insert(conn, ...)
insert(conn, ...)
insert(conn, ...)
There's no reason to close the connection to the database, and re-opening the connection each time can be expensive. (For example, you may need to establish a TCP session to connect to a remote database.)
Using a single connection will be faster, and operationally should be fine.
Use the atexit module if you want to ensure the closing eventually happens (even if your program is terminated by an exception). Specifically, import atexit at the start of your program, and atexit.register(conn.close) right after you connect -- note, no () after close, you want to register the function to be called at program exist (whether normal or via an exception), not to call the function.
Unfortunately if Python should crash due e.g to an error in a C-coded module that Python can't catch, or a kill -9, etc, the registered exit function(s) may end up not being called. Fortunately in this case it shouldn't hurt anyway (besides being, one hopes, a rare and extreme occurrence).
Related
psycopg2 provides some example code for using postgresql notify facilities
import select
import psycopg2
import psycopg2.extensions
conn = psycopg2.connect(DSN)
conn.set_isolation_level(psycopg2.extensions.ISOLATION_LEVEL_AUTOCOMMIT)
curs = conn.cursor()
curs.execute("LISTEN test;")
print "Waiting for notifications on channel 'test'"
while True:
if select.select([conn],[],[],5) == ([],[],[]):
print "Timeout"
else:
conn.poll()
while conn.notifies:
notify = conn.notifies.pop(0)
print "Got NOTIFY:", notify.pid, notify.channel, notify.payload
What is the select.select([conn],[],[],5) == ([],[],[]) code doing?
The documentation of select.select suggests that we are trying to make sure that the socket that underlies the conn is "ready for reading".
What does it mean for a psycogp2 connection to be "ready for reading"?
conn.poll() is a nonblocking command. If there are no notifications (or other comms) available, it will still return immediately, just without having done anything. When done in a tight loop, this would be busy waiting and that is bad. It will burn an entire CPU doing nothing, and may even destabilize the whole system.
The select means "wait politely for there to be something on the connection to be read (implying poll will probably have something to do), or for 5 sec, whichever occurs first. If you instead wanted to wait indefinitely for a message to show up, just remove ,5 from the select. If you wanted to wait for something interesting to happen on the first of several handles, you would put all of them in the select list, not just conn.
This is a sample code I'd like to run:
for i in range(1,2000):
db = create_engine('mysql://root#localhost/test_database')
conn = db.connect()
#some simple data operations
conn.close()
db.dispose()
Is there a way of running this without getting "Too many connections" errors from MySQL?
I already know I can handle the connection otherwise or have a connection pool. I'd just like to understand how to properly close a connection from sqlalchemy.
Here's how to write that code correctly:
db = create_engine('mysql://root#localhost/test_database')
for i in range(1,2000):
conn = db.connect()
#some simple data operations
conn.close()
db.dispose()
That is, the Engine is a factory for connections as well as a pool of connections, not the connection itself. When you say conn.close(), the connection is returned to the connection pool within the Engine, not actually closed.
If you do want the connection to be actually closed, that is, not pooled, disable pooling via NullPool:
from sqlalchemy.pool import NullPool
db = create_engine('mysql://root#localhost/test_database', poolclass=NullPool)
With the above Engine configuration, each call to conn.close() will close the underlying DBAPI connection.
If OTOH you actually want to connect to different databases on each call, that is, your hardcoded "localhost/test_database" is just an example and you actually have lots of different databases, then the approach using dispose() is fine; it will close out every connection that is not checked out from the pool.
In all of the above cases, the important thing is that the Connection object is closed via close(). If you're using any kind of "connectionless" execution, that is engine.execute() or statement.execute(), the ResultProxy object returned from that execute call should be fully read, or otherwise explicitly closed via close(). A Connection or ResultProxy that's still open will prohibit the NullPool or dispose() approaches from closing every last connection.
Tried to figure out a solution to disconnect from database for an unrelated problem (must disconnect before forking).
You need to invalidate the connection from the connection Pool too.
In your example:
for i in range(1,2000):
db = create_engine('mysql://root#localhost/test_database')
conn = db.connect()
# some simple data operations
# session.close() if needed
conn.invalidate()
db.dispose()
I use this one
engine = create_engine('...')
with engine.connect() as conn:
conn.execute(text(f"CREATE SCHEMA IF NOT EXISTS...")
engine.dispose()
In my case these always works and I am able to close!
So using invalidate() before close() makes the trick. Otherwise close() sucks.
conn = engine.raw_connection()
conn.get_warnings = True
curSql = xx_tmpsql
myresults = cur.execute(curSql, multi=True)
print("Warnings: #####")
print(cur.fetchwarnings())
for curresult in myresults:
print(curresult)
if curresult.with_rows:
print(curresult.column_names)
print(curresult.fetchall())
else:
print("no rows returned")
cur.close()
conn.invalidate()
conn.close()
engine.dispose()
Being new to both Python and sqlite, I've been playing around with them both recently trying to figure things out. In particular with sqlite I've learned how to open/close/commit data to a db. But now I'm trying to clean things up a bit so that I can open/close the db via function calls. For instance, I'd like to do something like:
def open_db():
conn = sqlite3.connect("path)
c = conn.cursor()
def close_db():
c.close()
conn.close()
def create_db():
open_db()
c.execute("CREATE STUFF")
close_db()
Then when I run the program, before I query or write to the table, I could do something like:
open_db()
c.execute('SELECT * DO STUFF')
OR
c.execute('DELETE * DO OTHER STUFF')
conn.commit
close_db()
I've read about context managers but I'm not sure I understand entirely whats going on with them. What would be the easiest solution to cleaning up the way I open/close my DB connections so I'm not always having to type in the cursor command.
This is because the connection you define is local to the open db function. Change it as follows
def open_db():
conn = sqlite3.connect("path)
return conn.cursor()
and then
c = open_db()
c.execute('SELECT * DO STUFF')
It should be noted that writing function like this purely as a learning exercise might be ok, but generally it's not very useful to write a thin wrapper around a database connectivity api.
I don't know that there is an easy way. As already suggested, if you make the name of a database cursor or connection local to a function then these will be lost upon exit from that function. The answer might be to write code using the contextlib module (which is included with the Python distribution, and documented in the help file); I wouldn't call that easy. The documentation for sqlite3 does mention that connection objects can be used as context managers; I suspect you've already noticed that. I also see that there's some sort of context manager for MySQL but I haven't used it.
I am constructing a model that does large parts of its calculations in a Postgresql database (for performance reasons). It looks somewhat like this:
def sql_func1(conn):
# prepare some data, crunch some number, etc.
curs = conn.cursor()
curs.execute("SOME SQL COMMAND")
curs.commit()
curs.close()
if __name__ == "__main__":
connection = psycopg2.connect(dbname='name', user='user', password='pass', host='localhost', port=1234)
sql_func1(conn)
sql_func2(conn)
sql_func3(conn)
connection.close()
The script uses around 30 individual functions like sql_func1. Obviously it is a little awkward to manage the connection and cursor in each function all the time. Thus I started using a decorator as described here. Now I can simply wrap sql_func1 with a decorator #db_connect and pass the connection from there. However, that means I am opening and closing the connection all the time, which is not good practice either. The psycopg2 FAQ says:
Creating a connection can be slow (think of SSL over TCP) so the best
practice is to create a single connection and keep it open as long as
required. It is also good practice to rollback or commit frequently
(even after a single SELECT statement) to make sure the backend is
never left “idle in transaction”. See also psycopg2.pool for
lightweight connection pooling.
Could you please give me some insights which would be an ideal practice im my case. Should I rather use a decorator that passes the cursor object instead of the connection? If so, please provide a code sample for the decorator. As I am rather new to programming, please let me also know in case you think my overall approach is wrong.
What about storing the connection in a global variable without closing it in the finally block? Something like this (according to the example you linked):
cnn = None
def with_connection(f):
def with_connection_(*args, **kwargs):
global cnn
if not cnn:
cnn = psycopg.connect(DSN)
try:
rv = f(cnn, *args, **kwargs)
except Exception, e:
cnn.rollback()
raise
else:
cnn.commit() # or maybe not
return rv
return with_connection_
I have a python script that needs to update a database information.
So, in my init() method, I start the connection. But when I call
the update method, the script does not give me any answer, it seems
like it is into an infinite loop.
def update(self,id,newDescription):
try:
sql="""UPDATE table SET table.new_string=:1 WHERE table.id=:2"""
con=self.connection.cursor()
con.execute(sql,(newDescription,id))
con.close()
except Exception,e:
self.errors+=[str(e)]
What I've tried so far:
Change the query, just to see if the connection is alright. When I did that (I used 'SELECT info from table'), the script worked.
I thought that my query was wrong, but when I execute it in SQLDeveloper
program, it goes right.
What can be happening?
Thanks
You are forgetting to call commit.
Not sure about how to do it in python script, but i think you need to call a "commit" before closing connection. Otherwise oracle rollsback your transaction.
Try adding con.commit() before close()