I'm using MySQLdb to connect to MySQL using python. My tables are all InnoDB and I'm using transactions.
I'm struggling to come up with a way to 'share' transactions across functions. Consider the following pseudocode:
def foo():
db = connect()
cur = db.cursor()
try:
cur.execute(...)
conn.commit()
except:
conn.rollback()
def bar():
db = connect()
cur = db.cursor()
try:
cur.execute(...)
foo() # note this call
conn.commit()
except:
conn.rollback()
At some points in my code, I need to call foo() and at some points I need to call bar(). What's the best practice here? How would I tell the call to foo() to commit() if called outside bar() but not inside bar()? This is obviously more complex if there are multiple threads calling foo() and bar() and the calls to connect() don't return the same connection object.
UPDATE
I found a solution which works for me. I've wrapped connect() to increment a value when called. Calling commit() decrements that value. If commit() is called and that counter's > 0, no commit happens and the value is decremented. You therefore get this:
def foo():
db = connect() # internal counter = 1
...
db.commit() # internal counter = 0, so commit
def bar():
db = connect() # internal counter = 1
...
foo() # internal counter goes to 2, then to 1 when commit() is called, so no commit happens
db.commit() # internal counter = 0, so commit
You can take advantage of Python's default function arguments in this case:
def foo(cur=None):
inside_larger_transaction = False
if cursor is None:
db = connect()
cur = db.cursor()
inside_larger_transaction = True
try:
cur.execute(...)
if not inside_larger_transaction:
conn.commit()
except:
conn.rollback()
So, if bar is calling foo, it just pass in the cursor object as a parameter.
Not that I don't see much sense in creating a different cursor object for each small function - you should either write your several functions as methods of an object, and have a cursor attribute - or pass the cursos explicitly always (in this case, use another named parameter to indicate whether the current function is part of a major transaction or not)
Another option is to create a context-manager class to make your commits, and encapsulate all transactions within it - therefore, none of your functions should do transaction commit - you would keep both transaction.commit and transaction.rollback calls on the __exit__method of this object.
class Transaction(object):
def __enter__(self):
self.db = connect()
cursor = self.db.cursor()
return cursor
def __exit__(self, exc_type, exc_value, traceback):
if exc_type is None:
self.db.commit()
else:
self.db.rollback()
And just use it like this:
def foo(cursor):
...
def foo(cur):
cur.execute(...)
def bar(cur):
cur.execute(...)
foo(cur)
with Transaction() as cursor:
foo(cursor)
with Transaction() as cursor:
bar(cursor)
The cleanest way IMO is to pass the connection object to foo and bar
Declare you connections outside the functions and pass them to the function as arguements
foo(cur, conn)
bar(cur, conn)
Related
I am currently working on a huge project, which constantly executes queries. My problem is, that my old code always created a new database connection and cursor, which decreased the speed immensivly. So I thought it's time to make a new database class, which looks like this at the moment:
class Database(object):
_instance = None
def __new__(cls):
if cls._instance is None:
cls._instance = object.__new__(cls)
try:
connection = Database._instance.connection = mysql.connector.connect(host="127.0.0.1", user="root", password="", database="db_test")
cursor = Database._instance.cursor = connection.cursor()
except Exception as error:
print("Error: Connection not established {}".format(error))
else:
print("Connection established")
return cls._instance
def __init__(self):
self.connection = self._instance.connection
self.cursor = self._instance.cursor
# Do database stuff here
The queries will use the class like so:
def foo():
with Database() as cursor:
cursor.execute("STATEMENT")
I am not absolutly sure, if this creates the connection only once regardless of how often the class is created. Maybe someone knows how to initialize a connection only once and how to make use of it in the class afterwards or maybe knows if my solution is correct. I am thankful for any help!
Explanation
The keyword here is clearly class variables. Taking a look in the official documentation, we can see that class variables, other than instance variables, are shared by all class instances regardless of how many class instances exists.
Generally speaking, instance variables are for data unique to each instance and class variables are for attributes and methods shared by all instances of the class:
So let us asume you have multiple instances of the class. The class itself is defined like below.
class Dog:
kind = "canine" # class variable shared by all instances
def __init__(self, name):
self.name = name # instance variable unique to each instance
In order to better understand the differences between class variables and instance variables, I would like to include a small example here:
>>> d = Dog("Fido")
>>> e = Dog("Buddy")
>>> d.kind # shared by all dogs
"canine"
>>> e.kind # shared by all dogs
"canine"
>>> d.name # unique to d
"Fido"
>>> e.name # unique to e
"Buddy"
Solution
Now that we know that class variables are shared by all instances of the class, we can simply define the connection and cursor like shown below.
class Database(object):
connection = None
cursor = None
def __init__(self):
if Database.connection is None:
try:
Database.connection = mysql.connector.connect(host="127.0.0.1", user="root", password="", database="db_test")
Database.cursor = Database.connection.cursor()
except Exception as error:
print("Error: Connection not established {}".format(error))
else:
print("Connection established")
self.connection = Database.connection
self.cursor = Database.cursor
As a result, the connection to the database is created once at the beginning and can then be used by every further instance.
Kind of like this. It's a cheap way of using a global.
class Database(object):
connection = None
def __init__(self):
if not Database.connection:
Database.connection = mysql.connector.connect(host="127.0.0.1", user="root", password="", database="db_test")
def query(self,sql):
cursor = Database.connection.cursor()
cursor.execute(sql)
# Do database stuff here
This too does work and you are guaranteed to always have one instance of the database
def singleton(class_):
instances = {}
def get_instance(*args, **kwargs):
if class_ not in instances:
instances[class_] = class_(*args, **kwargs)
return instances[class_]
return get_instance
#singleton
class SingletonDatabase:
def __init__(self) -> None:
print('Initializing singleton database connection... ', random.randint(1, 100))
The Reason you have to do all this is if you just create
a connection once and leave it at that you then
will end up trying to use a connection which is dropped
so you create a connection and attach it to your app
then whenever you get a new request check if the connection
still exists, with before request hook if not then recreate the
connection and proceeed.
on create_app
def create_app(self):
if not app.config.get('connection_created'):
app.database_connection = Database()
app.config['connection_created'] = True
on run app
#app.before_request
def check_database_connection(self):
if not app.config.get('connection_created') or not app.database_connection:
app.database_connection = Database()
app.config['connection_created'] = True
this will insure that your application always runs with an active connection
and that it gets created only once per app
if connection is dropped on any subsequent call then it gets recreated again...
I have a project written in Python 2.7 where the main program needs frequent access to a sqlite3 db for writing logs, measurement results, getting settings,...
At the moment I have a db module with functions such as add_log(), get_setting(), and each function in there basically looks like:
def add_log(logtext):
try:
db = sqlite3.connect(database_location)
except sqlite3.DatabaseError as e:
db.close() # try to gracefully close the db
return("ERROR (ADD_LOG): While opening db: {}".format(e))
try:
with db: # using context manager to automatically commit or roll back changes.
# when using the context manager, the execute function of the db should be used instead of the cursor
db.execute("insert into logs(level, source, log) values (?, ?, ?)", (level, source, logtext))
except sqlite3.DatabaseError as e:
return("ERROR (ADD_LOG): While adding log to db: {}".format(e))
return "OK"
(some additional code and comments removed).
It seems I should write a class extends the base sqlite connection object function so that the connection is created only once (at the beginning of the main program), and then this object contains the functionality such as
class Db(sqlite3.Connection):
def __init__(self, db_location = database_location):
try:
self = sqlite3.connect(db_location)
return self
except sqlite3.DatabaseError as e:
self.close() # try to gracefully close the db
def add_log(self, logtext):
self.execute("insert into logs(level, source, log) values (?, ?, ?)", (level, source, logtext))
It seems this should be fairly straightforward but, I can't seem to get it working.
It seems there is some useful advise here:
Python: How to successfully inherit Sqlite3.Cursor and add my customized method but I can't seem to understand how to use a similar construct for my purpose.
You are not that far away.
First of all, a class initializer cannot return anything but None (emphasis mine):
Because __new__() and __init__() work together in constructing objects (__new__() to create it, and __init__() to customise it), no non-None value may be returned by __init__(); doing so will cause a TypeError to be raised at runtime.
Second, you overwrite the current instance self of your Db object with a sqlite3.Connection object right in the initializer. That makes subclassing SQLite's connection object a bit pointless.
You just need to fix your __init__ method to make this work:
class Db(sqlite3.Connection):
# If you didn't use the default argument, you could omit overriding __init__ alltogether
def __init__(self, database=database_location, **kwargs):
super(Db, self).__init__(database=database, **kwargs)
def add_log(self, logtext, level, source):
self.execute("insert into logs(level, source, log) values (?, ?, ?)", (level, source, logtext))
That lets you use instances of your class as context managers:
with Db() as db:
print [i for i in db.execute("SELECT * FROM logs")]
db.add_log("I LAUNCHED THAT PUG INTO SPACE!", 42, "Right there")
Maurice Meyer said in the comments of the question that methods such as execute() are cursor methods and, per the DB-API 2.0 specs, that's correct.
However, sqlite3's connection objects offer a few shortcuts to cursor methods:
This is a nonstandard shortcut that creates an intermediate cursor object by calling the cursor method, then calls the cursor’s execute method with the parameters given.
To expand on the discussion in the comments:
The remark about the default argument in my code example above was targeted at the requirement to override sqlite3.Connection's __init__ method.
The __init__ in the class Db is only needed to define the default value database_location on the database argument for the sqlite3.Connection initializer.
If you were willing to pass such a value upon every instantiation of that class, your custom connection class could look like this, and still work the same way, except for that argument:
class Db(sqlite3.Connection):
def add_log(self, logtext, level, source):
self.execute("insert into logs(level, source, log) values (?, ?, ?)", (level, source, logtext))
However, the __init__ method has nothing to do with the context manager protocol as defined in PEP 343.
When it comes to classes, this protocol requires to implement the magic methods __enter__ and __exit__
The sqlite3.Connection does something along these lines:
class Connection:
def __enter__(self):
return self
def __exit__(self, exc_type, exc_val, exc_tb):
if exc_val is None:
self.commit()
else:
self.rollback()
Note: The sqlite3.Connection is provided by a C module, hence does not have a Python class definition. The above reflects what the methods would roughly look like if it did.
Lets say you don't want to keep the same connection open all the time, but rather have a dedicated connection per transaction while maintaining the general interface of the Db class above.
You could do something like this:
# Keep this to have your custom methods available
class Connection(sqlite3.Connection):
def add_log(self, level, source, log):
self.execute("INSERT INTO logs(level, source, log) VALUES (?, ?, ?)",
(level, source, log))
class DBM:
def __init__(self, database=database_location):
self._database = database
self._conn = None
def __enter__(self):
return self._connection()
def __exit__(self, exc_type, exc_val, exc_tb):
# Decide whether to commit or roll back
if exc_val:
self._connection().rollback()
else:
self._connection().commit()
# close connection
try:
self._conn.close()
except AttributeError:
pass
finally:
self._conn = None
def _connection(self):
if self._conn is None:
# Instantiate your custom sqlite3.Connection
self._conn = Connection(self._database)
return self._conn
# add shortcuts to connection methods as seen fit
def execute(self, sql, parameters=()):
with self as temp:
result = temp.execute(sql, parameters).fetchall()
return result
def add_log(self, level, source, log):
with self as temp:
temp.add_log(level, source, log)
This can be used in a context and by calling methods on the instance:
db = DBM(database_location)
with db as temp:
print [i for i in temp.execute("SELECT * FROM logs")]
temp.add_log(1, "foo", "I MADE MASHED POTATOES")
# The methods execute and add_log are only available from
# the outside because the shortcuts have been added to DBM
print [i for i in db.execute("SELECT * FROM logs")]
db.add_log(1, "foo", "I MADE MASHED POTATOES")
For further reading on context managers refer to the official documentation. I'll also recommend Jeff Knupp's nice introduction. Also, the aforementioned PEP 343 is worth having a look at for the technical specification and rationale behind that protocol.
In the PyMySQL library, in cursors.py the following functions are called:
def __enter__(self):
return self
def __exit__(self, *exc_info):
del exc_info
self.close()
That's mean that if I use the cursor class in the with statement, the cursor should close whenever I go out from the nested block. Why instead it remain setted?
db = pymysql.connect(config)
with pymysql.cursors.Cursor(db) as cursor:
print(cursor)
print(cursor)
also:
db = pymysql.connect(config)
with db.cursor() as cursor:
print(cursor)
print(cursor)
both forms return the cursor object printing two times (one time inside the with statement and one time out from the with statement?. Am I doing something wrong?
Closing a cursor doesn't null out the cursor, just detaches it from the database. Try printing cursor.connection instead.
Also, I think you're expecting the "with" keyword to delete the object in question, but it's really just syntactic sugar around the enter and exit functions.
Instead of using:
import sqlite3
conn = sqlite3.connect(':memory:')
c = conn.cursor()
c.execute(...)
c.close()
would it be possible to use the Pythonic idiom:
with conn.cursor() as c:
c.execute(...)
It doesn't seem to work:
AttributeError: __exit__
Note: it's important to close a cursor because of this.
You can use contextlib.closing:
import sqlite3
from contextlib import closing
conn = sqlite3.connect(':memory:')
with closing(conn.cursor()) as cursor:
cursor.execute(...)
This works because closing(object) automatically calls the close() method of the passed in object after the with block.
A simpler alternative would be to use the connection object with the context manager, as specified in the docs.
with con:
con.execute(...)
If you insist on working with the cursor (because reasons), then why not make your own wrapper class?
class SafeCursor:
def __init__(self, connection):
self.con = connection
def __enter__(self):
self.cursor = self.con.cursor()
return self.cursor
def __exit__(self, typ, value, traceback):
self.cursor.close()
You'll then call your class like this:
with SafeCursor(conn) as c:
c.execute(...)
Adding to sudormrfbin's post. I've recently experienced an issue where an INSERT statement wasn't committing to the database. Turns out I was missing the with context manager for just the Connection object.
Also, it is a good practice to always close the Cursor object as well, as mentioned in this post.
Therefore, use two contextlib.closing() methods, each within a with context manager:
import contextlib
import sqlite3
# Auto-closes the Connection object
with contextlib.closing(sqlite3.connect("path_to_db_file")) as conn:
# Auto-commit to the database
with conn:
# Auto-close the Cursor object
with contextlib.closing(conn.cursor()) as cursor:
# Execute method(s)
cursor.execute(""" SQL statements here """)
Below is a database pooling example. I don't understand the following.
Why the getcursor function use "yield"?
What is the context manager?
from psycopg2.pool import SimpleConnectionPool
from contextlib import contextmanager
dbConnection = "dbname='dbname' user='postgres' host='localhost' password='postgres'"
# pool define with 10 live connections
connectionpool = SimpleConnectionPool(1,10,dsn=dbConnection)
#contextmanager
def getcursor():
con = connectionpool.getconn()
try:
yield con.cursor()
finally:
connectionpool.putconn(con)
def main_work():
try:
# with here will take care of put connection when its done
with getcursor() as cur:
cur.execute("select * from \"TableName\"")
result_set = cur.fetchall()
except Exception as e:
print "error in executing with exception: ", e**
Both of your questions are related. In python context managers are what we're using whenever you see a with statement. Classically, they're written like this.
class getcursor(object):
def __enter__(self):
con = connectionpool.getconn()
return con
def __exit__(self, *args):
connectionpool.putconn(con)
Now when you use a context manager, it calls the __enter__ method on the with statement and the __exit__ method when it exits the context. Think of it like this.
cursor = getcursor()
with cursor as cur: # calls cursor.__enter__()
cur.execute("select * from \"TableName\"")
result_set = cur.fetchall()
# We're now exiting the context so it calls `cursor.__exit__()`
# with some exception info if relevant
x = 1
The #contextmanager decorator is some sugar to make creating a context manager easier. Basically, it uses the yield statement to give the execution back to the caller. Everything up to and including the yield statement is the __enter__ method and everything after that is effectively the __exit__ statement.