Guaranteeing a database connection is closed after use in class?

Guaranteeing a database connection is closed after use in class? - python

I have a wrapper class that I subclass out to the many database connections we use. I want something that will behave similarly to the try/finally functionality, but still give me the flexibility and subclassing potential of a class. Here is my base class, I would like to replace the __del__ method because I had read somewhere that there is no guarantee when this will run.
class DatabaseConnBase:
def __init__(self, conn_func, conn_info: MutableMapping):
self._conn = None
self.conn_func = conn_func
self.conn_info = conn_info
def _initialize_connection(self):
self._conn = self.conn_func(**self.conn_info)
#property
def conn(self):
if not self._conn:
self._initialize_connection()
return self._conn
#property
def cursor(self):
if not self._conn:
self._initialize_connection()
return self._conn.cursor()
def __del__(self):
if self._conn:
self._conn.close()
def commit(self):
if not self._conn:
raise AttributeError(
'The connection has not been initialized, unable to commit '
'transaction.'
)
self._conn.commit()
def execute(
self,
query_or_stmt: str,
verbose: bool = True,
has_res: bool = False,
auto_commit: bool = False
) -> Optional[Tuple[Any]]:
"""
Creates a new cursor object, and executes the query/statement. If
`has_res` is `True`, then it returns the list of tuple results.
:param query_or_stmt: The query or statement to run.
:param verbose: If `True`, prints out the statement or query to STDOUT.
:param has_res: Whether or not results should be returned.
:param auto_commit: Immediately commits the changes to the database
after the execute is performed.
:return: If `has_res` is `True`, then a list of tuples.
"""
cur = self.cursor
if verbose:
logger.info(f'Using {cur}')
logger.info(f'Executing:\n{query_or_stmt}')
cur.execute(query_or_stmt)
if auto_commit:
logger.info('Committing transaction...')
self.commit()
if has_res:
logger.info('Returning results...')
return cur.fetchall()

As per the top answer on this related question: What is the __del__ method, How to call it?
The __del__ method will be called when your object is garbage collected, but as you noted, there are no guarantees on how long after destruction of references to the object this will happen.
In your case, since you check for the existence of the connection before attempting to close it, your __del__ method is safe to be called multiple times, so you could simply call it explicitly before you destroy the reference to your object.

Related

Do I need to explicitly del this object?

class Tokenizer()
def __init__(self):
self.name = 'MyTokenizer'
self.tokenizer = Language.create_tokenizer(nlp)
def __call__(self, text):
if text:
with CoreClient(timeout=60000) as client:
doc = client.annotate(text, output_format='json')
else:
doc = Document("")
...
The question I am having is with the creation of 'CoreClient', which creates a http request to a server. The current code introduced by "with ... as client", can insure that the client is destroyed when 'client.annotate' is out of scope after it's done. However, the problem is that, the object 'client' has to be created for each request of processing 'text'. In order to avoid this, I had better create the object in the init method:
self.client = CoreClient(timeout=60000)
But then:
1) How to destroy the 'client' after all requests have been completed? OR
2) Is the current way of creating a Coreclient OK for each request? The creation of the object is heavy, which needs a lot of initialization.
EDIT:
def __enter__(self):
self.start()
return self
def start(self):
if self.start_cmd:
if self.be_quiet:
# Issue #26: subprocess.DEVNULL isn't supported in python 2.7.
stderr = open(os.devnull, 'w')
else:
stderr = self.stderr
print(f"Starting server with command: {' '.join(self.start_cmd)}")
self.server = subprocess.Popen(self.start_cmd,
stderr=stderr,
stdout=stderr)
To make it more clear, I added the implementation of the method enter. It seems it simply returns the object 'self'.

You only need to create the instance of CoreClient once. The with statement just ensures that the __enter__ and __exit__ methods of that instance are called before and after the body of the with statement; you don't need to create a new instance each time.
class Tokenizer()
def __init__(self):
self.name = 'MyTokenizer'
self.tokenizer = Language.create_tokenizer(nlp)
self.client = CoreClient(timeout=60000) # Create client here
def __call__(self, text):
if text:
with self.client:
doc = self.client.annotate(text, output_format='json')
else:
doc = Document("")
It appears that __enter__ and __exit__ together spin up and tear down a new server each time the CoreClient instance is used as a context manager.
The client will be collected when the Tokenizer instance gets collected. However, unless you are in an active with statement, the CoreClient instance isn't doing anything.

In this case I wouldn't worry about it because when the reference count goes to zero, Python will take care of it. Also, del does not actually delete and object. It might, but it might not. del will decrement the reference count to an object.
Take this for example:
In [1]: class Test:
...: def __del__(self):
...: print('deleted')
...:
In [2]: t = Test()
In [3]: del t
deleted
In [4]: t = Test()
In [5]: t1 = t
In [6]: del t # Nothing gets printed here because t1 still exists
In [7]: del t1 # reference count goes to 0 and now gets printed
deleted
This is why I think you should just let Python handle the destruction of your objects. Python keeps track of objects reference counts and knows when they are no longer needed. So let it take care of that stuff for you.

How to extend sqlite3 connection object with own functions?

I have a project written in Python 2.7 where the main program needs frequent access to a sqlite3 db for writing logs, measurement results, getting settings,...
At the moment I have a db module with functions such as add_log(), get_setting(), and each function in there basically looks like:
def add_log(logtext):
try:
db = sqlite3.connect(database_location)
except sqlite3.DatabaseError as e:
db.close() # try to gracefully close the db
return("ERROR (ADD_LOG): While opening db: {}".format(e))
try:
with db: # using context manager to automatically commit or roll back changes.
# when using the context manager, the execute function of the db should be used instead of the cursor
db.execute("insert into logs(level, source, log) values (?, ?, ?)", (level, source, logtext))
except sqlite3.DatabaseError as e:
return("ERROR (ADD_LOG): While adding log to db: {}".format(e))
return "OK"
(some additional code and comments removed).
It seems I should write a class extends the base sqlite connection object function so that the connection is created only once (at the beginning of the main program), and then this object contains the functionality such as
class Db(sqlite3.Connection):
def __init__(self, db_location = database_location):
try:
self = sqlite3.connect(db_location)
return self
except sqlite3.DatabaseError as e:
self.close() # try to gracefully close the db
def add_log(self, logtext):
self.execute("insert into logs(level, source, log) values (?, ?, ?)", (level, source, logtext))
It seems this should be fairly straightforward but, I can't seem to get it working.
It seems there is some useful advise here:
Python: How to successfully inherit Sqlite3.Cursor and add my customized method but I can't seem to understand how to use a similar construct for my purpose.

You are not that far away.
First of all, a class initializer cannot return anything but None (emphasis mine):
Because __new__() and __init__() work together in constructing objects (__new__() to create it, and __init__() to customise it), no non-None value may be returned by __init__(); doing so will cause a TypeError to be raised at runtime.
Second, you overwrite the current instance self of your Db object with a sqlite3.Connection object right in the initializer. That makes subclassing SQLite's connection object a bit pointless.
You just need to fix your __init__ method to make this work:
class Db(sqlite3.Connection):
# If you didn't use the default argument, you could omit overriding __init__ alltogether
def __init__(self, database=database_location, **kwargs):
super(Db, self).__init__(database=database, **kwargs)
def add_log(self, logtext, level, source):
self.execute("insert into logs(level, source, log) values (?, ?, ?)", (level, source, logtext))
That lets you use instances of your class as context managers:
with Db() as db:
print [i for i in db.execute("SELECT * FROM logs")]
db.add_log("I LAUNCHED THAT PUG INTO SPACE!", 42, "Right there")
Maurice Meyer said in the comments of the question that methods such as execute() are cursor methods and, per the DB-API 2.0 specs, that's correct.
However, sqlite3's connection objects offer a few shortcuts to cursor methods:
This is a nonstandard shortcut that creates an intermediate cursor object by calling the cursor method, then calls the cursor’s execute method with the parameters given.
To expand on the discussion in the comments:
The remark about the default argument in my code example above was targeted at the requirement to override sqlite3.Connection's __init__ method.
The __init__ in the class Db is only needed to define the default value database_location on the database argument for the sqlite3.Connection initializer.
If you were willing to pass such a value upon every instantiation of that class, your custom connection class could look like this, and still work the same way, except for that argument:
class Db(sqlite3.Connection):
def add_log(self, logtext, level, source):
self.execute("insert into logs(level, source, log) values (?, ?, ?)", (level, source, logtext))
However, the __init__ method has nothing to do with the context manager protocol as defined in PEP 343.
When it comes to classes, this protocol requires to implement the magic methods __enter__ and __exit__
The sqlite3.Connection does something along these lines:
class Connection:
def __enter__(self):
return self
def __exit__(self, exc_type, exc_val, exc_tb):
if exc_val is None:
self.commit()
else:
self.rollback()
Note: The sqlite3.Connection is provided by a C module, hence does not have a Python class definition. The above reflects what the methods would roughly look like if it did.
Lets say you don't want to keep the same connection open all the time, but rather have a dedicated connection per transaction while maintaining the general interface of the Db class above.
You could do something like this:
# Keep this to have your custom methods available
class Connection(sqlite3.Connection):
def add_log(self, level, source, log):
self.execute("INSERT INTO logs(level, source, log) VALUES (?, ?, ?)",
(level, source, log))
class DBM:
def __init__(self, database=database_location):
self._database = database
self._conn = None
def __enter__(self):
return self._connection()
def __exit__(self, exc_type, exc_val, exc_tb):
# Decide whether to commit or roll back
if exc_val:
self._connection().rollback()
else:
self._connection().commit()
# close connection
try:
self._conn.close()
except AttributeError:
pass
finally:
self._conn = None
def _connection(self):
if self._conn is None:
# Instantiate your custom sqlite3.Connection
self._conn = Connection(self._database)
return self._conn
# add shortcuts to connection methods as seen fit
def execute(self, sql, parameters=()):
with self as temp:
result = temp.execute(sql, parameters).fetchall()
return result
def add_log(self, level, source, log):
with self as temp:
temp.add_log(level, source, log)
This can be used in a context and by calling methods on the instance:
db = DBM(database_location)
with db as temp:
print [i for i in temp.execute("SELECT * FROM logs")]
temp.add_log(1, "foo", "I MADE MASHED POTATOES")
# The methods execute and add_log are only available from
# the outside because the shortcuts have been added to DBM
print [i for i in db.execute("SELECT * FROM logs")]
db.add_log(1, "foo", "I MADE MASHED POTATOES")
For further reading on context managers refer to the official documentation. I'll also recommend Jeff Knupp's nice introduction. Also, the aforementioned PEP 343 is worth having a look at for the technical specification and rationale behind that protocol.

Python descriptors to chain methods

I'm trying to figure out how to chain class methods to improve a utility class I've been writing - for reasons I'd prefer not to get into :)
Now suppose I wanted to chain a chain class methods on a class instance (in this case for setting the cursor) e.g.:
# initialize the class instance
db = CRUD(table='users', public_fields=['name', 'username', 'email'])
#the desired interface class_instance.cursor(<cursor>).method(...)
with sql.read_pool.cursor() as c:
db.cursor(c).get(target='username', where="omarlittle")
The part that's confusing is I would prefer the cursor not to persist as a class attribute after .get(...) has been called and has returned, I'd like to require that .cursor(cursor) must be first called.
class CRUD(object):
def __init__(self, table, public_fields):
self.table = table
self.public_fields = public_fields
def fields(self):
return ', '.join([f for f in self.public_fields])
def get(self, target, where):
#this is strictly for illustration purposes, I realize all
#the vulnerabilities this leaves me exposed to.
query = "SELECT {fields} FROM {table} WHERE {target} = {where}"
query.format(fields=self.fields, table=self.table, target=target,
where=where)
self.cursor.execute(query)
def cursor(self, cursor):
pass # this is where I get lost.

If I understand what you're asking, what you want is for the cursor method to return some object with a get method that works as desired. There's no reason the object it returns has to be self; it can instead return an instance of some cursor type.
That instance could have a back-reference to self, or it could get its own copy of whatever internals are needed to be a cursor, or it could be a wrapper around an underlying object from your low-level database library that knows how to be a cursor.
If you look at the DB API 2.0 spec, or implementations of it like the stdlib's sqlite3, that's exactly how they do it: A Database or Connection object (the thing you get from the top-level connect function) has a cursor method that returns a Cursor object, and that Cursor object has an execute method.
So:
class CRUDCursor(object):
def __init__(self, c, crud):
self.crud = crud
self.cursor = however_you_get_an_actual_sql_cursor(c)
def get(self, target, where):
#this is strictly for illustration purposes, I realize all
#the vulnerabilities this leaves me exposed to.
query = "SELECT {fields} FROM {table} WHERE {target} = {where}"
query.format(fields=self.crud.fields, table=self.crud.table,
target=target, where=where)
self.cursor.execute(query)
# you may want this to return something as well?
class CRUD(object):
def __init__(self, table, public_fields):
self.table = table
self.public_fields = public_fields
def fields(self):
return ', '.join([f for f in self.public_fields])
# no get method
def cursor(self, cursor):
return CRUDCursor(self, cursor)
However, there still seems to be a major problem with your example. Normally, after you execute a SELECT statement on a cursor, you want to fetch the rows from that cursor. You're not keeping the cursor object around in your "user" code, and you explicitly don't want the CRUD object to keep its cursor around, so… how do you expect to do that? Maybe get is supposed to return self.cursor.fetch_all() at the end or something?

Methods on descriptors

I'm trying to implement a wrapper around a redis database that does some bookkeeping, and I thought about using descriptors. I have an object with a bunch of fields: frames, failures, etc., and I need to be able to get, set, and increment the field as needed. I've tried to implement an Int-Like descriptor:
class IntType(object):
def __get__(self,instance,owner):
# issue a GET database command
return db.get(my_val)
def __set__(self,instance,val):
# issue a SET database command
db.set(instance.name,val)
def increment(self,instance,count):
# issue an INCRBY database command
db.hincrby(instance.name,count)
class Stream:
_prefix = 'stream'
frames = IntType()
failures = IntType()
uuid = StringType()
s = Stream()
s.frames.increment(1) # float' object has no attribute 'increment'
Is seems like I can't access the increment() method in my descriptor. I can't have increment be defined in the object that the __get__ returns. This would require an additional db query if all I want to do is increment! I also don't want increment() on the Stream class, as later on when I want to have additional fields like strings or sets in Stream, then I'd need to type check the heck out of everything.

Does this work?
class Stream:
_prefix = 'stream'
def __init__(self):
self.frames = IntType()
self.failures = IntType()
self.uuid = StringType()

Why not define the magic method iadd as well as get and set. This will allow you to do normal addition with assignment on the class. It will also mean you can treat the increment separately from the get function and thereby minimise the database accesses.
So change:
def increment(self,instance,count):
# issue an INCRBY database command
db.hincrby(instance.name,count)
to:
def __iadd__(self,other):
# your code goes here

Try this:
class IntType(object):
def __get__(self,instance,owner):
class IntValue():
def increment(self,count):
# issue an INCRBY database command
db.hincrby(self.name,count)
def getValue(self):
# issue a GET database command
return db.get(my_val)
return IntValue()
def __set__(self,instance,val):
# issue a SET database command
db.set(instance.name,val)

Encapsulating retries into `with` block

I'm looking to encapsulate logic for database transactions into a with block; wrapping the code in a transaction and handling various exceptions (locking issues). This is simple enough, however I'd like to also have the block encapsulate the retrying of the code block following certain exceptions. I can't see a way to package this up neatly into the context manager.
Is it possible to repeat the code within a with statement?
I'd like to use it as simply as this, which is really neat.
def do_work():
...
# This is ideal!
with transaction(retries=3):
# Atomic DB statements
...
...
I'm currently handling this with a decorator, but I'd prefer to offer the context manager (or in fact both), so I can choose to wrap a few lines of code in the with block instead of an inline function wrapped in a decorator, which is what I do at the moment:
def do_work():
...
# This is not ideal!
#transaction(retries=3)
def _perform_in_transaction():
# Atomic DB statements
...
_perform_in_transaction()
...

Is it possible to repeat the code within a with statement?
No.
As pointed out earlier in that mailing list thread, you can reduce a bit of duplication by making the decorator call the passed function:
def do_work():
...
# This is not ideal!
#transaction(retries=3)
def _perform_in_transaction():
# Atomic DB statements
...
# called implicitly
...

The way that occurs to me to do this is just to implement a standard database transaction context manager, but allow it to take a retries argument in the constructor. Then I'd just wrap that up in your method implementations. Something like this:
class transaction(object):
def __init__(self, retries=0):
self.retries = retries
def __enter__(self):
return self
def __exit__(self, exc_type, exc_val, traceback):
pass
# Implementation...
def execute(self, query):
err = None
for _ in range(self.retries):
try:
return self._cursor.execute(query)
except Exception as e:
err = e # probably ought to save all errors, but hey
raise err
with transaction(retries=3) as cursor:
cursor.execute('BLAH')

As decorators are just functions themselves, you could do the following:
with transaction(_perform_in_transaction, retries=3) as _perf:
_perf()
For the details, you'd need to implement transaction() as a factory method that returns an object with __callable__() set to call the original method and repeat it up to retries number of times on failure; __enter__() and __exit__() would be defined as normal for database transaction context managers.
You could alternatively set up transaction() such that it itself executes the passed method up to retries number of times, which would probably require about the same amount of work as implementing the context manager but would mean actual usage would be reduced to just transaction(_perform_in_transaction, retries=3) (which is, in fact, equivalent to the decorator example delnan provided).

While I agree it can't be done with a context manager... it can be done with two context managers!
The result is a little awkward, and I am not sure whether I approve of my own code yet, but this is what it looks like as the client:
with RetryManager(retries=3) as rm:
while rm:
with rm.protect:
print("Attempt #%d of %d" % (rm.attempt_count, rm.max_retries))
# Atomic DB statements
There is an explicit while loop still, and not one, but two, with statements, which leaves a little too much opportunity for mistakes for my liking.
Here's the code:
class RetryManager(object):
""" Context manager that counts attempts to run statements without
exceptions being raised.
- returns True when there should be more attempts
"""
class _RetryProtector(object):
""" Context manager that only raises exceptions if its parent
RetryManager has given up."""
def __init__(self, retry_manager):
self._retry_manager = retry_manager
def __enter__(self):
self._retry_manager._note_try()
return self
def __exit__(self, exc_type, exc_val, traceback):
if exc_type is None:
self._retry_manager._note_success()
else:
# This would be a good place to implement sleep between
# retries.
pass
# Suppress exception if the retry manager is still alive.
return self._retry_manager.is_still_trying()
def __init__(self, retries=1):
self.max_retries = retries
self.attempt_count = 0 # Note: 1-based.
self._success = False
self.protect = RetryManager._RetryProtector(self)
def __enter__(self):
return self
def __exit__(self, exc_type, exc_val, traceback):
pass
def _note_try(self):
self.attempt_count += 1
def _note_success(self):
self._success = True
def is_still_trying(self):
return not self._success and self.attempt_count < self.max_retries
def __bool__(self):
return self.is_still_trying()
Bonus: I know you don't want to separate your work off into separate functions wrapped with decorators... but if you were happy with that, the redo package from Mozilla offers the decorators to do that, so you don't have to roll your own. There is even a Context Manager that effective acts as temporary decorator for your function, but it still relies on your retrievable code to be factored out into a single function.

This question is a few years old but after reading the answers I decided to give this a shot.
This solution requires the use of a "helper" class, but I I think it does provide an interface with retries configured through a context manager.
class Client:
def _request(self):
# do request stuff
print("tried")
raise Exception()
def request(self):
retry = getattr(self, "_retry", None)
if not retry:
return self._request()
else:
for n in range(retry.tries):
try:
return self._request()
except Exception:
retry.attempts += 1
class Retry:
def __init__(self, client, tries=1):
self.client = client
self.tries = tries
self.attempts = 0
def __enter__(self):
self.client._retry = self
def __exit__(self, *exc):
print(f"Tried {self.attempts} times")
del self.client._retry
>>> client = Client()
>>> with Retry(client, tries=3):
... # will try 3 times
... response = client.request()
tried once
tried once
tried once
Tried 3 times

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Guaranteeing a database connection is closed after use in class? - python

Related

Do I need to explicitly del this object?

How to extend sqlite3 connection object with own functions?

Python descriptors to chain methods

Methods on descriptors

Encapsulating retries into `with` block

Categories

Resources