I have a project written in Python 2.7 where the main program needs frequent access to a sqlite3 db for writing logs, measurement results, getting settings,...
At the moment I have a db module with functions such as add_log(), get_setting(), and each function in there basically looks like:
def add_log(logtext):
try:
db = sqlite3.connect(database_location)
except sqlite3.DatabaseError as e:
db.close() # try to gracefully close the db
return("ERROR (ADD_LOG): While opening db: {}".format(e))
try:
with db: # using context manager to automatically commit or roll back changes.
# when using the context manager, the execute function of the db should be used instead of the cursor
db.execute("insert into logs(level, source, log) values (?, ?, ?)", (level, source, logtext))
except sqlite3.DatabaseError as e:
return("ERROR (ADD_LOG): While adding log to db: {}".format(e))
return "OK"
(some additional code and comments removed).
It seems I should write a class extends the base sqlite connection object function so that the connection is created only once (at the beginning of the main program), and then this object contains the functionality such as
class Db(sqlite3.Connection):
def __init__(self, db_location = database_location):
try:
self = sqlite3.connect(db_location)
return self
except sqlite3.DatabaseError as e:
self.close() # try to gracefully close the db
def add_log(self, logtext):
self.execute("insert into logs(level, source, log) values (?, ?, ?)", (level, source, logtext))
It seems this should be fairly straightforward but, I can't seem to get it working.
It seems there is some useful advise here:
Python: How to successfully inherit Sqlite3.Cursor and add my customized method but I can't seem to understand how to use a similar construct for my purpose.
You are not that far away.
First of all, a class initializer cannot return anything but None (emphasis mine):
Because __new__() and __init__() work together in constructing objects (__new__() to create it, and __init__() to customise it), no non-None value may be returned by __init__(); doing so will cause a TypeError to be raised at runtime.
Second, you overwrite the current instance self of your Db object with a sqlite3.Connection object right in the initializer. That makes subclassing SQLite's connection object a bit pointless.
You just need to fix your __init__ method to make this work:
class Db(sqlite3.Connection):
# If you didn't use the default argument, you could omit overriding __init__ alltogether
def __init__(self, database=database_location, **kwargs):
super(Db, self).__init__(database=database, **kwargs)
def add_log(self, logtext, level, source):
self.execute("insert into logs(level, source, log) values (?, ?, ?)", (level, source, logtext))
That lets you use instances of your class as context managers:
with Db() as db:
print [i for i in db.execute("SELECT * FROM logs")]
db.add_log("I LAUNCHED THAT PUG INTO SPACE!", 42, "Right there")
Maurice Meyer said in the comments of the question that methods such as execute() are cursor methods and, per the DB-API 2.0 specs, that's correct.
However, sqlite3's connection objects offer a few shortcuts to cursor methods:
This is a nonstandard shortcut that creates an intermediate cursor object by calling the cursor method, then calls the cursor’s execute method with the parameters given.
To expand on the discussion in the comments:
The remark about the default argument in my code example above was targeted at the requirement to override sqlite3.Connection's __init__ method.
The __init__ in the class Db is only needed to define the default value database_location on the database argument for the sqlite3.Connection initializer.
If you were willing to pass such a value upon every instantiation of that class, your custom connection class could look like this, and still work the same way, except for that argument:
class Db(sqlite3.Connection):
def add_log(self, logtext, level, source):
self.execute("insert into logs(level, source, log) values (?, ?, ?)", (level, source, logtext))
However, the __init__ method has nothing to do with the context manager protocol as defined in PEP 343.
When it comes to classes, this protocol requires to implement the magic methods __enter__ and __exit__
The sqlite3.Connection does something along these lines:
class Connection:
def __enter__(self):
return self
def __exit__(self, exc_type, exc_val, exc_tb):
if exc_val is None:
self.commit()
else:
self.rollback()
Note: The sqlite3.Connection is provided by a C module, hence does not have a Python class definition. The above reflects what the methods would roughly look like if it did.
Lets say you don't want to keep the same connection open all the time, but rather have a dedicated connection per transaction while maintaining the general interface of the Db class above.
You could do something like this:
# Keep this to have your custom methods available
class Connection(sqlite3.Connection):
def add_log(self, level, source, log):
self.execute("INSERT INTO logs(level, source, log) VALUES (?, ?, ?)",
(level, source, log))
class DBM:
def __init__(self, database=database_location):
self._database = database
self._conn = None
def __enter__(self):
return self._connection()
def __exit__(self, exc_type, exc_val, exc_tb):
# Decide whether to commit or roll back
if exc_val:
self._connection().rollback()
else:
self._connection().commit()
# close connection
try:
self._conn.close()
except AttributeError:
pass
finally:
self._conn = None
def _connection(self):
if self._conn is None:
# Instantiate your custom sqlite3.Connection
self._conn = Connection(self._database)
return self._conn
# add shortcuts to connection methods as seen fit
def execute(self, sql, parameters=()):
with self as temp:
result = temp.execute(sql, parameters).fetchall()
return result
def add_log(self, level, source, log):
with self as temp:
temp.add_log(level, source, log)
This can be used in a context and by calling methods on the instance:
db = DBM(database_location)
with db as temp:
print [i for i in temp.execute("SELECT * FROM logs")]
temp.add_log(1, "foo", "I MADE MASHED POTATOES")
# The methods execute and add_log are only available from
# the outside because the shortcuts have been added to DBM
print [i for i in db.execute("SELECT * FROM logs")]
db.add_log(1, "foo", "I MADE MASHED POTATOES")
For further reading on context managers refer to the official documentation. I'll also recommend Jeff Knupp's nice introduction. Also, the aforementioned PEP 343 is worth having a look at for the technical specification and rationale behind that protocol.
Related
I'll try show all the functions that are being used here:
def main():
peers = [*list with many peerIDs*]
asyncio.run(async_peerImport.importPeers(peers))
async def importPeers(peers):
dividedPeers = divideList(peers, 250) # this is just a function I made to split lists into lists of smaller lists
for peers in dividedPeers:
await asyncio.gather(*[importPeer(peerID) for peerID in peers])
async def importPeer(peerID):
fetchPeerDataTask = asyncio.create_task(async_requests.fetchPeerData(peerID))
getTorStatusTask = asyncio.create_task(async_requests.fetchPeerData(peerID))
peerData = await fetchPeerDataTask
torStatus = await getTorStatusTask
if peerData is not None and torStatus is not None:
db.upsertPeer(peerID, torStatus, peerData) # the function in question
print("peer imported:", peerID)
class db:
def upsertPeer(self, peerID, torStatus, peerData):
try:
sql = "INSERT INTO peers (peerID, torStatus, peerData) VALUES (%s, %s, %s);"
self.cursor.execute(sql, [peerID, torStatus, json.dumps(peerData)])
print("peer inserted")
except psycopg2.errors.UniqueViolation:
sql = "UPDATE peers SET torStatus = %s, peerData = %s WHERE peerID = %s;"
self.cursor.execute(sql, [torStatus, json.dumps(peerData), peerID])
print("peer updated")
finally:
self.connection.commit()
Hopefully you can see what should be happening? If it is more difficult than I thought then tell me and I'll add a bunch of comments.
I don't see a reason for this not to work, but when I run it, I get this error:
File "c:\path\async_peerImport.py", line 25, in importPeer
db.upsertPeer(peerID, torStatus, peerData)
TypeError: upsertPeer() missing 1 required positional argument: 'peerData'
I tried adding some random object into the first position and moved all the other arguments up (total of 4), and it said that object has no attribute 'connection'. I have also run this upsertPeer() on its own and it does work with those 3 arguments. I am completly lost here. Am I doing something wrong?
Again, if anything I have tried to explain doesn't make sense just tell me and I'll try better.
Thanks.
This bit:
class db:
def upsertPeer(self, peerID, torStatus, peerData):
# code here
defines an instance method of a class. In order to access it, you need to access it through an instance of the db class, which might look like
my_db = db()
my_db.upsertPeer(peerId, torStatus, peerData)
In which case the value of self is my_db and it is implicitly passed without any intervention from you.
If you are attempting to make a class method, which can be used in the way your code does, try this:
class db:
#classmethod
def upsertPeer(cls, peerID, torStatus, peerData):
# code here
Note that there's still an implicit first argument, but it's the db class object, not an instance of it.
I have a wrapper class that I subclass out to the many database connections we use. I want something that will behave similarly to the try/finally functionality, but still give me the flexibility and subclassing potential of a class. Here is my base class, I would like to replace the __del__ method because I had read somewhere that there is no guarantee when this will run.
class DatabaseConnBase:
def __init__(self, conn_func, conn_info: MutableMapping):
self._conn = None
self.conn_func = conn_func
self.conn_info = conn_info
def _initialize_connection(self):
self._conn = self.conn_func(**self.conn_info)
#property
def conn(self):
if not self._conn:
self._initialize_connection()
return self._conn
#property
def cursor(self):
if not self._conn:
self._initialize_connection()
return self._conn.cursor()
def __del__(self):
if self._conn:
self._conn.close()
def commit(self):
if not self._conn:
raise AttributeError(
'The connection has not been initialized, unable to commit '
'transaction.'
)
self._conn.commit()
def execute(
self,
query_or_stmt: str,
verbose: bool = True,
has_res: bool = False,
auto_commit: bool = False
) -> Optional[Tuple[Any]]:
"""
Creates a new cursor object, and executes the query/statement. If
`has_res` is `True`, then it returns the list of tuple results.
:param query_or_stmt: The query or statement to run.
:param verbose: If `True`, prints out the statement or query to STDOUT.
:param has_res: Whether or not results should be returned.
:param auto_commit: Immediately commits the changes to the database
after the execute is performed.
:return: If `has_res` is `True`, then a list of tuples.
"""
cur = self.cursor
if verbose:
logger.info(f'Using {cur}')
logger.info(f'Executing:\n{query_or_stmt}')
cur.execute(query_or_stmt)
if auto_commit:
logger.info('Committing transaction...')
self.commit()
if has_res:
logger.info('Returning results...')
return cur.fetchall()
As per the top answer on this related question: What is the __del__ method, How to call it?
The __del__ method will be called when your object is garbage collected, but as you noted, there are no guarantees on how long after destruction of references to the object this will happen.
In your case, since you check for the existence of the connection before attempting to close it, your __del__ method is safe to be called multiple times, so you could simply call it explicitly before you destroy the reference to your object.
How to remove DRY or to make it more Pythonic in a code like this? I know I can make the insertion as a function but I don't like that, many classes will have too many functions because code like these with different implementations are everywhere.
The injected db_conn is managed by Nameko's (microservice framework) DependencyProvider, the closing of the connection is done by the DepedencyProvider when a worker already finished the job.
I also made the db_conn object compatible with with statement. I close the connection in the __exit__ method.
Here is the current code.
class CommandChatBot(BaseChatBot):
def __init__(self, db_conn=None):
self.db_conn = None
if db_conn:
self.db_conn = db_conn
def add_interaction(self, question, answer, recipient):
if self.db_conn:
self.db_conn.use_or_create_db(db=recipient)
return self.db_conn.insert(
table=schemas.commandchatbot.TABLE,
document={
'question': question,
'answer': answer
}
)
else:
with db_connection_factory() as conn:
conn.use_or_create_db(db=recipient)
return conn.insert(
table=schemas.commandchatbot.TABLE,
document={
'question': question,
'answer': answer
}
)
The code above is also inefficient when a dependency is not injected, because each functions must instantiate their own db_conn object. I'm thinking about something like a with statement but for the whole class, is that possible?
Here's the closing of the connection when a dependency is a subclass of DependencyProvider, get_depedency() is called when a microservice dispatch a new worker, worker_teardown() will be called on worker deletion/all code inside a microservice is already executed.
class DbConnectionProvider(DependencyProvider):
def __init__(self):
self.db_conn = db_connection_factory()
def get_dependency(self, worker_ctx):
return self.db_conn
def worker_teardown(self, worker_ctx):
self.db_conn.close_conn()
One of the object generated by the db_conn_factory()
class RethinkDbAdapter(DatabaseAdapter):
def __init__(self, db=None, create_db_if_not_exist=True):
uri = build_nosqldatabase_uri(return_dict=True)
self.raw_connection = r.connect(
host=uri['host'],
port=uri['port']
)
...........
# __enter__ and __exit__ allows the use of 'with' statement
def __enter__(self):
return self
# Close connection when 'with' statement is out of scope
def __exit__(self, exc_type, exc_val, exc_tb):
self.raw_connection.close()
def close_conn(self):
"""
Only used in Nameko DependencyProvider.
"""
return self.raw_connection.close()
.........
It looks like you encapsulate the following code in another method that you can call from either the if or the else block:
insert(
table=schemas.commandchatbot.TABLE,
document={
'question': question,
'answer': answer
}
)
I'm trying to figure out how to chain class methods to improve a utility class I've been writing - for reasons I'd prefer not to get into :)
Now suppose I wanted to chain a chain class methods on a class instance (in this case for setting the cursor) e.g.:
# initialize the class instance
db = CRUD(table='users', public_fields=['name', 'username', 'email'])
#the desired interface class_instance.cursor(<cursor>).method(...)
with sql.read_pool.cursor() as c:
db.cursor(c).get(target='username', where="omarlittle")
The part that's confusing is I would prefer the cursor not to persist as a class attribute after .get(...) has been called and has returned, I'd like to require that .cursor(cursor) must be first called.
class CRUD(object):
def __init__(self, table, public_fields):
self.table = table
self.public_fields = public_fields
def fields(self):
return ', '.join([f for f in self.public_fields])
def get(self, target, where):
#this is strictly for illustration purposes, I realize all
#the vulnerabilities this leaves me exposed to.
query = "SELECT {fields} FROM {table} WHERE {target} = {where}"
query.format(fields=self.fields, table=self.table, target=target,
where=where)
self.cursor.execute(query)
def cursor(self, cursor):
pass # this is where I get lost.
If I understand what you're asking, what you want is for the cursor method to return some object with a get method that works as desired. There's no reason the object it returns has to be self; it can instead return an instance of some cursor type.
That instance could have a back-reference to self, or it could get its own copy of whatever internals are needed to be a cursor, or it could be a wrapper around an underlying object from your low-level database library that knows how to be a cursor.
If you look at the DB API 2.0 spec, or implementations of it like the stdlib's sqlite3, that's exactly how they do it: A Database or Connection object (the thing you get from the top-level connect function) has a cursor method that returns a Cursor object, and that Cursor object has an execute method.
So:
class CRUDCursor(object):
def __init__(self, c, crud):
self.crud = crud
self.cursor = however_you_get_an_actual_sql_cursor(c)
def get(self, target, where):
#this is strictly for illustration purposes, I realize all
#the vulnerabilities this leaves me exposed to.
query = "SELECT {fields} FROM {table} WHERE {target} = {where}"
query.format(fields=self.crud.fields, table=self.crud.table,
target=target, where=where)
self.cursor.execute(query)
# you may want this to return something as well?
class CRUD(object):
def __init__(self, table, public_fields):
self.table = table
self.public_fields = public_fields
def fields(self):
return ', '.join([f for f in self.public_fields])
# no get method
def cursor(self, cursor):
return CRUDCursor(self, cursor)
However, there still seems to be a major problem with your example. Normally, after you execute a SELECT statement on a cursor, you want to fetch the rows from that cursor. You're not keeping the cursor object around in your "user" code, and you explicitly don't want the CRUD object to keep its cursor around, so… how do you expect to do that? Maybe get is supposed to return self.cursor.fetch_all() at the end or something?
I'm using MySQLdb to connect to MySQL using python. My tables are all InnoDB and I'm using transactions.
I'm struggling to come up with a way to 'share' transactions across functions. Consider the following pseudocode:
def foo():
db = connect()
cur = db.cursor()
try:
cur.execute(...)
conn.commit()
except:
conn.rollback()
def bar():
db = connect()
cur = db.cursor()
try:
cur.execute(...)
foo() # note this call
conn.commit()
except:
conn.rollback()
At some points in my code, I need to call foo() and at some points I need to call bar(). What's the best practice here? How would I tell the call to foo() to commit() if called outside bar() but not inside bar()? This is obviously more complex if there are multiple threads calling foo() and bar() and the calls to connect() don't return the same connection object.
UPDATE
I found a solution which works for me. I've wrapped connect() to increment a value when called. Calling commit() decrements that value. If commit() is called and that counter's > 0, no commit happens and the value is decremented. You therefore get this:
def foo():
db = connect() # internal counter = 1
...
db.commit() # internal counter = 0, so commit
def bar():
db = connect() # internal counter = 1
...
foo() # internal counter goes to 2, then to 1 when commit() is called, so no commit happens
db.commit() # internal counter = 0, so commit
You can take advantage of Python's default function arguments in this case:
def foo(cur=None):
inside_larger_transaction = False
if cursor is None:
db = connect()
cur = db.cursor()
inside_larger_transaction = True
try:
cur.execute(...)
if not inside_larger_transaction:
conn.commit()
except:
conn.rollback()
So, if bar is calling foo, it just pass in the cursor object as a parameter.
Not that I don't see much sense in creating a different cursor object for each small function - you should either write your several functions as methods of an object, and have a cursor attribute - or pass the cursos explicitly always (in this case, use another named parameter to indicate whether the current function is part of a major transaction or not)
Another option is to create a context-manager class to make your commits, and encapsulate all transactions within it - therefore, none of your functions should do transaction commit - you would keep both transaction.commit and transaction.rollback calls on the __exit__method of this object.
class Transaction(object):
def __enter__(self):
self.db = connect()
cursor = self.db.cursor()
return cursor
def __exit__(self, exc_type, exc_value, traceback):
if exc_type is None:
self.db.commit()
else:
self.db.rollback()
And just use it like this:
def foo(cursor):
...
def foo(cur):
cur.execute(...)
def bar(cur):
cur.execute(...)
foo(cur)
with Transaction() as cursor:
foo(cursor)
with Transaction() as cursor:
bar(cursor)
The cleanest way IMO is to pass the connection object to foo and bar
Declare you connections outside the functions and pass them to the function as arguements
foo(cur, conn)
bar(cur, conn)