We have a little bit of a complicated setup:
In our normal code, we connect manually to a mysql db. We're doing this because I guess the connections django normally uses are not threadsafe? So we let django make the connection, extract the information from it, and then use a mysqldb connection to do the actual querying.
Our code is largely an update process, so we have autocommit turned off to save time.
For ease of creating test data, I created django models that represent the tables, and use them to create rows to test on. So I have functions like:
def make_thing(**overrides):
fields = deepcopy(DEFAULT_THING)
fields.update(overrides)
s = Thing(**fields)
s.save()
transaction.commit(using='ourdb')
reset_queries()
return s
However, it doesn't seem to actually be committing! After I make an object, I later have code that executes raw sql against the mysqldb connection:
def get_information(self, value):
print self.api.rawSql("select count(*) from thing")[0][0]
query = 'select info from thing where column = %s' % value
return self.api.rawSql(query)[0][0]
This print statement prints 0! Why?
Also, if I turn autocommit off, I get
TransactionManagementError: This is forbidden when an 'atomic' block is active.
when we try to alter the autocommit level later.
EDIT: I also just tried https://groups.google.com/forum/#!topic/django-users/4lzsQAWYwG0, which did not help.
EDIT2: I checked from a shell against the database--the commit is working, it's just not getting picked up. I've tried setting the transaction isolation level but it isn't helping. I should add that a function further up from get_information uses this decorator:
def single_transaction(fn):
from django.db import transaction
from django.db import connection
def wrapper(*args, **kwargs):
prior_autocommit = transaction.get_autocommit()
transaction.set_autocommit(False)
connection.cursor().execute('set transaction isolation level read committed')
connection.cursor().execute("SELECT ##session.tx_isolation")
try:
result = fn(*args, **kwargs)
transaction.commit()
return result
finally:
transaction.set_autocommit(prior_autocommit)
django.db.reset_queries()
gc.collect()
wrapper.__name__ = fn.__name__
return wrapper
Related
I have a program that uses a couple sql databases to store data. I have a class that manages the various sql functions, such as getting a value, an entire table or just updating a value. All of the processes work fine until I run a function that uses UPDATE. I execute an UPDATE command and try to commit the change and the database is always locked. Every function I have in my custom sql class has
cursor.close
database.close
So there shouldn't be any issue with the database connection still being open. Am I missing something in this syntax that is not connecting to the database correctly? I used the extra print statements in an attempt to find out where the problem is occurring, so those can be ignored.
import sqlite3 as db
import os
databaseName = "site"
class MassDb:
def __init__(self,databaseName):
super(MassDb, self).__init__()
print("Current Directory: ",os.getcwd())
self.databaseName = databaseName
def updateValue(self, location, metric, input_value):
print("OPEN CONNECTION UPDATE - running updateValue: ",location, metric, input_value)
if self.databaseName == "site":
try:
siteConn = db.connect("site_data.db")
siteCursor = siteConn.cursor()
siteCursor.execute("UPDATE sites SET " + metric + " = ? WHERE LOCATI ON = ?", (input_value, location))
siteConn.commit()
except:
print("UPDATE FAILED")
finally:
siteCursor.close
siteConn.close
elif self.databaseName == "comp":
try:
compConn = db.connect("comp_data.db")
compCursor = compConn.cursor()
compCursor.execute("UPDATE competitors SET " + metric + " = ? WHERE NAME = ?", (input_value, location))
compConn.commit()
except:
print("UPDATE FAILED")
finally:
compCursor.close
compConn.close
print("CLOSED CONNECTION UPDATE - Update Connection Closed")
else:
print("Update Error")
MassDb("site").updateValue("Location", "CURRENT_SCORE", "100")
As #roganjosh commented, my problem was that I wasn't properly closing the database. If
commit()
is used, there's no need to close the database. However,
cursor.close()
and
conn.close()
need to be written as such. Leaving off the parentheses would be as though an attribute is being referenced, rather than a method. In order to execute the close method, the () must be present. Seems obvious now, but I wasn't aware at the time. Hopefully this can help someone else if they too run across this.
Additionally, using a context manager works and eliminates the need to use close()
with conn:
#do stuff here
commit()
I need to test a function with different parameters, and the most proper way for this seems to be using the with self.subTest(...) context manager.
However, the function writes something to the db, and it ends up in an inconsistent state. I can delete the things I write, but it would be cleaner if I could recreate the whole db completely. Is there a way to do that?
Not sure how to recreate the database in self.subTest() but I have another technique I am currently using which might be of interest to you. You can use fixtures to create a "snapshot" of your database which will basically be copied in a second database used only for testing purposes. I currently use this method to test code on a big project I'm working on at work.
I'll post some example code to give you an idea of what this will look like in practice, but you might have to do some extra research to tailor the code to your needs (I've added links to guide you).
The process is rather straighforward. You would be creating a copy of your database with only the data needed by using fixtures, which will be stored in a .yaml file and accessed only by your test unit.
Here is what the process would look like:
List item you want to copy to your test database to populate it using fixtures. This will only create a db with the needed data instead of stupidly copying the entire db. It will be stored in a .yaml file.
generate.py
django.setup()
stdout = sys.stdout
conf = [
{
'file': 'myfile.yaml',
'models': [
dict(model='your.model', pks='your, primary, keys'),
dict(model='your.model', pks='your, primary, keys')
]
}
]
for fixture in conf:
print('Processing: %s' % fixture['file'])
with open(fixture['file'], 'w') as f:
sys.stdout = FixtureAnonymiser(f)
for model in fixture['models']:
call_command('dumpdata', model.pop('model'), format='yaml',indent=4, **model)
sys.stdout.flush()
sys.stdout = stdout
In your test unit, import your generated .yaml file as a fixture and your test will automatically use this the data from the fixture to carry out the tests, keeping your main database untouched.
test_class.py
from django.test import TestCase
class classTest(TestCase):
fixtures = ('myfile.yaml',)
def setUp(self):
"""setup tests cases"""
# create the object you want to test here, which will use data from the fixtures
def test_function(self):
self.assertEqual(True,True)
# write your test here
You can read up more here:
Django
YAML
If you have any questions because things are unclear just ask, I'd be happy to help you out.
Maybe my solution will help someone
I used transactions to roll back to the database state that I had at the start of the test.
I use Eric Cousineau's decorator function to parametrizing tests
More about database transactions at django documentation page
import functools
from django.db import transaction
from django.test import TransactionTestCase
from django.contrib.auth import get_user_model
User = get_user_model()
def sub_test(param_list):
"""Decorates a test case to run it as a set of subtests."""
def decorator(f):
#functools.wraps(f)
def wrapped(self):
for param in param_list:
with self.subTest(**param):
f(self, **param)
return wrapped
return decorator
class MyTestCase(TransactionTestCase):
#sub_test([
dict(email="new#user.com", password='12345678'),
dict(email="new#user.com", password='password'),
])
def test_passwords(self, email, password):
# open a transaction
with transaction.atomic():
# Creates a new savepoint. Returns the savepoint ID (sid).
sid = transaction.savepoint()
# create user and check, if there only one with this email in DB
user = User.objects.create(email=email, password=password)
self.assertEqual(User.objects.filter(email=user.email).count(), 1)
# Rolls back the transaction to savepoint sid.
transaction.savepoint_rollback(sid)
I am using the SQL Alchemy core version 1.1 and I can't seem to be able to get transaction working inside my falcon (python) application. I have followed the documentation to my knowledge correctly.
edit: database postgresql -> psycopg2cffi
def __init__(self, *args, **kwargs):
self.__conn_url__ = settings.get_db_url()
self.db_engine = create_engine(self.__conn_url__)
self.db_engine.echo = False
self.metadata = MetaData(self.db_engine)
self.connection = self.db_engine.connect()
self.organization_types_table = Table('organization_types', self.metadata, autoload=True)
self.organization_type_names_table = Table('organization_type_names', self.metadata, autoload=True)
def post(self, json_data):
transaction = self.connection.begin()
print(transaction)
try:
results = self.organization_types_table.insert().\
values(date_created=datetime.datetime.now()).\
execute()
organization_types_id = results.inserted_primary_key
results = self.organization_type_names_table.insert().\
values(organization_types_id=organization_types_id[0],
lang=json_data['lang'],
name=json_data['name']).\
execute()
transaction.commit()
print("I didn't rollback")
return results
except:
transaction.rollback()
print("I rollback :D")
raise
If I run this code multiple times inserting the same objects. It should only work for the first time due to a index constraint not allowing duplications.
The results I will get according to my print() statements:
Transaction object -> I didn't rollback
Transaction object -> I rollback :D
Transaction object -> I rollback :D
If I look inside my tables I can clearly see that organization_types_table contains 3 records while organization_type_names contains 1. Why isn't organization_types_table not rolling back according to the transaction?
You need to do
self.connection.execute(query)
instead of
query.execute()
query.execute() executes query on the engine, which acquires a new connection, instead of using the one on which you have a transaction open.
I am running in to the dreaded MySQL Commands out of Sync when using a custom DB library and celery.
The library is as follows:
import pymysql
import pymysql.cursors
from furl import furl
from flask import current_app
class LegacyDB:
"""Db
Legacy Database connectivity library
"""
def __init__(self,app):
with app.app_context():
self.rc = current_app.config['RAVEN']
self.logger = current_app.logger
self.data = {}
# setup Mysql
try:
uri = furl(current_app.config['DBCX'])
self.dbcx = pymysql.connect(
host=uri.host,
user=uri.username,
passwd=uri.password,
db=str(uri.path.segments[0]),
port=int(uri.port),
cursorclass=pymysql.cursors.DictCursor
)
except:
self.rc.captureException()
def query(self, sql, params = None, TTL=36):
# INPUT 1 : SQL query
# INPUT 2 : Parameters
# INPUT 3 : Time To Live
# OUTPUT : Array of result
# check that we're still connected to the
# database before we fire off the query
try:
db_cursor = self.dbcx.cursor()
if params:
self.logger.debug("%s : %s" % (sql, params))
db_cursor.execute(sql,params)
self.dbcx.commit()
else:
self.logger.debug("%s" % sql)
db_cursor.execute(sql)
self.data = db_cursor.fetchall()
if self.data == None:
self.data = {}
db_cursor.close()
except Exception as ex:
if ex[0] == "2006":
db_cursor.close()
self.connect()
db_cursor = self.dbcx.cursor()
if params:
db_cursor.execute(sql,params)
self.dbcx.commit()
else:
db_cursor.execute(sql)
self.data = db_cursor.fetchall()
db_cursor.close()
else:
self.rc.captureException()
return self.data
The purpose of the library is to work alongside SQLAlchemy whilst I migrate a legacy database schema from a C++-based system to a Python based system.
All configuration is done via a Flask application and the app.config['DBCX'] value reads the same as a SQLAlchemy String ("mysql://user:pass#host:port/dbname") allowing me to easily switch over in future.
I have a number of tasks that run "INSERT" statements via celery, all of which utilise this library. As you can imagine, the main reason for running Celery is so that I can increase throughput on this application, however I seem to be hitting an issue with the threading in my library or the application as after a while (around 500 processed messages) I see the following in the logs:
Stacktrace (most recent call last):
File "legacy/legacydb.py", line 49, in query
self.dbcx.commit()
File "pymysql/connections.py", line 662, in commit
self._read_ok_packet()
File "pymysql/connections.py", line 643, in _read_ok_packet
raise OperationalError(2014, "Command Out of Sync")
I'm obviously doing something wrong to hit this error, however it doesn't seem to matter whether MySQL has autocommit enabled/disabled or where I place my connection.commit() call.
If I leave out the connection.commit() then I don't get anything inserted into the database.
I've recently moved from mysqldb to pymysql and the occurrences appear to be lower, however given that these are simple "insert" commands and not a complicated select (there aren't even any foreign key constraints on this database!) I'm struggling to work out where the issue is.
As things stand at present, I am unable to use executemany as I cannot prepare the statements in advance (I am pulling data from a "firehose" message queue and storing it locally for later processing).
First of all, make sure that the celery thingamajig uses its own connection(s) since
>>> pymysql.threadsafety
1
Which means: "threads may share the module but not connections".
Is the init called once, or per-worker? If only once, you need to move the initialisation.
How about lazily initialising the connection in a thread-local variable the first time query is called?
Using Flask, I'm curious to know if SQLAlchemy is still the best way to go for querying my database with raw SQL (direct SELECT x FROM table WHERE ...) instead of using the ORM or if there is an simpler yet powerful alternative ?
Thank for your reply.
I use SQLAlchemy for direct queries all the time.
Primary advantage: it gives you the best protection against SQL injection attacks. SQLAlchemy does the Right Thing whatever parameters you throw at it.
I find it works wonders for adjusting the generated SQL based on conditions as well. Displaying a result set with multiple filter controls above it? Just build your query in a set of if/elif/else constructs and you know your SQL will be golden still.
Here is an excerpt from some live code (older SA version, so syntax could differ a little):
# Pull start and end dates from form
# ...
# Build a constraint if `start` and / or `end` have been set.
created = None
if start and end:
created = sa.sql.between(msg.c.create_time_stamp,
start.replace(hour=0, minute=0, second=0),
end.replace(hour=23, minute=59, second=59))
elif start:
created = (msg.c.create_time_stamp >=
start.replace(hour=0, minute=0, second=0))
elif end:
created = (msg.c.create_time_stamp <=
end.replace(hour=23, minute=59, second=59))
# More complex `from_` object built here, elided for example
# [...]
# Final query build
query = sa.select([unit.c.eli_uid], from_obj=[from_])
query = query.column(count(msg.c.id).label('sent'))
query = query.where(current_store)
if created:
query = query.where(created)
The code where this comes from is a lot more complex, but I wanted to highlight the date range code here. If I had to build the SQL using string formatting, I'd probably have introduced a SQL injection hole somewhere as it is much easier to forget to quote values.
After I worked on a small project of mine, I decided to try to just use MySQLDB, without SQL Alchemy.
It works fine and it's quite easy to use, here's an example (I created a small class that handles all the work to the database)
import MySQLdb
from MySQLdb.cursors import DictCursor
class DatabaseBridge():
def __init__(self, *args, **kwargs):
kwargs['cursorclass'] = DictCursor
self.cnx = MySQLdb.connect (**kwargs)
self.cnx.autocommit(True)
self.cursor = self.cnx.cursor()
def query_all(self, query, *args):
self.cursor.execute(query, *args)
return self.cursor.fetchall()
def find_unique(self, query, *args):
rows = self.query_all(query, *args);
if len(rows) == 1:
return rows[0]
return None
def execute(self, query, params):
self.cursor.execute(query, params)
return self.cursor.rowcount
def get_last_id(self):
return self.cnx.insert_id()
def close(self):
self.cursor.close()
self.cnx.close()
database = DatabaseBridge(**{
'user': 'user',
'passwd': 'password',
'db': 'my_db'
})
rows = database.query_all("SELECT id, name, email FROM users WHERE is_active = %s AND project = %s", (1, "My First Project"))
(It's a dumb example).
It works like a charm BUT you have to take these into consideration :
Multithreading is not supported ! It's ok if you don't work with multiprocessing from Python.
You won't have all the advantages of SQLAlchemy (Database to Class (model) wrapper, Query generation (select, where, order_by, etc)). This is the key point on how you want to work with your database.
But on the other hand, and like SQLAlchemy, there is protections agains't SQL injection attacks :
A basic query would be like this :
cursor.execute("SELECT * FROM users WHERE data = %s" % "Some value") # THIS IS DANGEROUS
But you should do :
cursor.execute("SELECT * FROM users WHERE data = %s", ("Some value",)) # This is secure!
Saw the difference ? Read again ;)
The difference is that I replaced %, by , : We pass the arguments as ... arguments to the execute, and these are escaped. When using %, arguments aren't escaped, enabling SQL Injection attacks!
The final word here is that it depends on your usage and what you plan to do with your project. For me, SQLAlchemy was on overkill (it's a basic shell script !), so MysqlDB was perfect.