Django TestCase: recreate database in self.subTest(...)

Django TestCase: recreate database in self.subTest(...) - python

I need to test a function with different parameters, and the most proper way for this seems to be using the with self.subTest(...) context manager.
However, the function writes something to the db, and it ends up in an inconsistent state. I can delete the things I write, but it would be cleaner if I could recreate the whole db completely. Is there a way to do that?

Not sure how to recreate the database in self.subTest() but I have another technique I am currently using which might be of interest to you. You can use fixtures to create a "snapshot" of your database which will basically be copied in a second database used only for testing purposes. I currently use this method to test code on a big project I'm working on at work.
I'll post some example code to give you an idea of what this will look like in practice, but you might have to do some extra research to tailor the code to your needs (I've added links to guide you).
The process is rather straighforward. You would be creating a copy of your database with only the data needed by using fixtures, which will be stored in a .yaml file and accessed only by your test unit.
Here is what the process would look like:
List item you want to copy to your test database to populate it using fixtures. This will only create a db with the needed data instead of stupidly copying the entire db. It will be stored in a .yaml file.
generate.py
django.setup()
stdout = sys.stdout
conf = [
{
'file': 'myfile.yaml',
'models': [
dict(model='your.model', pks='your, primary, keys'),
dict(model='your.model', pks='your, primary, keys')
]
}
]
for fixture in conf:
print('Processing: %s' % fixture['file'])
with open(fixture['file'], 'w') as f:
sys.stdout = FixtureAnonymiser(f)
for model in fixture['models']:
call_command('dumpdata', model.pop('model'), format='yaml',indent=4, **model)
sys.stdout.flush()
sys.stdout = stdout
In your test unit, import your generated .yaml file as a fixture and your test will automatically use this the data from the fixture to carry out the tests, keeping your main database untouched.
test_class.py
from django.test import TestCase
class classTest(TestCase):
fixtures = ('myfile.yaml',)
def setUp(self):
"""setup tests cases"""
# create the object you want to test here, which will use data from the fixtures
def test_function(self):
self.assertEqual(True,True)
# write your test here
You can read up more here:
Django
YAML
If you have any questions because things are unclear just ask, I'd be happy to help you out.

Maybe my solution will help someone
I used transactions to roll back to the database state that I had at the start of the test.
I use Eric Cousineau's decorator function to parametrizing tests
More about database transactions at django documentation page
import functools
from django.db import transaction
from django.test import TransactionTestCase
from django.contrib.auth import get_user_model
User = get_user_model()
def sub_test(param_list):
"""Decorates a test case to run it as a set of subtests."""
def decorator(f):
#functools.wraps(f)
def wrapped(self):
for param in param_list:
with self.subTest(**param):
f(self, **param)
return wrapped
return decorator
class MyTestCase(TransactionTestCase):
#sub_test([
dict(email="new#user.com", password='12345678'),
dict(email="new#user.com", password='password'),
])
def test_passwords(self, email, password):
# open a transaction
with transaction.atomic():
# Creates a new savepoint. Returns the savepoint ID (sid).
sid = transaction.savepoint()
# create user and check, if there only one with this email in DB
user = User.objects.create(email=email, password=password)
self.assertEqual(User.objects.filter(email=user.email).count(), 1)
# Rolls back the transaction to savepoint sid.
transaction.savepoint_rollback(sid)

Related

Where are rows created with pytest-postgresql

I inherited a fast-api project that uses PostgreSQL and pytest-postgresql to provide pytest fixtures to test against PostgreSQL. Out of curiosity I placed some breakpoint() statements in several places after row model creation, and commits(), but before any cleanup. With breakpoint() holding the process, I then looked at the database server to see if I could find the data that was entered with pytest-postgresql. I could find nothing. Where would this data be?
In my conftest.py file, I have the following for pytest-postgresql setup.
from pytest_postgresql import factories
...
postgresql_proc = factories.postgresql_proc(
host="localhost",
user="REDACTED",
port="5432",
password="REDACTED",
)
pg_fixture = factories.postgresql("postgresql_proc", db_name="REDACTED")
#pytest.fixture(scope="function")
def db_session(pg_fixture):
"""
A session object to a non persistent db.
Will clean up the database after each test run, in its cleanup stage
"""
sqlalchemy_uri = (
f"postgresql://{pg_fixture.info.user}:{pg_fixture.info.password}#"
f"{pg_fixture.info.host}:{pg_fixture.info.port}"
f"/{pg_fixture.info.dbname}"
)
engine = get_engine(sqlalchemy_uri)
models.base.Base.metadata.create_all(engine) # CREATES VARIOUS MODELS
Session = sessionmaker(bind=engine)
yield Session()
models.base.Base.metadata.drop_all(engine)
The tests work, so I assume that its setup correctly. And by work pass when they should and fail when they should not. But for the life of me I can not understand where pytest-postgresql is putting the row data inserted with the model creation in test setup.

Import a JSON project wise, so it loads just once

I have a Python project that performs a JSON validation against a specific schema.
It will run as a Transform step in GCP Dataflow, so it's very important that all dependencies are gathered before the run to avoid downloading the same file again and again.
The schema is placed in a separated Git repository.
The nature of the Transformer is that you receive a single record in your class, and you work with it. The typical flow is that you load the JSON Schema, you validate the record against it, and then you do stuff with the invalid and with the valid. Loading the schema in this way means that I download the schema from the repo for every record, and it could be hundred thousands.
The code gets "cloned" into the workers and then work kinda independent.
Inspired by the way Python loads the requirements at the beginning (one single time) and using them as imports, I thought I could add the repository (where the JSON schema lives) as a Python requirement, and then simply use it in my Python code. But of course, it's a JSON, not a Python module to be imported. How can it work?
An example would be something like:
requirements.txt
git+git://github.com/path/to/json/schema#41b95ec
dataflow_transformer.py
import apache_beam as beam
import the_downloaded_schema
from jsonschema import validate
class Verifier(beam.DoFn):
def process(self, record: dict):
validate(instance=record, schema=the_downloaded_schema)
# ... more stuff
yield record
class Transformer(beam.PTransform):
def expand(self, record):
return (
record
| "Verify Schema" >> beam.ParDo(Verifier())
)

You can load the json schema once and use it as a side input.
An example:
import json
import requests
json_current='https://covidtracking.com/api/v1/states/current.json'
def get_json_schema(url):
with requests.Session() as session:
schema = json.loads(session.get(url).text)
return schema
schema_json = get_json_schema(json_current)
def feed_schema(data, schema):
yield {'record': data, 'schema': schema[0]}
schema = p | beam.Create([schema_json])
data = p | beam.Create(range(10))
data_with_schema = data | beam.FlatMap(feed_schema, schema=beam.pvalue.AsSingleton(schema))
# Now do your schema validation
Just a demonstration of what the data_with_schema pcollection looks like

Why don't you just use a class for loading your resources that uses a cache in order to prevent double loading? Something along the lines of:
class JsonLoader:
def __init__(self):
self.cache = set()
def import(self, filename):
filename = os.path.absname(filename)
if filename not in self.cache:
self._load_json(filename)
self.cache.add(filename)
def _load_json(self, filename):
...

python: How to mock helper method?

Can you please help me out to figure what I did wrong? I have the following unit test for a python lambdas
class Tests(unittest.TestCase):
def setUp(self):
//some setup
#mock.patch('functions.tested_class.requests.get')
#mock.patch('functions.helper_class.get_auth_token')
def test_tested_class(self, mock_auth, mock_get):
mock_get.side_effect = [self.mock_response]
mock_auth.return_value = "some id token"
response = get_xml(self.event, None)
self.assertEqual(response['statusCode'], 200)
The problem is that when I run this code, I get the following error for get_auth_token:
Invalid URL '': No schema supplied. Perhaps you meant http://?
I debugged it, and it doesn't look like I patched it correctly. The Authorization helper file is in the same folder "functions" as the tested class.
EDIT:
In the tested_class I was importing get_auth_token like this:
from functions import helper_class
from functions.helper_class import get_auth_token
...
def get_xml(event, context):
...
response_token = get_auth_token()
After changing to this, it started to work fine
import functions.helper_class
...
def get_xml(event, context):
...
response_token = functions.helper_class.get_auth_token()
I still don't fully understand why though

In your first scenario
in tested_class.py, get_auth_token is imported
from functions.helper_class import get_auth_token
The patch should be exactly the get_auth_token at tested_class
#mock.patch('functions.tested_class.get_auth_token')
Second scenario
With the following usage
response_token = functions.helper_class.get_auth_token()
The only way to patch is this
#mock.patch('functions.helper_class.get_auth_token')
alternative
With import like this in tested_class
from functions import helper_class
helper_class.get_auth_token()
patch could be like this:
#mock.patch('functions.tested_class.helper_class.get_auth_token')

patch() works by (temporarily) changing the object that a name points to with another one. There can be many names pointing to any individual object, so for patching to work, you must ensure that you patch the name used by the system under test.
The basic principle is that you patch where an object is looked up, which is not necessarily the same place as where it is defined.
Python documentation has a very good example. where to patch

django database inserts not getting picked up

We have a little bit of a complicated setup:
In our normal code, we connect manually to a mysql db. We're doing this because I guess the connections django normally uses are not threadsafe? So we let django make the connection, extract the information from it, and then use a mysqldb connection to do the actual querying.
Our code is largely an update process, so we have autocommit turned off to save time.
For ease of creating test data, I created django models that represent the tables, and use them to create rows to test on. So I have functions like:
def make_thing(**overrides):
fields = deepcopy(DEFAULT_THING)
fields.update(overrides)
s = Thing(**fields)
s.save()
transaction.commit(using='ourdb')
reset_queries()
return s
However, it doesn't seem to actually be committing! After I make an object, I later have code that executes raw sql against the mysqldb connection:
def get_information(self, value):
print self.api.rawSql("select count(*) from thing")[0][0]
query = 'select info from thing where column = %s' % value
return self.api.rawSql(query)[0][0]
This print statement prints 0! Why?
Also, if I turn autocommit off, I get
TransactionManagementError: This is forbidden when an 'atomic' block is active.
when we try to alter the autocommit level later.
EDIT: I also just tried https://groups.google.com/forum/#!topic/django-users/4lzsQAWYwG0, which did not help.
EDIT2: I checked from a shell against the database--the commit is working, it's just not getting picked up. I've tried setting the transaction isolation level but it isn't helping. I should add that a function further up from get_information uses this decorator:
def single_transaction(fn):
from django.db import transaction
from django.db import connection
def wrapper(*args, **kwargs):
prior_autocommit = transaction.get_autocommit()
transaction.set_autocommit(False)
connection.cursor().execute('set transaction isolation level read committed')
connection.cursor().execute("SELECT ##session.tx_isolation")
try:
result = fn(*args, **kwargs)
transaction.commit()
return result
finally:
transaction.set_autocommit(prior_autocommit)
django.db.reset_queries()
gc.collect()
wrapper.__name__ = fn.__name__
return wrapper

Django: Detect unused templates

Is there a way to detect unused templates in a Django project?
Before Django 1.3, that would have been possible with a simple string-matching function like this one. But since 1.3, there are generic class based views that automatically generate a template_name, if you don't override it (e.g. DetailView).
Also, if you override 3rd party module templates, those templates aren't used anywhere directly in your views.
Maybe it could be done by crawling all URL definitions, loading the corresponding views and getting the template_name from them?

I was curious if you could do this by monkey patching/decorating get_template instead. I think you can, though you have to find all the template loading
functions (I have two in my example below).
I used wrapt when I noticed it went beyond just loader.get_template, but it seems to the trick just fine. Of course, keep this 50000 km away from prod, but...
Now, the thing to follow as well is that I am driving this with unittests and nosetests so, if you have full branch coverage of your template-using Python code, you should be able to get most templates (assuming I didn't miss any get_template-type functions).
in settings.py
This is the "brains" to patch get_template & co.
import wrapt
import django.template.loader
import django.template.engine
def wrapper(wrapped, instance, args, kwargs):
#concatenate the args vector into a string.
# print "\n\n\n\n%s\nI am a wrapper \nusage:%s\n%s\n\n\n\n\n" % ("*"*80, usage, "*"*80)
try:
return wrapped(*args, **kwargs)
finally:
usage = ",".join([unicode(arg) for arg in args if arg])
track_usage(usage)
#you have to wrap whatever is loading templates...
#imported django module + class/method/function path of what needs to be
#wrapped within that module. comment those 2 lines out and you are back to
#normal
wrapt.wrap_function_wrapper(django.template.loader, 'get_template', wrapper)
wrapt.wrap_function_wrapper(django.template.engine, 'Engine.find_template', wrapper)
See safely-applying-monkey-patches-in-python for more details on wrapt. Actually easier to use than to understand the docs, decorators make my brain hurt.
Also, to track which django functions were doing the actual loads I mispelled some template names on purpose in the code and in templates, ran unit tests on it and looked at the stacktraces for missing template exceptions.
This is my rather badly-written function which adds to a set and puts it into
a json output....
def track_usage(usage):
fnp_usage = "./usage.json"
try:
with open(fnp_usage, "r") as fi:
data = fi.read()
#read the set of used templates from the json file
j_data = json.loads(data)
s_used_file = set(j_data.get("li_used"))
except (IOError,),e:
s_used_file = set()
j_data = dict()
s_used_file.add(usage)
#convert the set back to a list for json compatibility
j_data["li_used"] = list(s_used_file)
with open(fnp_usage, "w") as fo:
json.dump(j_data, fo)
and the ouput (with a script to format it):
import sys
import json
fnp_usage = sys.argv[1]
with open(fnp_usage, "r") as fi:
data = fi.read()
#read the set of used templates from the json file
j_data = json.loads(data)
li_used_file = j_data.get("li_used")
li_used_file.sort()
print "\n\nused templates:"
for t in li_used_file:
print(t)
From wrapping the 2 functions above, it seems to have caught extends, %includes and straight get_templates, as well as list-type templates that were being used by class-based views. It even caught my dynamically generated templates which aren't even on the file system but get loaded with a custom loader.
used templates:
bootstrap/display_form.html
bootstrap/errors.html
bootstrap/field.html
bootstrap/layout/baseinput.html
bootstrap/layout/checkboxselectmultiple.html
bootstrap/layout/field_errors.html
bootstrap/layout/field_errors_block.html
bootstrap/layout/help_text.html
bootstrap/layout/help_text_and_errors.html
bootstrap/layout/radioselect.html
bootstrap/whole_uni_form.html
django_tables2/table.html
dynamic_template:db:testdb:name:pssecurity/directive.PrimaryDetails.json
uni_form/layout/div.html
uni_form/layout/fieldset.html
websec/__base.html
websec/__full12.html
websec/__l_right_sidebar.html
websec/bootstrapped_home.html
websec/changedb.html
websec/login.html
websec/requirejs_config.html
websec/topnav.html
websec/user_msg.html

It's not possible to detect unused templates for certain, even in the absence of generic views, because you can always write code like this:
get_template(any_code_you_like()).render(context)
So even prior to Django 1.3 the django-unused-templates application you linked to could only have worked for projects that respected some kind of discipline about the use of templates. (For example, always having a string literal as the template argument to functions like get_template and render_to_response.)
Loading all the views wouldn't be sufficient either: a view may use different templates under different circumstances:
def my_view(request):
if request.user.is_authenticated():
return render(request, 'template1.html')
else:
return render(request, 'template2.html')
And of course templates may not be used by views at all, but by other parts of the system (for example, e-mail messages).

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Django TestCase: recreate database in self.subTest(...) - python

Related

Where are rows created with pytest-postgresql

Import a JSON project wise, so it loads just once

python: How to mock helper method?

django database inserts not getting picked up

Django: Detect unused templates

Categories

Resources