Load data on startup

Load data on startup - python

I have a file with a bunch of data common between several projects. The data needs to be loaded into the Django database. The file doesn't change that much, so loading it once on server start is sufficient. Since the file is shared between multiple projects, I do not have full control over the format, so I cannot convert this into a fixture or something.
I tried loading it in ready(), but then I run into a problem when creating a new database or migrating an existing database, since apparently ready() is called before migrations are complete and I get errors from using models that do not have underlying tables. I tried to set it in class_prepared signal handler, but the loading process uses more than one model, so I cannot really be sure all required model classes are prepared. Also it seems that ready() is not called when running tests, so unit tests fail because the data is missing. What is the right place to do something like this?

It seems that what I am looking for doesn't exist. Django trusts the user to deal with migrations and such and doesn't check the database on load. So there is no place in the system where you can load some data on system start and be sure that you can actually load it. What I ended up doing is loading the data in ready(), but do a sanity check first by doing MyModel.objects.exist() in a try: except: block and returning if there was an exception. This is not ideal, but I haven't found any other way.

Related

During TestCase execution django.db.connection.cursor() SQL query returning data from main DB, not from the test one

I'm facing a problem which I have no more ideas how to resolve.
I have need to test data which is returned by direct query from database.
During execution of TestCase django.db.connection.cursor() is returning data from main database, not from test one, which contain fixtures prepared for this test purposes.
I've tried to use both TestCase and TransactionalTestCase.
I've tried debugging, checking variables values and found out that connection is pointed onto test database for sure.
Do you know why it's returning data from main database? Is there any case when Django is copying data from main database to this created for tests purposes?
I'm using: Python 3.6.5, Django 2.1, pytest 4.6.3, pytest-django 3.5
Thanks in advance for any support.
Follow-up
Dears,
That problem occurs only while trying to perform custom raw SQL Query directly inside Test Case. Retrieving objects by standard Django QuerySets works fine.
Do you have any idea why this specific way to retrieve data from db not working during test execution?

I found an answer in Django documentation:
If your code attempts to access the database when its modules are compiled, this will occur before the test database is set up, with potentially unexpected results.
Still - if you know any way to avoid affecting tests by production data while raw sql queries are performed I would love to know how to do it.

Django model isn't persisting data to DB on real-time

I'm using Django Python framework, and MySQL DBMS.
In the screenshot below, I'm creating the new_survey_draft object using the SurveyDraft.objects.create() as shown, assuming that it should create a new row in the surveydraft DB table, but as also shown in the screenshot, and after debugging my code, the new_survey_draft object was created with id=pk=270 , while the DB table shown in the other window to the right doesn't have the new row with the id=270.
Even when setting a break point in the publish_survey_draft() called after the object instantiation, I called the SurveyDraft.objects.get(pk=270) which returned the object, but still there is not id=270 in the DB table.
And finally, after resuming the code and returning from all definitions, the row was successfully added to the DB table with the id=270.
I'm wondering what's happening behind the seen, and is it possible that Django stores data in objects without persisting to DB on real-time, and only persists the data all together on some later execution point?
I've been stuck in this for hours and couldn't find anything helpful online, so I really appreciate any advice regarding the issue.

After digging deep into this issue, I just found that there is a concept called Atomic Requests that's enabled in my Django project by setting the ATOMIC_REQUESTS to True in the settings.py under the DATABASES dictionary as explained here
It works like this. Before calling a view function, Django starts a
transaction. If the response is produced without problems, Django
commits the transaction. If the view produces an exception, Django
rolls back the transaction.
That's why the changes were not persisting in the database while debugging my code using break points, since the changes will only be committed to the DB once the successful response is returned.

Can I create models.py Models that load data from the file system (NOT from a database)?

We have a test tool that produces test results following a directory structure like follows:
test-results/${TEST-ID}/exec_${YYYYMMDD}_${HHMMSS}/
Inside each exec folder, there are several files like CSVs, HTML reports, charts, etc. The structure is always the same, and for simplicity's sake we don't use a database.
Now I would like to use Django to build a simple website for displaying these test results. Think of a reporting website, with some basic functionality like comparing test executions against each other.
From reading The Tutorial, I understand that in a Django app I should define my data in models.py using classes that extend django.db.models.Model, and later work with the API (e.g. object.save(), object.delete(), etc.) while the framework takes care of the database operations.
My data is a set of test results, which lives on a file system, not in a database.
That said, I would like to keep the data abstraction in models.py (i.e. to keep the MVC abstraction). The Django app only needs to read data, e.g.:
TestResult.objects.all() would load all TestResults from the test-results directory
TestResult.objects.filter(test_id=1) would return all TestResults for TEST-ID 1
and so on.
Updating data is not necessary; the app only reads data from the file system and displays it.
Can I achieve this behaviour using Django?
My current assumption is that I have to write the abstraction layer somewhere (extend the Model class and overwrite certain methods?), but I'm not sure this is the best/correct approach.

Django model retrieves same results

I have a django model, TestModel, over an SQL database.
Whenever I do
TestModel.objects.all()
I seem to be getting the same results if I run it multiple times from the same process. I tested that by manually deleting (without using ANY of the django primitives) a line from the table the model is constructed on, the query still returns the same results, even though obviously there should be less objects after the delete.
Is there a caching mechanism of some sort and django is not going to the database every time I want to retrieve the objects?
If there is, is there a way I could still force django to go to the database on each query, preferably without writing raw SQL queries?
I should also specify that by restarting the process the model once again returns the correct objects, I don't see the deleted ones anymore, but if I delete some more the issue occurs again.

This is because your database isolation level is repeatable read. In a django shell all requests are enclosed in a single transaction.
Edited
You can try in your shell:
from django.db import transaction
with transaction.autocommit():
t = TestModel.objects.all()
...

Sounds like a db transaction issue. If you're keeping a shell session open while you separately go into the database itself and modify data, the transaction that's open in the shell won't see the changes because of isolation. You'll need to exit and reload the shell to get a new transaction before you can see them.
Note that in production, transactions are tied to the request/response cycle so this won't be a significant issue.

Django - Populating a database for test purposes

I need to populate my database with a bunch of dummy entries (around 200+) so that I can test the admin interface I've made and I was wondering if there was a better way to do it. I spent the better part of my day yesterday trying to fill it in by hand (i.e by wrapping stuff like this my_model(title="asdfasdf", field2="laksdj"...) in a bunch of "for x in range(0,200):" loops) and gave up because it didn't work the way I expected it to. I think this is what I need to use, but don't you need to have (existing) data in the database for this to work?

Check this app
https://github.com/aerosol/django-dilla/
Let's say you wrote your blog application (oh yeah, your favorite!) in Django. Unit tests went fine, and everything runs extremely fast, even those ORM-generated ultra-long queries. You've added several categorized posts and it's still stable as a rock. You're quite sure the app is efficient and ready to for live deployment. Right? Wrong.

You can use fixtures for this purpose, and the loaddata management command.
One approach is to do it like this.
Prepare your test database.
Use dumpdata to create JSON export of the database.
Put this in the fixtures directory of your application.
Write your unit tests to load this "fixture": https://docs.djangoproject.com/en/2.2/topics/testing/tools/#django.test.TransactionTestCase.fixtures

Django fixtures provide a mechanism for importing data on syncdb. However, doing this initial data propagation is often easier via Python code. The technique you outline should work, either via syncdb or a management command. For instance, via syncdb, in my_app/management.py:
def init_data(sender, **kwargs):
for i in range(1000):
MyModel(number=i).save()
signals.post_syncdb.connect(init_data)
Or, in a management command in myapp/management/commands/my_command.py:
from django.core.management.base import BaseCommand, CommandError
from models import MyModel
class MyCommand(BaseCommand):
def handle(self, *args, **options):
if len(args) > 0:
raise CommandError('need exactly zero arguments')
for i in range(1000):
MyModel(number=i).save()
You can then export this data to a fixture, or continue importing using the management command. If you choose to continue to use the syncdb signal, you'll want to conditionally run the init_data function to prevent the data getting imported on subsequent syncdb calls. When a fixture isn't sufficient, I personally like to do both: create a management command to import data, but have the first syncdb invocation do the import automatically. That way, deployment is more automated but I can still easily make modifications to the initial data and re-run the import.

I'm not sure why you require any serialization. As long as you have setup your Django settings.py file to point to your test database, populating a test database should be nothing more than saving models.
for x in range(0, 200):
m = my_model(title=random_title(), field2=random_string(), ...)
m.save()
There are better ways to do this, but if you want a quick test set, this is the way to go.

The app recommended by the accepted answer is no longer being maintained however django-seed can be used as a replacement:
https://github.com/brobin/django-seed

I would recommend django-autofixtures to you. I tried both django_seed and django-autofixtures, but django_seed has a lot of issues with unique keys.
django-autofixtures takes care of unique, primary and other db constraints while filling up the database

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Load data on startup - python

Related

During TestCase execution django.db.connection.cursor() SQL query returning data from main DB, not from the test one

Django model isn't persisting data to DB on real-time

Can I create models.py Models that load data from the file system (NOT from a database)?

Django model retrieves same results

Django - Populating a database for test purposes

Categories

Resources