Unit tests with an unmanaged external read-only database - python

I'm working on a project which involves a huge external dataset (~490Gb) loaded in an external database (MS SQL through django-pyodbc-azure). I've generated the Django models marked managed=False in their meta. In my application this works fine, but I can't seem to figure out how to run my unit tests. I can think of two approaches: mocking the data in a test database, and giving the unit tests (and CI) read-only access to the production dataset. Both options are acceptable, but I can't figure out either of them:
Option 1: Mocked data
Because my models are marked managed=False, there are no migrations, and as a result, the test runner fails to create the database.
Option 2: Live data
django-pyodbc-azure will attempt to create a test database, which fails because it has a read-only connection. Also I suspect that even if it were allowed to do so, the resulting database would be missing the required tables.
Q How can I run my unittests? Installing additional packages, or reconfiguring the database is acceptable. My setup uses django 1.9 with postgresql for the main DB.

After a day of staring at my screen, I found a solution:
I removed the managed=True from the models, and generated migrations. To prevent actual migrations against the production database, I used my database router to prevent the migrations. (return False in allow_migrate when for the appropriate app and database).
In my settings I detect whether unittests are being run, and then just don't define the database router or the external database. With the migrations present, the unit tests.

Related

How can I write database Django database model instances during startup of the Django project?

Is there a way to write the values of database model instances with initialization values during the startup phase of the Django project?
I'm not interested in migrations which require to invoke the command line prior to starting the Django project. The changes to the database shall be persistent w.r.t. the database change history.
In my case the system is in the IIoT context:
An edge backend
a MQTT broker
potentially several end-device agents.
The operating state of the overall application consists of the operating state of the edge backend, the MQTT broker and the end devices.
The single source of truth about the opstate shall be the database. This means only the edge backend needs to know about the opstate.
Initially after every start/restart of the Django project the operating state of all sub-components shall be unknown. One database model instance (singleton) holds the operating state of the overall distributed system via one-to-one references.
Why not use a fixture?
django-admin loaddata fixture [fixture ...]
or
python manage.py loaddata fixture <file>
Edit:
Ok so you make 2 points:
There should be a history logged in the database.
Signals should not be triggered.
point 1:
Since loaddata is using not the model to create the instances and save them, I don't know how it handles the history. Maybe someone else can weigh in here?
point 2:
as for signals, this is how you avoid the signals from firing when loading fixtures as explained in the official django documentation:
When fixture files are processed, the data is saved to the database as
is. Model defined save() methods are not called, and any pre_save or
post_save signals will be called with raw=True since the instance only
contains attributes that are local to the model. You may, for example,
want to disable handlers that access related fields that aren’t
present during fixture loading and would otherwise raise an exception:
from django.db.models.signals import post_save
from .models import MyModel
def my_handler(**kwargs):
# disable the handler during fixture loading
if kwargs['raw']:
return
...
post_save.connect(my_handler, sender=MyModel)
docs

Using web2py for a user frontend crud

I was asked to port a Access database to MySQL and
provide a simple web frontend for the users.
The DB consists of 8-10 tables and stores data about
clients consulting (client, consultant,topic, hours, ...).
I need to provide a webinterface for our consultants to use,
where they insert all this information during a session into a predefined mask/form.
My initial thought was to port the Access-DB to MySQL, which I have done
and then use the web2py framework to build a user interface with login,
inserting data, browse/scroll through the cases and pulling reports.
web2py with usermanagment and a few samples views & controllers and
MySQL-DB is running. I added the DB to the DAL in web2py,
but now I noticed, that with web2py it is mandatory to define every table
again in web2py for it being able to communicate with the SQL-Server.
While struggeling to succesfully run the extract_mysql_models.py script
to export the structure of the already existing SQL DB for use in web2py
concerns about web2py are accumulating.
This double/redundant way of talking to my DB strikes me as odd and
web2py does not support python3.
Is web2py the correct way to fulfill my task or is there better way?
Thank you very much for listening/helping out.
This double/redundant way of talking to my DB strikes me as odd and web2py does not support python3.
Any abstraction you want to use to communicate with your database (whether it be the web2py DAL, the Django ORM, SQLAlchemy, etc.) will have to have some knowledge of the database schema in order to construct queries.
Even if you programmatically generated all the SQL statements yourself without use of an ORM/DAL, your code would still have to have some knowledge of the database structure (i.e., somewhere you have to specify names of tables and fields, etc.).
For existing databases, we aim to automate this process via introspection of the database schema, which is the purpose of the extract_mysql_models.py script. If that script isn't working, you should report an issue on Github and/or open a thread on the web2py Google Group.
Also, note that when creating a new database, web2py helps you avoid redundant specification of the schema by handling migrations (including table creation) for you -- so you specify the schema only in web2py, and the DAL will automatically create the tables in the database (of course, this is optional).

Django App - Getting Initial Rows into Database

I have a Django application and I have a few rows that I want to be in the application at the beginning of time. An example is a System Settings table that has some settings - they should be setup with any db instance that is constructed.
In the past, I had handled this by making an migration script manually that inserted the records. However, when I run my tests and the database is created and deleted, these scripts are not run again and the database is empty. The tests assume the that the migrations only have schema migrations, so that they don't need to run them again, but that is not the case. This has led me to think that maybe my migrations shouldn't be data migrations and I should rethink the process? I am not sure what to do.
you could maybe use fixtures in your tests for initial data for models.
https://docs.djangoproject.com/en/1.8/howto/initial-data/

Django postres methods for external Database

I am working in a Django app for investors. Currently using a database and 3 installed apps configured in the settings.py.
I am about integrating a new feature for which every broker will register their IP in our app, so that we will replicate our postgres database in their server (Everything is same except 'HOST' regarding with database) manually. Then broker will send GET and POST methods to our server from their server.
I need to switch the database based on the request coming. I think I can connect their postgres database dynamically by looking the request and process by SQL queries. My requirement is, I just need to use Django postgres methods for processing without configuring the database in settings file.
if configuring database in settings is the only way, how can I switch to database every time efficiently and how many databases can be connected in a single Django app?
I believe if you want to use Django methods (and not simply use RAW SQL queries and parse them) you will have to use the settings.py method and define all your databases there.
https://docs.djangoproject.com/en/1.7/topics/db/multi-db/
In short, you define a database and can manually chose it in your code via (as per docs):
Author.objects.using(database_name_variable).filter(...)
An alternative would be to look at using REST api (like Tastypie) to make calls to different Django instances connected to each database.

django database routing with transactions

Referring to the example in Django documentation for multiple databases in one application,
https://docs.djangoproject.com/en/dev/topics/db/multi-db/#an-example
" It also doesn’t consider the interaction of transactions with the database utilization strategy. "
How do I handle the interaction stated above.
The scenario is this:
I am using postgresql as my database. I have setup up a replica and want all the reads to "auth" tables to go to replica. Following the documentation I wrote a database router. Now whenever I try to log in to my application is throws the following error.
DatabaseError: cannot execute UPDATE in a read-only transaction.
This happens when Django tries to save the "last_login" time. Because, in the same view it first fetches the record from replica, and then tries to update the last_login time. Since it happens in one transaction so the same database is used, i.e. replica.
How do I handle this?
Thoughts?

Categories