Django 1.10 seed database without using fixtures

Django 1.10 seed database without using fixtures - python

So I've looked at the documentation, as well as this SO question, and the django-seed package, but none of these seem to fit what I'm trying to do.
Basically, I want to programmatically seed my Games model from an external API, but all the information I can find seems to be reliant on generating a fixture first, which seems like an unnecessary step.
For example, in Ruby/Rails you can write directly to seed.rb and seed the database in anyway that's desired.
If a similar functionality available in Django, or do I need to generate the fixture first from the API, and then import it?

You can use a data migration for this. First create an empty migration for your app:
$ python manage.py makemigrations yourappname --empty
In your empty migration, create a function to load your data and add a migrations.RunPython operation. Here's a modified version of the one from the Django documentation on migrations:
from __future__ import unicode_literals
from django.db import migrations
def stream_from_api():
...
def load_data(apps, schema_editor):
# We can't import the Person model directly as it may be a newer
# version than this migration expects. We use the historical version.
Person = apps.get_model('yourappname', 'Person')
for item in stream_from_api():
person = Person(first=item['first'], last=item['last'], age=item['age'])
person.save()
class Migration(migrations.Migration):
dependencies = [('yourappname', '0009_something')]
operations = [migrations.RunPython(load_data)]
If you have a lot of simple data, you might benefit from the bulk-creation methods:
from __future__ import unicode_literals
from django.db import migrations
def stream_from_api():
...
def load_data(apps, schema_editor):
# We can't import the Person model directly as it may be a newer
# version than this migration expects. We use the historical version.
Person = apps.get_model('yourappname', 'Person')
def stream_people():
for item in stream_from_api():
yield Person(first=item['first'], last=item['last'], age=item['age'])
# Adjust (or remove) the batch size depending on your needs.
# You won't be able to use this method if your objects depend on one-another
Person.objects.bulk_create(stream_people(), batch_size=10000)
class Migration(migrations.Migration):
dependencies = [('yourappname', '0009_something')]
operations = [migrations.RunPython(load_data)]
Migrations have the added benefit of being automatically enclosed in a transaction, so you can stop the migration at any time and it won't leave your database in an inconsistent state.

Would it work for you to write some class method on the Games model that creates the data? Presumably this method query the external API, package up the Games() objects as a list called games and then use a Games.objects.bulk_create(games) to insert it into the database.

Related

Django 3.1 - "OperationalError: no such table" when using an ORM Class Model before making or applying a migration

TL;DR If I have a Django model for which I create a migration, I can run the migration and create the table in the DB, then I can proceed to reference the model in the code. Now, when I merge the code into master, it will both contain the migration files and the code that references the Django model. When I pull this onto a new machine (e.g. staging/prod), I will want to run the migrations in this new environment so as to have an updated DB. correct? This operation seems impossible though because the code that references the model will want the table to already exist even to just perform the migration itself, otherwise the migration command fails. How to solve this issue?
The Problem
Whenever I create a new model class inside products/models.py (products is the name of my Django app), I cannot reference such model (e.g. Product model) in any method or class before having made and run the migrations. Otherwise, if I reference the new model in the code and then I run any of the following commands/procedures, I receive the OperationalError: no such table error:
python manage.py runserver
python manage.py makemigrations [appname]
python manage.py migrate
python manage.py showmigrations
Delete all existing migrations and try to regenerate them with python manage.py makemigrations [appname]
Delete current db.sqlite3 file; delete all existing migrations and try to regenerate them with python manage.py makemigrations [appname]
The error looks like this:
...
File "/Users/my_user/Work/django_proj_1/config/urls.py", line 21, in <module>
path('', include('products.urls')),
...
File "/Users/my_user/Work/django_proj_1/products/urls.py", line 1, in <module>
from products.api_views import ProductList3
File "/Users/my_user/Work/django_proj_1/products/api_views.py", line 7, in <module>
class ProductList3(ListAPIView):
File "/Users/my_user/Work/django_proj_1/products/api_views.py", line 8, in ProductList3
queryset = get_products()
File "/Users/my_user/Work/django_proj_1/products/data_layer/product_data_layer.py", line 4, in get_products
all_prods = list(Product.objects.values())
...
File "/Users/my_user/.local/share/virtualenvs/django_proj_1-a2O6RBaf/lib/python3.8/site-packages/django/db/backends/sqlite3/base.py", line 413, in execute
return Database.Cursor.execute(self, query, params)
django.db.utils.OperationalError: no such table: products_product
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
The Real Question
This behavior may seem normal, but if you think about it, how would a developer make a release in production where a new model is created and also used in the code simultaneously (given that running migrations will fail if you referenced the created model into the code)?
I found similar questions (which I linked below), but none of them seems actually relevant for a real-world large scale production project. The only 'solution' in the similar questions is to:
comment out the code (all functions in the layer/chain) that end up interacting with the new model that does not exist in the DB yet
(make and) run the migrations
revert the comments
This seems pretty manual and not feasible for real-world production applications. I can understand if the issue was for local development, in which I can
create a model,
quickly make and run the migrations
and then start coding and use that model.
However, how is it possible to go about this in a real-world production scenario? By this I mean, how is it possible to release a single PR of code where I create a model as well as use the model in the code and make sure I can successfully run the migration on the production machine upon release of such code?
More Context of my problem
Let's assume I have the following app in Django called products with, among others, the following files:
...
|_products/ # one of the Django apps in the project
|
|_ models.py # where I defined the Product model
|_ data_layer/ # a python module to separate all DB logic
| |_ __init__.py
| |_ product_data_layer.py # file where I put the function that references the Product model
|_ api_views.py # file where I call a function from the data layer
|_ urls.py # file where I create an API endpoint that references api_views.py
|
...
Code inside the above files:
# products/urls.py content
from products.api_views import ProductList3 # THORWS ERROR
from django.urls import path
urlpatterns = [
path('api/v1/products', ProductList3)
]
# products/api_views.py content
from rest_framework.generics import ListAPIView
from products.serializers import GenericProductSerializer
from products.data_layer.product_data_layer import get_products
class ProductList3(ListAPIView): # THORWS ERROR
queryset = get_products() # THORWS ERROR
serializer_class = GenericProductSerializer
# products/data_layer/product_data_layer.py content
from products.models import Product
def get_products():
all_prods = list(Product.objects.values()) # THORWS ERROR (main problem)
# all the logic I need
return all_prods
# products/models.py content
from django.db import models
class Product(models.Model):
product_name = models.CharField(max_length=100)
product_description = models.TextField('Main Product Description')
def __str__(self):
return self.product_name
Similar questions
The following questions are similar questions that I looked at, but besides some dev-process solutions, I found no substantial answer on how to tackle this issue once for all.
django migration no such table 1 - unclear as to how not to execute code on import
django migration no such table 2 - no answers as of 18/03/2021
django migration no such table 3 - does not work in my case

You write in your class declaration:
queryset = get_products()
And this function has this line:
all_prods = list(Product.objects.values())
What happens here? Anything written directly inside a class is executed when the class is being created. If it were a method of the class then that would not be evaluated. So get_products() is called when the class is being created. Next list(Product.objects.values()) is called where calling list on the queryset forces the query to be evaluated. Now obviously you are making migrations and the tables don't exist yet giving you an error.
Firstly this code of yours is bad design even if there are changes to the table Product your view will still only display the products that were gotten in the first query made when the server was started up. This is obviously unintended and not what you want to do.
What is the solution? Obviously your query should reside somewhere where it is actually meant to be. Mostly in Django with classes that get a queryset there is a method named get_queryset that does this task, so you should be overriding that:
class ProductList3(ListAPIView):
serializer_class = GenericProductSerializer
def get_queryset(self):
return get_products() # More sensible to write it here.
# If concerned about overriding something you can use the below which would be equivalent
# self.queryset = get_products()
# return super().get_queryset()

Getting data from multiple databases with same tablenames in django

I need to get data from different imported MySQL-databases in Django
(Django 1.11.7, Python 3.5.2).
I run manage.py inspectdb --database '<db>' and then use the models in django.
Until now, I only had to access tables with different names. For this purpose, I used the using keyword in the queryset to specify the appropriate database and then concatenated the result, like this:
from ..models.db1 import Members
from ..models.db2 import Actor
context['db1_data'] = Members.objects.using('db1').filter...
context['db2_data'] = Actor.objects.using('db1').filter...
context["member_list"] = list(chain(
context["db1_data"],
context["db2_data"],
))
return context
Now I have the problem that there are tables with the same model names in two databases. I get the following error when using the above-mentioned method (I substituted the names):
RuntimeError: Conflicting '<table-name>' models in application '<app>': <class '<app>.<subfolder>.models.<db1>.<table-name>'> and <class '<app>.<subfolder>.models.<db2>.<table-name>'>.
I already tried importing the model with a different name, like this:
from ..models.db3 import Members as OtherMembers
but the error still comes up.
Shouldn't from ..models.db1 and from ..models.db2 be clear enough for Django to spot the difference between the two models?
One option would probably be to rename the models themselves, but that would mean to rename every database model with the same names. Since I will use many more databases in the future, this is not an option for me.
I tried from models import db1, db2 and then db1.Members etc., which still raises the error.
I read about the meta db_table = 'dbname.tablename'-option, but since the model is auto-generated through inspectdb, it already has something like this on every class:
class MyModel(models.Model):
<models>
class Meta:
managed = False
db_table = 'my_model'
As described before, the other database has the exact same model and hence the same Meta classes. I can't and don't want to change every Meta class.
EDIT:
My project structure looks like this:
app
-> admin.py
-> ...
-> models.py
-> views.py
subfolder
-> models
-> db1.py
-> db2.py
-> views
-> db1.py
-> db2.py

Assuming that you have set up your multiple databases correctly:
Have you tried to add a Custom Router?
If not follow the example given on the documentation link.
Have you tried to use a Custom Manager for your models?
Create a manager for each model, like this:
class YourModelManagerX(models.Manager):
def get_queryset(self, *args, **kwargs):
return super().get_queryset(*args, **kwargs).using('your_db_X')
And then add it to your appropriate model as the objects field:
class YourModel(models.Model):
...
fields
...
objects = YourManagerX()
class Meta:
managed = False
You may need to try both at once.

If db1.Members and db3.Members have the same definition, you do not have to redeclare the Members class separately for each database.
Models.py
...
class Members(models.Model): # Only declared once ever!
....
then,
from Models import Members
context['db1_data'] = Members.objects.using('db1').filter...
context['db3_data'] = Members.objects.using('db3').filter...
... # continue processing
Shouldn't from ..models.db1 and from ..models.db2 be clear enough for django to spot the difference between the two models?
Django Models are not database-specific, more like schema-specific, so if you have the same table in two different databases, one class extending model.Models suffices. Then, when you attempt to retrieve the objects, either specify the database with using(), or using routers, which you can read about in the Django docs https://docs.djangoproject.com/en/2.0/topics/db/multi-db/#an-example

How do I create new objects in django data migration?

I have a new django project with no migrations before, therefore I want to create my first migration which inserts some rows into a table. This is what I have yet:
# -*- coding: utf-8 -*-
from __future__ import unicode_literals
from django.db import migrations, models
def create_types(apps, schema_editor):
Type = apps.get_model("core", "Type")
type = Type()
class Migration(migrations.Migration):
dependencies = [
]
operations = [
migrations.RunPython(create_types)
]
I have an application core which has model named Type, and I want to create few types and set attributes for them, and all of them save into database. How to create a new object of Type model here?
Also other question, if this is very first migrator can I leave the dependencies empty? The file I created with this tutorial https://docs.djangoproject.com/en/1.8/topics/migrations/#data-migrations named it 0001_initial.py
EDIT: solved, creating model like this works, my overseight.
Type = apps.get_model("core", "Type")
type = Type()
type.prop = '..'
type.save()

The pattern I follow is to start with an initial migration:
> python manage.py makemigrations
This creates the migrations to build your table.
Next, I create an empty migration for the data:
> python manage.py makemigrations --empty core
Then I edit the newly created migration file with the data migration using the RunPython function. The pattern in the docs is a little different compared to the other answer here:
https://docs.djangoproject.com/en/1.11/ref/migration-operations/
def create_types(apps, schema_editor):
db_alias = schema_editor.connection.alias
Type = apps.get_model("core", "Type")
Type.objects.using(db_alias).create(property=value)
class Migration(migrations.Migration):
dependencies = [
('core', '0001_initial'),
]
operations = [
migrations.RunPython(create_types),
]

Firstly, this can't be the first migration, since you need one to actually create your Type table. You at least need to depend on that.
Secondly, once you have the Type model, you treat it just as you would in normal Django code; instantiate it, set its attributes, save it, whatever.

How (and when/where) init Django cache?

I'm trying to use the Django LocMemCache for storing a few simple values but I'm not sure how to initialize the cache when Django starts. Django 1.7 come with Applications, allowing to run some code in AppConfig.ready(), that place would be perfect to initialize the cache, but according to Django 1.7 documentation:
" ... Although you can access model classes as described above, avoid
interacting with the database in your ready() implementation."
So, let say I want to store some DB queries from my model:
x = MyModel.objects.count()
y = MyModel.(a really expensive query)
How and when should I init the cache? Is there a recommended "best practice" for doing that?
Currently, I have just added the following cache.py to my application, but I'm not sure if
my code hits the database once (i.e. the first request) and then uses the cached value before (the following requests).
# cache.py
from django.core.cache import caches
from .models import MyModel
class Cache(object):
def __init__(self):
self.__count = MyModel.objects.count()
self.cache = caches['cache-storage']
#property
def total_count(self):
return self.cache.get('total_count', self.__count)
Then I use the cached values in this way:
# view.py
from .cache import Cache
cache = Cache()
...
(some view)
counter = cache.total_count

That tip about using databases in ready() is generally good advice but isn't always going to work out. Some applications need to access the database based on a schedule instead of a view. The only way I've seen to implement that is from within ready().
This does create an issue because all manage.py commands will run the code in ready(). This means that the application would be accessing the database with your runtime code while running migrations or creating super users which isn't ideal.
I avoid this issue by adding a sys.argv check in ready().
import sys
from django.apps import AppConfig
class MyApp(AppConfig):
name = 'MyApp'
def ready(self):
if 'runserver' in sys.argv:
# your code here

Writing good tests for Django applications

I've never written any tests in my life, but I'd like to start writing tests for my Django projects. I've read some articles about tests and decided to try to write some tests for an extremely simple Django app or a start.
The app has two views (a list view, and a detail view) and a model with four fields:
class News(models.Model):
title = models.CharField(max_length=250)
content = models.TextField()
pub_date = models.DateTimeField(default=datetime.datetime.now)
slug = models.SlugField(unique=True)
I would like to show you my tests.py file and ask:
Does it make sense?
Am I even testing for the right things?
Are there best practices I'm not following, and you could point me to?
my tests.py (it contains 11 tests):
# -*- coding: utf-8 -*-
from django.test import TestCase
from django.test.client import Client
from django.core.urlresolvers import reverse
import datetime
from someproject.myapp.models import News
class viewTest(TestCase):
def setUp(self):
self.test_title = u'Test title: bąrekść'
self.test_content = u'This is a content 156'
self.test_slug = u'test-title-bareksc'
self.test_pub_date = datetime.datetime.today()
self.test_item = News.objects.create(
title=self.test_title,
content=self.test_content,
slug=self.test_slug,
pub_date=self.test_pub_date,
)
client = Client()
self.response_detail = client.get(self.test_item.get_absolute_url())
self.response_index = client.get(reverse('the-list-view'))
def test_detail_status_code(self):
"""
HTTP status code for the detail view
"""
self.failUnlessEqual(self.response_detail.status_code, 200)
def test_list_status_code(self):
"""
HTTP status code for the list view
"""
self.failUnlessEqual(self.response_index.status_code, 200)
def test_list_numer_of_items(self):
self.failUnlessEqual(len(self.response_index.context['object_list']), 1)
def test_detail_title(self):
self.failUnlessEqual(self.response_detail.context['object'].title, self.test_title)
def test_list_title(self):
self.failUnlessEqual(self.response_index.context['object_list'][0].title, self.test_title)
def test_detail_content(self):
self.failUnlessEqual(self.response_detail.context['object'].content, self.test_content)
def test_list_content(self):
self.failUnlessEqual(self.response_index.context['object_list'][0].content, self.test_content)
def test_detail_slug(self):
self.failUnlessEqual(self.response_detail.context['object'].slug, self.test_slug)
def test_list_slug(self):
self.failUnlessEqual(self.response_index.context['object_list'][0].slug, self.test_slug)
def test_detail_template(self):
self.assertContains(self.response_detail, self.test_title)
self.assertContains(self.response_detail, self.test_content)
def test_list_template(self):
self.assertContains(self.response_index, self.test_title)

I am not perfect in testing but a few thoughts:
Basically you should test every function, method, class, whatever, that you have written by yourself.
This implies that you don't have to test functions, classes, etc. which the framework provides.
That said, a quick check of your test functions:
test_detail_status_code and test_list_status_code:
Ok to check whether you have configured the routing properly or not. Even more important when you provide your own implementation of get_absolute_url().
test_list_numer_of_items:
Ok if a certain number of items should be returned by the view. Not necessary if the number is not important (i.e. arbitrary).
test_detail_template and test_list_template:
Ok to check whether template variables are correctly set.
All the other functions: Not necessary.
What your are basically testing here is whether the ORM worked properly, whether lists work as expected and whether object properties can be accessed (or not). As long as you don't change e.g. the save() method of a model and/or provide your custom logic, I would not test this. You should trust the framework developers that this works properly.
You only should have to test what you have (over)written.
The model classes are maybe a special case. You basically have to test them, as I said, if you provide custom logic. But you should also test them against your requirements. E.g. it could be that a field is not allowed to be null (or that it has to be a certain datatype, like integer). So you should test that storing an object fails, if it has a null value in this field.
This does not test the ORM for correctly following your specification but test that the specification still fulfills your requirements. It might be that you change the model and change some settings (by chance or because you forgot about the requirements).
But you don't have to test e.g. methods like save() or wether you can access a property.
Of course when you use buggy third party code... well things can be different. But as Django uses the test framework itself to verify that everything is working, I would assume it is working.
To sum up:
Test against your requirements, test your own code.
This is only my point of view. Maybe others have better proposals.

Break your tests into two completely separate kinds.
Model tests. Put these in your models.py file with your model. These tests will exercise the methods in your model classes. You can do simple CRUD (Create, Retrieve, Update, Delete) to simply prove that your model works. Don't test every attribute. Do test field defaults and save() rules if you're curious.
For your example, create a TestNews class that creates, gets, updates and deletes a News item. Be sure to test the default date results. This class should be short and to the point. You can, if your application requires it, test various kinds of filter processing. Your unit test code can (and should) provide examples of the "right" way to filter News.
UI Tests. Put these in a separate tests.py file. These tests will test the view functions and templates.
Name the TestCase with the "condition" you're creating. "TestNotLoggedIn". "TestLoggedIn". "TestNoValidThis". "TestNotAllowedToDoThat". Your setUp will do the login and any other steps required to establish the required condition.
Name each test method with the action and result. "test_get_noquery_should_list", "test_post_should_validate_with_errors", "test_get_query_should_detail".

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.