Django BigInteger auto-increment field as primary key? - python

I'm currently building a project which involves a lot of collective intelligence. Every user visiting the web site gets created a unique profile and their data is later used to calculate best matches for themselves and other users.
By default, Django creates an INT(11) id field to handle models primary keys. I'm concerned with this being overflown very quickly (i.e. ~2.4b devices visiting the page without prior cookie set up). How can I change it to be represented as BIGINT in MySQL and long() inside Django itself?
I've found I could do the following (http://docs.djangoproject.com/en/dev/ref/models/fields/#bigintegerfield):
class MyProfile(models.Model):
id = BigIntegerField(primary_key=True)
But is there a way to make it autoincrement, like usual id fields? Additionally, can I make it unsigned so that I get more space to fill in?
Thanks!

Django now has a BigAutoField built in if you are using Django 1.10:
https://docs.djangoproject.com/en/1.10/ref/models/fields/#bigautofield

Inspired by lfagundes but with a small but important correction:
class BigAutoField(fields.AutoField):
def db_type(self, connection): # pylint: disable=W0621
if 'mysql' in connection.__class__.__module__:
return 'bigint AUTO_INCREMENT'
return super(BigAutoField, self).db_type(connection)
add_introspection_rules([], [r"^a\.b\.c\.BigAutoField"])
Notice instead of extending BigIntegerField, I am extending AutoField. This is an important distinction. With AutoField, Django will retrieve the AUTO INCREMENTed id from the database, whereas BigInteger will not.
One concern when changing from BigIntegerField to AutoField was the casting of the data to an int in AutoField.
Notice from Django's AutoField:
def to_python(self, value):
if value is None:
return value
try:
return int(value)
except (TypeError, ValueError):
msg = self.error_messages['invalid'] % str(value)
raise exceptions.ValidationError(msg)
and
def get_prep_value(self, value):
if value is None:
return None
return int(value)
It turns out this is OK, as verified in a python shell:
>>> l2 = 99999999999999999999999999999
>>> type(l2)
<type 'long'>
>>> int(l2)
99999999999999999999999999999L
>>> type(l2)
<type 'long'>
>>> type(int(l2))
<type 'long'>
In other words, casting to an int will not truncate the number, nor will it change the underlying type.

NOTE: This answer as modified, according to Larry's code. Previous solution extended fields.BigIntegerField, but better to extend fields.AutoField
I had the same problem and solved with following code:
from django.db.models import fields
from south.modelsinspector import add_introspection_rules
class BigAutoField(fields.AutoField):
def db_type(self, connection):
if 'mysql' in connection.__class__.__module__:
return 'bigint AUTO_INCREMENT'
return super(BigAutoField, self).db_type(connection)
add_introspection_rules([], ["^MYAPP\.fields\.BigAutoField"])
Apparently this is working fine with south migrations.

You could alter the table afterwards. That may be a better solution.

Since Django 3.2 the type of implicit primary key can be controlled with the DEFAULT_AUTO_FIELD setting (documentation). So, there is no need anymore to override primary keys in all your models.
#This setting will change all implicitly added primary keys to BigAutoField
DEFAULT_AUTO_FIELD = 'django.db.models.BigAutoField'
Note that starting with Django 3.2 new projects are generated with DEFAULT_AUTO_FIELD set to BigAutoField (release notes).

As stated before you could alter the table afterwards. That is a good solution.
To do that without forgetting, you can create a management module under your application package and use the post_syncdb signal.
https://docs.djangoproject.com/en/dev/ref/signals/#post-syncdb
This can cause django-admin.py flush to fail. But it is still the best alternative I know.

I also had the same problem. Looks like there is no support for BigInteger auto fields in django.
I've tried to create some custom field BigIntegerAutoField but I faced a problem with south migration system (south couldn't create sequence for my field).
After giving a try couple of different approaches I decided to follow Matthew's advice and do alter table (e.g. ALTER TABLE table_name ALTER COLUMN id TYPE bigint; in postgre)
Would be great to have solution supported by django (like built in BigIntegerAutoField) and south.

Related

ValueError: Cannot assign "1": "RecipeRequirements.ingredient" must be a "Ingredient" instance [duplicate]

Is there a way to set foreign key relationship using the integer id of a model? This would be for optimization purposes.
For example, suppose I have an Employee model:
class Employee(models.Model):
first_name = models.CharField(max_length=100)
last_name = models.CharField(max_length=100)
type = models.ForeignKey('EmployeeType')
and
EmployeeType(models.Model):
type = models.CharField(max_length=100)
I want the flexibility of having unlimited employee types, but in the deployed application there will likely be only a single type so I'm wondering if there is a way to hardcode the id and set the relationship this way. This way I can avoid a db call to get the EmployeeType object first.
Yep:
employee = Employee(first_name="Name", last_name="Name")
employee.type_id = 4
employee.save()
ForeignKey fields store their value in an attribute with _id at the end, which you can access directly to avoid visiting the database.
The _id version of a ForeignKey is a particularly useful aspect of Django, one that everyone should know and use from time to time when appropriate.
caveat: [ < Django 2.1 ]
#RuneKaagaard points out that employee.type is not accurate afterwards in recent Django versions, even after calling employee.save() (it holds its old value). Using it would of course defeat the purpose of the above optimisation, but I would prefer an accidental extra query to being incorrect. So be careful, only use this when you are finished working on your instance (eg employee).
Note: As #humcat points out below, the bug is fixed in Django 2.1
An alternative that uses create to create the object and save it to the database in one line:
employee = Employee.objects.create(first_name='first', last_name='last', type_id=4)

Get Python type of Django's model field?

How can I get corresponding Python type of a Django model's field class ?
from django.db import models
class MyModel(models.Model):
value = models.DecimalField()
type(MyModel._meta.get_field('value')) # <class 'django.db.models.fields.DecimalField'>
I'm looking how can I get corresponding python type for field's value - decimal.Decimal in this case.
Any idea ?
p.s. I've attempted to work around this with field's default attribute, but it probably won't work in all cases where field has no default value defined.
I don't think you can decide the actual python type programmatically there. Part of this is due to python's dynamic type. If you look at the doc for converting values to python objects, there is no hard predefined type for a field: you can write a custom field that returns object in different types depending on the database value. The doc of model fields specifies what Python type corresponds to each field type, so you can do this "statically".
But why would you need to know the Python types in advance in order to serialize them? The serialize modules are supposed to do this for you, just throw them the objects you need to serialize. Python is a dynamically typed language.
An ugly alternative is to check the field's repr():
if 'DecimalField' in repr(model._meta.get_field(fieldname)):
return decimal.Decimal
else:
...
However, you have to this for all types seperatly.

How to get a document by its primary key in mongoengine?

I'm porting an application from App Engine's ndb to mongoengine. ndb provides the Model.get_by_id method, and I'd like to implement this in terms of mongoengine. So how do you get a document by its automatically generated id, or by whatever field has primary_key set to True?
You can use with_id():
class MyDocument(Document):
...
#classmethod
def get_by_id(cls, id):
return cls.objects.with_id(id)
This will return the document instance if it exists or None if it doesn't.
Check out http://docs.mongoengine.org/guide/querying.html
Answer is simple:
Model.objects(id='your-id')
I presume that you know the name of the primary key field.
Use with_id. It's specialized for that purpose.
Model.objects.with_id('your-id')
It returns None if no object is found.
But ensure you're not setting filter (as if it was filter method) because it raises InvalidQueryError.

How can you keep the Django ORM from making mistakes when you pass the wrong kind of object?

We found this while testing, one machine was setup with MyISAM as the default engine and one was set with InnoDB as the default engine. We have code similar to the following
class StudyManager(models.Manager):
def scored(self, school=None, student=None):
qset = self.objects.all()
if school:
qset = qset.filter(school=school)
if student:
qset = qset.filter(student=student)
return qset.order_by('something')
The problem code looked like this:
print Study.objects.scored(student).count()
which meant that the "student" was being treated as a school. This got thru testing in with MyISAM because student.id == school.id because MyISAM can't do a rollback and gets completely re-created each test (resetting the autoincrement id field). InnoDB caught these errors because rollback evidently does not reset the autoincrement fields.
Problem is, during testing, there could be many other errors that are going uncaught due to duck typing since all models have an id field. I'm worried about the id's on objects lining up (in production or in testing) and that causing problems/failing to find the bugs.
I could add asserts like so:
class StudyManager(models.Manager):
def scored(self, school=None, student=None):
qset = self.objects.all()
if school:
assert(isinstance(school, School))
qset = qset.filter(school=school)
if student:
assert(isinstance(student, Student))
qset = qset.filter(student=student)
return qset.order_by('something')
But this looks nasty, and is a lot of work (to go back and retrofit). It's also slower in debug mode.
I've thought about the idea that the id field for the models could be coerced into model_id (student_id for Student, school_id for School) so that schools would not have a student_id, this would only involve specifying the primary key field, but django has a shortcut for that in .pk so I'm guessing that might not help in all cases.
Is there a more elegant solution to catching this kind of bug? Being an old C++ hand, I kind of miss type safety.
This is an aspect of Python and has nothing to do with Django per se.
By defining default values for function parameters you do not eliminate the concept of positional arguments — you simply make it possible to not specify all parameters when invoking the function. #mVChr is correct in saying that you need to get in the habit of using the parameter name(s) when you call the routine, particularly when there is inherent ambiguity in just what it is being called with.
You might also consider having two separate routines whose names quiet clearly identify their expected parameter types.

Django ForeignKey which does not require referential integrity?

I'd like to set up a ForeignKey field in a django model which points to another table some of the time. But I want it to be okay to insert an id into this field which refers to an entry in the other table which might not be there. So if the row exists in the other table, I'd like to get all the benefits of the ForeignKey relationship. But if not, I'd like this treated as just a number.
Is this possible? Is this what Generic relations are for?
This question was asked a long time ago, but for newcomers there is now a built in way to handle this by setting db_constraint=False on your ForeignKey:
https://docs.djangoproject.com/en/dev/ref/models/fields/#django.db.models.ForeignKey.db_constraint
customer = models.ForeignKey('Customer', db_constraint=False)
or if you want to to be nullable as well as not enforcing referential integrity:
customer = models.ForeignKey('Customer', null=True, blank=True, db_constraint=False)
We use this in cases where we cannot guarantee that the relations will get created in the right order.
EDIT: update link
I'm new to Django, so I don't now if it provides what you want out-of-the-box. I thought of something like this:
from django.db import models
class YourModel(models.Model):
my_fk = models.PositiveIntegerField()
def set_fk_obj(self, obj):
my_fk = obj.id
def get_fk_obj(self):
if my_fk == None:
return None
try:
obj = YourFkModel.objects.get(pk = self.my_fk)
return obj
except YourFkModel.DoesNotExist:
return None
I don't know if you use the contrib admin app. Using PositiveIntegerField instead of ForeignKey the field would be rendered with a text field on the admin site.
This is probably as simple as declaring a ForeignKey and creating the column without actually declaring it as a FOREIGN KEY. That way, you'll get o.obj_id, o.obj will work if the object exists, and--I think--raise an exception if you try to load an object that doesn't actually exist (probably DoesNotExist).
However, I don't think there's any way to make syncdb do this for you. I found syncdb to be limiting to the point of being useless, so I bypass it entirely and create the schema with my own code. You can use syncdb to create the database, then alter the table directly, eg. ALTER TABLE tablename DROP CONSTRAINT fk_constraint_name.
You also inherently lose ON DELETE CASCADE and all referential integrity checking, of course.
To do the solution by #Glenn Maynard via South, generate an empty South migration:
python manage.py schemamigration myapp name_of_migration --empty
Edit the migration file then run it:
def forwards(self, orm):
db.delete_foreign_key('table_name', 'field_name')
def backwards(self, orm):
sql = db.foreign_key_sql('table_name', 'field_name', 'foreign_table_name', 'foreign_field_name')
db.execute(sql)
Source article
(Note: It might help if you explain why you want this. There might be a better way to approach the underlying problem.)
Is this possible?
Not with ForeignKey alone, because you're overloading the column values with two different meanings, without a reliable way of distinguishing them. (For example, what would happen if a new entry in the target table is created with a primary key matching old entries in the referencing table? What would happen to these old referencing entries when the new target entry is deleted?)
The usual ad hoc solution to this problem is to define a "type" or "tag" column alongside the foreign key, to distinguish the different meanings (but see below).
Is this what Generic relations are for?
Yes, partly.
GenericForeignKey is just a Django convenience helper for the pattern above; it pairs a foreign key with a type tag that identifies which table/model it refers to (using the model's associated ContentType; see contenttypes)
Example:
class Foo(models.Model):
other_type = models.ForeignKey('contenttypes.ContentType', null=True)
other_id = models.PositiveIntegerField()
# Optional accessor, not a stored column
other = generic.GenericForeignKey('other_type', 'other_id')
This will allow you use other like a ForeignKey, to refer to instances of your other model. (In the background, GenericForeignKey gets and sets other_type and other_id for you.)
To represent a number that isn't a reference, you would set other_type to None, and just use other_id directly. In this case, trying to access other will always return None, instead of raising DoesNotExist (or returning an unintended object, due to id collision).
tablename= columnname.ForeignKey('table', null=True, blank=True, db_constraint=False)
use this in your program

Categories