Create Django model or update if exists - python

I want to create a model object, like Person, if person's id doesn't not exist, or I will get that person object.
The code to create a new person as following:
class Person(models.Model):
identifier = models.CharField(max_length = 10)
name = models.CharField(max_length = 20)
objects = PersonManager()
class PersonManager(models.Manager):
def create_person(self, identifier):
person = self.create(identifier = identifier)
return person
But I don't know where to check and get the existing person object.

It's unclear whether your question is asking for the get_or_create method (available from at least Django 1.3) or the update_or_create method (new in Django 1.7). It depends on how you want to update the user object.
Sample use is as follows:
# In both cases, the call will get a person object with matching
# identifier or create one if none exists; if a person is created,
# it will be created with name equal to the value in `name`.
# In this case, if the Person already exists, its existing name is preserved
person, created = Person.objects.get_or_create(
identifier=identifier, defaults={"name": name}
)
# In this case, if the Person already exists, its name is updated
person, created = Person.objects.update_or_create(
identifier=identifier, defaults={"name": name}
)

If you're looking for "update if exists else create" use case, please refer to #Zags excellent answer
Django already has a get_or_create, https://docs.djangoproject.com/en/dev/ref/models/querysets/#get-or-create
For you it could be :
id = 'some identifier'
person, created = Person.objects.get_or_create(identifier=id)
if created:
# means you have created a new person
else:
# person just refers to the existing one

Django has support for this, check get_or_create
person, created = Person.objects.get_or_create(name='abc')
if created:
# A new person object created
else:
# person object already exists

For only a small amount of objects the update_or_create works well, but if you're doing over a large collection it won't scale well. update_or_create always first runs a SELECT and thereafter an UPDATE.
for the_bar in bars:
updated_rows = SomeModel.objects.filter(bar=the_bar).update(foo=100)
if not updated_rows:
# if not exists, create new
SomeModel.objects.create(bar=the_bar, foo=100)
This will at best only run the first update-query, and only if it matched zero rows run another INSERT-query. Which will greatly increase your performance if you expect most of the rows to actually be existing.
It all comes down to your use case though. If you are expecting mostly inserts then perhaps the bulk_create() command could be an option.

Thought I'd add an answer since your question title looks like it is asking how to create or update, rather than get or create as described in the question body.
If you did want to create or update an object, the .save() method already has this behaviour by default, from the docs:
Django abstracts the need to use INSERT or UPDATE SQL statements.
Specifically, when you call save(), Django follows this algorithm:
If the object’s primary key attribute is set to a value that evaluates
to True (i.e., a value other than None or the empty string), Django
executes an UPDATE. If the object’s primary key attribute is not set
or if the UPDATE didn’t update anything, Django executes an INSERT.
It's worth noting that when they say 'if the UPDATE didn't update anything' they are essentially referring to the case where the id you gave the object doesn't already exist in the database.

You can also use update_or_create just like get_or_create and here is the pattern I follow for update_or_create assuming a model Person with id (key), name, age, is_manager as attributes -
update_values = {"is_manager": False}
new_values = {"name": "Bob", "age": 25, "is_manager":True}
obj, created = Person.objects.update_or_create(identifier='id',
defaults=update_values)
if created:
obj.update(**new_values)

If one of the input when you create is a primary key, this will be enough:
Person.objects.get_or_create(id=1)
It will automatically update if exist since two data with the same primary key is not allowed.

This should be the answer you are looking for
EmployeeInfo.objects.update_or_create(
#id or any primary key:value to search for
identifier=your_id,
#if found update with the following or save/create if not found
defaults={'name':'your_name'}
)

Related

How to update the property of an instance of a model as foreign key of another model

I have am in the following situation. I have a Django model that includes a link to another class, ex:
class Document(models.Model):
document = models.FileField(upload_to=user_file_path)
origin = models.CharField(max_length=9)
date = models.DateField()
company = models.ForeignKey(Company, on_delete=models.CASCADE)
Company includes a property called status.
In another part of the code I am working with a doc of type Document and I want to update the status property of the company within the doc, so I proceed to:
doc.company.status = new_value
doc.save()
or
setattr(company, 'status', new_value)
company.save()
In the first case, the value of doc.company.status is updated, but if I query the company it keeps the old value, and in the second case is the other way around, the value of company.status is updated but doc.company.status keeps the old value.
I had always assumed that updating either (doc.company or company) of them had an immediate effect over the other, but now it seems that doc has a copy of company (or some sort of lazy link difficult to foresee) and both remain separate, and both of them have to be updated to change the value.
An alternative that seems to work is (saving doc.company instead of doc):
doc.company.status = new_value
doc.company.save()
The new value for status does not seem to propagate instantly to company, but yes when it is required by a query or operation.
May someone explain the exact relationship or way of doing or provide a reference where I may find the proper explanations?
Thanks
Setting an object instance attribute changes only the attribute on that instance. After using save(), the value is saved in the database.
If there were previously defined variables of the same instance, they will remain the same. That is, unless you call refresh_from_db() method.
Consider the following scenario:
company_var1 = Company.objects.get(pk=13)
company_var2 = Company.objects.get(pk=13)
company_var1.status = new_value # This only changes the attribute on the model instance
print(company_var1.status) # Output: new value
print(company_var2.status) # Output: old value
company_var1.save() # This updates the database table
print(company_var1.status) # Output: new value
# Note how the company_var2 is still the same even though
# it refers to the same DB table row (pk=13).
print(company_var2.status) # Output: old value
# After doing this company_var2 will get the values from the DB
company_var2.refresh_from_db()
print(company_var1.status) # Output: new value
print(company_var2.status) # Output: new value
I hope that clears things up and you can apply it to your case.

How to create a new record from django query set

I have heard of a django orm hack somewhere. It goes like this
dp = SomeModel.objects.filter(type="customer").last()
dp.id = dp.id + 1 #changing id to a new one
dp.save()
The last step supposedly creates a new record provided the value of id being used doesn't exist. In case the incremented id exists, then the save method acts like the update method.
example ::
dp.version += 1 #updating some random field
dp.save() # will change the newer version of dp.id
I would like to ask veterans in django two questions for our benefit,
Is there a sure shot way of creating a new record from an old record with the latest auto_increment_pk instead of pk + 1 method
Is the above method any faster or better. One advantage I see is if I have a model with 10 fields and I want to create a new record from an old one with only 1 or 2 changes from the older one, this method saves 8 lines of code.
Thank You
last() returns the last instance with regard to the ordering of the according QuerySet. This instance is not guaranteed to have the biggest pk in use. And if it did, there would be no guarantee that no other instance is created between retrieving the old highest pk and committing the new instance with pk + 1. Such an instance would be overwritten by the new clone. Hence, do not manually set a new pk, but let the data base handle that. As others have suggested, that is easily done by setting the pk to None:
instance.pk = None
instance.save()
Yes, Make id as None.
dp = SomeModel.objects.filter(type="customer").last()
dp.id = None
dp.save()
Rather than incrementing the pk, you should set it to None. Then it will always be saved as a new record.

Django "update_or_create" API: how to filter objects by created or updated?

So, I'm using the Django update_or_create API to build my form data. It works fine...but, once built, I need a way to check to see what profiles were actually updated or if they were created for the first time?
Just an example:
for people in peoples:
people, updated = People.objects.update_or_create(
id=people.person_id,
defaults={
'first_name': people.first_name,
}
)
Filtering queryset:
people = People.objects.filter(
id__in=whatever,
)
But, now, I'm trying to filter the queryset by created or updated...but don't see an API for that (e.g., a fitler of sorts)
So, I would like to do something like:
updated = Person.objects.filter(updated=True, created_for_first_time=False)
and then I can write something like
if updated:
do this
else:
do this
Basically, I just want to check if a profile was updated or created for the first time.
As you have shown in your question, the update_or_create method returns a tuple (obj, created), where obj in the object, and created is a boolean showing whether a new object was created.
You could check the value of the boolean field, and create a list to store the ids of the newly created objects
new_objects = []
for people in peoples:
obj, created = People.objects.update_or_create(...)
if created:
new_objects.append(obj.id)
You can then filter using that list:
new_people = People.objects.filter(id__in=new_objects)
existing_people = People.objects.exclude(id__in=new_objects)
When you call update_or_create:
person, created = People.objects.update_or_create(...)
the created return value will be True if the record was freshly created, or False if it was an existing record that got updated. If you need to act on this bit of information, it would be best do do it right here, while you have access to it.
If you need to do it later, the only way I can think of is to design a schema that supports it, i.e. have create_date and last_modify_date fields, and if those two fields are equal, you know the record has not been modified since it was created.

Django Integrity error _id may not be null, ForeignKey id assignment

I'm having issues understanding the way Django (v1.6.5) assigns the id to the different objects when saving. Taking the minimal example:
#models.py
class Book(models.Model):
title = models.CharField(max_length=10)
class Page(models.Model):
number = models.SmallIntegerField()
book = models.ForeignKey(Book)
The following view throws "IntegrityError,book_id may not be NULL" when saving my_page, however I would tend to say book_id does exist since save() has been called for the book at that stage.
#view.py
my_book = Book(title="My book")
#solution1 : having my_book.save() here
my_page = Page(number = 1, book = my_book)
my_book.save()
print("book id",my_page.book.id) #book.id does exist at that point!
#solution2: my_page.book = my_book
my_page.save() #throws the IntegrityError exception
There are some easy solutions to make the code above work but I would like to know what is wrong with the first approach. Am I missing something or is it a glitch/limitation in the way Django handles ForeignKeys?
You should save the book before setting my_page.book = book.
The behaviour you're experiencing is described by ticket 10811.
I see your point, but the current behavior seems more explicit. my_book is just a Python object, and all of its attributes (including id) can change. So it seems safer to assume that the user wants the value that exists at instantiation time.
For example, the Django idiom for copying a database row involves reusing the same object to represent more than one model instance. In your case that might look like:
my_book = Book(title="My book")
my_page = Page(number=1, book=my_book)
my_book.save()
my_book.id = None
my_book.save() # copy the book to a new row with a new id
my_page.save()
So which book should my_page point to? I think the developers are right to require you to be explicit here. The solution to the associated ticket will be even more direct in that you will get a ValueError when trying to instantiate my_page if my_book hasn't yet been saved.
If there is a foreign key in a model then you need to give the whole object in order to store a record in it. For instance if you want to save a number with book in Page model, so you need to give book foreign key a book object just because its not a normal field but its a foreign key to the book model.
It can be done like following
my_page = Page(title="my title", book=Book.objects.get(pk=id))
So coming to conclusion you cannot insert a record simply by giving strings to the foreign keys, you must give that object to which foreign key is pointing. Hope that helps

Django defering the foreign key look up

Working a django project and trying to speed up the calls. I noticed that Django automatically does a second query to evaulate any foreign key relationships. For instance if my models look like:
Model Person:
name = model.CharField("blah")
Model Address:
person = model.ForeignKey(Person)
Then I make:
p1 = Person("Bob")
address1 = Address(p1)
print (p1.id) #let it be 1 cause it is the first entry
then when I call:
address1.objects.filter(person_id = "1")
I get:
Query #1: SELECT address.id, address.person_id FROM address
Query #2: SELECT person.id, person.name FROM person
I want to get rid of the 2nd call, query #2. I have tried using "defer" from django documentation, but that did not work (in fact it makes even more calls). "values" is a possibility but in actual practice, there are many more fields I want to pull. The only thing I want it to do is not evaluate the FOREIGN KEY. I would be happy to get the person_id back, or not. This drastically reduces the runtime especially when I do a command like: Address.objects.all(), because it Django evaluates every foreign key.
Having just seen your other question on the same issue, I'm going to guess that you have defined a __unicode__ method that references the ForeignKey field. If you query for some objects in the shell and output them, the __unicode__ method will be called, which requires a query to get the ForeignKey. The solution is to either rewrite that method so it doesn't need that reference, or - as I say in the other question - use select_related().
Next time, please provide full code, including some that actually demonstrates the problem you are having.

Categories