Copy object instance and insert to DB using peewee creates duplicate ID - python

I have an instance of an object (with many attributes) which I want to duplicate.
I copy it using deepcopy() then modify couple of attributes.
Then I save my new object to the database using Python / PeeWee save() but the save() actually updates the original object (I assume it is because that the id was copied from the original object).
(btw no primary key is defined in the object model)
How do I force save the new object? can I change its id?
Thanks.

Turns out that I can set the id to None (obj.id = None) which will create a new record when performing save().

Set the id to None (obj.id = None) resolves if you are using sqlite, otherwise use:
data = obj.__dict__['_data']
data.pop('id')
obj.insert(data).execute()

Related

Unable to insert extracted objects in SQLAlchemy ORM

I'm trying to insert/update the list of objects extracted using SQLAlchemy ORM.
def truncate_mytable(self):
with self.session.begin():
current_records = self.session.query(MyTable).all()
self.session.query(MyTable).delete()
self.session.expunge_all()
return current_records
def compensate_truncate_mytable(self, objects):
with self.session.begin():
self.session.bulk_save_objects(objects)
But while the objects have been extracted correctly, they are not getting written to the DB.
Could it be because there are also some protected attributes inside the objects, such as <sqlalchemy.orm.state.InstanceState object at 0x11471bf70> and <ClassManager of <class 'lib.kaizen_models.models.MyTable'> at 1146673b0>? The objects' type in the list is <lib.kaizen_models.models.MyTable object at 0x11471bd00>.
(I'm writing compensation methods, following Saga pattern.)
The problem is that the objects are in a detached state, and this means bulk_save_objects will try to update rather than insert them*.
The state can be reset to transient by calling orm.make_transient on each object, after which they can be saved by bulk_save_objects.
def truncate_mytable(self):
with self.session.begin():
current_records = self.session.query(MyTable).all()
self.session.query(MyTable).delete()
self.session.expunge_all()
for record in current records:
orm.make_transient(record)
return current_records
Alternatively, you could merge the objects back into the session before calling bulk_save_objects, but this might reduce the performance benefits that you want to obtain from the bulk operation.
* By default bulk_create_objects' update_changed_only argument is True, and since there a no changes in the objects' attribute histories no updates are attempted. Setting it to False will emit UPDATE statements, but result in a StaleDataError because the UPDATE matches no rows in the empty table.

Django "update_or_create" API: how to filter objects by created or updated?

So, I'm using the Django update_or_create API to build my form data. It works fine...but, once built, I need a way to check to see what profiles were actually updated or if they were created for the first time?
Just an example:
for people in peoples:
people, updated = People.objects.update_or_create(
id=people.person_id,
defaults={
'first_name': people.first_name,
}
)
Filtering queryset:
people = People.objects.filter(
id__in=whatever,
)
But, now, I'm trying to filter the queryset by created or updated...but don't see an API for that (e.g., a fitler of sorts)
So, I would like to do something like:
updated = Person.objects.filter(updated=True, created_for_first_time=False)
and then I can write something like
if updated:
do this
else:
do this
Basically, I just want to check if a profile was updated or created for the first time.
As you have shown in your question, the update_or_create method returns a tuple (obj, created), where obj in the object, and created is a boolean showing whether a new object was created.
You could check the value of the boolean field, and create a list to store the ids of the newly created objects
new_objects = []
for people in peoples:
obj, created = People.objects.update_or_create(...)
if created:
new_objects.append(obj.id)
You can then filter using that list:
new_people = People.objects.filter(id__in=new_objects)
existing_people = People.objects.exclude(id__in=new_objects)
When you call update_or_create:
person, created = People.objects.update_or_create(...)
the created return value will be True if the record was freshly created, or False if it was an existing record that got updated. If you need to act on this bit of information, it would be best do do it right here, while you have access to it.
If you need to do it later, the only way I can think of is to design a schema that supports it, i.e. have create_date and last_modify_date fields, and if those two fields are equal, you know the record has not been modified since it was created.

Python peewee save() doesn't work as expected

I'm inserting/updating objects into a MySQL database using the peewee ORM for Python. I have a model like this:
class Person(Model):
person_id = CharField(primary_key=True)
name = CharField()
I create the objects/rows with a loop, and each time through the loop have a dictionary like:
pd = {"name":"Alice","person_id":"A123456"}
Then I try creating an object and saving it.
po = Person()
for key,value in pd.items():
setattr(po,key,value)
po.save()
This takes a while to execute, and runs without errors, but it doesn't save anything to the database -- no records are created.
This works:
Person.create(**pd)
But also throws an error (and terminates the script) when the primary key already exists. From reading the manual, I thought save() was the function I needed -- that peewee would perform the update or insert as required.
Not sure what I need to do here -- try getting each record first? Catch errors and try updating a record if it can't be created? I'm new to peewee, and would normally just write INSERT ... ON DUPLICATE KEY UPDATE or even REPLACE.
Person.save(force_insert=True)
It's documented: http://docs.peewee-orm.com/en/latest/peewee/models.html#non-integer-primary-keys-composite-keys-and-other-tricks
I've had a chance to re-test my answer, and I think it should be replaced. Here's the pattern I can now recommend; first, use get_or_create() on the model, which will create the database row if it doesn't exist. Then, if it is not created (object is retrieved from db instead), set all the attributes from the data dictionary and save the object.
po, created = Person.get_or_create(person_id=pd["person_id"],defaults=pd)
if created is False:
for key in pd:
setattr(fa,key,pd[key])
po.save()
As before, I should mention that these are two distinct transactions, so this should not be used with multi-user databases requiring a true upsert in one transaction.
I think you might try get_or_create()? http://peewee.readthedocs.org/en/latest/peewee/querying.html#get-or-create
You may do something like:
po = Person()
for key,value in pd.items():
setattr(po,key,value)
updated = po.save()
if not updated:
po.save(force_insert=True)

Using Django 1.4, How can I save TimeStampedModel objects without automatically updating the modified field?

I am in the process of running a Datamigration using Django and South. I have already added a new field to my model with a Schemamigration, and now I am in the process of populating the field for all the objects of that Model.
The problem is that when I call the save() method on my objects in the datamigration, it is automatically updating the modified field that is on each object and all the objects are ending up with the same modified date. I would like to be able to preserve the modified date from before the datamigration if possible.
Currently my datamigration looks like this:
class Migration(DataMigration):
def forwards(self, orm):
for w in orm.Writer.objects.all():
w.type = 'outside'
if w.managed_by != None:
if w.managed_by.writer != None:
if w.id == w.managed_by.writer.id:
w.type = 'client'
w.save()
Is there a way to only save the values in the type field, without updating the modified date?
You can update your object to only change the fields that you desire by using the update() method available for your model's queryset (see https://docs.djangoproject.com/en/1.4/topics/db/queries/#updating-multiple-objects-at-once for additional details).
Though the documentation references using this feature for multiple objects, you can filter the update query to only target a single object by restricting to the PK of the object you're working with:
orm.Writer.objects.filter(pk=w.pk).update(type='client')

Create Django model or update if exists

I want to create a model object, like Person, if person's id doesn't not exist, or I will get that person object.
The code to create a new person as following:
class Person(models.Model):
identifier = models.CharField(max_length = 10)
name = models.CharField(max_length = 20)
objects = PersonManager()
class PersonManager(models.Manager):
def create_person(self, identifier):
person = self.create(identifier = identifier)
return person
But I don't know where to check and get the existing person object.
It's unclear whether your question is asking for the get_or_create method (available from at least Django 1.3) or the update_or_create method (new in Django 1.7). It depends on how you want to update the user object.
Sample use is as follows:
# In both cases, the call will get a person object with matching
# identifier or create one if none exists; if a person is created,
# it will be created with name equal to the value in `name`.
# In this case, if the Person already exists, its existing name is preserved
person, created = Person.objects.get_or_create(
identifier=identifier, defaults={"name": name}
)
# In this case, if the Person already exists, its name is updated
person, created = Person.objects.update_or_create(
identifier=identifier, defaults={"name": name}
)
If you're looking for "update if exists else create" use case, please refer to #Zags excellent answer
Django already has a get_or_create, https://docs.djangoproject.com/en/dev/ref/models/querysets/#get-or-create
For you it could be :
id = 'some identifier'
person, created = Person.objects.get_or_create(identifier=id)
if created:
# means you have created a new person
else:
# person just refers to the existing one
Django has support for this, check get_or_create
person, created = Person.objects.get_or_create(name='abc')
if created:
# A new person object created
else:
# person object already exists
For only a small amount of objects the update_or_create works well, but if you're doing over a large collection it won't scale well. update_or_create always first runs a SELECT and thereafter an UPDATE.
for the_bar in bars:
updated_rows = SomeModel.objects.filter(bar=the_bar).update(foo=100)
if not updated_rows:
# if not exists, create new
SomeModel.objects.create(bar=the_bar, foo=100)
This will at best only run the first update-query, and only if it matched zero rows run another INSERT-query. Which will greatly increase your performance if you expect most of the rows to actually be existing.
It all comes down to your use case though. If you are expecting mostly inserts then perhaps the bulk_create() command could be an option.
Thought I'd add an answer since your question title looks like it is asking how to create or update, rather than get or create as described in the question body.
If you did want to create or update an object, the .save() method already has this behaviour by default, from the docs:
Django abstracts the need to use INSERT or UPDATE SQL statements.
Specifically, when you call save(), Django follows this algorithm:
If the object’s primary key attribute is set to a value that evaluates
to True (i.e., a value other than None or the empty string), Django
executes an UPDATE. If the object’s primary key attribute is not set
or if the UPDATE didn’t update anything, Django executes an INSERT.
It's worth noting that when they say 'if the UPDATE didn't update anything' they are essentially referring to the case where the id you gave the object doesn't already exist in the database.
You can also use update_or_create just like get_or_create and here is the pattern I follow for update_or_create assuming a model Person with id (key), name, age, is_manager as attributes -
update_values = {"is_manager": False}
new_values = {"name": "Bob", "age": 25, "is_manager":True}
obj, created = Person.objects.update_or_create(identifier='id',
defaults=update_values)
if created:
obj.update(**new_values)
If one of the input when you create is a primary key, this will be enough:
Person.objects.get_or_create(id=1)
It will automatically update if exist since two data with the same primary key is not allowed.
This should be the answer you are looking for
EmployeeInfo.objects.update_or_create(
#id or any primary key:value to search for
identifier=your_id,
#if found update with the following or save/create if not found
defaults={'name':'your_name'}
)

Categories