get_or_create in Peewee - python

The paragraph titled Get or create on the peewee documentation says:
While peewee has a get_or_create() method, this should really not be
used outside of tests as it is vulnerable to a race condition. The
proper way to perform a get or create with peewee is to rely on the
database to enforce a constraint.
And then it goes on with an example that only shows the create part, not the get part.
What is the best way to perform a get or create with peewee?

Everything you are doing inside a transaction is atomic.
So as long as you are calling get_or_create() inside a transaction, that paragraph is wrong.

Related

When to use force_update in Django's save()?

Django official docs state that:
In some rare circumstances, it’s necessary to be able to force the save() method to perform an SQL INSERT and not fall back to doing an UPDATE. Or vice-versa: update, if possible, but not insert a new row. In these cases you can pass the force_insert=True or force_update=True parameters to the save() method. Obviously, passing both parameters is an error: you cannot both insert and update at the same time!
What are these rare circumstances?
Basically, when should one use force_update=True?

SQLAlchemy how LazyLoading works

Hi I would like to understand how does sqlalchemy lazy loading works? Assuming I have this query
results = (
session.query(Parent).
options(lazyload(Parent.children)).
filter(Parent.id == 1).
all()
)
for parent in results:
logging.error(parent.children)
I want to know if I access the parent.children on the for loop will this create a new select statement? or is the record or parent.children already cached or something? I'm thinking of how this will affect the performance. I just want to most optimize way.
Should I use lazyloading?
Will accessing per item on the loop create a new sqlalchemy
How do I find out if a query is being run by sqlalchemy? (Just want to find out if accessing per entry will create a select statement
Maybe.
Do you mean issue a new query? The answer is yes, that's the point of lazyload(). The relationship collection attribute is populated when first accessed, lazily. If on the other hand you'd wish to avoid the possible N+1 situation, you could for example use joinedload() instead in order to populate children in the same query.
Use logging. Pass echo=True in your engine configuration.

Insert statment created by django ORM at bulk_create

I am kind of new to python and django.
I am using bulk_create to insert a lot of rows and as a former DBA I would very much like to see what insert statments are being executed. I know that for querys you can use .query but for insert statments I can't find a command.
Is there something I'm missing or is there no easy way to see it? (A regular print is fine by me.)
The easiest way is to set DEBUG = True and check connection.queries after executing the query. This stores the raw queries and the time each query takes.
from django.db import connection
MyModel.objects.bulk_create(...)
print(connection.queries[-1]['sql'])
There's more information in the docs.
A great tool to make this information easily accessible is the django-debug-toolbar.

django select_for_update and select_related on same query?

Does anyone know if you can do both the .select_for_update() and .select_related() statements in a single query? Such as:
employee = get_object_or_404(Employee.objects.select_for_update().
select_related(‘company’), pk=3)
It seemed to work fine in one place in my code, but a second usage threw an "InternalError: current transaction is aborted" for a series of unit tests. Removing the .select_related and leaving just .select_for_update made the error go away, but I don't know why. I'd like to use them both to optimize my code, but if forced to choose, I'll pick select_for_update. Wondering if there's a way I can use both. Using postgres and django 1.9. Thanks!
Since Django 2.0, you can use select_for_update together with select_related even on nullable relations - by using new parameter of=...
Using their Person example from docs, you could do
Person.objects.select_related('hometown').select_for_update(of=('self',))
which would lock only the Person object
You can't use select_related with foreign keys that are nullable when you are using select_for_update on the same queryset.
This will work in all cases:
Book.objects.select_related().select_for_update().get(name='Doors of perception')

Django models - assign id instead of object

I apologize if my question turns out to be silly, but I'm rather new to Django, and I could not find an answer anywhere.
I have the following model:
class BlackListEntry(models.Model):
user_banned = models.ForeignKey(auth.models.User,related_name="user_banned")
user_banning = models.ForeignKey(auth.models.User,related_name="user_banning")
Now, when i try to create an object like this:
BlackListEntry.objects.create(user_banned=int(user_id),user_banning=int(banning_id))
I get a following error:
Cannot assign "1": "BlackListEntry.user_banned" must be a "User" instance.
Of course, if i replace it with something like this:
user_banned = User.objects.get(pk=user_id)
user_banning = User.objects.get(pk=banning_id)
BlackListEntry.objects.create(user_banned=user_banned,user_banning=user_banning)
everything works fine. The question is:
Does my solution hit the database to retrieve both users, and if yes, is it possible to avoid it, just passing ids?
The answer to your question is: YES.
Django will hit the database (at least) 3 times, 2 to retrieve the two User objects and a third one to commit your desired information. This will cause an absolutelly unnecessary overhead.
Just try:
BlackListEntry.objects.create(user_banned_id=int(user_id),user_banning_id=int(banning_id))
These is the default name pattern for the FK fields generated by Django ORM. This way you can set the information directly and avoid the queries.
If you wanted to query for the already saved BlackListEntry objects, you can navigate the attributes with a double underscore, like this:
BlackListEntry.objects.filter(user_banned__id=int(user_id),user_banning__id=int(banning_id))
This is how you access properties in Django querysets. with a double underscore. Then you can compare to the value of the attribute.
Though very similar, they work completely different. The first one sets an atribute directly while the second one is parsed by django, that splits it at the '__', and query the database the right way, being the second part the name of an attribute.
You can always compare user_banned and user_banning with the actual User objects, instead of their ids. But there is no use for this if you don't already have those objects with you.
Hope it helps.
I do believe that when you fetch the users, it is going to hit the db...
To avoid it, you would have to write the raw sql to do the update using method described here:
https://docs.djangoproject.com/en/dev/topics/db/sql/
If you decide to go that route keep in mind you are responsible for protecting yourself from sql injection attacks.
Another alternative would be to cache the user_banned and user_banning objects.
But in all likelihood, simply grabbing the users and creating the BlackListEntry won't cause you any noticeable performance problems. Caching or executing raw sql will only provide a small benefit. You're probably going to run into other issues before this becomes a problem.

Categories