Is there a better way to deal with DoesNotExist query sets

Is there a better way to deal with DoesNotExist query sets - python

There is probably a better way of dealing with non existant query sets...!
The problem i have with this code is that it raises an exception if the normal case will be true! That is: if a workspace name with the same name in the db is not existent.
But instead of having an exception i would like to go for a query that does not return DoesNotExist but true or false
My unelegant code:
try:
is_workspace_name = Workspace.objects.get(workspace_name=workspace_name,user=self.user.id )
except:
return workspace_name
if is_workspace_name:
raise forms.ValidationError(u'%s already exists as a workspace name! Please choose a different one!' %workspace_name )
Thanks a lot!

You can use exists() method. Quoting docs:
Returns True if the QuerySet contains any results, and False if not.
This tries to perform the query in the simplest and fastest way
possible, but it does execute nearly the same query as a normal
QuerySet query.
Remarks: the simplest and fastest way. It is cheaper to use exists (than count) because with exists the database stops counting at first occurrence.
if Workspace.objects.filter(workspace_name=workspace_name,
user=self.user.id).exists()
raise forms.ValidationError(u'%s already exists ...!' % workspace_name)
else:
return workspace_name

Checking for the existence of a record.
If you want to test for the existence of a record in your database, you could be using Workspace.objects.filter(workspace_name = workspace_name,user = self.user.id).count().
This will return the number of records matching your conditions. This number will be 0 in case there is none, which will be readily usable with an if clause. I believe this to me the most standard and easy way to do what you need here.
## EDIT ## Actually that's false, you might want to check danihp's answer for a better solution using Queryset.exists!
A word of warning: the case of checking for existence before insertion
Be cautious when using such a construct however, especially if you plan on checking whether you have a duplicate before trying to insert a record. In such a case, the best solution is to try to create the record and see if it raises an exception.
Indeed, you could be in the following situation:
Request 1 reaches the server
Request 2 reaches the server
Check is done for request 1, no object exist.
Check is done for request 2, no object exist.
Proceed with creation in request 1.
Proceed with creation in request 2.
And... you have a duplicate - this is called a race condition, and is a common issue when dealing with parallel code.
Long story short, you should use try, expect and unique constraints when dealing with insertion.
Using get_or_create, as suggested by init3, also helps. Indeed, get_or_create is aware of this, and you'll be safe so long as unwanted duplicated would raise an IntegrityError

obj, created = Workspace.objects.get_or_create(workspace_name=workspace_name, user=self.user.id)
if created:
# everything ok
# do something
pass
else:
# not ok
# respond he should choose anything else
pass
read more at the docs

Related

Explicitly checking if an SQL INSERT operation succeeded unnecessary?

I'm using Python to talk to a Postgres DBMS using psycopg2.
Is it safe to assume that if an INSERT returns without raising an exception, then that INSERT actually did store something new in the database?
Right now, I've been checking the 'rowcount' attribute of the database cursor, and if it's 0 then that means the INSERT failed. However, I'm starting to think that this isn't necessary.

Is it safe to assume that if an INSERT returns without raising an
exception, then that INSERT actually did store something new in the
database?
No.
The affected record count will be zero if:
You ran an INSERT INTO ... SELECT ..., and the query returned no rows
You ran an INSERT INTO ... ON CONFLICT DO NOTHING, and it encountered a conflict
You have a BEFORE INSERT trigger on your table, and the trigger function returned NULL
You have a rule defined which results in no records being affected (e.g. ... DO INSTEAD NOTHING)
(... and possibly more, though nothing comes to mind.)
The common thread is that it will only affect zero records if you told it to, one way or another. Whether you want to treat any of these as a "failure" is highly dependent on your application logic.
Anything which is unequivocally a "failure" (constraint violation, serialisation failure, out of disk space...) should throw an error, so checking the record count is generally unnecessary.

By default postgres will return None for a successful insert:
cursor.execute - The method returns None. If a query was executed, the returned values can be retrieved using fetch*() methods.
http://initd.org/psycopg/docs/cursor.html
If you want to know something about the insert, an easy/efficient option is to use RETURNING (which takes the same options as a SELECT):
INSERT INTO ... RETURNING id

found similar question here, How to check if value is inserted successfully or not?
they seem to use the row count method to check if the data was inserted correctly.

Better way to check if a object is present in Django ORM among Get/Filter?

Lets consider I am trying to find if a user with primary key 20 exists or not? I can do this in 2 ways.
The First one :
try:
user = User.objects.get(pk=20)
except User.DoesNotExist:
handle_non_existent_user()
The other way could be :
users = User.objects.filter(pk=20)
if not users.exists():
handle_non_existent_user()
Which is better method to do check existence?
This might be related to this : What is the best way to check if data is present in django?
However, people favoured the first method because of specified examples did not had the reference of model queryset.
Also in the answer of following question : what is the right way to validate if an object exists in a django view without returning 404?
It is largely based because we are not getting the reference of object in question.

TLDR: For cases when you almost always sure that the object is db it is better to use try:get for cases when there is 50% chance that object doesn't exists then it is better to use if:filter.exists
It really depends on code context. For example there are cases when if statement is better than try/except Using try vs if in python
So for your question it is the same Difference between Django's filter() and get() methods. get method underneath calls filter
https://github.com/django/django/blob/stable/1.11.x/django/db/models/query.py#L366
def get(self, *args, **kwargs):
"""
Performs the query and returns a single object matching the given
keyword arguments.
"""
clone = self.filter(*args, **kwargs)
if self.query.can_filter() and not self.query.distinct_fields:
clone = clone.order_by()
num = len(clone)
if num == 1:
return clone._result_cache[0]
if not num:
raise self.model.DoesNotExist(
"%s matching query does not exist." %
self.model._meta.object_name
)
raise self.model.MultipleObjectsReturned(
"get() returned more than one %s -- it returned %s!" %
(self.model._meta.object_name, num)
)
So in case when you use filter with exists. It will do almost the same code because exists underneath does this
def exists(self):
if self._result_cache is None:
return self.query.has_results(using=self.db)
return bool(self._result_cache)
And as you can see filter.exists will execute less code and should work faster, but it doesn't return you an object.

The first one is the best way in my opinion because if I ever forget to check whether the user exists or not, it will give me an error even if I don't use the try / except clause.
Also, get was made specifically to get ONE item only.

get_or_create: is it a 'get' or a 'create'?

AFAIK, peewee's Model.get_or_create() doesn't return a flag that indicates a creation, unlike django's get_or_create(). Is there a good way to check if an instance returned by get_or_create() is freshly created?
Thanks

There's a section in the docs that should hopefully be helpful: http://docs.peewee-orm.com/en/latest/peewee/querying.html#get-or-create
If the docs are lacking, please let me know and I'll be happy to improve them.

http://docs.peewee-orm.com/en/latest/peewee/api.html#Model.get_or_create
classmethod get_or_create(**kwargs)
Attempt to get the row matching the given filters. If no matching row is found, create a new row.
Parameters:
kwargs – Mapping of field-name to value.
defaults – Default values to use if creating a new row.
Returns:
Tuple of Model instance and boolean indicating if a new object was created.
It also warns you that race conditions are possible with this method, and even gives you an example without using the method:
try:
person = Person.get(
(Person.first_name == 'John') &
(Person.last_name == 'Lennon'))
except Person.DoesNotExist:
person = Person.create(
first_name='John',
last_name='Lennon',
birthday=datetime.date(1940, 10, 9))

According to source code, no way to find out. Also, according to documentation, it is not recommended to use this method.
I suggest to use try/except/else clause.

Proper use of exceptions in python/django orm

I'm trying to get an object, if it existed ok if not then something else and so on. Is it correct to do the following? I've heared that exceptions are expensive and shuold be avoided, is that correct?
try:
user = User.objects.get(user=id)
except ObjectDoesNotExist:
try:
user = User.objects.get(email=id)
except ObjectDoesNotExist:
try:
# ...
finally:
# do the final thing

They are somewhat expensive, but certainly not too expensive to use when needed. I'd be more concerned about hammering the database with multiple queries You can avoid both problems by getting the results for all possible fields back in one query.
from django.contrib.auth.models import User
from django.db.models import Q
user = User.objects.filter(Q(user=id) | Q(email=id) | [...])[0]
This relies on django's Q-objects, which allow you to create conditions joined together in more complex ways than the usual AND joins that you usually get when building filters. If you aren't concerned about the possibility of getting multiple objects back, you can still use the get() method like you did in your example.

The cost of a try/except is explained here: cost-of-exception-handlers-in-python
I suggest to catch things that should not happen or only happen rarely (real exceptions) with a try/except and more common situations with conditions.
Especially in a case like a Model.objects.get() where the underlying sql returns an empty list that wouldn't raise an exception if called as a filter.
users = User.objects.filter(user=id)[:1]
user = users and users[0]

.filter() vs .get() for single object? (Django)

I was having a debate on this with some colleagues. Is there a preferred way to retrieve an object in Django when you're expecting only one?
The two obvious ways are:
try:
obj = MyModel.objects.get(id=1)
except MyModel.DoesNotExist:
# We have no object! Do something...
pass
And:
objs = MyModel.objects.filter(id=1)
if len(objs) == 1:
obj = objs[0]
else:
# We have no object! Do something...
pass
The first method seems behaviorally more correct, but uses exceptions in control flow which may introduce some overhead. The second is more roundabout but won't ever raise an exception.
Any thoughts on which of these is preferable? Which is more efficient?

get() is provided specifically for this case. Use it.
Option 2 is almost precisely how the get() method is actually implemented in Django, so there should be no "performance" difference (and the fact that you're thinking about it indicates you're violating one of the cardinal rules of programming, namely trying to optimize code before it's even been written and profiled -- until you have the code and can run it, you don't know how it will perform, and trying to optimize before then is a path of pain).

You can install a module called django-annoying and then do this:
from annoying.functions import get_object_or_None
obj = get_object_or_None(MyModel, id=1)
if not obj:
#omg the object was not found do some error stuff

1 is correct. In Python an exception has equal overhead to a return. For a simplified proof you can look at this.
2 This is what Django is doing in the backend. get calls filter and raises an exception if no item is found or if more than one object is found.

I'm a bit late to the party, but with Django 1.6 there is the first() method on querysets.
https://docs.djangoproject.com/en/dev/ref/models/querysets/#django.db.models.query.QuerySet.first
Returns the first object matched by the queryset, or None if there is no matching object. If the QuerySet has no ordering defined, then the queryset is automatically ordered by the primary key.
Example:
p = Article.objects.order_by('title', 'pub_date').first()
Note that first() is a convenience method, the following code sample is equivalent to the above example:
try:
p = Article.objects.order_by('title', 'pub_date')[0]
except IndexError:
p = None

Why do all that work? Replace 4 lines with 1 builtin shortcut. (This does its own try/except.)
from django.shortcuts import get_object_or_404
obj = get_object_or_404(MyModel, id=1)

I can't speak with any experience of Django but option #1 clearly tells the system that you are asking for 1 object, whereas the second option does not. This means that option #1 could more easily take advantage of cache or database indexes, especially where the attribute you're filtering on is not guaranteed to be unique.
Also (again, speculating) the second option may have to create some sort of results collection or iterator object since the filter() call could normally return many rows. You'd bypass this with get().
Finally, the first option is both shorter and omits the extra temporary variable - only a minor difference but every little helps.

Some more info about exceptions. If they are not raised, they cost almost nothing. Thus if you know you are probably going to have a result, use the exception, since using a conditional expression you pay the cost of checking every time, no matter what. On the other hand, they cost a bit more than a conditional expression when they are raised, so if you expect not to have a result with some frequency (say, 30% of the time, if memory serves), the conditional check turns out to be a bit cheaper.
But this is Django's ORM, and probably the round-trip to the database, or even a cached result, is likely to dominate the performance characteristics, so favor readability, in this case, since you expect exactly one result, use get().

I've played with this problem a bit and discovered that the option 2 executes two SQL queries, which for such a simple task is excessive. See my annotation:
objs = MyModel.objects.filter(id=1) # This does not execute any SQL
if len(objs) == 1: # This executes SELECT COUNT(*) FROM XXX WHERE filter
obj = objs[0] # This executes SELECT x, y, z, .. FROM XXX WHERE filter
else:
# we have no object! do something
pass
An equivalent version that executes a single query is:
items = [item for item in MyModel.objects.filter(id=1)] # executes SELECT x, y, z FROM XXX WHERE filter
count = len(items) # Does not execute any query, items is a standard list.
if count == 0:
return None
return items[0]
By switching to this approach, I was able to substantially reduce number of queries my application executes.

.get()
Returns the object matching the given lookup parameters, which should
be in the format described in Field lookups.
get() raises MultipleObjectsReturned if more than one object was
found. The MultipleObjectsReturned exception is an attribute of the
model class.
get() raises a DoesNotExist exception if an object wasn't found for
the given parameters. This exception is also an attribute of the model
class.
.filter()
Returns a new QuerySet containing objects that match the given lookup
parameters.
Note
use get() when you want to get a single unique object, and filter()
when you want to get all objects that match your lookup parameters.

Interesting question, but for me option #2 reeks of premature optimisation. I'm not sure which is more performant, but option #1 certainly looks and feels more pythonic to me.

I suggest a different design.
If you want to perform a function on a possible result, you could derive from QuerySet, like this: http://djangosnippets.org/snippets/734/
The result is pretty awesome, you could for example:
MyModel.objects.filter(id=1).yourFunction()
Here, filter returns either an empty queryset or a queryset with a single item. Your custom queryset functions are also chainable and reusable. If you want to perform it for all your entries: MyModel.objects.all().yourFunction().
They are also ideal to be used as actions in the admin interface:
def yourAction(self, request, queryset):
queryset.yourFunction()

Option 1 is more elegant, but be sure to use try..except.
From my own experience I can tell you that sometimes you're sure there cannot possibly be more than one matching object in the database, and yet there will be two... (except of course when getting the object by its primary key).

Sorry to add one more take on this issue, but I am using the django paginator, and in my data admin app, the user is allowed to pick what to query on. Sometimes that is the id of a document, but otherwise it is a general query returning more than one object, i.e., a Queryset.
If the user queries the id, I can run:
Record.objects.get(pk=id)
which throws an error in django's paginator, because it is a Record and not a Queryset of Records.
I need to run:
Record.objects.filter(pk=id)
Which returns a Queryset with one item in it. Then the paginator works just fine.

".get()" can return one object:
{
"name": "John",
"age": "26",
"gender": "Male"
}
".filter()" can return **a list(set) of one or more objects:
[
{
"name": "John",
"age": "26",
"gender": "Male"
},
{
"name": "Tom",
"age": "18",
"gender": "Male"
},
{
"name": "Marry",
"age": "22",
"gender": "Female"
}
]

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.