Caching at QuerySet level in Django - python

I'm trying to get a queryset from the cache, but am unsure if this even has a point.
I have the following method (simplified) inside a custom queryset:
def queryset_from_cache(self, key: str=None, timeout: int=60):
# Generate a key based on the query.
if key is None:
key = self.__generate_key # ()
# If the cache has the key, return the cached object.
cached_object = cache.get(key, None)
# If the cache doesn't have the key, set the cache,
# and then return self (from DB) as cached_object
if cached_object is None:
cached_object = self
cache.set(key, cached_object , timeout=timeout)
return cached_object
The usage is basically to append it to a django QuerySet method, for example:
queryset = MyModel.objects.filter(id__range=[0,99]).queryset_from_cache()
My question:
Would usage like this work?
Or would it call MyModel.objects.filter(id__range=[0,99]) from the database no matter what?
Since normally caching would be done like this:
cached_object = cache.get(key, None)
if cached_object is None:
cached_object = MyModel.objects.filter(id__range=[0,99])
#Only now call the query
cache.set(key, cached_object , timeout=timeout)
And thus the queryset filter() method only gets called when the key is not present in the cache, as opposed to always calling it, and then trying to get it from the cache with the queryset_from_cache method.

This is a really cool idea, but I'm not sure if you can Cache full-on Objects.. I think it's only attributes
Now this having a point. Grom what I'm seeing from the limited code I've seen idk if it does have a point, unless filtering for Jane and John (and only them) is very common. Very narrow.
Maybe just try caching ALL the users or just individual Users, and only the attributes you need
Update
Yes! you are completetly correct, you can cache full on objects- how cool!
I don't think your example method of queryset = MyModel.objects.filter(id__range=[0,99]).queryset_from_cache() would work.
but you can do something similar by using Model Managers and do something like: queryset = MyModel.objects.queryset_from_cache(filterdict)
Models
Natually you can return just the qs, this is just for the example to show it actually is from the cache
from django.db import models
class MyModelManager(models.Manager):
def queryset_from_cache(self, filterdict):
from django.core.cache import cache
cachekey = 'MyModelCache'
qs = cache.get(cachekey)
if qs:
d = {
'in_cache': True,
'qs': qs
}
else:
qs = MyModel.objects.filter(**filterdict)
cache.set(cachekey, qs, 300) # 5 min cache
d = {
'in_cache': False,
'qs': qs
}
return d
class MyModel(models.Model):
name = models.CharField(max_length=200)
#
# other attributes
#
objects = MyModelManager()
Example Use
from app.models import MyModel
filterdict = {'pk__range':[0,99]}
r = MyModel.objects.queryset_from_cache(filterdict)
print(r['qs'])
While it's not exactly what you wanted, it might be close enough

Related

How to efficiently update nested objects in Django Serializer

I have been doing some research but couldn't find an effective yet simple way to update instances without making some repetitive code. Here is an example:
PartySerializer(Model.serializers)
def update(self, instance, validated_data):
instance.name = validated_data.get('name', instance.name)
instance.event_date = validated_data.get('event_date', instance.event_date)
instance.time_start = validated_data.get('time_start', instance.time_start)
instance.time_end = validated_data.get('time_end', instance.time_end)
instance.main_color = validated_data.get('main_color', instance.main_color)
instance.is_age_limited = validated_data.get('is_age_limited', instance.is_age_limited)
instance.is_open_bar = validated_data.get('is_open_bar', instance.is_open_bar)
instance.is_open_food = validated_data.get('is_open_food', instance.is_open_food)
instance.description = validated_data.get('description', instance.description)
instance.location = validated_data.get('location', instance.location)
instance.category = validated_data.get('category', instance.category)
instance.save()
Is there any cleaner and more efficient approach ?
Guys I finally found a better way of doing this. #Neeraj as i am using nested objects I need to specify an override for method update. So when I use super() I end up calling the same function I am in. #VJ Magar I am not doing a partial update in this case, all fields are present at least its key. My solution was this:
def update(self, instance, validated_data):
for key, obj in validated_data.items():
if not key == 'location':
setattr(instance, key, obj)
else:
location_serializer = LocationSerializer(instance.location, data=validated_data.get('location'))
if location_serializer.is_valid():
location_serializer.save()
instance.save()
return instance

Wagtail/Django: Query filter to return only the pages the user has acess/permissions?

I'm going through the documentation at: http://docs.wagtail.io/en/v2.7.1/reference/pages/queryset_reference.html.
Is there a filter to return only the pages the user has access to? I can only see public() and not_public().
I have some pages which privacy is set to Private (accessible to users in specific groups). And would like to exclude them from the query results.
There is no such filter in PageQuerySet. You can however create your own QuerySet that adds an authorized filter and use that. The following code comes from the Joyous events EventQuerySet and is based upon PageQuerySet.public_q and BaseViewRestriction.accept_request. It gets all the restrictions that could apply, excludes the ones that the user passes, and then filters out the pages with the remaining restrictions.
from wagtail.core.query import PageQuerySet
from wagtail.core.models import Page, PageManager, PageViewRestriction
class MyQuerySet(PageQuerySet):
def authorized_q(self, request):
PASSWORD = PageViewRestriction.PASSWORD
LOGIN = PageViewRestriction.LOGIN
GROUPS = PageViewRestriction.GROUPS
KEY = PageViewRestriction.passed_view_restrictions_session_key
restrictions = PageViewRestriction.objects.all()
passed = request.session.get(KEY, [])
if passed:
restrictions = restrictions.exclude(id__in=passed,
restriction_type=PASSWORD)
if request.user.is_authenticated:
restrictions = restrictions.exclude(restriction_type=LOGIN)
if request.user.is_superuser:
restrictions = restrictions.exclude(restriction_type=GROUPS)
else:
membership = request.user.groups.all()
if membership:
restrictions = restrictions.exclude(groups__in=membership,
restriction_type=GROUPS)
q = models.Q()
for restriction in restrictions:
q &= ~self.descendant_of_q(restriction.page, inclusive=True)
return q
def authorized(self, request):
self.request = request
if request is None:
return self
else:
return self.filter(self.authorized_q(request))
You could then set this to be your model's default QuerySet.
class MyPage(Page):
objects = PageManager.from_queryset(MyQuerySet)()
Then when filtering your MyPage objects you can say MyPage.objects.live().authorized(request).all()
Hope that is helpful. May contain bugs.

To lock or to catch IntegrityError? [duplicate]

I want to get an object from the database if it already exists (based on provided parameters) or create it if it does not.
Django's get_or_create (or source) does this. Is there an equivalent shortcut in SQLAlchemy?
I'm currently writing it out explicitly like this:
def get_or_create_instrument(session, serial_number):
instrument = session.query(Instrument).filter_by(serial_number=serial_number).first()
if instrument:
return instrument
else:
instrument = Instrument(serial_number)
session.add(instrument)
return instrument
Following the solution of #WoLpH, this is the code that worked for me (simple version):
def get_or_create(session, model, **kwargs):
instance = session.query(model).filter_by(**kwargs).first()
if instance:
return instance
else:
instance = model(**kwargs)
session.add(instance)
session.commit()
return instance
With this, I'm able to get_or_create any object of my model.
Suppose my model object is :
class Country(Base):
__tablename__ = 'countries'
id = Column(Integer, primary_key=True)
name = Column(String, unique=True)
To get or create my object I write :
myCountry = get_or_create(session, Country, name=countryName)
That's basically the way to do it, there is no shortcut readily available AFAIK.
You could generalize it ofcourse:
def get_or_create(session, model, defaults=None, **kwargs):
instance = session.query(model).filter_by(**kwargs).one_or_none()
if instance:
return instance, False
else:
params = {k: v for k, v in kwargs.items() if not isinstance(v, ClauseElement)}
params.update(defaults or {})
instance = model(**params)
try:
session.add(instance)
session.commit()
except Exception: # The actual exception depends on the specific database so we catch all exceptions. This is similar to the official documentation: https://docs.sqlalchemy.org/en/latest/orm/session_transaction.html
session.rollback()
instance = session.query(model).filter_by(**kwargs).one()
return instance, False
else:
return instance, True
2020 update (Python 3.9+ ONLY)
Here is a cleaner version with Python 3.9's the new dict union operator (|=)
def get_or_create(session, model, defaults=None, **kwargs):
instance = session.query(model).filter_by(**kwargs).one_or_none()
if instance:
return instance, False
else:
kwargs |= defaults or {}
instance = model(**kwargs)
try:
session.add(instance)
session.commit()
except Exception: # The actual exception depends on the specific database so we catch all exceptions. This is similar to the official documentation: https://docs.sqlalchemy.org/en/latest/orm/session_transaction.html
session.rollback()
instance = session.query(model).filter_by(**kwargs).one()
return instance, False
else:
return instance, True
Note:
Similar to the Django version this will catch duplicate key constraints and similar errors. If your get or create is not guaranteed to return a single result it can still result in race conditions.
To alleviate some of that issue you would need to add another one_or_none() style fetch right after the session.commit(). This still is no 100% guarantee against race conditions unless you also use a with_for_update() or serializable transaction mode.
I've been playing with this problem and have ended up with a fairly robust solution:
def get_one_or_create(session,
model,
create_method='',
create_method_kwargs=None,
**kwargs):
try:
return session.query(model).filter_by(**kwargs).one(), False
except NoResultFound:
kwargs.update(create_method_kwargs or {})
created = getattr(model, create_method, model)(**kwargs)
try:
session.add(created)
session.flush()
return created, True
except IntegrityError:
session.rollback()
return session.query(model).filter_by(**kwargs).one(), False
I just wrote a fairly expansive blog post on all the details, but a few quite ideas of why I used this.
It unpacks to a tuple that tells you if the object existed or not. This can often be useful in your workflow.
The function gives the ability to work with #classmethod decorated creator functions (and attributes specific to them).
The solution protects against Race Conditions when you have more than one process connected to the datastore.
EDIT: I've changed session.commit() to session.flush() as explained in this blog post. Note that these decisions are specific to the datastore used (Postgres in this case).
EDIT 2: I’ve updated using a {} as a default value in the function as this is typical Python gotcha. Thanks for the comment, Nigel! If your curious about this gotcha, check out this StackOverflow question and this blog post.
A modified version of erik's excellent answer
def get_one_or_create(session,
model,
create_method='',
create_method_kwargs=None,
**kwargs):
try:
return session.query(model).filter_by(**kwargs).one(), True
except NoResultFound:
kwargs.update(create_method_kwargs or {})
try:
with session.begin_nested():
created = getattr(model, create_method, model)(**kwargs)
session.add(created)
return created, False
except IntegrityError:
return session.query(model).filter_by(**kwargs).one(), True
Use a nested transaction to only roll back the addition of the new item instead of rolling back everything (See this answer to use nested transactions with SQLite)
Move create_method. If the created object has relations and it is assigned members through those relations, it is automatically added to the session. E.g. create a book, which has user_id and user as corresponding relationship, then doing book.user=<user object> inside of create_method will add book to the session. This means that create_method must be inside with to benefit from an eventual rollback. Note that begin_nested automatically triggers a flush.
Note that if using MySQL, the transaction isolation level must be set to READ COMMITTED rather than REPEATABLE READ for this to work. Django's get_or_create (and here) uses the same stratagem, see also the Django documentation.
This SQLALchemy recipe does the job nice and elegant.
The first thing to do is to define a function that is given a Session to work with, and associates a dictionary with the Session() which keeps track of current unique keys.
def _unique(session, cls, hashfunc, queryfunc, constructor, arg, kw):
cache = getattr(session, '_unique_cache', None)
if cache is None:
session._unique_cache = cache = {}
key = (cls, hashfunc(*arg, **kw))
if key in cache:
return cache[key]
else:
with session.no_autoflush:
q = session.query(cls)
q = queryfunc(q, *arg, **kw)
obj = q.first()
if not obj:
obj = constructor(*arg, **kw)
session.add(obj)
cache[key] = obj
return obj
An example of utilizing this function would be in a mixin:
class UniqueMixin(object):
#classmethod
def unique_hash(cls, *arg, **kw):
raise NotImplementedError()
#classmethod
def unique_filter(cls, query, *arg, **kw):
raise NotImplementedError()
#classmethod
def as_unique(cls, session, *arg, **kw):
return _unique(
session,
cls,
cls.unique_hash,
cls.unique_filter,
cls,
arg, kw
)
And finally creating the unique get_or_create model:
from sqlalchemy import Column, Integer, String, create_engine
from sqlalchemy.orm import sessionmaker
from sqlalchemy.ext.declarative import declarative_base
Base = declarative_base()
engine = create_engine('sqlite://', echo=True)
Session = sessionmaker(bind=engine)
class Widget(UniqueMixin, Base):
__tablename__ = 'widget'
id = Column(Integer, primary_key=True)
name = Column(String, unique=True, nullable=False)
#classmethod
def unique_hash(cls, name):
return name
#classmethod
def unique_filter(cls, query, name):
return query.filter(Widget.name == name)
Base.metadata.create_all(engine)
session = Session()
w1, w2, w3 = Widget.as_unique(session, name='w1'), \
Widget.as_unique(session, name='w2'), \
Widget.as_unique(session, name='w3')
w1b = Widget.as_unique(session, name='w1')
assert w1 is w1b
assert w2 is not w3
assert w2 is not w1
session.commit()
The recipe goes deeper into the idea and provides different approaches but I've used this one with great success.
The closest semantically is probably:
def get_or_create(model, **kwargs):
"""SqlAlchemy implementation of Django's get_or_create.
"""
session = Session()
instance = session.query(model).filter_by(**kwargs).first()
if instance:
return instance, False
else:
instance = model(**kwargs)
session.add(instance)
session.commit()
return instance, True
not sure how kosher it is to rely on a globally defined Session in sqlalchemy, but the Django version doesn't take a connection so...
The tuple returned contains the instance and a boolean indicating if the instance was created (i.e. it's False if we read the instance from the db).
Django's get_or_create is often used to make sure that global data is available, so I'm committing at the earliest point possible.
I slightly simplified #Kevin. solution to avoid wrapping the whole function in an if/else statement. This way there's only one return, which I find cleaner:
def get_or_create(session, model, **kwargs):
instance = session.query(model).filter_by(**kwargs).first()
if not instance:
instance = model(**kwargs)
session.add(instance)
return instance
There is a Python package that has #erik's solution as well as a version of update_or_create(). https://github.com/enricobarzetti/sqlalchemy_get_or_create
Depending on the isolation level you adopted, none of the above solutions would work.
The best solution I have found is a RAW SQL in the following form:
INSERT INTO table(f1, f2, unique_f3)
SELECT 'v1', 'v2', 'v3'
WHERE NOT EXISTS (SELECT 1 FROM table WHERE f3 = 'v3')
This is transactionally safe whatever the isolation level and the degree of parallelism are.
Beware: in order to make it efficient, it would be wise to have an INDEX for the unique column.
One problem I regularly encounter is when a field has a max length (say, STRING(40)) and you'd like to perform a get or create with a string of large length, the above solutions will fail.
Building off of the above solutions, here's my approach:
from sqlalchemy import Column, String
def get_or_create(self, add=True, flush=True, commit=False, **kwargs):
"""
Get the an entity based on the kwargs or create an entity with those kwargs.
Params:
add: (default True) should the instance be added to the session?
flush: (default True) flush the instance to the session?
commit: (default False) commit the session?
kwargs: key, value pairs of parameters to lookup/create.
Ex: SocialPlatform.get_or_create(**{'name':'facebook'})
returns --> existing record or, will create a new record
---------
NOTE: I like to add this as a classmethod in the base class of my tables, so that
all data models inherit the base class --> functionality is transmitted across
all orm defined models.
"""
# Truncate values if necessary
for key, value in kwargs.items():
# Only use strings
if not isinstance(value, str):
continue
# Only use if it's a column
my_col = getattr(self.__table__.columns, key)
if not isinstance(my_col, Column):
continue
# Skip non strings again here
if not isinstance(my_col.type, String):
continue
# Get the max length
max_len = my_col.type.length
if value and max_len and len(value) > max_len:
# Update the value
value = value[:max_len]
kwargs[key] = value
# -------------------------------------------------
# Make the query...
instance = session.query(self).filter_by(**kwargs).first()
if instance:
return instance
else:
# Max length isn't accounted for here.
# The assumption is that auto-truncation will happen on the child-model
# Or directtly in the db
instance = self(**kwargs)
# You'll usually want to add to the session
if add:
session.add(instance)
# Navigate these with caution
if add and commit:
try:
session.commit()
except IntegrityError:
session.rollback()
elif add and flush:
session.flush()
return instance

Fastest Way to Create a New Object Only if it Doesn't Already Exist (SQLAlchemy) [duplicate]

I want to get an object from the database if it already exists (based on provided parameters) or create it if it does not.
Django's get_or_create (or source) does this. Is there an equivalent shortcut in SQLAlchemy?
I'm currently writing it out explicitly like this:
def get_or_create_instrument(session, serial_number):
instrument = session.query(Instrument).filter_by(serial_number=serial_number).first()
if instrument:
return instrument
else:
instrument = Instrument(serial_number)
session.add(instrument)
return instrument
Following the solution of #WoLpH, this is the code that worked for me (simple version):
def get_or_create(session, model, **kwargs):
instance = session.query(model).filter_by(**kwargs).first()
if instance:
return instance
else:
instance = model(**kwargs)
session.add(instance)
session.commit()
return instance
With this, I'm able to get_or_create any object of my model.
Suppose my model object is :
class Country(Base):
__tablename__ = 'countries'
id = Column(Integer, primary_key=True)
name = Column(String, unique=True)
To get or create my object I write :
myCountry = get_or_create(session, Country, name=countryName)
That's basically the way to do it, there is no shortcut readily available AFAIK.
You could generalize it ofcourse:
def get_or_create(session, model, defaults=None, **kwargs):
instance = session.query(model).filter_by(**kwargs).one_or_none()
if instance:
return instance, False
else:
params = {k: v for k, v in kwargs.items() if not isinstance(v, ClauseElement)}
params.update(defaults or {})
instance = model(**params)
try:
session.add(instance)
session.commit()
except Exception: # The actual exception depends on the specific database so we catch all exceptions. This is similar to the official documentation: https://docs.sqlalchemy.org/en/latest/orm/session_transaction.html
session.rollback()
instance = session.query(model).filter_by(**kwargs).one()
return instance, False
else:
return instance, True
2020 update (Python 3.9+ ONLY)
Here is a cleaner version with Python 3.9's the new dict union operator (|=)
def get_or_create(session, model, defaults=None, **kwargs):
instance = session.query(model).filter_by(**kwargs).one_or_none()
if instance:
return instance, False
else:
kwargs |= defaults or {}
instance = model(**kwargs)
try:
session.add(instance)
session.commit()
except Exception: # The actual exception depends on the specific database so we catch all exceptions. This is similar to the official documentation: https://docs.sqlalchemy.org/en/latest/orm/session_transaction.html
session.rollback()
instance = session.query(model).filter_by(**kwargs).one()
return instance, False
else:
return instance, True
Note:
Similar to the Django version this will catch duplicate key constraints and similar errors. If your get or create is not guaranteed to return a single result it can still result in race conditions.
To alleviate some of that issue you would need to add another one_or_none() style fetch right after the session.commit(). This still is no 100% guarantee against race conditions unless you also use a with_for_update() or serializable transaction mode.
I've been playing with this problem and have ended up with a fairly robust solution:
def get_one_or_create(session,
model,
create_method='',
create_method_kwargs=None,
**kwargs):
try:
return session.query(model).filter_by(**kwargs).one(), False
except NoResultFound:
kwargs.update(create_method_kwargs or {})
created = getattr(model, create_method, model)(**kwargs)
try:
session.add(created)
session.flush()
return created, True
except IntegrityError:
session.rollback()
return session.query(model).filter_by(**kwargs).one(), False
I just wrote a fairly expansive blog post on all the details, but a few quite ideas of why I used this.
It unpacks to a tuple that tells you if the object existed or not. This can often be useful in your workflow.
The function gives the ability to work with #classmethod decorated creator functions (and attributes specific to them).
The solution protects against Race Conditions when you have more than one process connected to the datastore.
EDIT: I've changed session.commit() to session.flush() as explained in this blog post. Note that these decisions are specific to the datastore used (Postgres in this case).
EDIT 2: I’ve updated using a {} as a default value in the function as this is typical Python gotcha. Thanks for the comment, Nigel! If your curious about this gotcha, check out this StackOverflow question and this blog post.
A modified version of erik's excellent answer
def get_one_or_create(session,
model,
create_method='',
create_method_kwargs=None,
**kwargs):
try:
return session.query(model).filter_by(**kwargs).one(), True
except NoResultFound:
kwargs.update(create_method_kwargs or {})
try:
with session.begin_nested():
created = getattr(model, create_method, model)(**kwargs)
session.add(created)
return created, False
except IntegrityError:
return session.query(model).filter_by(**kwargs).one(), True
Use a nested transaction to only roll back the addition of the new item instead of rolling back everything (See this answer to use nested transactions with SQLite)
Move create_method. If the created object has relations and it is assigned members through those relations, it is automatically added to the session. E.g. create a book, which has user_id and user as corresponding relationship, then doing book.user=<user object> inside of create_method will add book to the session. This means that create_method must be inside with to benefit from an eventual rollback. Note that begin_nested automatically triggers a flush.
Note that if using MySQL, the transaction isolation level must be set to READ COMMITTED rather than REPEATABLE READ for this to work. Django's get_or_create (and here) uses the same stratagem, see also the Django documentation.
This SQLALchemy recipe does the job nice and elegant.
The first thing to do is to define a function that is given a Session to work with, and associates a dictionary with the Session() which keeps track of current unique keys.
def _unique(session, cls, hashfunc, queryfunc, constructor, arg, kw):
cache = getattr(session, '_unique_cache', None)
if cache is None:
session._unique_cache = cache = {}
key = (cls, hashfunc(*arg, **kw))
if key in cache:
return cache[key]
else:
with session.no_autoflush:
q = session.query(cls)
q = queryfunc(q, *arg, **kw)
obj = q.first()
if not obj:
obj = constructor(*arg, **kw)
session.add(obj)
cache[key] = obj
return obj
An example of utilizing this function would be in a mixin:
class UniqueMixin(object):
#classmethod
def unique_hash(cls, *arg, **kw):
raise NotImplementedError()
#classmethod
def unique_filter(cls, query, *arg, **kw):
raise NotImplementedError()
#classmethod
def as_unique(cls, session, *arg, **kw):
return _unique(
session,
cls,
cls.unique_hash,
cls.unique_filter,
cls,
arg, kw
)
And finally creating the unique get_or_create model:
from sqlalchemy import Column, Integer, String, create_engine
from sqlalchemy.orm import sessionmaker
from sqlalchemy.ext.declarative import declarative_base
Base = declarative_base()
engine = create_engine('sqlite://', echo=True)
Session = sessionmaker(bind=engine)
class Widget(UniqueMixin, Base):
__tablename__ = 'widget'
id = Column(Integer, primary_key=True)
name = Column(String, unique=True, nullable=False)
#classmethod
def unique_hash(cls, name):
return name
#classmethod
def unique_filter(cls, query, name):
return query.filter(Widget.name == name)
Base.metadata.create_all(engine)
session = Session()
w1, w2, w3 = Widget.as_unique(session, name='w1'), \
Widget.as_unique(session, name='w2'), \
Widget.as_unique(session, name='w3')
w1b = Widget.as_unique(session, name='w1')
assert w1 is w1b
assert w2 is not w3
assert w2 is not w1
session.commit()
The recipe goes deeper into the idea and provides different approaches but I've used this one with great success.
The closest semantically is probably:
def get_or_create(model, **kwargs):
"""SqlAlchemy implementation of Django's get_or_create.
"""
session = Session()
instance = session.query(model).filter_by(**kwargs).first()
if instance:
return instance, False
else:
instance = model(**kwargs)
session.add(instance)
session.commit()
return instance, True
not sure how kosher it is to rely on a globally defined Session in sqlalchemy, but the Django version doesn't take a connection so...
The tuple returned contains the instance and a boolean indicating if the instance was created (i.e. it's False if we read the instance from the db).
Django's get_or_create is often used to make sure that global data is available, so I'm committing at the earliest point possible.
I slightly simplified #Kevin. solution to avoid wrapping the whole function in an if/else statement. This way there's only one return, which I find cleaner:
def get_or_create(session, model, **kwargs):
instance = session.query(model).filter_by(**kwargs).first()
if not instance:
instance = model(**kwargs)
session.add(instance)
return instance
There is a Python package that has #erik's solution as well as a version of update_or_create(). https://github.com/enricobarzetti/sqlalchemy_get_or_create
Depending on the isolation level you adopted, none of the above solutions would work.
The best solution I have found is a RAW SQL in the following form:
INSERT INTO table(f1, f2, unique_f3)
SELECT 'v1', 'v2', 'v3'
WHERE NOT EXISTS (SELECT 1 FROM table WHERE f3 = 'v3')
This is transactionally safe whatever the isolation level and the degree of parallelism are.
Beware: in order to make it efficient, it would be wise to have an INDEX for the unique column.
One problem I regularly encounter is when a field has a max length (say, STRING(40)) and you'd like to perform a get or create with a string of large length, the above solutions will fail.
Building off of the above solutions, here's my approach:
from sqlalchemy import Column, String
def get_or_create(self, add=True, flush=True, commit=False, **kwargs):
"""
Get the an entity based on the kwargs or create an entity with those kwargs.
Params:
add: (default True) should the instance be added to the session?
flush: (default True) flush the instance to the session?
commit: (default False) commit the session?
kwargs: key, value pairs of parameters to lookup/create.
Ex: SocialPlatform.get_or_create(**{'name':'facebook'})
returns --> existing record or, will create a new record
---------
NOTE: I like to add this as a classmethod in the base class of my tables, so that
all data models inherit the base class --> functionality is transmitted across
all orm defined models.
"""
# Truncate values if necessary
for key, value in kwargs.items():
# Only use strings
if not isinstance(value, str):
continue
# Only use if it's a column
my_col = getattr(self.__table__.columns, key)
if not isinstance(my_col, Column):
continue
# Skip non strings again here
if not isinstance(my_col.type, String):
continue
# Get the max length
max_len = my_col.type.length
if value and max_len and len(value) > max_len:
# Update the value
value = value[:max_len]
kwargs[key] = value
# -------------------------------------------------
# Make the query...
instance = session.query(self).filter_by(**kwargs).first()
if instance:
return instance
else:
# Max length isn't accounted for here.
# The assumption is that auto-truncation will happen on the child-model
# Or directtly in the db
instance = self(**kwargs)
# You'll usually want to add to the session
if add:
session.add(instance)
# Navigate these with caution
if add and commit:
try:
session.commit()
except IntegrityError:
session.rollback()
elif add and flush:
session.flush()
return instance

Django: Saving old QuerySet for future comparison

I'm new with django and I'm trying to make a unit test where I want to compare a QuerySet before and after a batch editing function call.
def test_batchEditing_9(self):
reset() #reset database for test
query = Game.objects.all()
query_old = Game.objects.all()
dict_value = {'game_code' : '001'}
Utility.batchEditing(Game, query, dict_value)
query_new = Game.objects.all()
self.assertTrue(compareQuerySet(query_old, query_new))
My problem is that query_old will be updated after batchEditing is called. Therefor, both querysets will be the same.
It seem that QuerySet is bound to the current state of the database.
Is this normal?
Is there a way to unbind a QuerySet from the database?
I have tried queryset.values, list(queryset) but it still updates the value.
I'm actually thinking about iterating on the queryset and creating a list of dictionaries by myself, but I want to know if there is an easier way.
Here is batchEditing (didn't paste input validity check)
def batchEditing(model, query, values):
for item in query:
if isinstance(item, model):
for field, val in values.iteritems():
if val is not None:
setattr(item, field, val)
item.save()
Here is compareQuerySet
def compareQuerySet(object1, object2):
list_val1 = object1.values_list()
list_val2 = object2.values_list()
for i in range(len(list_val1)):
if list_val1[i] != list_val2[i]:
return False
return True
A Queryset is essentially just generating SQL and only on evaluating it, the database is hit. As far as I remember, that happens on iterating over the Queryset. For instance,
gamescache = list(Game.objects.all())
or
for g in Game.objects.all():
...
hit the database.
Following code should work:
def test_batchEditing_9(self):
reset() #reset database for test
query = Game.objects.all()
query_old = set(query)
dict_value = {'game_code' : '001'}
Utility.batchEditing(Game, query, dict_value)
query_new = set(query)
self.assertEqual(query_old, query_new)
This is because Game.objects.all() is not hitting database, but just creates object that stores query parameters.
BTW. If you will use order_by in query and order is important you can use list rather than set.

Categories