Celery raise error while passing my queryset obj as parameter

Celery raise error while passing my queryset obj as parameter - python

I trying to execute a periodic task, so I used celery with Django 1.8 and Django Rest Framework and Postgres as Database. When I try to send my obj to the task I get TypeError: foreign_model_obj is not JSON serializable. How can I pass my queryset object to my Task.
views.py :
class MyModelCreateApiView(generics.CreateAPIView):
queryset = MyModel.objects.all()
serializer_class = MyModelSerializer
authentication_classes = (TokenAuthentication,)
def create(self, request, *args, **kwargs):
data = dict()
data['foreign_model_id'] = kwargs['pk']
foreign_model_obj = MyForeignModel.objects.get(id=data['foreign_model_id'])
obj = MyModel.objects.create(**data)
result = serialize_query(MyModel, {"id": obj.id})
local_time = foreign_model_obj.time
my_celery_task.apply_async([foreign_model_obj], eta=local_time)
return Response(result)
tasks.py :
#celery_app.task(name="my_celery_task")
def my_first_celery_task(mymodel_obj):
# ... updating obj attributes
mymodel_obj.save()

You have just to send the id of your instance and retrieve the object within the task.
It's a bad practice to pass the instance, since it can be altered in meantime, specially that you are excuting your task with a deplay as it seems to be.
views.py :
class MyModelCreateApiView(generics.CreateAPIView):
queryset = MyModel.objects.all()
serializer_class = MyModelSerializer
authentication_classes = (TokenAuthentication,)
def create(self, request, *args, **kwargs):
data = dict()
data['foreign_model_id'] = kwargs['pk']
foreign_model_obj = MyForeignModel.objects.get(id=data['foreign_model_id'])
obj = MyModel.objects.create(**data)
result = serialize_query(MyModel, {"id": obj.id})
local_time = foreign_model_obj.time
my_celery_task.apply_async([foreign_model_obj.id], eta=local_time) # send only the obj id
return Response(result)
tasks.py :
#celery_app.task(name="my_celery_task")
def my_celery_task(mymodel_obj_id):
my_model_obj = MyModel.objects.get(id=mymodel_obj_id) # retrieve your object here
# ... updating obj attributes
mymodel_obj.save()

Actually, IMHO the best way to go is to get a picklable component of the queryset, then regenerate the queryset in the task (https://docs.djangoproject.com/en/1.9/ref/models/querysets/):
import pickle
query = pickle.loads(s) # Assuming 's' is the pickled string.
qs = MyModel.objects.filter(a__in=[1,2,3]) # whatever you want here...
querystr = pickle.dumps(qs.query) # pickle the queryset
my_celery_task.apply_async(querystr, eta=local_time) # send only the string...
The task:
#celery_app.task(name="my_celery_task")
def my_celery_task(querystr):
my_model_objs = MyModel.objects.all()
my_model_objs.query = pickle.loads(querystr) # Restore the queryset
# ... updating obj attributes
item=my_model_objs[0]
This is the best approach, I think, because the query will get executed (perhaps the first time) in the task, preventing various timing issues, it need not be executed in the caller (so no doubling up on the query).

You could change method of serialization to pickle, but it is not recommended to pass queryset as a parameter. Quote from Celery documentation:
Another gotcha is Django model objects. They shouldn’t be passed on as arguments to tasks. It’s almost always better to re-fetch the object from the database when the task is running instead, as using old data may lead to race conditions.
http://docs.celeryproject.org/en/latest/userguide/tasks.html

Related

How to cache SerializerMethodField result

I have the below serializer:
class OrderItemResponseSerializer(serializers.ModelSerializer):
prepack_qty = serializers.SerializerMethodField()
product_description = serializers.SerializerMethodField()
class Meta:
model = OrderItem
fields = (
"product",
"qty_ordered",
"qty_approved",
"status",
"prepack_qty",
"product_description"
)
def get_prepack_qty(self, obj):
return obj.product.prepack_quantity
def get_product_description(self, obj):
return obj.product.product_description
When I make a get request to /orders, I make a lot of SQL queries to the database because different orders may contain the same product. How can I cache the result of get_prepack_qty and get_product_description methods? I tried to use #cached_property this way:
class OrderItem(models.Model):
...
#cached_property
def item_product_description(self):
return self.product.product_description
But the number of requests to the database remained the same.

Well, first of all, I should say that what you have implemented in this piece of code below:
...
#cached_property
def item_product_description(self):
return self.product.product_description
And using #cached_property alone doesn't cache the data for you, you just created a property in the model for get_product_description serializer method, And this does not reduce the volume and number of your queries to the database at all; Of course you need .bind() method in your serializer method like below:
class BlobSerializer(SerializerCacheMixin, serializers.Serializer):
blobs = serializers.SerializerMethodField()
def get_blobs(self, instance):
# recursive serializer
serializer = self.__class__(instance.results, many=True)
serializer.bind('*', self)
return serializer.data
But in order to cache the result of this method As you asked in your question there is a good project in Pypi called drf-serializer-cache you can use it easily for this purpose, for example, the following piece of code is taken from the document of this project:
from drf_serializer_cache import SerializerCacheMixin
from rest_framework import serializer
class ResultSerializer(SerializerCacheMixin, serializers.Serializer):
results = serializers.SerializerMethodField()
def get_results(self, instance):
# recursive serializer
serializer = self.__class__(instance.results, many=True)
serializer.bind('*', self) # bind call is essential for >efficient cache!
return serializer.data
Also if you want to implement it yourself in your project Seeing the implementation of SerializerCacheMixin object in this project can help you a lot or even use it directly.

You can leverage Django's cache framework to cache the result of a SerializerMethodField. It would look something like this:
from django.core.cache import cache
class MySerializer(serializers.Serializer):
my_field = serializers.SerializerMethodField()
def get_my_field(self, obj):
# Get cached value
value = cache.get("my_field_%s" % obj.pk)
if value is None:
# Value is not cached, compute it
value = ...
# Cache the value
cache.set("my_field_%s" % obj.pk, value, 3600)
return value
This will cache the result of the field for 1 hour (3600 seconds), using a cache key that depends on the object's primary key.
You can of course adapt the caching logic to your exact use case.

How to force django model save method to lookup queryset in custom manager method?

I have a model named Run with a manager named RunManager and with a custom save() method as follows.
class RunManager(models.Manager):
use_for_related_fields = True
def get_queryset(self):
queryset = super(RunManager, self).get_queryset()
queryset = queryset.filter(archived=False)
return queryset
def unfiltered_runs(self):
queryset = super(RunManager, self).get_queryset()
return queryset
class Run(models.Model):
name = models.CharField(max_length=256)
archived = models.BooleanField(default=False)
objects = RunManager()
def save(self, *args, **kwargs):
# some business logic
super(Run, self).save(*args, **kwargs)
def archive(self):
# Some business logic
self.archived = True
self.save()
def recover_archived(self):
# Some business logic
self.archived = False
self.save()
This was an old code where the run.objects were used at several location, so to hide the archived runs I am using the RunManager.
Everything was working fine, but now we want to unarchive the runs. So I added the unfiltred_runs() method which shows the list of all the runs along with the archived runs. But when I run recove_archived() method i get following error
IntegrityError: UNIQUE constraint failed: run.id
I know the error is because the db is treating it as a new entry with same id.
I know I can completely override the save method but I want to avoid that.
So is there any way to make save method lookup in the unfiltered_runs() queryset instead of regular one.

By following #ivissani's suggestion I modified my recover_archived method as follows. And it is working flawlessly.
def recover_archived(self):
# Some business logic
Run.objects.unfiltered_runs().filter(pk=self.id).update(archived=False)

Django REST API query on related field

I have 3 models, Run, RunParameter, RunValue:
class Run(models.Model):
start_time = models.DateTimeField(db_index=True)
end_time = models.DateTimeField()
class RunParameter(models.Model):
parameter = models.ForeignKey(Parameter, on_delete=models.CASCADE)
class RunValue(models.Model):
run = models.ForeignKey(Run, on_delete=models.CASCADE)
run_parameter = models.ForeignKey(RunParameter, on_delete=models.CASCADE)
value = models.FloatField(default=0)
class Meta:
unique_together=(('run','run_parameter'),)
A Run can have a RunValue, which is a float value with the value's name coming from RunParameter (which is basically a table containing names), for example:
A RunValue could be AverageTime, or MaximumTemperature
A Run could then have RunValue = RunParameter:AverageTime with value X.
Another Run instance could have RunValue = RunParameter:MaximumTemperature with value Y, etc.
I created an endpoint to query my API, but I only have the RunParameter ID (because of the way you can select which parameter you want to graph), not the RunValue ID directly. I basically show a list of all RunParameter and a list of all Run instances, because if I showed all instances of RunValue the list would be too long and confusing, as instead of seeing "Maximum Temperature" you would see:
"Maximum Temperature for Run X"
"Maximum Temperature for Run Y"
"Maximum Temperature for Run Z", etc. (repeat 50+ times).
My API view looks like this:
class RunValuesDetailAPIView(RetrieveAPIView):
queryset = RunValue.objects.all()
serializer_class = RunValuesDetailSerializer
permission_classes = [IsOwnerOrReadOnly]]
And the serializer for that looks like this:
class RunValuesDetailSerializer(ModelSerializer):
run = SerializerMethodField()
class Meta:
model = RunValue
fields = [
'id',
'run',
'run_parameter',
'value'
]
def get_run(self, obj):
return str(obj.run)
And the URL just in case it's relevant:
url(r'^run-values/(?P<pk>\d+)/$', RunValuesDetailAPIView.as_view(), name='values_list_detail'),
Since I'm new to REST API, so far I've only dealt with having the ID of the model API view I am querying directly, but never an ID of a related field. I'm not sure where to modify my queryset to pass it an ID to get the appropriate model instance from a related field.
At the point I make the API query, I have the Run instance ID and the RunParameter ID. I would need the queryset to be:
run_value = RunValue.objects.get(run=run_id, run_parameter_id=param_id)
While so far I've only ever had to do something like:
run_value = RunValue.objects.get(id=value_id) # I don't have this ID

If I understand correctly, you're trying to get an instance of RunValue with only the Run id and the RunParameter id, i.e. query based on related fields.
The queryset can be achieved with the following:
run_value = RunValue.objects.get(
run__id=run_id,
run_parameter__id=run_parameter_id
)
Providing that a RunValue instance only ever has 1 related Run and RunParameter, this will return the instance of RunValue you're after.
Let me know if that's not what you mean.
The double underscore allows you to access those related instance fields in your query.

Well its pretty simple, all you have to do is override the get_object method, for example(copy pasted from documentation):
# view
from django.shortcuts import get_object_or_404
class RunValuesDetailAPIView(RetrieveAPIView):
queryset = RunValue.objects.all()
serializer_class = RunValuesDetailSerializer
permission_classes = [IsOwnerOrReadOnly]]
lookup_fields = ["run_id", "run_parameter_id"]
def get_object(self):
queryset = self.get_queryset() # Get the base queryset
queryset = self.filter_queryset(queryset) # Apply any filter backends
filter = {}
for field in self.lookup_fields:
if self.kwargs[field]: # Ignore empty fields.
filter[field] = self.kwargs[field]
obj = get_object_or_404(queryset, **filter) # Lookup the object
self.check_object_permissions(self.request, obj)
return obj
# url
url(r'^run-values/(?P<run_id>\d+)/(?P<run_parameter_id>\d+)/$', RunValuesDetailAPIView.as_view(), name='values_list_detail'),
But one big thing you need to be careful, is not to have duplicate entries with same run_id and run_parameter_id, then it will throw errors. To avoid it, either use unique_together=['run', 'run_parameter'] or you can use queryset.filter(**filter).first() instead of get_object_or_404 in the view. But second option will produce wrong results when duplicate entries are created.

Pass a custom queryset to serializer in Django Rest Framework

I am using Django rest framework 2.3
I have a class like this
class Quiz():
fields..
# A custom manager for result objects
class SavedOnceManager(models.Manager):
def filter(self, *args, **kwargs):
if not 'saved_once' in kwargs:
kwargs['saved_once'] = True
return super(SavedOnceManager, self).filter(*args, **kwargs)
class Result():
saved_once = models.NullBooleanField(default=False, db_index=True,
null=True)
quiz = models.ForeignKey(Quiz, related_name='result_set')
objects = SavedOnceManager()
As you see I have a custom manager on results so Result.objects.filter() will only return results that have save_once set to True
Now my Serializers look like this:
class ResultSerializer(serializers.ModelSerializer):
fields...
class QuizSerializer(serializers.ModelSerializer):
results = ResultSerializer(many=True, required=False, source='result_set')
Now if I serializer my quiz it would return only results that have saved_once set to True. But for a particular use case I want the serializer to return all objects. I have read that I can do that by passing a queryset parameter http://www.django-rest-framework.org/api-guide/relations/ in (further notes section). However when I try this
results = ResultSerializer(many=True, required=False, source='result_set',
queryset=
Result.objects.filter(
saved_once__in=[True, False]))
I get TypeError: __init__() got an unexpected keyword argument 'queryset'
And looking at the source code of DRF(in my version atleast) it does not accept a queryset argument.
Looking for some guidance on this to see if this is possible... thanks!

In my opinion, modifying filter like this is not a very good practice. It is very difficult to write a solution for you when I cannot use filter on the Result model without this extra filtering happening. I would suggest not modifying filter in this manner and instead creating a custom manager method which allows you to apply your filter in an obvious way where it is needed, eg/
class SavedOnceManager(models.Manager):
def saved_once(self):
return self.get_queryset().filter('saved_once'=True)
Therefore, you can query either the saved_once rows or the unfiltered rows as you would expect:
Results.objects.all()
Results.objects.saved_once().all()
Here is one way which you can use an additional queryset inside a serializer. However, it looks to me that this most likely will not work for you if the default manager is somehow filtering out the saved_once objects. Hence, your problem lies elsewhere.
class QuizSerializer(serializers.ModelSerializer):
results = serializers.SerializerMethodField()
def get_results(self, obj):
results = Result.objects.filter(id__in=obj.result_set)
return ResultSerializer(results, many=True).data

Dynamically limiting queryset of related field

Using Django REST Framework, I want to limit which values can be used in a related field in a creation.
For example consider this example (based on the filtering example on https://web.archive.org/web/20140515203013/http://www.django-rest-framework.org/api-guide/filtering.html, but changed to ListCreateAPIView):
class PurchaseList(generics.ListCreateAPIView)
model = Purchase
serializer_class = PurchaseSerializer
def get_queryset(self):
user = self.request.user
return Purchase.objects.filter(purchaser=user)
In this example, how do I ensure that on creation the purchaser may only be equal to self.request.user, and that this is the only value populated in the dropdown in the form in the browsable API renderer?

I ended up doing something similar to what Khamaileon suggested here. Basically I modified my serializer to peek into the request, which kind of smells wrong, but it gets the job done... Here's how it looks (examplified with the purchase-example):
class PurchaseSerializer(serializers.HyperlinkedModelSerializer):
def get_fields(self, *args, **kwargs):
fields = super(PurchaseSerializer, self).get_fields(*args, **kwargs)
fields['purchaser'].queryset = permitted_objects(self.context['view'].request.user, fields['purchaser'].queryset)
return fields
class Meta:
model = Purchase
permitted_objects is a function which takes a user and a query, and returns a filtered query which only contains objects that the user has permission to link to. This seems to work both for validation and for the browsable API dropdown fields.

Here's how I do it:
class PurchaseList(viewsets.ModelViewSet):
...
def get_serializer(self, *args, **kwargs):
serializer_class = self.get_serializer_class()
context = self.get_serializer_context()
return serializer_class(*args, request_user=self.request.user, context=context, **kwargs)
class PurchaseSerializer(serializers.ModelSerializer):
...
def __init__(self, *args, request_user=None, **kwargs):
super(PurchaseSerializer, self).__init__(*args, **kwargs)
self.fields['user'].queryset = User._default_manager.filter(pk=request_user.pk)

The example link does not seem to be available anymore, but by reading other comments, I assume that you are trying to filter the user relationship to purchases.
If i am correct, then i can say that there is now an official way to do this. Tested with django rest framework 3.10.1.
class UserPKField(serializers.PrimaryKeyRelatedField):
def get_queryset(self):
user = self.context['request'].user
queryset = User.objects.filter(...)
return queryset
class PurchaseSeriaizer(serializers.ModelSerializer):
users = UserPKField(many=True)
class Meta:
model = Purchase
fields = ('id', 'users')
This works as well with the browsable API.
Sources:
https://github.com/encode/django-rest-framework/issues/1985#issuecomment-328366412
https://medium.com/django-rest-framework/limit-related-data-choices-with-django-rest-framework-c54e96f5815e

I disliked the style of having to override the init method for every place where I need to have access to user data or the instance at runtime to limit the queryset. So I opted for this solution.
Here is the code inline.
from rest_framework import serializers
class LimitQuerySetSerializerFieldMixin:
"""
Serializer mixin with a special `get_queryset()` method that lets you pass
a callable for the queryset kwarg. This enables you to limit the queryset
based on data or context available on the serializer at runtime.
"""
def get_queryset(self):
"""
Return the queryset for a related field. If the queryset is a callable,
it will be called with one argument which is the field instance, and
should return a queryset or model manager.
"""
# noinspection PyUnresolvedReferences
queryset = self.queryset
if hasattr(queryset, '__call__'):
queryset = queryset(self)
if isinstance(queryset, (QuerySet, Manager)):
# Ensure queryset is re-evaluated whenever used.
# Note that actually a `Manager` class may also be used as the
# queryset argument. This occurs on ModelSerializer fields,
# as it allows us to generate a more expressive 'repr' output
# for the field.
# Eg: 'MyRelationship(queryset=ExampleModel.objects.all())'
queryset = queryset.all()
return queryset
class DynamicQuersetPrimaryKeyRelatedField(LimitQuerySetSerializerFieldMixin, serializers.PrimaryKeyRelatedField):
"""Evaluates callable queryset at runtime."""
pass
class MyModelSerializer(serializers.ModelSerializer):
"""
MyModel serializer with a primary key related field to 'MyRelatedModel'.
"""
def get_my_limited_queryset(self):
root = self.root
if root.instance is None:
return MyRelatedModel.objects.none()
return root.instance.related_set.all()
my_related_model = DynamicQuersetPrimaryKeyRelatedField(queryset=get_my_limited_queryset)
class Meta:
model = MyModel
The only drawback with this is that you would need to explicitly set the related serializer field instead of using the automatic field discovery provided by ModelSerializer. i would however expect something like this to be in rest_framework by default.

In django rest framework 3.0 the get_fields method was removed. But in a similar way you can do this in the init function of the serializer:
class PurchaseSerializer(serializers.HyperlinkedModelSerializer):
class Meta:
model = Purchase
def __init__(self, *args, **kwargs):
super(PurchaseSerializer, self).__init__(*args, **kwargs)
if 'request' in self.context:
self.fields['purchaser'].queryset = permitted_objects(self.context['view'].request.user, fields['purchaser'].queryset)
I added the if check since if you use PurchaseSerializer as field in another serializer on get methods, the request will not be passed to the context.

First to make sure you only allow "self.request.user" when you have an incoming http POST/PUT (this assumes the property on your serializer and model is named "user" literally)
def validate_user(self, attrs, source):
posted_user = attrs.get(source, None)
if posted_user:
raise serializers.ValidationError("invalid post data")
else:
user = self.context['request']._request.user
if not user:
raise serializers.ValidationError("invalid post data")
attrs[source] = user
return attrs
By adding the above to your model serializer you ensure that ONLY the request.user is inserted into your database.
2) -about your filter above (filter purchaser=user) I would actually recommend using a custom global filter (to ensure this is filtered globally). I do something for a software as a service app of my own and it helps to ensure each http request is filtered down (including an http 404 when someone tries to lookup a "object" they don't have access to see in the first place)
I recently patched this in the master branch so both list and singular views will filter this
https://github.com/tomchristie/django-rest-framework/commit/1a8f07def8094a1e34a656d83fc7bdba0efff184
3) - about the api renderer - are you having your customers use this directly? if not I would say avoid it. If you need this it might be possible to add a custom serlializer that would help to limit the input on the front-end

Upon request # gabn88, as you may know by now, with DRF 3.0 and above, there is no easy solution.
Even IF you do manage to figure out a solution, it won't be pretty and will most likely fail on subsequent versions of DRF as it will override a bunch of DRF source which will have changed by then.
I forget the exact implementation I used, but the idea is to create 2 fields on the serializer, one your normal serializer field (lets say PrimaryKeyRelatedField etc...), and another field a serializer method field, which the results will be swapped under certain cases (such as based on the request, the request user, or whatever). This would be done on the serializers constructor (ie: init)
Your serializer method field will return a custom query that you want.
You will pop and/or swap these fields results, so that the results of your serializer method field will be assigned to the normal/default serializer field (PrimaryKeyRelatedField etc...) accordingly. That way you always deal with that one key (your default field) while the other key remains transparent within your application.
Along with this info, all you really need is to modify this: http://www.django-rest-framework.org/api-guide/serializers/#dynamically-modifying-fields

I wrote a custom CustomQueryHyperlinkedRelatedField class to generalize this behavior:
class CustomQueryHyperlinkedRelatedField(serializers.HyperlinkedRelatedField):
def __init__(self, view_name=None, **kwargs):
self.custom_query = kwargs.pop('custom_query', None)
super(CustomQueryHyperlinkedRelatedField, self).__init__(view_name, **kwargs)
def get_queryset(self):
if self.custom_query and callable(self.custom_query):
qry = self.custom_query()(self)
else:
qry = super(CustomQueryHyperlinkedRelatedField, self).get_queryset()
return qry
#property
def choices(self):
qry = self.get_queryset()
return OrderedDict([
(
six.text_type(self.to_representation(item)),
six.text_type(item)
)
for item in qry
])
Usage:
class MySerializer(serializers.HyperlinkedModelSerializer):
....
somefield = CustomQueryHyperlinkedRelatedField(view_name='someview-detail',
queryset=SomeModel.objects.none(),
custom_query=lambda: MySerializer.some_custom_query)
#staticmethod
def some_custom_query(field):
return SomeModel.objects.filter(somefield=field.context['request'].user.email)
...

I did the following:
class MyModelSerializer(serializers.ModelSerializer):
myForeignKeyFieldName = MyForeignModel.objects.all()
def get_fields(self, *args, **kwargs):
fields = super(MyModelSerializer, self).get_fields()
qs = MyModel.objects.filter(room=self.instance.id)
fields['myForeignKeyFieldName'].queryset = qs
return fields

I looked for a solution where I can set the queryset upon creation of the field and don't have to add a separate field class. This is what I came up with:
class PurchaseSerializer(serializers.HyperlinkedModelSerializer):
class Meta:
model = Purchase
fields = ["purchaser"]
def get_purchaser_queryset(self):
user = self.context["request"].user
return Purchase.objects.filter(purchaser=user)
def get_extra_kwargs(self):
kwargs = super().get_extra_kwargs()
kwargs["purchaser"] = {"queryset": self.get_purchaser_queryset()}
return kwargs
The main issue for tracking suggestions regarding this seems to be drf#1985.

Here's a re-usable generic serializer field that can be used instead of defining a custom field for every use case.
class DynamicPrimaryKeyRelatedField(serializers.PrimaryKeyRelatedField):
"""A PrimaryKeyRelatedField with ability to set queryset at runtime.
Pass a function in the `queryset_fn` kwarg. It will be passed the serializer `context`.
The function should return a queryset.
"""
def __init__(self, queryset_fn=None, **kwargs):
assert queryset_fn is not None, "The `queryset_fn` argument is required."
self.queryset_fn = queryset_fn
super().__init__(**kwargs)
def get_queryset(self):
return self.queryset_fn(context=self.context)
Usage:
class MySerializer(serializers.ModelSerializer):
my_models = DynamicPrimaryKeyRelatedField(
queryset_fn=lambda context: MyModel.objects.visible_to_user(context["request"].user)
)
# ...
Same works for serializers.SlugRelatedField.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Celery raise error while passing my queryset obj as parameter - python

Related

How to cache SerializerMethodField result

How to force django model save method to lookup queryset in custom manager method?

Django REST API query on related field

Pass a custom queryset to serializer in Django Rest Framework

Dynamically limiting queryset of related field

Categories

Resources