Extended celery.schedules.schedule objects aren't passed extra arguments - python

The code is here.
I wrote an extension of the celery.schedules.schedule interface, and I can't figure out why it's getting instantiated with nothing set in the extra values I created.
When I instantiate them before passing to app.conf.CELERYBEAT_SCHEDULE they're correct. But all the ones that celery beat instantiates are incorrect.
I asked in #celery IRC chan and the only response I got was about lazy mode, but that's for celery.beat.Scheduler, not celery.schedules.schedule, so if it's relevant, I don't understand how. Do I have to extend that too, just so that it instantiates the schedules correctly?
I've tried digging into the celery code w/the debugger to figure out where these schedules are getting instantiated and I can't find it. I can see when they come back from Unpickler they are wrong, but I can't find where they get created or where they get pickled.

celery.schedules.schedule has a __reduce__ method that defines how to serialize and reconstruct the object using pickle:
https://github.com/celery/celery/blob/master/celery/schedules.py#L150-L151
When pickle serializes the object it will call:
fun, args = obj.__reduce__()
and when it reconstructs the object it will do:
obj = fun(*args)
So if you've added new state to your custom schedule subclass, passed as arguments to __init__, then you will
also have to define a __reduce__ method that takes these
new arguments into account:
class myschedule(schedule):
def __init__(self, run_every=None, relative=False, nowfun=None,
odds=None, max_run_every=None, **kwargs):
super(myschedule, self).__init__(
run_every, relative, nowfun, **kwargs)
self.odds = odds
self.max_run_every = max_run_every
def __reduce__(self):
return self.__class__, (
self.run_every, self.relative, self.nowfun,
self.odds, self.max_run_every)

After a lot of time in the Python debugger, I narrowed the problem down to celery.beat.PersistentScheduler.sync() and/or shelve.sync() (which is called by the former).
When the shelve is synced, the values are getting lost. I don't know why, but I'm pretty sure it's a bug in either Celery or Shelve.
In any case, I wrote a workaround.

Related

Why does tab auto-completion in Python REPL and Jupyter notebook (or ipython) for a class evaluate all its descriptors/properties?

I am trying to implement a Python class to facilitate easy exploration of relatively large dataset in Jupyter notebook by exposing various (some what compute intensive) filter methods as class attributes using descriptor protocol. Idea was to take advantage of lazyness of descriptor to only compute on accessing particular attribute.
Consider the following snippet:
import time
accessed_attr = [] # I find this easier then using basic logging for jupyter/ipython
class MyProperty:
def __init__(self,name):
self.name=name
def __get__(self, instance, owner):
if instance is None:
return self
accessed_attr.append(f'accessed {self.name} from {instance} at {time.asctime()}')
setattr(instance, self.name, self.name)
return self.name # just return string
class Dummy:
abc=MyProperty('abc')
bce=MyProperty('bce')
cde=MyProperty('cde')
dummy_inst = Dummy() # instantiate the Dummy class
on dummy_inst.<tab>, I assumed Juptyer would show auto completions abc, bce, cde among other hidden methods and not evaluate them. Printing the logging list accessed_attr shows all __get__ methods for the three descriptors were called, which is not what I expect or want.
A hacky way I figured was to deffer first access to descriptor using a counter like shown in image below, but has its own issues.
I tried other ways using __slots__, modifying __dir__ to trick the kernel, but couldn't find a way to get around the issue.
I understand there is another way using __getattribute__, but it still doesn't seem elegant, I am puzzled with what seemed so trivial turned out to be mystery to me. Any hints, pointers and solutions are appreciated.
Here is my Python 3.7 based environment:
{'IPython': '7.18.1',
'jedi': '0.17.2',
'jupyter': '1.0.0',
'jupyter_core': '4.6.3',
'jupyter_client': '6.1.7'}
It's unfortunately a ca and mouse battle, IPython used to aggressively explore attribute, which ended up being deactivated because of side effects. (see for example why the IPCompleter.limit_to__all__ option was added. Though other users come to complain that dynamic attribute don't show up. So it's likely either jedi that look at those attributes. You can try using c.Completer.use_jedi=False to check that. If it's jedi, then you have to ask the jedi author, if not then I'm unsure, but it's a delicate balance.
Lazy vs exploratory is really complicated subject in IPython, you might be able to register a custom completer (even for dict keys) that might make it easier to explore without computing, or use async await for make sure only calling await obj.attr triggers the computation.

Calling super().__init__() on subclass of ForceElement causes No constructor defined

I am trying to create a custom ForceElement as follows
class FrontWheelForce(ForceElement):
def __init__(self, plant):
front_wheel = plant.GetBodyByName("front_wheel")
front_wheel_node_index = front_wheel.index()
pdb.set_trace()
ForceElement.__init__(self, front_wheel.model_instance())
But get the following error on the line ForceElement.__init__(self, front_wheel.model_instance())
TypeError: FrontWheelForce: No constructor defined!
You didn't show us the parent's definition.
I'm a little surprised you didn't see this diagnostic:
TypeError: object.__init__() takes exactly one argument (the instance to initialize)
I imagine the framework you're using raises "no constructor"
as a reminder that you have some more code to implement
before using that parent class.
Please take a look at the docs here for ForceElement; "ForceElement allows modeling state and time dependent forces in a MultibodyTree model". That is, a force element that is a function of the torque on the wheel can not be modeled as a ForceElement. I believe that what you want is a FrontWheelSystem, being a LeafSystem, that output the force you want to model. You can apply the external force of your model to the plant through either actuators
connected to get_actuation_input_port(), or as externally applied spatial forces connected to get_applied_spatial_force_input_port().
Summarizing a few comments into the correct answer
By ekhumoro
The error message suggests the ForceElement class does not support subclassing. That is, the python bindings for drake do not wrap the __init__ method for this class - so presumably ForceElement.__init__ will raise an AttributeError.
By Eric Cousineau
this (ForceElement) is not written as a trampoline class, which is necessary for pybind11 to permit Python-subclassing of a bound C++ class
Ref:
pybind11 docs, ForceElementbinding

Python __doc__ documentation on instances

I'd like to provide documentation (within my program) on certain dynamically created objects, but still fall back to using their class documentation. Setting __doc__ seems a suitable way to do so. However, I can't find many details in the Python help in this regard, are there any technical problems with providing documentation on an instance? For example:
class MyClass:
"""
A description of the class goes here.
"""
a = MyClass()
a.__doc__ = "A description of the object"
print( MyClass.__doc__ )
print( a.__doc__ )
__doc__ is documented as a writable attribute for functions, but not for instances of user defined classes. pydoc.help(a), for example, will only consider the __doc__ defined on the type in Python versions < 3.9.
Other protocols (including future use-cases) may reasonably bypass the special attributes defined in the instance dict, too. See Special method lookup section of the datamodel documentation, specifically:
For custom classes, implicit invocations of special methods are only guaranteed to work correctly if defined on an object’s type, not in the object’s instance dictionary.
So, depending on the consumer of the attribute, what you intend to do may not be reliable. Avoid.
A safe and simple alternative is just to use a different attribute name of your own choosing for your own use-case, preferably not using the __dunder__ syntax convention which usually indicates a special name reserved for some specific use by the implementation and/or the stdlib.
There are some pretty obvious technical problems; the question is whether or not they matter for your use case.
Here are some major uses for docstrings that your idiom will not help with:
help(a): Type help(a) in an interactive terminal, and you get the docstring for MyClass, not the docstring for a.
Auto-generated documentation: Unless you write your own documentation generator, it's not going to understand that you've done anything special with your a value. Many doc generators do have some way to specify help for module and class constants, but I'm not aware of any that will recognize your idiom.
IDE help: Many IDEs will not only auto-complete an expression, but show the relevant docstring in a tooltip. They all do this statically, and without some special-case code designed around your idiom (which they're unlikely to have, given that it's an unusual idiom), they're almost certain to fetch the docstring for the class, not the object.
Here are some where it might help:
Source readability: As a human reading your source, I can tell the intent from the a.__doc__ = … right near the construction of a. Then again, I could tell the same intent just as easily from a Sphinx comment on the constant.
Debugging: pdb doesn't really do much with docstrings, but some GUI debuggers wrapped around it do, and most of them are probably going to show a.__doc__.
Custom dynamic use of docstrings: Obviously any code that you write that does something with a.__doc__ is going to get the instance docstring if you want it to, and therefore can do whatever it wants with it. However, keep in mind that if you want to define your own "protocol", you should use your own name, not one reserved for the implementation.
Notice that most of the same is true for using a descriptor for the docstring:
>>> class C:
... #property
... def __doc__(self):
... return('C doc')
>>> c = C()
If you type c.__doc__, you'll get 'C doc', but help(c) will treat it as an object with no docstring.
It's worth noting that making help work is one of the reasons some dynamic proxy libraries generate new classes on the fly—that is, a proxy to underlying type Spam has some new type like _SpamProxy, instead of the same GenericProxy type used for proxies to Hams and Eggseses. The former allows help(myspam) to show dynamically-generated information about Spam. But I don't know how important a reason it is; often you already need dynamic classes to, e.g., make special method lookup work, at which point adding dynamic docstrings comes for free.
I think it's preferred to keep it under the class via your doc string as it will also aid any developer that works on the code. However if you are doing something dynamic that requires this setup then I don't see any reason why not. Just understand that it adds a level of indirection that makes things less clear to others.
Remember to K.I.S.S. where applicable :)
I just stumbled over this and noticed that at least with python 3.9.5 the behavior seems to have changed.
E.g. using the above example, when I call:
help(a)
I get:
Help on MyClass in module __main__:
<__main__.MyClass object>
A description of the object
Also for reference, have a look at the pydoc implementation which shows:
def _getowndoc(obj):
"""Get the documentation string for an object if it is not
inherited from its class."""
try:
doc = object.__getattribute__(obj, '__doc__')
if doc is None:
return None
if obj is not type:
typedoc = type(obj).__doc__
if isinstance(typedoc, str) and typedoc == doc:
return None
return doc
except AttributeError:
return None

Is it possible to pickle Python object by reference (by name)?

I have a situation where there's a complex object that can be referenced by unique name like package.subpackage.MYOBJECT. While it's possible to pickle this object using standard pickle algorithm, resulting data string will be very big.
I'm looking for some way to get same pickling semantic for an object that is already here for classes and functions: Python's pickle just dumps their fully qualified names, not code. This way just string like package.subpackage.MYOBJECT will be dumped and upon unpickling object will be imported, just like it happens for functions or classes.
It seems that this task boils down to making object aware of variable name it's bound to, but I have no clues how to do it.
Here's short example to explain myself clearly (obvious imports are skipped).
File bigpackage/bigclasses/models.py:
class SomeInterface():
__meta__ = ABCMeta
#abstractmethod
def operation():
pass
class ImplementationA(SomeInterface):
def operation():
print "ImplementationA"
class ImplementationB(SomeInterface):
def operation():
print "ImplementationB"
IMPL_A = ImplementationA()
IMPL_B = ImplementationB()
File bigpackage/bigclasses/tasks.py:
#celery.task
def background_task(impl, somearg):
assert isinstance(impl, SomeInterface)
impl.operation()
print somearg
File bigpackage/bigclasses/work.py:
from bigpackage.bigclasses.models import IMPL_A, IMPL_B
from bigpackage.bigclasses.tasks import background_task
background_task.submit(IMPL_A, "arg1")
background_task.submit(IMPL_B, "arg2")
Here I have trivial background Celery task that accept one of two available implementations of SomeInterface as an argument. Task's arguments are pickled by Celery, passed to a queue and executed on some worker server, that runs exactly the same code base. My idea is to avoid deep pickling of IMPL_A and IMPL_B and instead pass them as bigpackage.bigclasses.models.IMPL_A and bigpackage.bigclasses.models.IMPL_B correspondingly. That will help with performance and total traffic for queue server and also provide some safety against changes in IMPL_A and IMPL_B that will make them non-pickleable (for example lambda anywhere in object attributes hierarchy).

Helper function injected on all python objects?

I'm trying to mimic methods.grep from Ruby which simply returns a list of available methods for any object (class or instance) called upon, filtered by regexp pattern passed to grep.
Very handy for investigating objects in an interactive prompt.
def methods_grep(self, pattern):
""" returns list of object's method by a regexp pattern """
from re import search
return [meth_name for meth_name in dir(self) \
if search(pattern, meth_name)]
Because of Python's limitation not quite clear to me it unfortunately can't be simply inserted in the object class ancestor:
object.mgrep = classmethod(methods_grep)
# TypeError: can't set attributes of built-in/extension type 'object'
Is there some workaround how to inject all classes or do I have to stick with a global function like dir ?
There is a module called forbiddenfruit that enables you to patch built-in objects. It also allows you to reverse the changes. You can find it here https://pypi.python.org/pypi/forbiddenfruit/0.1.1
from forbiddenfruit import curse
curse(object, "methods_grep", classmethod(methods_grep))
Of course, using this in production code is likely a bad idea.
There is no workaround AFAIK. I find it quite annoying that you can't alter built-in classes. Personal opinion though.
One way would be to create a base object and force all your objects to inherit from it.
But I don't see the problem to be honest. You can simply use methods_grep(object, pattern), right? You don't have to insert it anywhere.

Categories