Python's equivalent of .Net's sealed class - python

Does python have anything similar to a sealed class? I believe it's also known as final class, in java.
In other words, in python, can we mark a class so it can never be inherited or expanded upon? Did python ever considered having such a feature? Why?
Disclaimers
Actually trying to understand why sealed classes even exist. Answer here (and in many, many, many, many, many, really many other places) did not satisfy me at all, so I'm trying to look from a different angle. Please, avoid theoretical answers to this question, and focus on the title! Or, if you'd insist, at least please give one very good and practical example of a sealed class in csharp, pointing what would break big time if it was unsealed.
I'm no expert in either language, but I do know a bit of both. Just yesterday while coding on csharp I got to know about the existence of sealed classes. And now I'm wondering if python has anything equivalent to that. I believe there is a very good reason for its existence, but I'm really not getting it.

You can use a metaclass to prevent subclassing:
class Final(type):
def __new__(cls, name, bases, classdict):
for b in bases:
if isinstance(b, Final):
raise TypeError("type '{0}' is not an acceptable base type".format(b.__name__))
return type.__new__(cls, name, bases, dict(classdict))
class Foo:
__metaclass__ = Final
class Bar(Foo):
pass
gives:
>>> class Bar(Foo):
... pass
...
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "<stdin>", line 5, in __new__
TypeError: type 'Foo' is not an acceptable base type
The __metaclass__ = Final line makes the Foo class 'sealed'.
Note that you'd use a sealed class in .NET as a performance measure; since there won't be any subclassing methods can be addressed directly. Python method lookups work very differently, and there is no advantage or disadvantage, when it comes to method lookups, to using a metaclass like the above example.

Before we talk Python, let's talk "sealed":
I, too, have heard that the advantage of .Net sealed / Java final / C++ entirely-nonvirtual classes is performance. I heard it from a .Net dev at Microsoft, so maybe it's true. If you're building a heavy-use, highly-performance-sensitive app or framework, you may want to seal a handful of classes at or near the real, profiled bottleneck. Particularly classes that you are using within your own code.
For most applications of software, sealing a class that other teams consume as part of a framework/library/API is kinda...weird.
Mostly because there's a simple work-around for any sealed class, anyway.
I teach "Essential Test-Driven Development" courses, and in those three languages, I suggest consumers of such a sealed class wrap it in a delegating proxy that has the exact same method signatures, but they're override-able (virtual), so devs can create test-doubles for these slow, nondeterministic, or side-effect-inducing external dependencies.
[Warning: below snark intended as humor. Please read with your sense of humor subroutines activated. I do realize that there are cases where sealed/final are necessary.]
The proxy (which is not test code) effectively unseals (re-virtualizes) the class, resulting in v-table look-ups and possibly less efficient code (unless the compiler optimizer is competent enough to in-line the delegation). The advantages are that you can test your own code efficiently, saving living, breathing humans weeks of debugging time (in contrast to saving your app a few million microseconds) per month... [Disclaimer: that's just a WAG. Yeah, I know, your app is special. ;-]
So, my recommendations: (1) trust your compiler's optimizer, (2) stop creating unnecessary sealed/final/non-virtual classes that you built in order to either (a) eke out every microsecond of performance at a place that is likely not your bottleneck anyway (the keyboard, the Internet...), or (b) create some sort of misguided compile-time constraint on the "junior developers" on your team (yeah...I've seen that, too).
Oh, and (3) write the test first. ;-)
Okay, yes, there's always link-time mocking, too (e.g. TypeMock). You got me. Go ahead, seal your class. Whatevs.
Back to Python: The fact that there's a hack rather than a keyword is probably a reflection of the pure-virtual nature of Python. It's just not "natural."
By the way, I came to this question because I had the exact same question. Working on the Python port of my ever-so-challenging and realistic legacy-code lab, and I wanted to know if Python had such an abominable keyword as sealed or final (I use them in the Java, C#, and C++ courses as a challenge to unit testing). Apparently it doesn't. Now I have to find something equally challenging about untested Python code. Hmmm...

Python does have classes that can't be extended, such as bool or NoneType:
>>> class ExtendedBool(bool):
... pass
...
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: type 'bool' is not an acceptable base type
However, such classes cannot be created from Python code. (In the CPython C API, they are created by not setting the Py_TPFLAGS_BASETYPE flag.)
Python 3.6 will introduce the __init_subclass__ special method; raising an error from it will prevent creating subclasses. For older versions, a metaclass can be used.
Still, the most “Pythonic” way to limit usage of a class is to document how it should not be used.

Similar in purpose to a sealed class and useful to reduce memory usage (Usage of __slots__?) is the __slots__ attribute which prevents monkey patching a class. Because when the metaclass __new__ is called, it is too late to put a __slots__ into the class, we have to put it into the namespace at the first possible timepoint, i.e. during __prepare__. Additionally, this throws the TypeError a little bit earlier. Using mcs for the isinstance comparison removes the necessity to hardcode the metaclass name in itself. The disadvantage is that all unslotted attributes are read-only. Therefore, if we want to set specific attributes during initialization or later, they have to slotted specifically. This is feasible e.g. by using a dynamic metaclass taking slots as an argument.
def Final(slots=[]):
if "__dict__" in slots:
raise ValueError("Having __dict__ in __slots__ breaks the purpose")
class _Final(type):
#classmethod
def __prepare__(mcs, name, bases, **kwargs):
for b in bases:
if isinstance(b, mcs):
msg = "type '{0}' is not an acceptable base type"
raise TypeError(msg.format(b.__name__))
namespace = {"__slots__":slots}
return namespace
return _Final
class Foo(metaclass=Final(slots=["_z"])):
y = 1
def __init__(self, z=1):
self.z = 1
#property
def z(self):
return self._z
#z.setter
def z(self, val:int):
if not isinstance(val, int):
raise TypeError("Value must be an integer")
else:
self._z = val
def foo(self):
print("I am sealed against monkey patching")
where the attempt of overwriting foo.foo will throw AttributeError: 'Foo' object attribute 'foo' is read-only and attempting to add foo.x will throw AttributeError: 'Foo' object has no attribute 'x'. The limiting power of __slots__ would be broken when inheriting, but because Foo has the metaclass Final, you can't inherit from it. It would also be broken when dict is in slots, so we throw a ValueError in case. To conclude, defining setters and getters for slotted properties allows to limit how the user can overwrite them.
foo = Foo()
# attributes are accessible
foo.foo()
print(foo.y)
# changing slotted attributes is possible
foo.z = 2
# %%
# overwriting unslotted attributes won't work
foo.foo = lambda:print("Guerilla patching attempt")
# overwriting a accordingly defined property won't work
foo.z = foo.foo
# expanding won't work
foo.x = 1
# %% inheriting won't work
class Bar(Foo):
pass
In that regard, Foo could not be inherited or expanded upon. The disadvantage is that all attributes have to be explicitly slotted, or are limited to a read-only class variable.

Python 3.8 has that feature in the form of the typing.final decorator:
class Base:
#final
def done(self) -> None:
...
class Sub(Base):
def done(self) -> None: # Error reported by type checker
...
#final
class Leaf:
...
class Other(Leaf): # Error reported by type checker
See https://docs.python.org/3/library/typing.html#typing.final

Related

Intercept magic method calls in python class

I am trying to make a class that wraps a value that will be used across multiple other objects. For computational reasons, the aim is for this wrapped value to only be calculated once and the reference to the value passed around to its users. I don't believe this is possible in vanilla python due to its object container model. Instead, my approach is a wrapper class that is passed around, defined as follows:
class DynamicProperty():
def __init__(self, value = None):
# Value of the property
self.value: Any = value
def __repr__(self):
# Use value's repr instead
return repr(self.value)
def __getattr__(self, attr):
# Doesn't exist in wrapper, get it from the value
# instead
return getattr(self.value, attr)
The following works as expected:
wrappedString = DynamicProperty("foo")
wrappedString.upper() # 'FOO'
wrappedFloat = DynamicProperty(1.5)
wrappedFloat.__add__(2) # 3.5
However, implicitly calling __add__ through normal syntax fails:
wrappedFloat + 2 # TypeError: unsupported operand type(s) for
# +: 'DynamicProperty' and 'float'
Is there a way to intercept these implicit method calls without explicitly defining magic methods for DynamicProperty to call the method on its value attribute?
Talking about "passing by reference" will only confuse you. Keep that terminology to languages where you can have a choice on that, and where it makes a difference. In Python you always pass objects around - and this passing is the equivalent of "passing by reference" - for all objects - from None to int to a live asyncio network connection pool instance.
With that out of the way: the algorithm the language follows to retrieve attributes from an object is complicated, have details - implementing __getattr__ is just the tip of the iceberg. Reading the document called "Data Model" in its entirety will give you a better grasp of all the mechanisms involved in retrieving attributes.
That said, here is how it works for "magic" or "dunder" methods - (special functions with two underscores before and two after the name): when you use an operator that requires the existence of the method that implements it (like __add__ for +), the language checks the class of your object for the __add__ method - not the instance. And __getattr__ on the class can dynamically create attributes for instances of that class only.
But that is not the only problem: you could create a metaclass (inheriting from type) and put a __getattr__ method on this metaclass. For all querying you would do from Python, it would look like your object had the __add__ (or any other dunder method) in its class. However, for dunder methods, Python do not go through the normal attribute lookup mechanism - it "looks" directly at the class, if the dunder method is "physically" there. There are slots in the memory structure that holds the classes for each of the possible dunder methods - and they either refer to the corresponding method, or are "null" (this is "viewable" when coding in C on the Python side, the default dir will show these methods when they exist, or omit them if not). If they are not there, Python will just "say" the object does not implement that operation and period.
The way to work around that with a proxy object like you want is to create a proxy class that either features the dunder methods from the class you want to wrap, or features all possible methods, and upon being called, check if the underlying object actually implements the called method.
That is why "serious" code will rarely, if ever, offer true "transparent" proxy objects. There are exceptions, but from "Weakrefs", to "super()", to concurrent.futures, just to mention a few in the core language and stdlib, no one attempts a "fully working transparent proxy" - instead, the api is more like you call a ".value()" or ".result()" method on the wrapper to get to the original object itself.
However, it can be done, as I described above. I even have a small (long unmaintained) package on pypi that does that, wrapping a proxy for a future.
The code is at https://bitbucket.org/jsbueno/lelo/src/master/lelo/_lelo.py
The + operator in your case does not work, because DynamicProperty does not inherit from float. See:
>>> class Foo(float):
pass
>>> Foo(1.5) + 2
3.5
So, you'll need to do some kind of dynamic inheritance:
def get_dynamic_property(instance):
base = type(instance)
class DynamicProperty(base):
pass
return DynamicProperty(instance)
wrapped_string = get_dynamic_property("foo")
print(wrapped_string.upper())
wrapped_float = get_dynamic_property(1.5)
print(wrapped_float + 2)
Output:
FOO
3.5

PyCharm: List all usages of all methods of a class

I'm aware that I can use 'Find Usages' to find what's calling a method in a class.
Is there a way of doing this for all methods on a given class? (or indeed all methods in file)
Use Case: I'm trying to refactor a god class, that should almost certainly be several classes. It would be nice to be able to see what subset of god class methods, the classes that interact with it use. It seems like PyCharm has done the hard bit of this, but doesn't let me scale it up.
I'm using PyCharm 2016.1.2
https://intellij-support.jetbrains.com/hc/en-us/community/posts/206666319-See-all-callers-of-all-methods-of-a-class
This is possible, but you have to deal with abstraction, otherwise Pycharm doesn't know the method in question belongs to your specific class. AKA - Type Hinting
Any instance of that method being called in an abstraction layer which does not have type hinting will not be found.
Example:
#The class which has the method you're searching for.
class Inst(object):
def mymethod(self):
return
#not the class your looking for, but it too has a method of the same name.
class SomethingElse(object):
def mymethod(self):
return
#Option 1 -- Assert hinting
def foo(inst):
assert isinstance(inst, Inst)
inst.mymethod()
#Option 2 -- docstring hinting
def bar(inst):
"""
:param inst:
:type inst: Inst
:return:
:rtype:
"""
inst.mymethod()
Nowadays it would be rather easy for Pycharm to use Python 3.6 type hints and match function calls "correctly", as type hints are part of Python 3.5/3.6 language. Of course partial type hints in a big software cause some compromises when resolving the targets of the method calls.
Here is an example, how type hints makes it very easy to do type inferencing logic and resolve the correct target of the call.
def an_example():
a: SoftagramAnalysisAction = SoftagramAnalysisAction(
analysis_context=analysis_context,
preprocessors=list(preprocessors),
analyzers=list(analyzers),
analysis_control_params=analysis_control_params)
output = a.run()
In above example, local variable a is specially marked to have type SoftagramAnalysisAction which makes it clear that run() call below targets to the run method of that class (or any of its possible subclasses).
The current version (2018.1) does not resolve these kind of calls correctly but I hope that will change in the future.

Why Is The property Decorator Only Defined For Classes?

tl;dr: How come property decorators work with class-level function definitions, but not with module-level definitions?
I was applying property decorators to some module-level functions, thinking they would allow me to invoke the methods by mere attribute lookup.
This was particularly tempting because I was defining a set of configuration functions, like get_port, get_hostname, etc., all of which could have been replaced with their simpler, more terse property counterparts: port, hostname, etc.
Thus, config.get_port() would just be the much nicer config.port
I was surprised when I found the following traceback, proving that this was not a viable option:
TypeError: int() argument must be a string or a number, not 'property'
I knew I had seen some precedant for property-like functionality at module-level, as I had used it for scripting shell commands using the elegant but hacky pbs library.
The interesting hack below can be found in the pbs library source code. It enables the ability to do property-like attribute lookups at module-level, but it's horribly, horribly hackish.
# this is a thin wrapper around THIS module (we patch sys.modules[__name__]).
# this is in the case that the user does a "from pbs import whatever"
# in other words, they only want to import certain programs, not the whole
# system PATH worth of commands. in this case, we just proxy the
# import lookup to our Environment class
class SelfWrapper(ModuleType):
def __init__(self, self_module):
# this is super ugly to have to copy attributes like this,
# but it seems to be the only way to make reload() behave
# nicely. if i make these attributes dynamic lookups in
# __getattr__, reload sometimes chokes in weird ways...
for attr in ["__builtins__", "__doc__", "__name__", "__package__"]:
setattr(self, attr, getattr(self_module, attr))
self.self_module = self_module
self.env = Environment(globals())
def __getattr__(self, name):
return self.env[name]
Below is the code for inserting this class into the import namespace. It actually patches sys.modules directly!
# we're being run as a stand-alone script, fire up a REPL
if __name__ == "__main__":
globs = globals()
f_globals = {}
for k in ["__builtins__", "__doc__", "__name__", "__package__"]:
f_globals[k] = globs[k]
env = Environment(f_globals)
run_repl(env)
# we're being imported from somewhere
else:
self = sys.modules[__name__]
sys.modules[__name__] = SelfWrapper(self)
Now that I've seen what lengths pbs has to go through, I'm left wondering why this facility of Python isn't built into the language directly. The property decorator in particular seems like a natural place to add such functionality.
Is there any partiuclar reason or motivation for why this isn't built directly in?
This is related to a combination of two factors: first, that properties are implemented using the descriptor protocol, and second that modules are always instances of a particular class rather than being instantiable classes.
This part of the descriptor protocol is implemented in object.__getattribute__ (the relevant code is PyObject_GenericGetAttr starting at line 1319). The lookup rules go like this:
Search through the class mro for a type dictionary that has name
If the first matching item is a data descriptor, call its __get__ and return its result
If name is in the instance dictionary, return its associated value
If there was a matching item from the class dictionaries and it was a non-data descriptor, call its __get__ and return the result
If there was a matching item from the class dictionaries, return it
raise AttributeError
The key to this is at number 3 - if name is found in the instance dictionary (as it will be with modules), then its value will just be returned - it won't be tested for descriptorness, and its __get__ won't be called. This leads to this situation (using Python 3):
>>> class F:
... def __getattribute__(self, attr):
... print('hi')
... return object.__getattribute__(self, attr)
...
>>> f = F()
>>> f.blah = property(lambda: 5)
>>> f.blah
hi
<property object at 0xbfa1b0>
You can see that .__getattribute__ is being invoked, but isn't treating f.blah as a descriptor.
It is likely that the reason for the rules being structured this way is an explicit tradeoff between the usefulness of allowing descriptors on instances (and, therefore, in modules) and the extra code complexity that this would lead to.
Properties are a feature specific to classes (new-style classes specifically) so by extension the property decorator can only be applied to class methods.
A new-style class is one that derives from object, i.e. class Foo(object):
Further info: Can modules have properties the same way that objects can?

When to inline definitions of metaclass in Python?

Today I have come across a surprising definition of a metaclass in Python here, with the metaclass definition effectively inlined. The relevant part is
class Plugin(object):
class __metaclass__(type):
def __init__(cls, name, bases, dict):
type.__init__(name, bases, dict)
registry.append((name, cls))
When does it make sense to use such an inline definition?
Further Arguments:
An argument one way would be that the created metaclass is not reusable elsewhere using this technique. A counter argument is that a common pattern in using metaclasses is defining a metaclass and using it in one class and then inhertiting from that. For example, in a conservative metaclass the definition
class DeclarativeMeta(type):
def __new__(meta, class_name, bases, new_attrs):
cls = type.__new__(meta, class_name, bases, new_attrs)
cls.__classinit__.im_func(cls, new_attrs)
return cls
class Declarative(object):
__metaclass__ = DeclarativeMeta
def __classinit__(cls, new_attrs): pass
could have been written as
class Declarative(object): #code not tested!
class __metaclass__(type):
def __new__(meta, class_name, bases, new_attrs):
cls = type.__new__(meta, class_name, bases, new_attrs)
cls.__classinit__.im_func(cls, new_attrs)
return cls
def __classinit__(cls, new_attrs): pass
Any other considerations?
Like every other form of nested class definition, a nested metaclass may be more "compact and convenient" (as long as you're OK with not reusing that metaclass except by inheritance) for many kinds of "production use", but can be somewhat inconvenient for debugging and introspection.
Basically, instead of giving the metaclass a proper, top-level name, you're going to end up with all custom metaclasses defined in a module being undistiguishable from each other based on their __module__ and __name__ attributes (which is what Python uses to form their repr if needed). Consider:
>>> class Mcl(type): pass
...
>>> class A: __metaclass__ = Mcl
...
>>> class B:
... class __metaclass__(type): pass
...
>>> type(A)
<class '__main__.Mcl'>
>>> type(B)
<class '__main__.__metaclass__'>
IOW, if you want to examine "which type is class A" (a metaclass is the class's type, remember), you get a clear and useful answer -- it's Mcl in the main module. However, if you want to examine "which type is class B", the answer is not all that useful: it says it's __metaclass__ in the main module, but that's not even true:
>>> import __main__
>>> __main__.__metaclass__
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
AttributeError: 'module' object has no attribute '__metaclass__'
>>>
...there is no such thing, actually; that repr is misleading and not very helpful;-).
A class's repr is essentially '%s.%s' % (c.__module__, c.__name__) -- a simple, useful, and consistent rule -- but in many cases such as, the class statement not being unique at module scope, or not being at module scope at all (but rather within a function or class body), or not even existing (classes can of course be built without a class statement, by explicitly calling their metaclass), this can be somewhat misleading (and the best solution is to avoid, in as far as possible, those peculiar cases, except when substantial advantage can be obtained by using them). For example, consider:
>>> class A(object):
... def foo(self): print('first')
...
>>> x = A()
>>> class A(object):
... def foo(self): print('second')
...
>>> y = A()
>>> x.foo()
first
>>> y.foo()
second
>>> x.__class__
<class '__main__.A'>
>>> y.__class__
<class '__main__.A'>
>>> x.__class__ is y.__class__
False
with two class statement at the same scope, the second one rebinds the name (here, A), but existing instances refer to the first binding of the name by object, not by name -- so both class objects remain, one accessible only through the type (or __class__ attribute) of its instances (if any -- if none, that first object disappears) -- the two classes have the same name and module (and therefore the same representation), but they're distinct objects. Classes nested within class or function bodies, or created by directly calling the metaclass (including type), may cause similar confusion if debugging or introspection is ever called for.
So, nesting the metaclass is OK if you'll never need to debug or otherwise introspect that code, and can be lived with if whoever is so doing understand this quirks (though it will never be as convenient as using a nice, real name, of course -- just like debugging a function coded with lambda cannot possibly ever be so convenient as debugging one coded with def). By analogy with lambda vs def you can reasonably claim that anonymous, "nested" definition is OK for metaclasses which are so utterly simple, such no-brainers, that no debugging or introspection will ever conceivably be required.
In Python 3, the "nested definition" just doesn't work -- there, a metaclass must be passed as a keyword argument to the class, as in class A(metaclass=Mcl):, so defining __metaclass__ in the body has no effect. I believe this also suggests that a nested metaclass definition in Python 2 code is probably appropriate only if you know for sure that code will never need to be ported to Python 3 (since you're making that port so much harder, and will need to de-nest the metaclass definition for the purpose) -- "throwaway" code, in other words, which won't be around in a few years when some version of Python 3 acquires huge, compelling advantages of speed, functionality, or third-party support, over Python 2.7 (the last ever version of Python 2).
Code that you expect to be throwaway, as the history of computing shows us, has an endearing habit of surprising you utterly, and being still around 20 years later (while perhaps the code you wrote around the same time "for the ages" is utterly forgotten;-). This would certainly seem to suggest avoiding nested definition of metaclasses.

A class subclass of itself. Why mutual subclassing is forbidden?

Complex question I assume, but studying OWL opened a new perspective to live, the universe and everything. I'm going philosophical here.
I am trying to achieve a class C which is subclass of B which in turn is subclass of C. Just for fun, you know...
So here it is
>>> class A(object): pass
...
>>> class B(A): pass
...
>>> class C(B): pass
...
>>> B.__bases__
(<class '__main__.A'>,)
>>> B.__bases__ = (C,)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: a __bases__ item causes an inheritance cycle
>>>
clearly, python is smart and forbids this. However, in OWL it is possible to define two classes to be mutual subclasses. The question is: what is the mind boggling explanation why this is allowed in OWL (which is not a programming language) and disallowed in programming languages ?
Python doesn't allow it because there is no sensible way to do it. You could invent arbitrary rules about how to handle such a case (and perhaps some languages do), but since there is no actual gain in doing so, Python refuses to guess. Classes are required to have a stable, predictable method resolution order for a number of reasons, and so weird, unpredictable or surprising MROs are not allowed.
That said, there is a special case in Python: type and object. object is an instance of type, and type is a subclass of object. And of course, type is also an instance of type (since it's a subclass of object). This might be why OWL allows it: you need to start a class/metaclass hierarchy in some singularity, if you want everything to be an object and all objects to have a class.
The MRO scheme implemented in Python (as of 2.3) forbids cyclic subclassing. Valid MRO's are guaranteed to satisfy "local precedence" and "monotonicity". Cyclic subclassing would break monotonicity.
This issue is discussed in the section entitled "Bad Method Resolution Orders"
Part of this "disconnect" is because OWL describes an open world ontology. An Ontology has little or nothing to do with a program, other than a program can manipulate an ontology.
Trying to relate OWL concepts to programming languages is like trying to relate A Pianist and A Piano Sonata.
The sonata doesn't really have a concrete manifestion until someone is playing it -- ideally a Pianist, but not necessarily. Until it's being played, it's just potential relationships among notes manifested as sounds. When it's being played, some of actual relationships will be relevant to you, the listener. Some won't be relevant to the listener.
I think the answer is "When you construct class C ... it must create instance of class B .. which must create instance of class C ... and so on" This will never end. This is forbidden in most languages (in fact i don't know other case).
You can only create an object with a 'reference' to other object that can be initially null.
For a semantic reasoner, if A is a subclass of B, and B is a subclass of A, then the classes can be considered equivalent. They are not the "same", but from a reasoning perspective, if I can reason an individual is (or is not) a member of the class A, I can reason the individual is (or is not) a member of the class B. The classes A and B are semantically equivalent, which is what you were able to express with OWL.
I'm sure someone can up with an example where this makes sense. However, I guess this restriction is easier and not less powerful.
Eg, Let's say class A holds fields a and b. Class C holds b and c. Then the view on things from C would be: A.a, C.b, C.c and the view from A would be: A.a, A.b, C.c.
Just moving b into a common base class is far easier to understand and implement, though.

Categories