A class subclass of itself. Why mutual subclassing is forbidden? - python

Complex question I assume, but studying OWL opened a new perspective to live, the universe and everything. I'm going philosophical here.
I am trying to achieve a class C which is subclass of B which in turn is subclass of C. Just for fun, you know...
So here it is
>>> class A(object): pass
...
>>> class B(A): pass
...
>>> class C(B): pass
...
>>> B.__bases__
(<class '__main__.A'>,)
>>> B.__bases__ = (C,)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: a __bases__ item causes an inheritance cycle
>>>
clearly, python is smart and forbids this. However, in OWL it is possible to define two classes to be mutual subclasses. The question is: what is the mind boggling explanation why this is allowed in OWL (which is not a programming language) and disallowed in programming languages ?

Python doesn't allow it because there is no sensible way to do it. You could invent arbitrary rules about how to handle such a case (and perhaps some languages do), but since there is no actual gain in doing so, Python refuses to guess. Classes are required to have a stable, predictable method resolution order for a number of reasons, and so weird, unpredictable or surprising MROs are not allowed.
That said, there is a special case in Python: type and object. object is an instance of type, and type is a subclass of object. And of course, type is also an instance of type (since it's a subclass of object). This might be why OWL allows it: you need to start a class/metaclass hierarchy in some singularity, if you want everything to be an object and all objects to have a class.

The MRO scheme implemented in Python (as of 2.3) forbids cyclic subclassing. Valid MRO's are guaranteed to satisfy "local precedence" and "monotonicity". Cyclic subclassing would break monotonicity.
This issue is discussed in the section entitled "Bad Method Resolution Orders"

Part of this "disconnect" is because OWL describes an open world ontology. An Ontology has little or nothing to do with a program, other than a program can manipulate an ontology.
Trying to relate OWL concepts to programming languages is like trying to relate A Pianist and A Piano Sonata.
The sonata doesn't really have a concrete manifestion until someone is playing it -- ideally a Pianist, but not necessarily. Until it's being played, it's just potential relationships among notes manifested as sounds. When it's being played, some of actual relationships will be relevant to you, the listener. Some won't be relevant to the listener.

I think the answer is "When you construct class C ... it must create instance of class B .. which must create instance of class C ... and so on" This will never end. This is forbidden in most languages (in fact i don't know other case).
You can only create an object with a 'reference' to other object that can be initially null.

For a semantic reasoner, if A is a subclass of B, and B is a subclass of A, then the classes can be considered equivalent. They are not the "same", but from a reasoning perspective, if I can reason an individual is (or is not) a member of the class A, I can reason the individual is (or is not) a member of the class B. The classes A and B are semantically equivalent, which is what you were able to express with OWL.

I'm sure someone can up with an example where this makes sense. However, I guess this restriction is easier and not less powerful.
Eg, Let's say class A holds fields a and b. Class C holds b and c. Then the view on things from C would be: A.a, C.b, C.c and the view from A would be: A.a, A.b, C.c.
Just moving b into a common base class is far easier to understand and implement, though.

Related

Reusing method from another class without inheritance or delegation in Python

I want to use a method from another class.
Neither inheritance nor delegation is a good choice (to my understanding) because the existing class is too complicated to override and too expensive to instanciate.
Note that modifying the existing class is not allowed (legacy project, you know).
I came up with a way:
class Old:
def a(self):
print('Old.a')
class Mine:
b = Old.a
and it shows
>>> Mine().b()
Old.a
>>> Mine().b
<bound method Old.a of <__main__.Mine object at 0x...>>
It seems fine.
And I tried with some more complicated cases including property modification (like self.foo = 'bar'), everything seems okay.
My question:
What is actually happening when I define methods like that?
Will that safely do the trick for my need mentioned above?
Explanation
What's happening is that you are defining a callable class property of class Mine called b. However, this works:
m = Mine()
m.b()
But this won't:
Mine.b()
Why doesn't the second way work?
When you call a function of a class, python expects the first argument to be the actual object upon which the function was called. When you do this, the self argument is automatically passed into the function behind the scenes. Since we called Mine.b() without an instantiated instance of any object, no self was passed into b().
Will this "do the trick"?
As for whether this will do the trick, that depends.
As long as Mine can behave the same way as Old, python won't complain. This is because the python interpreter does not care about the "type" of self. As long as it walks like a duck and quacks like a duck, it's a duck (see duck typing). However, can you guarantee this? What if someone goes and changes the implementation of Old.a. Most of the time, as a client of another system we have no say when the private implementation of functions change.
A simpler solution might be to pull out the functionality you are missing into a separate module. Yes, there is some code duplication but at least you can be confident the code won't change from under you.
Ultimately, if you can guarantee the behavior of Old and Mine will be similar enough for the purposes of Old.a, python really shouldn't care.

How to properly get `next` to use use overridden instance method `__next__`?

Consider the following snippet.
class A:
def __next__(self):
return 2
a = A()
print(next(a),a.__next__()) # prints "2,2" as expected
a.__next__ = lambda: 4
print(next(a),a.__next__()) # prints "2,4". I expected "4,4"
Clearly, the property __next__ is updated by the patching, but the inbuilt next function does not resolve that.
The python 3 docs docs on the python datamodel that says
For instance, if a class defines a method named __getitem__(), and x is an instance of this class, then x[i] is roughly equivalent to type(x).__getitem__(x, i).
From this, I came up with a hack as below
class A:
def next_(self):
return 2
def __next__(self):
return self.next_()
a = A()
print(next(a),a.__next__()) # 2,2
a.next_ = lambda: 4
print(next(a),a.__next__()) # 4,4
The code works, but at the expense of another layer of indirection via another next_-method.
My question is: What is the proper way to monkey-patch the __next__ instance method? What is the rationale behind this design in python?
You can't. Special methods are special they cannot be overridden at the instance level. Period. If you want to "customize" the instance behaviour the correct way to do it is to simply have a proper implementation instead of a bogus implementation that you swap at runtime. Change the value instead of the method.
The rationale can be found in The History of Python - Adding Support for User-defined Classes at the end of following section:
Special Methods
As briefly mentioned in the last section, one of my main goals was to
keep the implementation of classes simple. In most object oriented
languages, there are a variety of special operators and methods that
only apply to classes. For example, in C++, there is a special syntax
for defining constructors and destructors that is different than the
normal syntax used to define ordinary function and methods.
I really didn't want to introduce additional syntax to handle special
operations for objects. So instead, I handled this by simply mapping
special operators to a predefined set of "special method" names such
as __init__ and __del__. By defining methods with these names, users
could supply code related to the construction and destruction of
objects.
I also used this technique to allow user classes to redefine the
behavior of Python's operators. As previously noted, Python is
implemented in C and uses tables of function pointers to implement
various capabilities of built-in objects (e.g., “get attribute”, “add”
and “call”). To allow these capabilities to be defined in user-defined
classes, I mapped the various function pointers to special method
names such as __getattr__, __add__, and __call__. There is a direct
correspondence between these names and the tables of function pointers
one has to define when implementing new Python objects in C.
In summary: types defined in C have a structure that contains pointers to special methods. Guido wanted to keep consistency with types defined in Python and so their special methods end up being used at the class level.
Could the implementation always follow the lookup order? Yes... at a huge cost in speed, since now even the C code would have to first perform a dictionary lookup on the instance to ensure whether or not a special method is defined and call that. Given that special methods are called often, especially for built-in types, it makes sense to just have a direct pointer to the function in the class. The behaviour of the python side is just consistent with this.
Python was never bright in the performance sector. Your suggested implementation would run extremely slowly, especially 20 years ago when it was design on way less powerful machines and when JITs were extremely rare and not so well understood (compared to the present).

Python: Assigning custom attributes on objects [duplicate]

So, I was playing around with Python while answering this question, and I discovered that this is not valid:
o = object()
o.attr = 'hello'
due to an AttributeError: 'object' object has no attribute 'attr'. However, with any class inherited from object, it is valid:
class Sub(object):
pass
s = Sub()
s.attr = 'hello'
Printing s.attr displays 'hello' as expected. Why is this the case? What in the Python language specification specifies that you can't assign attributes to vanilla objects?
For other workarounds, see How can I create an object and add attributes to it?.
To support arbitrary attribute assignment, an object needs a __dict__: a dict associated with the object, where arbitrary attributes can be stored. Otherwise, there's nowhere to put new attributes.
An instance of object does not carry around a __dict__ -- if it did, before the horrible circular dependence problem (since dict, like most everything else, inherits from object;-), this would saddle every object in Python with a dict, which would mean an overhead of many bytes per object that currently doesn't have or need a dict (essentially, all objects that don't have arbitrarily assignable attributes don't have or need a dict).
For example, using the excellent pympler project (you can get it via svn from here), we can do some measurements...:
>>> from pympler import asizeof
>>> asizeof.asizeof({})
144
>>> asizeof.asizeof(23)
16
You wouldn't want every int to take up 144 bytes instead of just 16, right?-)
Now, when you make a class (inheriting from whatever), things change...:
>>> class dint(int): pass
...
>>> asizeof.asizeof(dint(23))
184
...the __dict__ is now added (plus, a little more overhead) -- so a dint instance can have arbitrary attributes, but you pay quite a space cost for that flexibility.
So what if you wanted ints with just one extra attribute foobar...? It's a rare need, but Python does offer a special mechanism for the purpose...
>>> class fint(int):
... __slots__ = 'foobar',
... def __init__(self, x): self.foobar=x+100
...
>>> asizeof.asizeof(fint(23))
80
...not quite as tiny as an int, mind you! (or even the two ints, one the self and one the self.foobar -- the second one can be reassigned), but surely much better than a dint.
When the class has the __slots__ special attribute (a sequence of strings), then the class statement (more precisely, the default metaclass, type) does not equip every instance of that class with a __dict__ (and therefore the ability to have arbitrary attributes), just a finite, rigid set of "slots" (basically places which can each hold one reference to some object) with the given names.
In exchange for the lost flexibility, you gain a lot of bytes per instance (probably meaningful only if you have zillions of instances gallivanting around, but, there are use cases for that).
As other answerers have said, an object does not have a __dict__. object is the base class of all types, including int or str. Thus whatever is provided by object will be a burden to them as well. Even something as simple as an optional __dict__ would need an extra pointer for each value; this would waste additional 4-8 bytes of memory for each object in the system, for a very limited utility.
Instead of doing an instance of a dummy class, in Python 3.3+, you can (and should) use types.SimpleNamespace for this.
It is simply due to optimization.
Dicts are relatively large.
>>> import sys
>>> sys.getsizeof((lambda:1).__dict__)
140
Most (maybe all) classes that are defined in C do not have a dict for optimization.
If you look at the source code you will see that there are many checks to see if the object has a dict or not.
So, investigating my own question, I discovered this about the Python language: you can inherit from things like int, and you see the same behaviour:
>>> class MyInt(int):
pass
>>> x = MyInt()
>>> print x
0
>>> x.hello = 4
>>> print x.hello
4
>>> x = x + 1
>>> print x
1
>>> print x.hello
Traceback (most recent call last):
File "<interactive input>", line 1, in <module>
AttributeError: 'int' object has no attribute 'hello'
I assume the error at the end is because the add function returns an int, so I'd have to override functions like __add__ and such in order to retain my custom attributes. But this all now makes sense to me (I think), when I think of "object" like "int".
https://docs.python.org/3/library/functions.html#object :
Note: object does not have a __dict__, so you can’t assign arbitrary attributes to an instance of the object class.
It's because object is a "type", not a class. In general, all classes that are defined in C extensions (like all the built in datatypes, and stuff like numpy arrays) do not allow addition of arbitrary attributes.
This is (IMO) one of the fundamental limitations with Python - you can't re-open classes. I believe the actual problem, though, is caused by the fact that classes implemented in C can't be modified at runtime... subclasses can, but not the base classes.

Python's equivalent of .Net's sealed class

Does python have anything similar to a sealed class? I believe it's also known as final class, in java.
In other words, in python, can we mark a class so it can never be inherited or expanded upon? Did python ever considered having such a feature? Why?
Disclaimers
Actually trying to understand why sealed classes even exist. Answer here (and in many, many, many, many, many, really many other places) did not satisfy me at all, so I'm trying to look from a different angle. Please, avoid theoretical answers to this question, and focus on the title! Or, if you'd insist, at least please give one very good and practical example of a sealed class in csharp, pointing what would break big time if it was unsealed.
I'm no expert in either language, but I do know a bit of both. Just yesterday while coding on csharp I got to know about the existence of sealed classes. And now I'm wondering if python has anything equivalent to that. I believe there is a very good reason for its existence, but I'm really not getting it.
You can use a metaclass to prevent subclassing:
class Final(type):
def __new__(cls, name, bases, classdict):
for b in bases:
if isinstance(b, Final):
raise TypeError("type '{0}' is not an acceptable base type".format(b.__name__))
return type.__new__(cls, name, bases, dict(classdict))
class Foo:
__metaclass__ = Final
class Bar(Foo):
pass
gives:
>>> class Bar(Foo):
... pass
...
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "<stdin>", line 5, in __new__
TypeError: type 'Foo' is not an acceptable base type
The __metaclass__ = Final line makes the Foo class 'sealed'.
Note that you'd use a sealed class in .NET as a performance measure; since there won't be any subclassing methods can be addressed directly. Python method lookups work very differently, and there is no advantage or disadvantage, when it comes to method lookups, to using a metaclass like the above example.
Before we talk Python, let's talk "sealed":
I, too, have heard that the advantage of .Net sealed / Java final / C++ entirely-nonvirtual classes is performance. I heard it from a .Net dev at Microsoft, so maybe it's true. If you're building a heavy-use, highly-performance-sensitive app or framework, you may want to seal a handful of classes at or near the real, profiled bottleneck. Particularly classes that you are using within your own code.
For most applications of software, sealing a class that other teams consume as part of a framework/library/API is kinda...weird.
Mostly because there's a simple work-around for any sealed class, anyway.
I teach "Essential Test-Driven Development" courses, and in those three languages, I suggest consumers of such a sealed class wrap it in a delegating proxy that has the exact same method signatures, but they're override-able (virtual), so devs can create test-doubles for these slow, nondeterministic, or side-effect-inducing external dependencies.
[Warning: below snark intended as humor. Please read with your sense of humor subroutines activated. I do realize that there are cases where sealed/final are necessary.]
The proxy (which is not test code) effectively unseals (re-virtualizes) the class, resulting in v-table look-ups and possibly less efficient code (unless the compiler optimizer is competent enough to in-line the delegation). The advantages are that you can test your own code efficiently, saving living, breathing humans weeks of debugging time (in contrast to saving your app a few million microseconds) per month... [Disclaimer: that's just a WAG. Yeah, I know, your app is special. ;-]
So, my recommendations: (1) trust your compiler's optimizer, (2) stop creating unnecessary sealed/final/non-virtual classes that you built in order to either (a) eke out every microsecond of performance at a place that is likely not your bottleneck anyway (the keyboard, the Internet...), or (b) create some sort of misguided compile-time constraint on the "junior developers" on your team (yeah...I've seen that, too).
Oh, and (3) write the test first. ;-)
Okay, yes, there's always link-time mocking, too (e.g. TypeMock). You got me. Go ahead, seal your class. Whatevs.
Back to Python: The fact that there's a hack rather than a keyword is probably a reflection of the pure-virtual nature of Python. It's just not "natural."
By the way, I came to this question because I had the exact same question. Working on the Python port of my ever-so-challenging and realistic legacy-code lab, and I wanted to know if Python had such an abominable keyword as sealed or final (I use them in the Java, C#, and C++ courses as a challenge to unit testing). Apparently it doesn't. Now I have to find something equally challenging about untested Python code. Hmmm...
Python does have classes that can't be extended, such as bool or NoneType:
>>> class ExtendedBool(bool):
... pass
...
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: type 'bool' is not an acceptable base type
However, such classes cannot be created from Python code. (In the CPython C API, they are created by not setting the Py_TPFLAGS_BASETYPE flag.)
Python 3.6 will introduce the __init_subclass__ special method; raising an error from it will prevent creating subclasses. For older versions, a metaclass can be used.
Still, the most “Pythonic” way to limit usage of a class is to document how it should not be used.
Similar in purpose to a sealed class and useful to reduce memory usage (Usage of __slots__?) is the __slots__ attribute which prevents monkey patching a class. Because when the metaclass __new__ is called, it is too late to put a __slots__ into the class, we have to put it into the namespace at the first possible timepoint, i.e. during __prepare__. Additionally, this throws the TypeError a little bit earlier. Using mcs for the isinstance comparison removes the necessity to hardcode the metaclass name in itself. The disadvantage is that all unslotted attributes are read-only. Therefore, if we want to set specific attributes during initialization or later, they have to slotted specifically. This is feasible e.g. by using a dynamic metaclass taking slots as an argument.
def Final(slots=[]):
if "__dict__" in slots:
raise ValueError("Having __dict__ in __slots__ breaks the purpose")
class _Final(type):
#classmethod
def __prepare__(mcs, name, bases, **kwargs):
for b in bases:
if isinstance(b, mcs):
msg = "type '{0}' is not an acceptable base type"
raise TypeError(msg.format(b.__name__))
namespace = {"__slots__":slots}
return namespace
return _Final
class Foo(metaclass=Final(slots=["_z"])):
y = 1
def __init__(self, z=1):
self.z = 1
#property
def z(self):
return self._z
#z.setter
def z(self, val:int):
if not isinstance(val, int):
raise TypeError("Value must be an integer")
else:
self._z = val
def foo(self):
print("I am sealed against monkey patching")
where the attempt of overwriting foo.foo will throw AttributeError: 'Foo' object attribute 'foo' is read-only and attempting to add foo.x will throw AttributeError: 'Foo' object has no attribute 'x'. The limiting power of __slots__ would be broken when inheriting, but because Foo has the metaclass Final, you can't inherit from it. It would also be broken when dict is in slots, so we throw a ValueError in case. To conclude, defining setters and getters for slotted properties allows to limit how the user can overwrite them.
foo = Foo()
# attributes are accessible
foo.foo()
print(foo.y)
# changing slotted attributes is possible
foo.z = 2
# %%
# overwriting unslotted attributes won't work
foo.foo = lambda:print("Guerilla patching attempt")
# overwriting a accordingly defined property won't work
foo.z = foo.foo
# expanding won't work
foo.x = 1
# %% inheriting won't work
class Bar(Foo):
pass
In that regard, Foo could not be inherited or expanded upon. The disadvantage is that all attributes have to be explicitly slotted, or are limited to a read-only class variable.
Python 3.8 has that feature in the form of the typing.final decorator:
class Base:
#final
def done(self) -> None:
...
class Sub(Base):
def done(self) -> None: # Error reported by type checker
...
#final
class Leaf:
...
class Other(Leaf): # Error reported by type checker
See https://docs.python.org/3/library/typing.html#typing.final

Classes How I understand them. Correct me if Im wrong please

I really hope this is not a question posed by millions of newbies, but my search didn t really give me a satisfying answer.
So my question is fairly simple. Are classes basically a container for functions with its own namespace? What other functions do they have beside providing a separate namespace and holding functions while making them callable as class atributes? Im asking in a python context.
Oh and thanks for the great help most of you have been!
More importantly than functions, class instances hold data attributes, allowing you to define new data types beyond what is built into the language; and
they support inheritance and duck typing.
For example, here's a moderately useful class. Since Python files (created with open) don't remember their own name, let's make a file class that does.
class NamedFile(object):
def __init__(self, name):
self._f = f
self.name = name
def readline(self):
return self._f.readline()
Had Python not had classes, you'd probably be working with dicts instead:
def open_file(name):
return {"name": name, "f": open(name)}
Needless to say, calling myfile["f"].readline() all the time will cause your fingers to hurt at some point. You could of course introduce a function readline in a NamedFile module (namespace), but then you'd always have to use that exact function. By contrast, NamedFile instances can be used anywhere you need an object with a readline method, so it would be a plug-in replacement for file in many situation. That's called polymorphism, one of the biggest benefits of OO/class-based programming.
(Also, dict is a class, so using it violates the assumption that there are no classes :)
In most languages, classes are just pieces of code that describe how to produce an object. That's kinda true in Python too:
>>> class ObjectCreator(object):
... pass
...
>>> my_object = ObjectCreator()
>>> print my_object
<__main__.ObjectCreator object at 0x8974f2c>
But classes are more than that in Python. Classes are objects too.
Yes, objects.
As soon as you use the keyword class, Python executes it and creates an OBJECT. The instruction:
>>> class ObjectCreator(object):
... pass
...
creates in memory an object with the name ObjectCreator.
This object (the class) is itself capable of creating objects (the instances), and this is why it's a class.
But still, it's an object, and therefore:
you can assign it to a variable
you can copy it
you can add attributes to it
you can pass it as a function parameter
e.g.:
>>> print ObjectCreator # you can print a class because it's an object
<class '__main__.ObjectCreator'>
>>> def echo(o):
... print o
...
>>> echo(ObjectCreator) # you can pass a class as a parameter
<class '__main__.ObjectCreator'>
>>> print hasattr(ObjectCreator, 'new_attribute')
False
>>> ObjectCreator.new_attribute = 'foo' # you can add attributes to a class
>>> print hasattr(ObjectCreator, 'new_attribute')
True
>>> print ObjectCreator.new_attribute
foo
>>> ObjectCreatorMirror = ObjectCreator # you can assign a class to a variable
>>> print ObjectCreatorMirror.new_attribute
foo
>>> print ObjectCreatorMirror()
<__main__.ObjectCreator object at 0x8997b4c>
Classes (or objects) are used to provide encapsulation of data and operations that can be performed on that data.
They don't provide namespacing in Python per se; module imports provide the same type of stuff and a module can be entirely functional rather than object oriented.
You might gain some benefit from looking at OOP With Python, Dive into Python, Chapter 5. Objects and Object Oriented Programming or even just the Wikipedia article on object oriented programming
A class is the definition of an object. In this sense, the class provides a namespace of sorts, but that is not the true purpose of a class. The true purpose is to define what the object will 'look like' - what the object is capable of doing (methods) and what it will know (properties).
Note that my answer is intended to provide a sense of understanding on a relatively non-technical level, which is what my initial trouble was with understanding classes. I'm sure there will be many other great answers to this question; I hope this one adds to your overall understanding.

Categories