Code first
The goal is to design OuterBase such that the following passes:
class Outer(OuterBase):
def __init__(self, foo: str) -> None:
self.foo = foo
class Inner:
outer: Outer
def get_foo(self) -> str:
return self.outer.foo
inner = Outer("bar").Inner()
assert inner.get_foo() == "bar"
My question is closely related to this:
How to access outer class from an inner class?
But it is decisively different in one relevant nuance. That question is about how to access an outer class from inside the inner class. This question is about access to a specific instance of the outer class.
Question
Given an Outer and an Inner class, where the latter is defined in the body of the former, and given an instance of Outer, can we pass that instance to the Inner constructor so as to bind the Inner instance to that Outer instance?
So if we did outer = Outer() and then inner = outer.Inner(), there would then be a reference to outer in an attribute of inner.
Secondary requirements
1) Simplest possible usage (minimal boilerplate)
It would be ideal, if the entire logic facilitating this binding of the instances were "hidden" in the Outer class.
Then there would be some OuterBase class that a user could inherit from and all he would have to do, is define the Inner class (with the agreed upon fixed name) and expect its outer attribute (also agreed upon) to hold a reference to an instance of the outer class.
Solutions involving decoration of the inner class or explicitly passing it a meta class or defining a special __init__ method and so on would be considered sub-optimal.
2) Type safety (to the greatest degree possible)
The code (both of the implementation and the usage) should ideally pass mypy --strict checks and obfuscate dynamic typing as little as possible.
Hypothetical real-life use case
Say the inner class is used as a settings container for instances of the outer class, similar to how Pydantic is designed. In Pydantic (possibly for various reasons) the inner Config class is a class-wide configuration, i.e. it applies to all instances of the model (outer class). With a setup like the one I am asking about here the usage of Pydantic models would remain unchanged, but now deviations in configuration would be possible on an instance level by binding a specific Config instance to a specific model instance.
A solution is to define a custom meta class for OuterBase, which will replace the Inner class defined inside the body of any Outer with an instance of a special descriptor.
That descriptor holds a reference to the actual Inner class. It also retains a reference to the Outer instance from which its __get__ method is called and has a generic __call__ method returning an instance of Inner, after assigning the reference to the Outer instance to its outer attribute.
from __future__ import annotations
from typing import Any, ClassVar, Generic, Optional, Protocol, TypeVar
class InnerProtocol(Protocol):
outer: object
T = TypeVar("T", bound=InnerProtocol)
class InnerDescriptor(Generic[T]):
"""Replacement for the actual inner class during outer class creation."""
InnerCls: type[T]
outer_instance: Optional[object]
def __init__(self, inner_class: type[T]) -> None:
self.InnerCls = inner_class
self.outer_instance = None
def __get__(self, instance: object, owner: type) -> InnerDescriptor[T]:
self.outer_instance = instance
return self
def __call__(self, *args: object, **kwargs: object) -> T:
if self.outer_instance is None:
raise RuntimeError("Inner class not bound to an outer instance")
inner_instance = self.InnerCls(*args, **kwargs)
inner_instance.outer = self.outer_instance
return inner_instance
class OuterMeta(type):
"""Replaces an inner class in the outer namespace with a descriptor."""
INNER_CLS_NAME: ClassVar[str] = "Inner"
def __new__(
mcs,
name: str,
bases: tuple[type],
namespace: dict[str, Any],
**kwargs: Any,
) -> OuterMeta:
if mcs.INNER_CLS_NAME in namespace:
namespace[mcs.INNER_CLS_NAME] = InnerDescriptor(
namespace[mcs.INNER_CLS_NAME]
)
return super().__new__(mcs, name, bases, namespace, **kwargs)
class OuterBase(metaclass=OuterMeta):
pass
Now this code passes (both runtime and static type checks):
class Outer(OuterBase):
def __init__(self, foo: str) -> None:
self.foo = foo
class Inner:
outer: Outer
def get_foo(self) -> str:
return self.outer.foo
inner = Outer("bar").Inner()
assert inner.get_foo() == "bar"
Ignoring type checkers, we could of course also omit the outer: Outer annotation.
If we were to instead compromise in the sense that we define some InnerBase for Inner to inherit from, we could define the outer attribute there. That would also eliminate the need for the custom InnerProtocol to make type checkers happy. But the trade-off is that users of OuterBase would always have to explicitly inherit when defining the inner class as class Inner(InnerBase): ....
Related
I have defined an abstract base class BaseRepository that acts as a collection of items with specified supertype Foo.
The convenience classmethods in BaseRepository are annotated/type hinted to work with objects of type Foo. Here is a minimal example:
from abc import ABCMeta, abstractmethod
NoReturn = None
class Foo(object):
pass # simple data holding object
class BaseRepository(object, metaclass=ABCMeta):
# May be filled with subtypes of `Foo` later
_items = None # type: List[Foo]
#classmethod
def get_item(cls) -> Foo:
return cls._items[0]
#classmethod
#abstractmethod
def _load_items(cls) -> NoReturn:
pass
Now there are multiple static implementations (e.g. SubRepository) which are each supposed to work with their own type of items (like Bar), being subclasses of the original generic type Foo.
class Bar(Foo):
pass # Must implement Foo in order for BaseRepository's methods to work
def load_some_bars():
return [Bar(),Bar()]
class SubRepository(BaseRepository):
# Inherits `get_item` from BaseRepository
#classmethod
def _load_items(cls) -> NoReturn:
cls._items = load_some_bars()
The repositories are static, meaning that they are not instantiated but rather function as namespaces for proper access to items that I load from YAML configuration files. The main perk is that I can create one of these SubRepositories and simply override the deserialization method _load_items, and the resulting repository will have all convenience methods from the base class. As I need to ensure that all of these SubRepositories work with items Foo that have a specific interface in order for the BaseRepository methods function properly, the SubRepositories must work with items that inherit from Foo.
Strongly-typed languages like Java or C# have the concept of Generic Collections, where the elements in the subclassed collections all assume a specific type.
Is the same possible with type hinting in Python?
In particular, I would like the inherited get_item method in SubRepository to be hinted as Bar with minimal effort (not override it just for the sake of type hints). Optimally, the correct return value should be linted by PyCharm.
Currently, even though SubRepository holds Bar items, my autocompletion in PyCharm only shows me members of Foo.
I read about typing.Generic and TypeVar, but I'm unsure how to use them in this case.
You're programming to an interface, so only Foo members are exposed.
from typing import get_type_hints
print(get_type_hints(SubRepository.get_item))
Output:
{'return': <class '__main__.Foo'>}
A generic collection will expose the generic type's members.
from typing import TypeVar, Generic, get_type_hints
from abc import ABCMeta, abstractmethod
NoReturn = None
# type variable with an upper bound
T = TypeVar('T', bound=Foo)
class BaseRepository(Generic[T], metaclass=ABCMeta):
_items = None # type: List[T]
#classmethod
def get_item(cls) -> T:
return cls._items[0]
#classmethod
#abstractmethod
def _load_items(cls) -> NoReturn:
pass
class SubRepository(BaseRepository[Bar]):
# Inherits `get_item` from BaseRepository
#classmethod
def _load_items(cls) -> NoReturn:
cls._items = load_some_bars()
Return type
print(get_type_hints(SubRepository.get_item))
Passes the buck
{'return': ~T}
Autocompletion will now show members of Bar.
I'm working on a decorator to implement some behaviors for an immutable class. I'd like a class to inherit from namedtuple (to have attribute immutability) and also want to add some new methods. Like this ... but correctly preventing new attributes being assigned to the new class.
When inheriting from namedtuple, you should define __new__ and set __slots__ to be an empty tuple (to maintain immutability):
def define_new(clz):
def __new(cls, *args, **kwargs):
return super(clz, cls).__new__(cls, *args, **kwargs)
clz.__new__ = staticmethod(__new) # delegate namedtuple.__new__ to namedtuple
return clz
#define_new
class C(namedtuple('Foo', "a b c")):
__slots__ = () # Prevent assignment of new vars
def foo(self): return "foo"
C(1,2,3).x = 123 # Fails, correctly
... great. But now I'd like to move the __slots__ assignment into the decorator:
def define_new(clz):
def __new(cls, *args, **kwargs):
return super(clz, cls).__new__(cls, *args, **kwargs)
#clz.__slots__ = ()
clz.__slots__ = (123) # just for testing
clz.__new__ = staticmethod(__new)
return clz
#define_new
class C(namedtuple('Foo', "a b c")):
def foo(self): return "foo"
c = C(1,2,3)
print c.__slots__ # Is the (123) I assigned!
c.x = 456 # Assignment succeeds! Not immutable.
print c.__slots__ # Is still (123)
Which is a little surprising.
Why has moving the assignment of __slots__ into the decorator caused a change in behavior?
If I print C.__slots__, I get the object I assigned. What do the x get stored?
The code doesn't work because __slots__ is not a normal class property consulted at run-time. It is a fundamental property of the class that affects the memory layout of each of its instances, and as such must be known when the class is created and remain static throughout the its lifetime. While Python (presumably for backward compatibility) allows assigning to __slots__ later, the assignment has no effect on the behavior of existing or future instances.
How __slots__ is set
The value of __slots__ determined by the class author is passed to the class constructor when the class object is being created. This is done when the class statement is executed; for example:
class X:
__slots__ = ()
The above statement is equivalent1 to creating a class object and assigning it to X:
X = type('X', (), {'__slots__': ()})
The type object is the metaclass, the factory that creates and returns a class when called. The metaclass invocation accepts the name of the type, its superclasses, and its definition dict. Most of the contents of the definition dict can also be assigned later The definition dict contains directives that affect low-level layour of the class instances. As you discovered, later assignment to __slots__ simply has no effect.
Setting __slots__ from the outside
To modify __slots__ so that it is picked up by Python, one must specify it when the class is being created. This can be accomplished with a metaclass, the type responsible for instantiating types. The metaclass drives the creation of the class object and it can make sure __slots__ makes its way into the class definition dict before the constructor is invoked:
class DefineNew(type):
def __new__(metacls, name, bases, dct):
def __new__(cls, *new_args, **new_kwargs):
return super(defcls, cls).__new__(cls, *new_args, **new_kwargs)
dct['__slots__'] = ()
dct['__new__'] = __new__
defcls = super().__new__(metacls, name, bases, dct)
return defcls
class C(namedtuple('Foo', "a b c"), metaclass=DefineNew):
def foo(self):
return "foo"
Testing results in the expected:
>>> c = C(1, 2, 3)
>>> c.foo()
'foo'
>>> c.bar = 1
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
AttributeError: 'C' object has no attribute 'bar'
Metaclass mixing pitfall
Note that the C type object will itself be an instance of DefineMeta - which is not surprising, since that follows from the definition of a metaclass. But this might cause an error if you ever inherit from both C and a type that specifies a metaclass other than type or DefineMeta. Since we only need the metaclass to hook into class creation, but are not using it later, it is not strictly necessary for C to be created as an instance of DefineMeta - we can instead make it an instance of type, just like any other class. This is achieved by changing the line:
defcls = super().__new__(metacls, name, bases, dct)
to:
defcls = type.__new__(type, name, bases, dct)
The injection of __new__ and __slots__ will remain, but C will be a most ordinary type with the default metaclass.
In conclusion...
Defining a __new__ which simply calls the superclass __new__ is always superfluous - presumably the real code will also do something different in the injected __new__, e.g. provide the default values for the namedtuple.
1
In the actual class definition the compiler adds a couple of additional items to the class dict, such as the module name. Those are useful, but they do not affect the class definition in the fundamental way that __slots__ does. If X had methods, their function objects would also be included in the dict keyed by function name - automatically inserted as a side effect of executing the def statement in the class definition namespace dict.
__slots__ has to be present during class creation. It affects the memory layout of a class's instances, which isn't something you can just change at will. (Imagine if you already had instances of the class and you tried to reassign the class's __slots__ at that point; instances would all break.) The processing that bases the memory layout on __slots__ only happens during class creation.
Assigning __slots__ in a decorator is too late to do anything. It has to happen before class creation, in the class body or a metaclass __new__.
Also, your define_new is pointless; namedtuple.__new__ already does what you need your __new__ to do.
class A(object):
def __init__(self, val):
self.val = A.plus_one(val)
#staticmethod
def plus_one(val):
return val + 1
We can define a static method in a class and use it in any places. My question is, why do we have to namespace the static method even when we use this static method within this class, that is, to add A. before the method name.
My idea is, can we simply change the mechanism so that the static method name within the class does not have to be namespaces as in: (although Python does not support it now):
class A(object):
def __init__(self, val):
self.val = plus_one(val)
#staticmethod
def plus_one(val):
return val + 1
My intuition is, since we are calling from within the class, the priority of class members should be higher than global functions, and thus there won't be any ambiguity. Why does Python force us to namespace it?
Methods defined in a class are not accessible by bare name, except by code running directly in the class namespace (not inside another method). If you want to call a method, you usually have to look it up on some kind of object (either an instance, or the class itself). The different kinds of methods differ in how they behave when for different kinds of lookups.
class Test(object):
def normal_method(self, foo):
print("normal_method", self, foo)
# none of the method names are valid here as bare names, only via self or Test
try:
class_method(type(self), foo) # raises NameError
except NameError:
pass
self.class_method(foo) # this works
Test.class_method(foo) # so does this
#classmethod
def class_method(cls, foo):
print("class_method", cls, foo)
# from a class method, you can call a method via the automatically provided cls
cls.static_method(foo)
#staticmethod
def static_method(foo):
print("static_method", foo)
# a static method can only use the global class name to call other methods
if isinstance(foo, int) and foo > 0:
Test.static_method(foo - 1)
# Class level code is perilous.
#normal_method('x', 2) # could work in Python 3 (Python 2 checks the type of self)
# but it probably won't be useful (since self is not an instance)
#class_method('x', 2) # Won't work. A classmethod descriptor isn't directly callable
#staticmethod(2) # Nor is a staticmethod
# Top level code (these are all valid)
Test().normal_method(2) # also could be Test.normal_method(Test(), 2)
Test.class_method(2) # could also be Test().class_method(2) (type of the instance is used)
Test.static_method(2) # could also be Test().static_method(2) (instance is ignored)
Am I missing something, or this something like this not possible?
class Outer:
def __init__(self, val):
self.__val = val
def __getVal(self):
return self.__val
def getInner(self):
return self.Inner(self)
class Inner:
def __init__(self, outer):
self.__outer = outer
def getVal(self):
return self.__outer.__getVal()
foo = Outer('foo')
inner = foo.getInner()
val = inner.getVal()
print val
I'm getting this error message:
return self.__outer.__getVal()
AttributeError: Outer instance has no attribute '_Inner__getVal'
You are trying to apply Java techniques to Python classes. Don't. Python has no privacy model like Java does. All attributes on a class and its instances are always accessible, even when using __name double-underscore names in a class (they are simply renamed to add a namespace).
As such, you don't need an inner class either, as there is no privileged access for such a class. You can just put that class outside Outer and have the exact same access levels.
You run into your error because Python renames attributes with initial double-underscore names within a class context to avoid clashing with subclasses. These are called class private because the renaming adds the class names as a namespace; this applies both to their definition and use. See the Reserved classes of identifiers section of the reference documentation:
__*
Class-private names. Names in this category, when used within the context of a class definition, are re-written to use a mangled form to help avoid name clashes between “private” attributes of base and derived classes.
All names with double underscores in Outer get renamed to _Outer prefixed, so __getVal is renamed to _Outer__getVal. The same happens to any such names in Inner, so your Inner.getVal() method will be looking for a _Inner__getVal attribute. Since Outer has no _Inner__getVal attribute, you get your error.
You could manually apply the same transformation to Inner.getVal() to 'fix' this error:
def getVal(self):
return self.__outer._Outer__getVal()
But you are not using double-underscore names as intended anyway, so move to single underscores instead, and don't use a nested class:
class Outer:
def __init__(self, val):
self._val = val
def _getVal(self):
return self._val
def getInner(self):
return _Inner(self)
class _Inner:
def __init__(self, outer):
self._outer = outer
def getVal(self):
return self._outer._getVal()
I renamed Inner to _Inner to document the type is an internal implementation detail.
While we are on the subject, there really is no need to use accessors either. In Python you can switch between property objects and plain attributes at any time. There is no need to code defensively like you have to in Java, where switching between attributes and accessors carries a huge switching cost. In Python, don't use obj.getAttribute() and obj.setAttribute(val) methods. Just use obj.attribute and obj.attribute = val, and use property if you need to do more work to produce or set the value. Switch to or away from property objects at will during your development cycles.
As such, you can simplify the above further to:
class Outer(object):
def __init__(self, val):
self._val = val
#property
def inner(self):
return _Inner(self)
class _Inner(object):
def __init__(self, outer):
self._outer = outer
#property
def val(self):
return self._outer._val
Here outer.inner produces a new _Inner() instance as needed, and the Inner.val property proxies to the stored self._outer reference. The user of the instance never need know either attribute is handled by a property object:
>>> outer = Outer(42)
>>> print outer.inner.val
42
Note that for property to work properly in Python 2, you must use new-style classes; inherit from object to do this; on old-style classes on property getters are supported (meaning setting is not prevented either!). This is the default in Python 3.
The leading-double-underscore naming convention in Python is supported with "name mangling." This is implemented by inserting the name of the current class in the name, as you have seen.
What this means for you is that names of the form __getVal can only be accessed from within the exact same class. If you have a nested class, it will be subject to different name mangling. Thus:
class Outer:
def foo(self):
print(self.__bar)
class Inner:
def foo2(self):
print(self.__bar)
In the two nested classes, the names will be mangled to _Outer__bar and _Inner__bar respectively.
This is not Java's notion of private. It's "lexical privacy" (akin to "lexical scope" ;-).
If you want Inner to be able to access the Outer value, you will have to provide a non-mangled API. Perhaps a single underscore: _getVal, or perhaps a public method: getVal.
Given a class MyClass(object), how can I programmatically define class member methods using some template and the MyClass.__getattr__ mechanism? I'm thinking of something along the lines of
class MyClass(object):
def __init__(self, defaultmembers, members):
# defaultmembers is a list
# members is a dict
self._defaultmembers = defaultmembers
self._members = members
# some magic spell creating a member function factory
def __getattr__(self):
# some more magic
def f_MemberB():
pass
C = MyClass(defaultmembers=["MemberA"], members=dict(MemberB=f_MemberB)
C.MemberA() # should be a valid statement
C.MemberB() # should also be a valid statement
C.MemberC() # should raise an AttributeError
C.MemberA should be a method automatically created from some template mechanism inside the class, and C.MemberB should be the function f_MemberB.
You don't need to redefine __getattr__ (and in fact you generally should never do that). Python is a late binding language. This means you can simply assign values to names in a class at any time (even dynamically at runtime) and they will now exist.
So, in your case, you can simply do:
class MyClass(object):
def __init__(self, defaultmembers, members):
# defaultmembers is a list
# members is a dict
for func in defaultmembers:
setattr(self, func.__name__, func)
for name, func in members.items():
setattr(self, name, func)
Note that this will not actually bind the method to the class (it will not get self as its first argument). If that is what you want, you need to use the MethodType function from types, like so:
from types import MethodType
class MyClass(object):
def __init__(self, defaultmembers, members):
# defaultmembers is a list
# members is a dict
for func in defaultmembers:
name = func.__name__
setattr(self, name , func)
setattr(self, name , MethodType(getattr(self, name), self))
for name, func in members.items():
setattr(self, name, func)
setattr(self, name , MethodType(getattr(self, name), self))
Example:
def def_member(self):
return 1
def key_member(self):
return 2
>>> test = MyClass([def_member], {'named_method':key_member})
>>> test.def_member()
1
>>> test.named_method()
2
You can also make the init method slightly less awkward by using *args and **kwargs, so that the example would just be test = MyClass(def_member, named_member = key_member) if you know there won't be any other arguments to the constructor of this class.
Obviously I've left out the template creation bit for the defaultmembers, since I've used passing a function, rather than simply a name, as the argument. But you should be able to see how you would expand that example to suit your needs, as the template creation part is a bit out of the scope of the original question, which is how to dynamically bind methods to classes.
An important note: This only affects the single instance of MyClass. Please alter the question if you want to affect all instances. Though I would think using mixin class would be better in that case.