Are methods with covariant arguments allowed or prohibited in Python? - python

from typing import Generic, TypeVar
T_co = TypeVar('T_co', covariant=True)
class CovariantClass(Generic[T_co]):
def get_t(self, t: T_co) -> T_co: # <--- Mypy: cannot use a covariant type variable as a parameter
return t
see closed mypy issue
My reading of the PEPs 484 and 483 is that covariant functions are prohibited but covariant methods are allowed and mypy is "wrong" to flag it. That's to say that if the class is declared by the author as covariant then it is their responsibility to ensure that any covariant method in the class is well-behaved (e.g. by not appending to collection of covariant items).
from PEP 484:
Consider a class Employee with a subclass Manager. Now suppose we have a function with an argument annotated with List[Employee]. Should we be allowed to call this function with a variable of type List[Manager] as its argument? Many people would answer "yes, of course" without even considering the consequences. But unless we know more about the function, a type checker should reject such a call: the function might append an Employee instance to the list, which would violate the variable's type in the caller.
It turns out such an argument acts contravariantly, whereas the intuitive answer (which is correct in case the function doesn't mutate its argument!) requires the argument to act covariantly.
....
Covariance or contravariance is not a property of a type variable, but a property of a generic class defined using this variable. Variance is only applicable to generic types; generic functions do not have this property. The latter should be defined using only type variables without covariant or contravariant keyword arguments.
It then gives an example of a prohibited function with a covariant argument.
Also from PEP 483 as evidence against the argument put forward in the closed mypy issue that it is prohibited to keep python type-safe (emphasis as in the PEP):
It is possible to declare the variance for user defined generic types
PEP 484 contains a class with a covariant argument in the __init__ but there is no example I could find in the PEPs of a covariant argument in a method showing it to be explicitly either allowed or prohibited.
T_co = TypeVar('T_co', covariant=True)
class ImmutableList(Generic[T_co]):
def __init__(self, items: Iterable[T_co]) -> None: ...
def __iter__(self) -> Iterator[T_co]: ...
...
EDIT 1: simple example of where this could be useful as requested in the comments. An extra method in the above class from the PEP:
def is_member(self, item: T_co) -> bool:
return item in self._items
EDIT 2: another example in case the first one seems academic
def k_nearest_neighbours(self, target_item: T_co, k: int) -> list[T_co]:
nearest: list[T_co] = []
distances: list[float] = [item.distance_metric(target_item)
for item in self._items]
...
return nearest

Related

how to define python generic classes [duplicate]

This question already has an answer here:
Python typing: return type with generics like Clazz[T] as in Java Clazz<T>
(1 answer)
Closed 3 months ago.
I have a class:
T = TypeVar('T')
class Stack(Generic[T]):
def __init__(self) -> None:
self.items: list[T] = []
def push(self, item: T) -> None:
self.items.append(item)
def pop(self) -> T:
return self.items.pop()
def empty(self) -> bool:
return not self.items
but I can also do:
T = TypeVar('T')
class Stack:
def __init__(self) -> None:
# Create an empty list with items of type T
self.items: list[T] = []
def push(self, item: T) -> None:
self.items.append(item)
def pop(self) -> T:
return self.items.pop()
def empty(self) -> bool:
return not self.items
what is the difference between these two samples?
which on should I use?
I tried running both, and both worked.
Type checking vs runtime
After writing this, I finally understood #Alexander point in first comment: whatever you write in annotations, it does not affect runtime, and your code is executed in the same way (sorry, I missed that you're looking just not from type checking perspective). This is core principle of python typing, as opposed to strongly typed languages (which makes it wonderful IMO): you can always say "I don't need types here - save my time and mental health". Type annotations are used to help some third-party tools, like mypy (type checker maintained by python core team) and IDEs. IDEs can suggest you something based on this information, and mypy checks whether your code can work if your types match the reality.
Generic version
T = TypeVar('T')
class Stack(Generic[T]):
def __init__(self) -> None:
self.items: list[T] = []
def push(self, item: T) -> None:
self.items.append(item)
def pop(self) -> T:
return self.items.pop()
def empty(self) -> bool:
return not self.items
You can treat type variables like regular variables, but intended for "meta" usage and ignored (well, there are some runtime traces, but they exist primary for introspection purpose) on runtime. They are substituted once for every binding context (more about it - below), and can be defined only once per module scope.
The code above declares normal generic class with one type argument. Now you can say Stack[int] to refer to a stack of integers, which is great. Current definition allows either explicit typing or using implicit Any parametrization:
# Explicit type
int_stack: Stack[int] = Stack()
reveal_type(int_stack) # N: revealed type is "__main__.Stack[builtins.int]
int_stack.push(1) # ok
int_stack.push('foo') # E: Argument 1 to "push" of "Stack" has incompatible type "str"; expected "int" [arg-type]
reveal_type(int_stack.pop()) # N: revealed type is "builtins.int"
# No type results in mypy error, similar to `x = []`
any_stack = Stack() # E: need type annotation for any_stack
# But if you ignore it, the type becomes `Stack[Any]`
reveal_type(any_stack) # N: revealed type is "__main__.Stack[Any]
any_stack.push(1) # ok
any_stack.push('foo') # ok too
reveal_type(any_stack.pop()) # N: revealed type is "Any"
To make the intended usage easier, you can allow initialization from iterable (I'm not covering the fact that you should be using collections.deque instead of list and maybe instead of this Stack class, assuming it is just a toy collection):
from collections.abc import Iterable
class Stack(Generic[T]):
def __init__(self, items: Iterable[T] | None) -> None:
# Create an empty list with items of type T
self.items: list[T] = list(items or [])
...
deduced_int_stack = Stack([1])
reveal_type(deduced_int_stack) # N: revealed type is "__main__.Stack[builtins.int]"
To sum up, generic classes have some type variable bound to the class body. When you create an instance of such class, it can be parametrized with some type - it may be another type variable or some fixed type, like int or tuple[str, Callable[[], MyClass[bool]]]. Then all occurrences of T in its body (except for nested classes, which are perhaps out of "quick glance" explanation context) are replaced with this type (or Any, if it is not given and cannot be deduced). This type can be deduced iff at least one of __init__ or __new__ arguments has type referring to T (just T or, say, list[T]), and otherwise you have to specify it. Note that if you have T used in __init__ of non-generic class, it is not very cool, although currently not disallowed.
Now, if you use T in some methods of generic class, it refers to that replaced value and results in typecheck errors, if passed types are not compatible with expected.
You can play with this example here.
Working outside of generic context
However, not all usages of type variables are related to generic classes. Fortunately, you cannot declare generic function with possibility to declare generic arg on calling side (like function<T> fun(x: number): int and fun<string>(0)), but there is enough more stuff. Let's begin with simpler examples - pure functions:
T = TypeVar('T')
def func1() -> T:
return 1
def func2(x: T) -> int:
return 1
def func3(x: T) -> T:
return x
def func4(x: T, y: T) -> int:
return 1
First function is declared to return some value of unbound type T. It obviously makes no sense, and recent mypy versions even learned to mark it as error. Your function return depends only on arguments and external state - and type variable must be present there, right? You cannot also declare global variable of type T in module scope, because T is still unbound - and thus neither func1 args nor module-scoped variables can depend on T.
Second function is more interesting. It does not cause mypy error, although still makes not very much sense: we can bind some type to T, but what is the difference between this and func2_1(x: Any) -> int: ...? We can speculate that now T can be used as annotation in function body, which can help in some corner case with type variable having upper bound, and I won't say it is impossible - but I cannot quickly construct such example, and have never seen such usage in proper context (it was always a mistake). Similar example is even explicitly referenced in PEP as valid.
The third and fourth functions are typical examples of type variables in functions. The third declares function returning the same type as it's argument.
The fourth function takes two arguments of the same type (arbitrary one). It is more useful if you have T = TypeVar('T', bound=Something) or T = TypeVar('T', str, bytes): you can concatenate two arguments of type T, but cannot - of type str | bytes, like in the below example:
T = TypeVar('T', str, bytes)
def total_length(x: T, y: T) -> int:
return len(x + y)
The most important fact about all examples above in this section: T doesnot have to be the same for different functions. You can call func3(1), then func3(['bar']) and then func4('foo', 'bar'). T is int, list[str] and str in these calls - no need to match.
With this in mind your second solution is clear:
T = TypeVar('T')
class Stack:
def __init__(self) -> None:
# Create an empty list with items of type T
self.items: list[T] = [] # E: Type variable "__main__.T" is unbound [valid-type]
def push(self, item: T) -> None:
self.items.append(item)
def pop(self) -> T: # E: A function returning TypeVar should receive at least one argument containing the same TypeVar [type-var]
return self.items.pop()
Here is mypy issue, discussing similar case.
__init__ says that we set attribute x to value of type T, but this T is lost later (T is scoped only within __init__) - so mypy rejects the assignment.
push is ill-formed and T has no meaning here, but it does not result in invalid typing situation, so is not rejected (type of argument is erased to Any, so you still can call push with some argument).
pop is invalid, because typechecker needs to know what my_stack.pop() will return. It could say "I give up - just have your Any", and will be perfectly valid (PEP does not enforce this). but mypy is more smart and denies invalid-by-design usage.
Edge case: you can return SomeGeneric[T] with unbound T, for example, in factory functions:
def make_list() -> list[T]: ...
mylist: list[str] = make_list()
because otherwise type argument couldn't have been specified on calling site
For better understanding of type variables and generics in python, I suggest you to read PEP483 and PEP484 - usually PEPs are more like a boring standard, but these are really good as a starting point.
There are many edge cases omitted there, which still cause hot discussions in mypy team (and probably other typecheckers too) - say, type variables in staticmethods of generic classes, or binding in classmethods used as constructors - mind that they can be used on instances too. However, basically you can:
have a TypeVar bound to class (Generic or Protocol, or some Generic subclass - if you subclass Iterable[T], your class is already generic in T) - then all methods use the same T and can contain it in one or both sides
or have a method-scoped/function-scoped type variable - then it's useful if repeated in the signature more than once (not necessary "clean" - it may be parametrizing another generic)
or use type variables in generic aliases (like LongTuple = tuple[T, T, T, T] - then you can do x: LongTuple[int] = (1, 2, 3, 4)
or do something more exotic with type variables, which is probably out of scope

How to define callable attribute with covariant return type on protocol?

Usually it is understood that the return type of a callable is covariant. When defining a type with a callable attribute, I can indeed make the return type generic and covariant:
from typing import TypeVar, Callable, Generic, Sequence
from dataclasses import dataclass
R = TypeVar("R", covariant=True)
#dataclass
class Works(Generic[R]):
call: Callable[[], R] # returns an R *or subtype*
w: Works[Sequence] = Works(lambda: []) # okay: list is subtype of Sequence
However, the same does not work for a Protocol. When I define a Protocol for the type in the same way, MyPy rejects this – it insists the return type must be invariant.
from typing import TypeVar, Callable, Protocol
R = TypeVar("R", covariant=True)
class Fails(Protocol[R]):
attribute: Callable[[], R]
$ python -m mypy so_testbed.py --pretty
so_testbed.py:5: error: Covariant type variable "R" used in protocol where invariant one is expected
class Fails(Protocol[R]):
^
Found 1 error in 1 file (checked 1 source file)
How can I properly define a Protocol for the concrete type that respects the covariance of R?
What you're attempting is explicitly not possible with Protocol - see the following in PEP 544:
Covariant subtyping of mutable attributes
Rejected because covariant
subtyping of mutable attributes is not safe. Consider this example:
class P(Protocol):
x: float
def f(arg: P) -> None:
arg.x = 0.42
class C:
x: int
c = C()
f(c) # Would typecheck if covariant subtyping
# of mutable attributes were allowed.
c.x >> 1 # But this fails at runtime
It was initially proposed to allow this for practical reasons, but it
was subsequently rejected, since this may mask some hard to spot bugs.
Since your attribute is a mutable member - you cannot have it be covariant with regards to R.
A possible alternative is to replace attribute with a method:
class Passes(Protocol[R]):
#property
def attribute(self) -> Callable[[], R]:
pass
which passes type-checking - but it's an inflexible solution.
If you have need of mutable covariant members, Protocol isn't the way to go.
As #Daniel Kleinstein pointed out, you cannot parameterize the protocol type by a covariant variable because it is used for a mutable attribute.
Another alternative is to split the variables into two (covariant and invariant) and use them in two protocols (replace Callable with Protocol).
from typing import TypeVar, Callable, Protocol
R_cov = TypeVar("R_cov", covariant=True)
R_inv = TypeVar("R_inv")
class CallProto(Protocol[R_cov]):
def __call__(self) -> R_cov: ...
class Fails(Protocol[R_inv]):
attribute: CallProto[R_inv]

mypy - callable with derived classes gives error

class BaseClass:
p: int
class DerivedClass(BaseClass):
q: int
def p(q: Callable[[BaseClass], str]) -> None:
return None
def r(derived: DerivedClass) -> str:
return ""
p(r)
Expected behavior:
    - No error from mypy -
Actual behavior:
Argument 1 to "p" has incompatible type "Callable[[DerivedClass], str]";
expected "Callable[[BaseClass], str]"
Let's talk about type variance. Under typical subtyping rules, if we have a type DerivedClass that is a subtype of a type BaseClass, then every instance of DerivedClass is an instance of BaseClass. Simple enough, right? But now the complexity arises when we have generic type arguments.
Let's suppose that we have a class that gets a value and returns it. I don't know how it gets it; maybe it queries a database, maybe it reads the file system, maybe it just makes one up. But it gets a value.
class Getter:
def get_value(self):
# Some deep magic ...
Now let's assume that, when we construct the Getter, we know what type it should be querying at compile-time. We can use a type variable to annotate this.
T = TypeVar("T")
class Getter(Generic[T]):
def get_value(self) -> T:
...
Now, Getter is a valid thing. We can have a Getter[int] which gets an integer and a Getter[str] which gets a string.
But here's a question. If I have a Getter[int], is that a valid Getter[object]? Surely, if I can get a value as an int, it's easy enough to upcast it, right?
my_getter_int: Getter[int] = ...
my_getter_obj: Getter[object] = my_getter_int
But Python won't allow this. See, Getter was declared to be invariant in its type argument. That's a fancy way of saying that, even though int is a subtype of object, Getter[int] and Getter[object] have no relationship.
But, like I said, surely they should have a relationship, right? Well, yes. If your type is only used in positive position (glossing over some details, that means roughly that it only appears as the return value of methods or as the type of read-only properties), then we can make it covariant.
T_co = TypeVar("T_co", covariant=True)
class Getter(Generic[T_co]):
def get_value(self) -> T_co:
...
By convention, in Python, we denote covariant type arguments using names that end in _co. But the thing that actually makes it covariant here is the covariant=True keyword argument.
Now, with this version of Getter, Getter[int] is actually a subtype of Getter[object]. In general, if A is a subtype of B, then Getter[A] is a subtype of Getter[B]. Covariance preserves subtyping.
Okay, that's covariance. Now consider the opposite. Let's say we have a setter which sets some value in a database.
class Setter:
def set_value(self, value):
...
Same assumptions as before. Suppose we know what the type is in advance. Nowe we write
T = TypeVar("T")
class Setter:
def set_value(self, value: T) -> None:
...
Okay, great. Now, if I have a value my_setter : Setter[int], is that a Setter[object]? Well, my_setter can always take an integer value, whereas a Setter[object] is guaranteed to be able to take any object. my_setter can't guarantee that, so it's actually not. If we try to make T covariant in this example, we'll get
error: Cannot use a covariant type variable as a parameter
Because it's actually not a valid relationship. In fact, in this case, we get the opposite relationship. If we have a my_setter : Setter[object], then that's a guarantee that we can pass it any object at all, so certainly we can pass it an integer, hence we have a Setter[int]. This is called contravariance.
T_contra = TypeVar("T_contra", contravariant=True)
class Setter:
def set_value(self, value: T_contra) -> None:
...
We can make our type contravariant if it only appears in negative position, which (again, oversimplifying a bit) generally means that it appears as arguments to functions, but not as a return value. Now, Setter[object] is a subtype of Setter[int]. It's backwards. In general, if A is a subtype of B, then Setter[B] is a subtype of Setter[A]. Contravariance reverses the subtyping relationship.
Now, back to your example. You have a Callable[[DerivedClass], str] and want to know if it's a valid Callable[[BaseClass], str]
Applying our principles from before, we have a type Callable[[T], S] (I'm assuming only one argument for simplicity's sake, but in reality this works in Python for any number of arguments) and want to ask if T and S are covariant, contravariant, or invariant.
Well, what is a Callable? It's a function. It has one thing we can do: call it with a T and get an S. So it's pretty clear that T is only used as an argument and S as a result. Things only used as arguments are contravariant, and those used as results are covariant, so in reality it's more correct to write
Callable[[T_contra], S_co]
Arguments to Callable are contravariant, which means that if DerivedClass is a subtype of BaseClass, then Callable[[BaseClass], str] is a subtype of Callable[[DerivedClass], str], the opposite relationship to the one you suggested. You need a function that can accept any BaseClass. A function with a BaseClass argument would suffice, and so would a function with an object argument, or any type which is a supertype of BaseClass, but subtypes are insufficient because they're too specific for your contract.
MyPy objects to your call of p with r as its argument because given only the type signatures, it can't be sure the function won't be called with a non-DerivedClass instance.
For instance, given the same type annotations, p could be implemented like this:
def p(q: Callable[[BaseClass], str]) -> None:
obj = BaseClass()
q(obj)
This will break p(r) if r has an implementation that depends on the derived attributes of its argument:
def r(derived: DerivedClass) -> str:
return str(derived.q)

What is the static type of self?

I want to constrain a method parameter to be of the same type as the class it's called on (see the end for an example). While trying to do that, I've come across this behaviour that I'm struggling to get my head around.
The following doesn't type check
class A:
def foo(self) -> None:
pass
A.foo(1)
with
error: Argument 1 to "foo" of "A" has incompatible type "int"; expected "A"
as I'd expect, since I'd have thought A.foo should only take an A. If however I add a self type
from typing import TypeVar
Self = TypeVar("Self")
class A:
def foo(self: Self) -> None:
pass
A.foo(1)
it does type check. I would have expected it to fail, telling me I need to pass an A not an int. This suggests to me that the type checker usually infers the type A for self, and adding a Self type overrides that, I'm guessing to object. This fits with the error
from typing import TypeVar
Self = TypeVar("Self")
class A:
def bar(self) -> int:
return 0
def foo(self: Self) -> None:
self.bar()
error: "Self" has no attribute "bar"
which I can fix if I bound as Self = TypeVar("Self", bound='A')
Am I right that this means self is not constrained, in e.g. the same way I'd expect this to be constrained in Scala?
I guess this only has an impact if I specify the type of self to be anything but the class it's defined on, intentionally or otherwise. I'm also interested to know what the impact is of overriding self to be another type, and indeed whether it even makes sense with how Python resolves and calls methods.
Context
I want to do things like
class A:
def foo(self: Self, bar: List[Self]) -> Self:
...
but I was expecting Self to be constrained to be an A, and was surprised that it wasn't.
Two things:
self is only half-magic.
The self arg has the magical property that, if you call an attribute of an object as a function, and that function has self as its first arg, then the object itself will be prepended to the explicit args as the self.
I guess any good static analyzer would take as implicit that self has the class in question as its type, which is what you're seeing in your first example.
TypeVar is for polymorphism.
And I think that's what you're trying to do? In your third example, Self can be any type, depending on context. In the context of A.foo(1), Self is int, so self.bar() fails.
It may be possible to write an instance method that can be called as a static method against class non-members with parametric type restrictions, but it's probably not a good idea for any application in the wild. Just name the variable something else and declare the method to be static.
If you omit a type hint on self, the type checker will automatically assume it has whatever the type of the containing class is.
This means that:
class A:
def foo(self) -> None: pass
...is equivalent to doing:
class A:
def foo(self: A) -> None: pass
If you want self to be something else, you should set a custom type hint.
Regarding this code snippet:
from typing import TypeVar
Self = TypeVar("Self")
class A:
def foo(self: Self) -> None:
pass
A.foo(1)
Using a TypeVar only once in a function signature is either malformed or redundant, depending on your perspective.
But this is kind of unrelated to the main thrust of your question. We can repair your code snippet by instead doing:
from typing import TypeVar
Self = TypeVar("Self")
class A:
def foo(self: Self) -> Self:
return self
A.foo(1)
...which exhibits the same behaviors you noticed.
But regardless of which of the two code snippets we look at, I believe the type checker will indeed assume self has the same type as whatever the upper bound of Self is while type checking the body of foo. In this case, the upper bound is object, as you suspected.
We get this behavior whether or not we're doing anything fancy with self or not. For example, we'd get the exact same behavior by just doing:
def foo(x: Self) -> Self:
return x
...and so forth. From the perspective of the type checker, there's nothing special about the self parameter, except that we set a default type for it if it's missing a type hint instead of just using Any.
error: "Self" has no attribute "bar"
which I can fix if I bound as Self = TypeVar("Self", bound='A')
Am I right that this means self is not constrained, in e.g. the same way I'd expect this to be constrained in Scala?
I'm unfamiliar with how this is constrained in Scala, but it is indeed the case that if you chose to override the default type of self, you are responsible for setting your own constraints and bounds as appropriate.
To put it another way, once a TypeVar is defined, its meaning won't be changed when you try using it in a function definition. This is the rule for TypeVars/functions in general. And since mostly there's nothing special about self, the same rule also applies there.
(Though type checkers such as mypy will also try doing some basic sanity checks on whatever constraints you end up picking to ensure you don't end up with a method that's impossible to call or whatever. For example, it complain if you tried setting the bound of Self to int.)
Note that doing things like:
from typing import TypeVar, List
Self = TypeVar('Self', bound='A')
class A:
def foo(self: Self, bar: List[Self]) -> Self:
...
class B(A): pass
x = A().foo([A(), A()])
y = B().foo([B(), B()])
reveal_type(x) # Revealed type is 'A'
reveal_type(y) # Revealed type is 'B'
...is explicitly supported by PEP 484. The mypy docs also have a few examples.

How can I define a generic covariant function in Python?

I want to define a function example that takes an argument of type Widget or anything that extends Widget and returns the same type as the argument. So if Button extends Widget, calling example(Button()) returns type Button.
I tried the following:
T_co = TypeVar('T_co', Widget, covariant=True)
def example(widget: T_co) -> T_co:
...
However the type checker (Pyright) ignores the covariance. Upon further research I found a note in PEP 484:
Note: Covariance or contravariance is not a property of a type variable, but a property of a generic class defined using this variable. Variance is only applicable to generic types; generic functions do not have this property. The latter should be defined using only type variables without covariant or contravariant keyword arguments.
However if I try to define a generic function without the covariant argument as specified in the note:
T_co = TypeVar('T_co', Widget)
def example(widget: T_co) -> T_co:
...
I can only pass values of type Widget to the function (not Button).
How can I achieve this?
I was able to find the answer in the MyPy docs. Turns out I was looking for bound, not covariant. This can be done like so:
T = TypeVar('T', bound=Widget)
def example(widget: T) -> T:
...

Categories