I'm not able to understand the use of Generic and TypeVar, and how they are related.
https://docs.python.org/3/library/typing.html#building-generic-types
The docs have this example:
class Mapping(Generic[KT, VT]):
def __getitem__(self, key: KT) -> VT:
...
# Etc.
X = TypeVar('X')
Y = TypeVar('Y')
def lookup_name(mapping: Mapping[X, Y], key: X, default: Y) -> Y:
try:
return mapping[key]
except KeyError:
return default
Type variables exist primarily for the benefit of static type
checkers. They serve as the parameters for generic types as well as
for generic function definitions.
Why can't I simply use Mapping with some existing type, like int, instead of creating X and Y?
Type variables are literally "variables for types". Similar to how regular variables allow code to apply to multiple values, type variables allow code to apply to multiple types.
At the same time, just like code is not required to apply to multiple values, it is not required to depend on multiple types. A literal value can be used instead of variables, and a literal type can be used instead of type variables – provided these are the only values/types applicable.
Since the Python language semantically only knows values – runtime types are also values – it does not have the facilities to express type variability. Namely, it cannot define, reference or scope type variables. Thus, typing represents these two concepts via concrete things:
A typing.TypeVar represents the definition and reference to a type variable.
A typing.Generic represents the scoping of types, specifically to class scope.
Notably, it is possible to use TypeVar without Generic – functions are naturally scoped – and Generic without TypeVar – scopes may use literal types.
Consider a function to add two things. The most naive implementation adds two literal things:
def add():
return 5 + 12
That is valid but needlessly restricted. One would like to parameterise the two things to add – this is what regular variables are used for:
def add(a, b):
return a + b
Now consider a function to add two typed things. The most naive implementations adds two things of literal type:
def add(a: int, b: int) -> int:
return a + b
That is valid but needlessly restricted. One would like to parameterise the types of the two things to add – this is what type variables are used for:
T = TypeVar("T")
def add(a: T, b: T) -> T:
return a + b
Now, in the case of values we defined two variables – a and b but in the case of types we defined one variable – the single T – but used for both variables! Just like the expression a + a would mean both operands are the same value, the annotation a: T, b: T means both parameters are the same type. This is because our function has a strong relation between the types but not the values.
While type variables are automatically scoped in functions – to the function scope – this is not the case for classes: a type variable might be scoped across all methods/attributes of a class or specific to some method/attribute.
When we define a class, we may scope type variables to the class scope by adding them as parameters to the class. Notably, parameters are always variables – this applies to regular parameters just as for type parameters. It just does not make sense to parameterise a literal.
# v value parameters of the function are "value variables"
def mapping(keys, values):
...
# v type parameters of the class are "type variables"
class Mapping(Generic[KT, VT]):
...
When we use a class, the scope of its parameters has already been defined. Notably, the arguments passed in may be literal or variable – this again applies to regular arguments just as for type arguments.
# v pass in arguments via literals
mapping([0, 1, 2, 3], ['zero', 'one', 'two', 'three'])
# v pass in arguments via variables
mapping(ks, vs)
# v pass in arguments via literals
m: Mapping[int, str]
# v pass in arguments via variables
m: Mapping[KT, VT]
Whether to use literals or variables and whether to scope them or not depends on the use-case. But we are free to do either as required.
The whole purporse of using Generic and TypeVar (here represented as the X and Y variables) is when one wants the parameters to be as generic as possible. int can be used, instead, in this case. The difference is: the static analyzer will interpret the parameter as always being an int.
Using generics mean the function accepts any type of parameter. The static analyzer, as in an IDE for instance, will determine the type of the variables and the return type as the arguments are provided on function call or object instantiation.
mapping: Mapping[str, int] = {"2": 2, "3": 3}
name = lookup_name(mapping, "1", 1)
In the above example type checkers will know name will always be an int relying on the type annotations. In IDEs, code completion for int methods will be shown as the 'name' variable is used.
Using specific types is ideal if that is your goal. The function accepting only a map with int keys or values, and/or returning int in this case, for instance.
As X and Y are variable you can choose any name want, basically.
Below example is possible:
def lookup_name(mapping: Mapping[str, int], key: str, default: int) -> int:
try:
return mapping[key]
except KeyError:
return default
The types are not generic in the above example. The key will always be str; the default variable, the value, and the return type will always be an int. It's the programmer's choice. This is not enforced by Python, though. A static type checker like mypy is needed for that.
The Generic type could even be constrained if wanted:
import typing
X = typing.TypeVar("X", int, str) # Accept int and str
Y = typing.TypeVar("Y", int) # Accept only int
#MisterMiyagi's answer offers a thorough explanation on the use scope for TypeVar and Generic.
Related
Let us assume that we need a function that accepts two arguments of any type as long as both arguments have the same type. How would you check it statically with mypy?
If we only need the function to accept some finite amount of already known types, it is easy:
from typing import TypeVar, List, Callable
T = TypeVar('T', int, str, List[int], Callable[[], int])
def f(a: T, b: T) -> None:
pass
f(1, 2)
f("1", "2")
f([1], [2])
f(lambda: 1, lambda: 2)
f(1, "2") # mypy will print an error message
For this code, mypy can ensure that the arguments to f are either two ints or two strs or two lists of ints or two functions of zero arguments that return int.
But what if we don't know the types in advance? What if we need something similar to let f (a:'t) (b:'t) = () from F# and OCaml? Simply writing T = TypeVar('T') would make things like f(1, "2") valid, and this is not what we want.
What you're asking for is impossible (see below for explanation). But usually, there's no need in python to require that two arguments have precisely identical type.
In your example, int, str, List[int], Callable[[], int] don't have any common methods or attributes (other than what any two object instances have), so unless you manually check the type with isinstance, you can't really do anything with your argument that you couldn't do with object instances. Could you explain your use case?
Explanation of why you can't enforce type equality
Mypy type system has subtyping. So when you write f(a, b), mypy only checks that types of a and b are both subtypes of T rather than precisely equal to T.
In addition mypy subtyping system is mostly pre-defined and not under the programmer control, in particular every type is a subtype of object. (IIUC, in OCaml the programmer needs to say explicitly which types should be in a subtyping relationship, so by default every type constraint is equality constraint. That's why you could do what you wanted in OCaml).
So, when you write
T = TypeVar('T')
f(a: T, b: T) -> None: ...
f(x, y)
you are only telling mypy that the types of x and y must be subtypes of some common type T. And of course, this constraint is always (trivially) satisfied by inferring that T is object.
Update
To your question in the comment (is it possible to ensure that type of y is of subtype of type of x?), the answer is also no.
Even though mypy allows a type variable to be bounded from above by a specified type, that bound cannot be another type variable, so this won't work:
T = TypeVar('T')
U = TypeVar('U', bound=T, contravariant=True) # error, T not valid here
f(x: T, y: U) -> None
from typing import Generic, TypeVar
T_co = TypeVar('T_co', covariant=True)
class CovariantClass(Generic[T_co]):
def get_t(self, t: T_co) -> T_co: # <--- Mypy: cannot use a covariant type variable as a parameter
return t
see closed mypy issue
My reading of the PEPs 484 and 483 is that covariant functions are prohibited but covariant methods are allowed and mypy is "wrong" to flag it. That's to say that if the class is declared by the author as covariant then it is their responsibility to ensure that any covariant method in the class is well-behaved (e.g. by not appending to collection of covariant items).
from PEP 484:
Consider a class Employee with a subclass Manager. Now suppose we have a function with an argument annotated with List[Employee]. Should we be allowed to call this function with a variable of type List[Manager] as its argument? Many people would answer "yes, of course" without even considering the consequences. But unless we know more about the function, a type checker should reject such a call: the function might append an Employee instance to the list, which would violate the variable's type in the caller.
It turns out such an argument acts contravariantly, whereas the intuitive answer (which is correct in case the function doesn't mutate its argument!) requires the argument to act covariantly.
....
Covariance or contravariance is not a property of a type variable, but a property of a generic class defined using this variable. Variance is only applicable to generic types; generic functions do not have this property. The latter should be defined using only type variables without covariant or contravariant keyword arguments.
It then gives an example of a prohibited function with a covariant argument.
Also from PEP 483 as evidence against the argument put forward in the closed mypy issue that it is prohibited to keep python type-safe (emphasis as in the PEP):
It is possible to declare the variance for user defined generic types
PEP 484 contains a class with a covariant argument in the __init__ but there is no example I could find in the PEPs of a covariant argument in a method showing it to be explicitly either allowed or prohibited.
T_co = TypeVar('T_co', covariant=True)
class ImmutableList(Generic[T_co]):
def __init__(self, items: Iterable[T_co]) -> None: ...
def __iter__(self) -> Iterator[T_co]: ...
...
EDIT 1: simple example of where this could be useful as requested in the comments. An extra method in the above class from the PEP:
def is_member(self, item: T_co) -> bool:
return item in self._items
EDIT 2: another example in case the first one seems academic
def k_nearest_neighbours(self, target_item: T_co, k: int) -> list[T_co]:
nearest: list[T_co] = []
distances: list[float] = [item.distance_metric(target_item)
for item in self._items]
...
return nearest
class BaseClass:
p: int
class DerivedClass(BaseClass):
q: int
def p(q: Callable[[BaseClass], str]) -> None:
return None
def r(derived: DerivedClass) -> str:
return ""
p(r)
Expected behavior:
- No error from mypy -
Actual behavior:
Argument 1 to "p" has incompatible type "Callable[[DerivedClass], str]";
expected "Callable[[BaseClass], str]"
Let's talk about type variance. Under typical subtyping rules, if we have a type DerivedClass that is a subtype of a type BaseClass, then every instance of DerivedClass is an instance of BaseClass. Simple enough, right? But now the complexity arises when we have generic type arguments.
Let's suppose that we have a class that gets a value and returns it. I don't know how it gets it; maybe it queries a database, maybe it reads the file system, maybe it just makes one up. But it gets a value.
class Getter:
def get_value(self):
# Some deep magic ...
Now let's assume that, when we construct the Getter, we know what type it should be querying at compile-time. We can use a type variable to annotate this.
T = TypeVar("T")
class Getter(Generic[T]):
def get_value(self) -> T:
...
Now, Getter is a valid thing. We can have a Getter[int] which gets an integer and a Getter[str] which gets a string.
But here's a question. If I have a Getter[int], is that a valid Getter[object]? Surely, if I can get a value as an int, it's easy enough to upcast it, right?
my_getter_int: Getter[int] = ...
my_getter_obj: Getter[object] = my_getter_int
But Python won't allow this. See, Getter was declared to be invariant in its type argument. That's a fancy way of saying that, even though int is a subtype of object, Getter[int] and Getter[object] have no relationship.
But, like I said, surely they should have a relationship, right? Well, yes. If your type is only used in positive position (glossing over some details, that means roughly that it only appears as the return value of methods or as the type of read-only properties), then we can make it covariant.
T_co = TypeVar("T_co", covariant=True)
class Getter(Generic[T_co]):
def get_value(self) -> T_co:
...
By convention, in Python, we denote covariant type arguments using names that end in _co. But the thing that actually makes it covariant here is the covariant=True keyword argument.
Now, with this version of Getter, Getter[int] is actually a subtype of Getter[object]. In general, if A is a subtype of B, then Getter[A] is a subtype of Getter[B]. Covariance preserves subtyping.
Okay, that's covariance. Now consider the opposite. Let's say we have a setter which sets some value in a database.
class Setter:
def set_value(self, value):
...
Same assumptions as before. Suppose we know what the type is in advance. Nowe we write
T = TypeVar("T")
class Setter:
def set_value(self, value: T) -> None:
...
Okay, great. Now, if I have a value my_setter : Setter[int], is that a Setter[object]? Well, my_setter can always take an integer value, whereas a Setter[object] is guaranteed to be able to take any object. my_setter can't guarantee that, so it's actually not. If we try to make T covariant in this example, we'll get
error: Cannot use a covariant type variable as a parameter
Because it's actually not a valid relationship. In fact, in this case, we get the opposite relationship. If we have a my_setter : Setter[object], then that's a guarantee that we can pass it any object at all, so certainly we can pass it an integer, hence we have a Setter[int]. This is called contravariance.
T_contra = TypeVar("T_contra", contravariant=True)
class Setter:
def set_value(self, value: T_contra) -> None:
...
We can make our type contravariant if it only appears in negative position, which (again, oversimplifying a bit) generally means that it appears as arguments to functions, but not as a return value. Now, Setter[object] is a subtype of Setter[int]. It's backwards. In general, if A is a subtype of B, then Setter[B] is a subtype of Setter[A]. Contravariance reverses the subtyping relationship.
Now, back to your example. You have a Callable[[DerivedClass], str] and want to know if it's a valid Callable[[BaseClass], str]
Applying our principles from before, we have a type Callable[[T], S] (I'm assuming only one argument for simplicity's sake, but in reality this works in Python for any number of arguments) and want to ask if T and S are covariant, contravariant, or invariant.
Well, what is a Callable? It's a function. It has one thing we can do: call it with a T and get an S. So it's pretty clear that T is only used as an argument and S as a result. Things only used as arguments are contravariant, and those used as results are covariant, so in reality it's more correct to write
Callable[[T_contra], S_co]
Arguments to Callable are contravariant, which means that if DerivedClass is a subtype of BaseClass, then Callable[[BaseClass], str] is a subtype of Callable[[DerivedClass], str], the opposite relationship to the one you suggested. You need a function that can accept any BaseClass. A function with a BaseClass argument would suffice, and so would a function with an object argument, or any type which is a supertype of BaseClass, but subtypes are insufficient because they're too specific for your contract.
MyPy objects to your call of p with r as its argument because given only the type signatures, it can't be sure the function won't be called with a non-DerivedClass instance.
For instance, given the same type annotations, p could be implemented like this:
def p(q: Callable[[BaseClass], str]) -> None:
obj = BaseClass()
q(obj)
This will break p(r) if r has an implementation that depends on the derived attributes of its argument:
def r(derived: DerivedClass) -> str:
return str(derived.q)
A working solution, which returns integers, was given in Overload int() in Python.
However, it works only for returning int, but not float or let's say a list:
class Test:
def __init__(self, mylist):
self.mylist = mylist
def __int__(self):
return list(map(int, self.mylist))
t = Test([1.5, 6.1])
t.__int__() # [1, 6]
int(t)
Thus t.__int__() works, but int(t) gives TypeError: __int__ returned non-int (type list).
Thus, is there a possibility to fully overwrite int, maybe with __getattribute__ or metaclass?
The __int__, __float__, ... special methods and various others do not overwrite their respective type such as int, float, etc. These methods serve as hooks that allow the types to ask for an appropriate value. The types will still enforce that a proper type is provided.
If required, one can actually overwrite int and similar on the builtins module. This can be done anywhere, and has global effect.
import builtins
# store the original ``int`` type as a default argument
def weakint(x, base=None, _real_int=builtins.int):
"""A weakly typed ``int`` whose return type may be another type"""
if base is None:
try:
return type(x).__int__(x)
except AttributeError:
return _real_int(x)
return _real_int(x, base)
# overwrite the original ``int`` type with the weaker one
builtins.int = weakint
Note that replacing builtin types may violate assumptions of code about these types, e.g. that type(int(x)) is int holds true. Only do so if absolutely required.
This is an example of how to replace int(...). It will break various features of int being a type, e.g. checking inheritance, unless the replacement is a carefully crafted type. A full replacement requires emulating the initial int type, e.g. via custom subclassing checks, and will not be completely possible for some builtin operations.
From the doc of __int__
Called to implement the built-in functions complex(), int() and float(). Should return a value of the appropriate type.
Here your method return a list and not an int, this works when calling it explictly but not using int() which checks the type of what is returned by the __int__
Here is a working example of what if could be, even if the usage if not very pertinent
class Test:
def __init__(self, mylist):
self.mylist = mylist
def __int__(self):
return int(sum(self.mylist))
I'm new to Python annotation (type hints). I noticed that many of the class definitions in pyi files inherit to Generic[_T], and _T = TypeVar('_T').
I am confused, what does the _T mean here?
from typing import Generic, TypeVar
_T = TypeVar('_T')
class Base(Generic[_T]): pass
I recommend reading through the entire built-in typing module documentation.
typing.TypeVar
Basic Usage
Specifically, typing.TypeVar is used to specify that multiple possible types are allowed. If no specific types are specified, then any type is valid.
from typing import TypeVar
T = TypeVar('T') # <-- 'T' can be any type
A = TypeVar('A', str, int) # <-- 'A' will be either str or int
But, if T can be any type, then why create a typing.TypeVar like that, when you could just use typing.Any for the type hint?
The reason is so you can ensure that particular input and output arguments have the same type, like in the following examples.
A Dict Lookup Example
from typing import TypeVar, Dict
Key = TypeVar('Key')
Value = TypeVar('Value')
def lookup(input_dict: Dict[Key, Value], key_to_lookup: Key) -> Value:
return input_dict[key_to_loopup]
This appears to be a trivial example at first, but these annotations require that the types of the keys in input dictionary are the same as the key_to_lookup argument, and that the type of the output matches the type of the values in the dict as well.
The keys and values as a whole could be different types, and for any particular call to this function, they could be different (because Key and Value do not restrict the types), but for a given call, the keys of the dict must match the type of the lookup key, and the same for the values and the return type.
An Addition Example
If you create a new TypeVar and limit the types to float and int:
B = TypeVar('B', float, int)
def add_x_and_y(x: B, y: B) -> B:
return x + y
This function requires that x and y either both be float, or both be int, and the same type must be returned. If x were a float and y were an int, the type checking should fail.
typing.Generic
I'm a little more sketchy on this one, but the typing.Generic (links to the official docs) Abstract Base Class (ABC) allows setting up a Class that has a defined type hint. They have a good example in the linked docs.
In this case they are creating a completely generic type class. If I understand correctly, this allows using Base[AnyName] as a type hint elsewhere in the code and then one can reuse AnyName to represent the same type elsewhere within the same definition (i.e. within the same code scope).
I suppose this would be useful to avoid having to use TypeVar repeatedly, you can basically create new TypeVars at will by just using the Base class as a type hint, as long as you just need it for the scope of that local definition.