A working solution, which returns integers, was given in Overload int() in Python.
However, it works only for returning int, but not float or let's say a list:
class Test:
def __init__(self, mylist):
self.mylist = mylist
def __int__(self):
return list(map(int, self.mylist))
t = Test([1.5, 6.1])
t.__int__() # [1, 6]
int(t)
Thus t.__int__() works, but int(t) gives TypeError: __int__ returned non-int (type list).
Thus, is there a possibility to fully overwrite int, maybe with __getattribute__ or metaclass?
The __int__, __float__, ... special methods and various others do not overwrite their respective type such as int, float, etc. These methods serve as hooks that allow the types to ask for an appropriate value. The types will still enforce that a proper type is provided.
If required, one can actually overwrite int and similar on the builtins module. This can be done anywhere, and has global effect.
import builtins
# store the original ``int`` type as a default argument
def weakint(x, base=None, _real_int=builtins.int):
"""A weakly typed ``int`` whose return type may be another type"""
if base is None:
try:
return type(x).__int__(x)
except AttributeError:
return _real_int(x)
return _real_int(x, base)
# overwrite the original ``int`` type with the weaker one
builtins.int = weakint
Note that replacing builtin types may violate assumptions of code about these types, e.g. that type(int(x)) is int holds true. Only do so if absolutely required.
This is an example of how to replace int(...). It will break various features of int being a type, e.g. checking inheritance, unless the replacement is a carefully crafted type. A full replacement requires emulating the initial int type, e.g. via custom subclassing checks, and will not be completely possible for some builtin operations.
From the doc of __int__
Called to implement the built-in functions complex(), int() and float(). Should return a value of the appropriate type.
Here your method return a list and not an int, this works when calling it explictly but not using int() which checks the type of what is returned by the __int__
Here is a working example of what if could be, even if the usage if not very pertinent
class Test:
def __init__(self, mylist):
self.mylist = mylist
def __int__(self):
return int(sum(self.mylist))
Related
I have a function which takes a list of a base class as argument, and I have a variable which is a list of a derived class. Using this variable as the argument gives mypy error: Argument 1 to "do_stuff" has incompatible type "List[DerivedClass]"; expected "List[BaseClass]".
class BaseClass(TypedDict):
base_field: str
class DerivedClass(BaseClass):
derived_field: str
def do_stuff(data: List[BaseClass]) -> None:
pass
foo: List[DerivedClass] = [{'base_field': 'foo', 'derived_field': 'bar'}]
do_stuff(foo)
If the argument and variable are instead BaseClass and DerivedClass respectively, i.e. not lists, it understands that the variable can be casted implicitly to the base class. But for lists it doesn't work. How can I solve this, preferably other than #type: ignore.
It depends on what exactly do_stuff is doing, but nine times out of ten the best solution is to use Sequence instead of List:
from typing import Sequence
def do_stuff(data: Sequence[BaseClass]) -> None:
pass
The reason you can't use List[BaseClass] here is that do_stuff would be allowed to add BaseClass instances to data, which would in turn break foo in the caller. Sequence doesn't imply mutability, so do_stuff is not allowed (static-typing-wise) to modify a Sequence parameter, which prevents that issue. (Put differently, Sequence is covariant and List is invariant. Most mutable generics are invariant because of exactly this issue.)
If do_stuff does need to mutate data, you'll need to rethink the typing -- should it be allowed to add a BaseClass to it? If not, maybe do_stuff should take a List[DerivedClass]. If so, you need to declare foo as a List[BaseClass] to account for that possibility.
You can try using a TypeVar and bound it to BaseClass. In code, that looks like
from typing import TypeVar
T = TypeVar('T', bound=BaseClass)
def do_stuff(data: list[T]) -> None:
pass
this also makes mypy happy
I'm not able to understand the use of Generic and TypeVar, and how they are related.
https://docs.python.org/3/library/typing.html#building-generic-types
The docs have this example:
class Mapping(Generic[KT, VT]):
def __getitem__(self, key: KT) -> VT:
...
# Etc.
X = TypeVar('X')
Y = TypeVar('Y')
def lookup_name(mapping: Mapping[X, Y], key: X, default: Y) -> Y:
try:
return mapping[key]
except KeyError:
return default
Type variables exist primarily for the benefit of static type
checkers. They serve as the parameters for generic types as well as
for generic function definitions.
Why can't I simply use Mapping with some existing type, like int, instead of creating X and Y?
Type variables are literally "variables for types". Similar to how regular variables allow code to apply to multiple values, type variables allow code to apply to multiple types.
At the same time, just like code is not required to apply to multiple values, it is not required to depend on multiple types. A literal value can be used instead of variables, and a literal type can be used instead of type variables – provided these are the only values/types applicable.
Since the Python language semantically only knows values – runtime types are also values – it does not have the facilities to express type variability. Namely, it cannot define, reference or scope type variables. Thus, typing represents these two concepts via concrete things:
A typing.TypeVar represents the definition and reference to a type variable.
A typing.Generic represents the scoping of types, specifically to class scope.
Notably, it is possible to use TypeVar without Generic – functions are naturally scoped – and Generic without TypeVar – scopes may use literal types.
Consider a function to add two things. The most naive implementation adds two literal things:
def add():
return 5 + 12
That is valid but needlessly restricted. One would like to parameterise the two things to add – this is what regular variables are used for:
def add(a, b):
return a + b
Now consider a function to add two typed things. The most naive implementations adds two things of literal type:
def add(a: int, b: int) -> int:
return a + b
That is valid but needlessly restricted. One would like to parameterise the types of the two things to add – this is what type variables are used for:
T = TypeVar("T")
def add(a: T, b: T) -> T:
return a + b
Now, in the case of values we defined two variables – a and b but in the case of types we defined one variable – the single T – but used for both variables! Just like the expression a + a would mean both operands are the same value, the annotation a: T, b: T means both parameters are the same type. This is because our function has a strong relation between the types but not the values.
While type variables are automatically scoped in functions – to the function scope – this is not the case for classes: a type variable might be scoped across all methods/attributes of a class or specific to some method/attribute.
When we define a class, we may scope type variables to the class scope by adding them as parameters to the class. Notably, parameters are always variables – this applies to regular parameters just as for type parameters. It just does not make sense to parameterise a literal.
# v value parameters of the function are "value variables"
def mapping(keys, values):
...
# v type parameters of the class are "type variables"
class Mapping(Generic[KT, VT]):
...
When we use a class, the scope of its parameters has already been defined. Notably, the arguments passed in may be literal or variable – this again applies to regular arguments just as for type arguments.
# v pass in arguments via literals
mapping([0, 1, 2, 3], ['zero', 'one', 'two', 'three'])
# v pass in arguments via variables
mapping(ks, vs)
# v pass in arguments via literals
m: Mapping[int, str]
# v pass in arguments via variables
m: Mapping[KT, VT]
Whether to use literals or variables and whether to scope them or not depends on the use-case. But we are free to do either as required.
The whole purporse of using Generic and TypeVar (here represented as the X and Y variables) is when one wants the parameters to be as generic as possible. int can be used, instead, in this case. The difference is: the static analyzer will interpret the parameter as always being an int.
Using generics mean the function accepts any type of parameter. The static analyzer, as in an IDE for instance, will determine the type of the variables and the return type as the arguments are provided on function call or object instantiation.
mapping: Mapping[str, int] = {"2": 2, "3": 3}
name = lookup_name(mapping, "1", 1)
In the above example type checkers will know name will always be an int relying on the type annotations. In IDEs, code completion for int methods will be shown as the 'name' variable is used.
Using specific types is ideal if that is your goal. The function accepting only a map with int keys or values, and/or returning int in this case, for instance.
As X and Y are variable you can choose any name want, basically.
Below example is possible:
def lookup_name(mapping: Mapping[str, int], key: str, default: int) -> int:
try:
return mapping[key]
except KeyError:
return default
The types are not generic in the above example. The key will always be str; the default variable, the value, and the return type will always be an int. It's the programmer's choice. This is not enforced by Python, though. A static type checker like mypy is needed for that.
The Generic type could even be constrained if wanted:
import typing
X = typing.TypeVar("X", int, str) # Accept int and str
Y = typing.TypeVar("Y", int) # Accept only int
#MisterMiyagi's answer offers a thorough explanation on the use scope for TypeVar and Generic.
class BaseClass:
p: int
class DerivedClass(BaseClass):
q: int
def p(q: Callable[[BaseClass], str]) -> None:
return None
def r(derived: DerivedClass) -> str:
return ""
p(r)
Expected behavior:
- No error from mypy -
Actual behavior:
Argument 1 to "p" has incompatible type "Callable[[DerivedClass], str]";
expected "Callable[[BaseClass], str]"
Let's talk about type variance. Under typical subtyping rules, if we have a type DerivedClass that is a subtype of a type BaseClass, then every instance of DerivedClass is an instance of BaseClass. Simple enough, right? But now the complexity arises when we have generic type arguments.
Let's suppose that we have a class that gets a value and returns it. I don't know how it gets it; maybe it queries a database, maybe it reads the file system, maybe it just makes one up. But it gets a value.
class Getter:
def get_value(self):
# Some deep magic ...
Now let's assume that, when we construct the Getter, we know what type it should be querying at compile-time. We can use a type variable to annotate this.
T = TypeVar("T")
class Getter(Generic[T]):
def get_value(self) -> T:
...
Now, Getter is a valid thing. We can have a Getter[int] which gets an integer and a Getter[str] which gets a string.
But here's a question. If I have a Getter[int], is that a valid Getter[object]? Surely, if I can get a value as an int, it's easy enough to upcast it, right?
my_getter_int: Getter[int] = ...
my_getter_obj: Getter[object] = my_getter_int
But Python won't allow this. See, Getter was declared to be invariant in its type argument. That's a fancy way of saying that, even though int is a subtype of object, Getter[int] and Getter[object] have no relationship.
But, like I said, surely they should have a relationship, right? Well, yes. If your type is only used in positive position (glossing over some details, that means roughly that it only appears as the return value of methods or as the type of read-only properties), then we can make it covariant.
T_co = TypeVar("T_co", covariant=True)
class Getter(Generic[T_co]):
def get_value(self) -> T_co:
...
By convention, in Python, we denote covariant type arguments using names that end in _co. But the thing that actually makes it covariant here is the covariant=True keyword argument.
Now, with this version of Getter, Getter[int] is actually a subtype of Getter[object]. In general, if A is a subtype of B, then Getter[A] is a subtype of Getter[B]. Covariance preserves subtyping.
Okay, that's covariance. Now consider the opposite. Let's say we have a setter which sets some value in a database.
class Setter:
def set_value(self, value):
...
Same assumptions as before. Suppose we know what the type is in advance. Nowe we write
T = TypeVar("T")
class Setter:
def set_value(self, value: T) -> None:
...
Okay, great. Now, if I have a value my_setter : Setter[int], is that a Setter[object]? Well, my_setter can always take an integer value, whereas a Setter[object] is guaranteed to be able to take any object. my_setter can't guarantee that, so it's actually not. If we try to make T covariant in this example, we'll get
error: Cannot use a covariant type variable as a parameter
Because it's actually not a valid relationship. In fact, in this case, we get the opposite relationship. If we have a my_setter : Setter[object], then that's a guarantee that we can pass it any object at all, so certainly we can pass it an integer, hence we have a Setter[int]. This is called contravariance.
T_contra = TypeVar("T_contra", contravariant=True)
class Setter:
def set_value(self, value: T_contra) -> None:
...
We can make our type contravariant if it only appears in negative position, which (again, oversimplifying a bit) generally means that it appears as arguments to functions, but not as a return value. Now, Setter[object] is a subtype of Setter[int]. It's backwards. In general, if A is a subtype of B, then Setter[B] is a subtype of Setter[A]. Contravariance reverses the subtyping relationship.
Now, back to your example. You have a Callable[[DerivedClass], str] and want to know if it's a valid Callable[[BaseClass], str]
Applying our principles from before, we have a type Callable[[T], S] (I'm assuming only one argument for simplicity's sake, but in reality this works in Python for any number of arguments) and want to ask if T and S are covariant, contravariant, or invariant.
Well, what is a Callable? It's a function. It has one thing we can do: call it with a T and get an S. So it's pretty clear that T is only used as an argument and S as a result. Things only used as arguments are contravariant, and those used as results are covariant, so in reality it's more correct to write
Callable[[T_contra], S_co]
Arguments to Callable are contravariant, which means that if DerivedClass is a subtype of BaseClass, then Callable[[BaseClass], str] is a subtype of Callable[[DerivedClass], str], the opposite relationship to the one you suggested. You need a function that can accept any BaseClass. A function with a BaseClass argument would suffice, and so would a function with an object argument, or any type which is a supertype of BaseClass, but subtypes are insufficient because they're too specific for your contract.
MyPy objects to your call of p with r as its argument because given only the type signatures, it can't be sure the function won't be called with a non-DerivedClass instance.
For instance, given the same type annotations, p could be implemented like this:
def p(q: Callable[[BaseClass], str]) -> None:
obj = BaseClass()
q(obj)
This will break p(r) if r has an implementation that depends on the derived attributes of its argument:
def r(derived: DerivedClass) -> str:
return str(derived.q)
Let's consider the two following syntax variations:
class Foo:
x: int
def __init__(self, an_int: int):
self.x = an_int
And
class Foo:
def __init__(self, an_int: int):
self.x = an_int
Apparently the following code raises a mypy error in both cases (which is expected):
obj = Foo(3)
obj.x.title() # this is a str operation
But I really want to enforce the contract: I want to make it clear that x is an instance variable of every Foo object. So which syntax should be preferred, and why?
This is ultimately a matter of personal preference. To use the example in the other answer, doing both:
class Foo:
x: Union[int, str]
def __init__(self, an_int: int) -> None:
self.x = an_int
...and doing:
class Foo:
def __init__(self, an_int: int) -> None:
self.x: Union[int, str] = an_int
...will be treated in the exact same way by type checkers.
The main advantage of doing the former is that it makes the types of your attributes more obvious in the cases where your constructor is complex to the point where it's difficult to trace what type inference is being performed.
This style is also consistent with how you declare and use things like dataclasses:
from dataclasses import dataclass
#dataclass
class Foo:
x: int
y: Union[int, str]
z: str
# You get an `__init__` for free. Mypy will check to make sure the types match.
# So this type checks:
a = Foo(1, "b", "c")
# ...but this doesn't:
b = Foo("bad", 3.14, 0)
This isn't really a pro or a con, just more of an observation that the standard library has, in some specific cases, embraced the former style.
The main disadvantage is that this style is somewhat verbose: you're forced into repeating the variable name two times (three, if you include the __init__ parameter), and often forced into repeating the type hint twice (once in your variable annotation and once in in the __init__ signature).
It also opens up a possible correctness issue in your code: mypy will never actually check to make sure you've assigned anything to your attribute! For example, the following code will happily type check despite that it crashes at runtime:
class Foo:
x: int
def __init__(self, x: int) -> None:
# Whoops, I forgot to do 'self.x = x'
pass
f = Foo(1)
# Type checks, but crashes at runtime!
print(f.x)
The latter style dodges these issues: if you forget to assign an attribute, mypy will complain that it doesn't exist when you try using it later.
The other main advantage of the latter style is that you can also get away with not adding an explicit type hint a lot of the time, especially if you're just assigning a parameter directly to a field. The type checker will infer the exact same type in those cases.
So given these factors, my personal preference is to:
Use dataclasses (and by proxy, the former style) if I just want a simple, record-like object with an automatically generated __init__.
Use the latter style if I either feel dataclasses are overkill or need to write a custom __init__, to decrease both verbosity and the odds of running into the "forgot-to-assign-an-attribute" bug.
Switch back to the former style if I have a sufficiently large and complex __init__ that's somewhat difficult to read. (Or better yet, just refactor my code so I can keep the __init__ simple!)
You may end up weighing these factors differently and come up with a different set of tradeoffs, of course.
One final tangent -- when you do:
class Foo:
x: int
...you are not actually annotating a class variable. At this point, x has no value, so doesn't actually exist as a variable.
The only thing you're creating is an annotation, which is just pure metadata and distinct from the variable itself.
But if you do:
class Foo:
x: int = 3
...then you are creating both a class variable and an annotation. Somewhat confusingly, while you may be creating a class variable/attribute (as opposed to an instance variable/attribute), mypy and other type checker will continue assuming that type annotation is meant to annotate specifically an instance attribute.
This inconsistency usually doesn't matter in practice, especially if you follow the general best practice of avoiding mutable default values for anything. But this may cause some surprises if you're trying to do something fancy.
If you want mypy/other type checkers to understand your annotation is a class variable annotation, you need to use the ClassVar type:
# Import this from 'typing_extensions' if you're using Python 3.7 or earlier
from typing import ClassVar
class Foo:
x: ClassVar[int] = 3
If you ever want to use Any, Union, or Optional for an instance variable you should annotate them:
from typing import Union
class Foo:
x: Union[int, str]
def __init__(self, an_int: int):
self.x = an_int
def setx(self, a_str: str):
self.x = a_str
Otherwise you can use whichever you think is easier to read. mypy will infer the type from __init__.
I'm new to Python annotation (type hints). I noticed that many of the class definitions in pyi files inherit to Generic[_T], and _T = TypeVar('_T').
I am confused, what does the _T mean here?
from typing import Generic, TypeVar
_T = TypeVar('_T')
class Base(Generic[_T]): pass
I recommend reading through the entire built-in typing module documentation.
typing.TypeVar
Basic Usage
Specifically, typing.TypeVar is used to specify that multiple possible types are allowed. If no specific types are specified, then any type is valid.
from typing import TypeVar
T = TypeVar('T') # <-- 'T' can be any type
A = TypeVar('A', str, int) # <-- 'A' will be either str or int
But, if T can be any type, then why create a typing.TypeVar like that, when you could just use typing.Any for the type hint?
The reason is so you can ensure that particular input and output arguments have the same type, like in the following examples.
A Dict Lookup Example
from typing import TypeVar, Dict
Key = TypeVar('Key')
Value = TypeVar('Value')
def lookup(input_dict: Dict[Key, Value], key_to_lookup: Key) -> Value:
return input_dict[key_to_loopup]
This appears to be a trivial example at first, but these annotations require that the types of the keys in input dictionary are the same as the key_to_lookup argument, and that the type of the output matches the type of the values in the dict as well.
The keys and values as a whole could be different types, and for any particular call to this function, they could be different (because Key and Value do not restrict the types), but for a given call, the keys of the dict must match the type of the lookup key, and the same for the values and the return type.
An Addition Example
If you create a new TypeVar and limit the types to float and int:
B = TypeVar('B', float, int)
def add_x_and_y(x: B, y: B) -> B:
return x + y
This function requires that x and y either both be float, or both be int, and the same type must be returned. If x were a float and y were an int, the type checking should fail.
typing.Generic
I'm a little more sketchy on this one, but the typing.Generic (links to the official docs) Abstract Base Class (ABC) allows setting up a Class that has a defined type hint. They have a good example in the linked docs.
In this case they are creating a completely generic type class. If I understand correctly, this allows using Base[AnyName] as a type hint elsewhere in the code and then one can reuse AnyName to represent the same type elsewhere within the same definition (i.e. within the same code scope).
I suppose this would be useful to avoid having to use TypeVar repeatedly, you can basically create new TypeVars at will by just using the Base class as a type hint, as long as you just need it for the scope of that local definition.