This isn't my exact use case, but it's similar. Suppose I want to define two typing annotations:
Matrix = np.ndarray
Vector = np.ndarray
Now, I want a potential type-checker to complain when I pass a Matrix to a function that accepts a Vector:
def f(x: Vector):
...
m: Matrix = ...
f(m) # Bad!
How do I mark these types as incompatible?
It appears that I can use typing.NewType to create distinct types:
from typing import NewType
A = NewType('A', int)
B = NewType('B', int)
def f(a: A):
pass
b: B
f(b)
gives
a.py:11: error: Argument 1 to "f" has incompatible type "B"; expected "A"
Unfortunately, it doesn't work with np.ndarray until either numpy implements type hinting or NewType supports a base type of Any.
Related
My code is as follows:
from typing import Tuple
a: Tuple[int, int] = tuple(sorted([1, 3]))
Mypy tells me:
Incompatible types in assignment (expression has type "Tuple[int,
...]", variable has type "Tuple[int, int]")
What am I doing wrong? Why can't Mypy figure out that the sorted tuple will give back exactly two integers?
The call to sorted produces a List[int] which carries no information about length. As such, producing a tuple from it also has no information about the length. The number of elements simply is undefined by the types you use.
You must tell your type checker to trust you in such cases. Use # type: ignore or cast to unconditionally accept the target type as valid:
# ignore mismatch by annotation
a: Tuple[int, int] = tuple(sorted([1, 3])) # type: ignore
# ignore mismatch by cast
a = cast(Tuple[int, int], tuple(sorted([1, 3])))
Alternatively, create a length-aware sort:
def sort_pair(a: T, b: T) -> Tuple[T, T]:
return (a, b) if a <= b else (b, a)
I am creating a custom container that returns an instance of itself when sliced:
from typing import Union, List
class CustomContainer:
def __init__(self, values: List[int]):
self.values = values
def __getitem__(self, item: Union[int, slice]) -> Union[int, CustomContainer]:
if isinstance(item, slice):
return CustomContainer(self.values[item])
return self.values[item]
This works but comes with the following problem:
a = CustomContainer([1, 2])
b = a[0] # is always int, but recognized as both int and CustomContainer
c = a[:] # is always CustomContainer, but recognized as both int and CustomContainer
# Non-scalable solution: Forced type hint
d: int = a[0]
e: CustomContainer = a[:]
If I change the return type of __getitem__ to only int (my original approach), then a[0] correctly shows type int, but a[:] is considered a list instead of a CustomContainer.
As far as I understand, there used to be a function in python2 to define how slices are created, but it was removed in python3.
Is there a way to give the proper type hint without having to force the type hint every time I use my container?
You want to use typing.overload, which allows you to register multiple different signatures of a function with a type checker. Functions decorated with #overload are ignored at runtime, so you'll typically just fill the body with a literal ellipsis ..., pass, or a docstring. This also means that you have to keep at least one version of the function that isn't decorated with #overload, which will be the actual function used at runtime.
If you take a look at typeshed, the repository of stub files used by most major type-checkers for checking the standard library, you'll see this is the technique they use for annotating __getitem__ methods in custom containers such as collections.UserList. In your case, you'd annotate your method like this:
from typing import overload, Union, List
class CustomContainer:
def __init__(self, values: List[int]):
self.values = values
#overload
def __getitem__(self, item: int) -> int:
"""Signature when the function is passed an int"""
#overload
def __getitem__(self, item: slice) -> CustomContainer:
"""Signature when the function is passed a slice"""
def __getitem__(self, item: Union[int, slice]) -> Union[int, CustomContainer]:
"""Actual runtime implementation"""
if isinstance(item, slice):
return CustomContainer(self.values[item])
return self.values[item]
a = CustomContainer([1, 2])
b = a[0]
c = a[:]
reveal_type(b)
reveal_type(c)
Run it through MyPy, and it tells us:
main.py:24: note: Revealed type is "builtins.int"
main.py:25: note: Revealed type is "__main__.CustomContainer"
Further reading
The mypy docs for #overload can be found here.
def f(x, y):
return x & 1 == 0 and y > 0
g = lambda x, y: x & 1 == 0 and y > 0
Now the same thing in Haskell:
import Data.Bits
f :: Int -> Int -> Bool
f x y = (.&.) x 1 == 0 && y > 0
That works, however this doesn't:
g = \x y -> (.&.) x 1 == 0 && y > 0
Here's the error this gives:
someFunc :: IO ()
someFunc = putStrLn $ "f 5 7: " ++ ( show $ f 5 7 ) ++ "\tg 5 7: " ++ ( show $ g 5 7 )
• Ambiguous type variable ‘a0’ arising from the literal ‘1’
prevents the constraint ‘(Num a0)’ from being solved.
Relevant bindings include
x :: a0 (bound at src/Lib.hs:13:6)
g :: a0 -> Integer -> Bool (bound at src/Lib.hs:13:1)
Probable fix: use a type annotation to specify what ‘a0’ should be.
These potential instances exist:
instance Num Integer -- Defined in ‘GHC.Num’
instance Num Double -- Defined in ‘GHC.Float’
instance Num Float -- Defined in ‘GHC.Float’
...plus two others
...plus one instance involving out-of-scope types
(use -fprint-potential-instances to see them all)
• In the second argument of ‘(.&.)’, namely ‘1’
In the first argument of ‘(==)’, namely ‘(.&.) x 1’
In the first argument of ‘(&&)’, namely ‘(.&.) x 1 == 0’
|
13 | g = \x y -> (.&.) x 1 == 0 && y > 0
| ^
How do I get the same error in Python? - How do I get errors when the input doesn't match expectations?
To be specific, how do I say that a function/lambda MUST have:
arity of 5
each argument must be numerical
each argument must implement __matmul__ (#)
return bool
I know that I can roughly do this with: (docstrings and/or PEP484) with abc; for classes. But what can I do for 'loose' functions in a module?
Generally you have three possible approaches:
Let the runtime produce errors for incompatible operations (e.g. 1 + 'foo' is a TypeError).
Do explicit runtime checks for certain attributes; though this is usually discouraged since Python uses duck typing a lot.
Use type annotations and a static type checker.
Your specific points:
arity of 5
Define five parameters:
def f(a, b, c, d, e): ...
each argument must be numerical
Either static type annotations:
def f(a: int, b: int, c: int, d: int, e: int): ...
And/or runtime checks:
def f(a, b, c, d, e):
assert all(isinstance(i, int) for i in (a, b, c, d, e))
def f(a, b, c, d, e):
if not all(isinstance(i, int) for i in (a, b, c, d, e)):
raise TypeError
asserts are for debugging purposes and can be disabled, an explicit if..raise cannot. Given the verboseness of this and the duck typing philosophy, these approaches are not very pythonic.
each argument must implement __matmul__ (#)
The most practical way is probably to let the runtime raise an error intrinsically if the passed values do not support the operation, i.e. just do:
def f(a, b):
return a # b # TypeError: unsupported operand type(s) for #: ... and ...
If you want static type checking for this, you can use a typing.Protocol:
from typing import Protocol
class MatMullable(Protocol):
def __matmul__(self, other) -> int:
pass
def f(a: MatMullable, ...): ...
In practice you probably want to combine this with your previous "each argument must be numerical" and type hint for a type that fulfils both these requirements.
return bool
def f(...) -> bool: ...
Especially given that the # operator is mostly used by 3rd party packages like numpy, in practice the most pythonic implementation of such a function is probably something along these lines:
import numpy as np
from numpy import ndarray
def f(a: ndarray, b: ndarray, c: ndarray, d: ndarray, e: ndarray) -> bool:
return np.linalg.det(a # b # c # d # e) > 0 # or whatever
You can't directly translate the same typing expectations from Haskell to Python. Haskell is an insanely strongly typed language, while Python is almost the complete opposite.
To type hint a higher order function that accepts such a function as argument, use typing.Callable:
from typing import Callable
def hof(f: Callable[[ndarray, ndarray, ndarray, ndarray, ndarray], bool]): ...
What's the difference between the following two TypeVars?
from typing import TypeVar, Union
class A: pass
class B: pass
T = TypeVar("T", A, B)
T = TypeVar("T", bound=Union[A, B])
Here's an example of something I don't get: this passes type checking...
T = TypeVar("T", bound=Union[A, B])
class AA(A):
pass
class X(Generic[T]):
pass
class XA(X[A]):
pass
class XAA(X[AA]):
pass
...but with T = TypeVar("T", A, B), it fails with
error: Value of type variable "T" of "X" cannot be "AA"
Related: this question on the difference between Union[A, B] and TypeVar("T", A, B).
When you do T = TypeVar("T", bound=Union[A, B]), you are saying T can be bound to either Union[A, B] or any subtype of Union[A, B]. It's upper-bounded to the union.
So for example, if you had a function of type def f(x: T) -> T, it would be legal to pass in values of any of the following types:
Union[A, B] (or a union of any subtypes of A and B such as Union[A, BChild])
A (or any subtype of A)
B (or any subtype of B)
This is how generics behave in most programming languages: they let you impose a single upper bound.
But when you do T = TypeVar("T", A, B), you are basically saying T must be either upper-bounded by A or upper-bounded by B. That is, instead of establishing a single upper-bound, you get to establish multiple!
So this means while it would be legal to pass in values of either types A or B into f, it would not be legal to pass in Union[A, B] since the union is neither upper-bounded by A nor B.
So for example, suppose you had a iterable that could contain either ints or strs.
If you want this iterable to contain any arbitrary mixture of ints or strs, you only need a single upper-bound of a Union[int, str]. For example:
from typing import TypeVar, Union, List, Iterable
mix1: List[Union[int, str]] = [1, "a", 3]
mix2: List[Union[int, str]] = [4, "x", "y"]
all_ints = [1, 2, 3]
all_strs = ["a", "b", "c"]
T1 = TypeVar('T1', bound=Union[int, str])
def concat1(x: Iterable[T1], y: Iterable[T1]) -> List[T1]:
out: List[T1] = []
out.extend(x)
out.extend(y)
return out
# Type checks
a1 = concat1(mix1, mix2)
# Also type checks (though your type checker may need a hint to deduce
# you really do want a union)
a2: List[Union[int, str]] = concat1(all_ints, all_strs)
# Also type checks
a3 = concat1(all_strs, all_strs)
In contrast, if you want to enforce that the function will accept either a list of all ints or all strs but never a mixture of either, you'll need multiple upper bounds.
T2 = TypeVar('T2', int, str)
def concat2(x: Iterable[T2], y: Iterable[T2]) -> List[T2]:
out: List[T2] = []
out.extend(x)
out.extend(y)
return out
# Does NOT type check
b1 = concat2(mix1, mix2)
# Also does NOT type check
b2 = concat2(all_ints, all_strs)
# But this type checks
b3 = concat2(all_ints, all_ints)
After a bunch of reading, I believe mypy correctly raises the type-var error in the OP's question:
generics.py:31: error: Value of type variable "T" of "X" cannot be "AA"
See the below explanation.
Second Case: TypeVar("T", bound=Union[A, B])
I think #Michael0x2a's answer does a great job of describing what's happening.
First Case: TypeVar("T", A, B)
The reason boils down to Liskov Substitution Principle (LSP), also known as behavioral subtyping. Explaining this is outside the scope of this answer, you will need to read up on + understanding the meaning of invariance vs covariance.
From python's typing docs for TypeVar:
By default type variables are invariant.
Based on this information, T = TypeVar("T", A, B) means type variable T has value restrictions of classes A and B, but because it's invariant... it only accepts those two (and not any child classes of A or B).
Thus, when passed AA, mypy correctly raises a type-var error.
You might then say: well, doesn't AA properly match behavioral subtyping of A? And in my opinion, you would be correct.
Why? Because one can properly substitute out and A with AA, and the behavior of the program would be unchanged.
However, because mypy is a static type checker, mypy can't figure this out (it can't check runtime behavior). One has to state the covariance explicitly, via the syntax covariant=True.
Also note: when specifying a covariant TypeVar, one should use the suffix _co in type variable names. This is documented in PEP 484 here.
from typing import TypeVar, Generic
class A: pass
class AA(A): pass
T_co = TypeVar("T_co", AA, A, covariant=True)
class X(Generic[T_co]): pass
class XA(X[A]): pass
class XAA(X[AA]): pass
Output: Success: no issues found in 1 source file
So, what should you do?
I would use TypeVar("T", bound=Union[A, B]), since:
A and B aren't related
You want their subclasses to be allowed
Further reading on LSP-related issues in mypy:
python/mypy #2984: List[subclass] is incompatible with List[superclass]
python/mypy #7049: [Question] why covariant type variable isn't allowed in instance method parameter?
Contains a good example from #Michael0x2a
I have a code like this:
class A:
def __init__(self, a: int) -> None:
self.a: int = a
class B(A):
def __init__(self, a: float) -> None:
self.a: float = a
The problem is that self.a changes from type int in the base class, A, to float in class B. mypy gives me this error:
typehintancestor.py:8: error: Incompatible types in assignment (expression has type "float", variable has type "int")
(Line 8 is the last line)
Is this a bug in mypy or should I change the implementation of class B?
This is a bug in your code. Suppose it was legal to define your classes that way and we wrote the following program:
from typing import List
# class definitions here
def extract_int(items: List[A]) -> List[int]:
return [item.a for item in items]
my_list: List[A] = [A(1), A(2), B(3.14)]
list_of_ints = extract_int(my_list)
We expect the list_of_ints variable to contain a just ints, but it'll actually contain a float.
Basically, mypy is enforcing that your code follows the Liskov substitution principle here.