I'm trying to reuse type hints from a dataclass in my function signature - that is, without having to type the signature out again.
What would be the best way of going about this?
from dataclasses import dataclass
from typing import Set, Tuple, Type
#dataclass
class MyDataClass:
force: Set[Tuple[str, float, bool]]
# I've had to write the same type annotation in the dataclass and the
# function signature - yuck
def do_something(force: Set[Tuple[str, float, bool]]):
print(force)
# I want to do something like this, where I reference the type annotation from
# the dataclass. But, doing it this way, pycharm thinks `force` is type `Any`
def do_something_2(force: Type["MyDataClass.force"]):
print(force)
What would be the best way of going about this?
PEP 484 gives one clear option for this case
Type aliases
Type aliases are defined by simple variable assignments:
(...)
Type aliases may be as complex as type hints in annotations -- anything that is acceptable as a type hint is acceptable in a type alias:
Applied to your example this would amount to (Mypy confirms this as correct)
from dataclasses import dataclass
Your_Type = set[tuple[str, float, bool]]
#dataclass
class MyDataClass:
force: Your_Type
def do_something(force: Your_Type):
print(force)
The above is written using Python 3.9 onward Generic Alias Type. The syntax is more concise and modern since typing.Set and typing.Tuple have been deprecated.
Now, fully understanding this in terms of the Python Data Model is more complicated than it may seem:
3.1. Objects, values and types
Every object has an identity, a type and a value.
Your first attempt of using Type would give an astonishing result
>>> type(MyDataClass.force)
AttributeError: type object 'MyDataClass' has no attribute 'force'
This is because the builtin function type returns a type (which is itself an object) but MyDataClass is "a Class" (a declaration) and the "Class attribute" force is on the Class not on the type object of the class where type() looks for it. Notice the Data Model carefully on the difference:
Classes
These objects normally act as factories for new instances of themselves
Class Instances
Instances of arbitrary classes
If instead you checked the type on an instance you would get the following result
>>> init_values: set = {(True, "the_str", 1.2)}
>>> a_var = MyDataClass(init_values)
>>> type(a_var)
<class '__main__.MyDataClass'>
>>> type(a_var.force)
<class 'set'>
Now lets recover the type object (not the type hints) on force by applying type() to the __anotations__ on the Class declaration object (here we see the Generic Alias type mentioned earlier). (Here we are indeed checking the type object on the class attribute force).
>>> type(MyDataClass.__annotations__['force'])
<class 'typing._GenericAlias'>
Or we could check the annotations on the Class instance, and recover the type hints as we are used to seeing them.
>>> init_values: set = {(True, "the_str", 1.2)}
>>> a_var = MyDataClass(init_values)
>>> a_var.__annotations__
{'force': set[tuple[str, float, bool]]}
I've had to write the same type annotation in the dataclass and the function signature -
For tuples annotations tend to become long literals and that justifies creating a purpose variable for conciseness. But in general explicit signatures are more descriptive and it's what most API's go for.
The typing Module
Fundamental building blocks:
Tuple, used by listing the element types, for example Tuple[int, int, str]. The empty tuple can be typed as Tuple[()]. Arbitrary-length homogeneous tuples can be expressed using one type and ellipsis, for example Tuple[int, ...]. (The ... here are part of the syntax, a literal ellipsis.)
Related
I'm trying to figure out a way to create precise types for a function that returns class attributes as an immutable dictionary.
For the sake of example, let's say we have a foreign class called "A", by foreign I mean it comes from a library that I don't own, but it has typing stubs such as:
class A:
name: str
content: str
age: int
Now, I want to create a function that returns dictionary, so that its shape is exactly as that class' available attributes. I am able to get list of fields and their corresponding types through typing.get_type_hints(), such as:
>>> from typing import get_type_hints
>>> obj = A()
>>> get_type_hints(obj)
{'name': <class 'str'>, 'content': <class 'str'>, 'age': <class 'int'>}
Now, what I'd like to do is to create a function that is fully and strongly typed:
def convert_object_to_dict(_obj: A) -> frozendict[str, Any]:
... # Let's ignore the implementation detail here, this is just example
(I know frozendict is not a type, but let's assume it is)
If I call it with an object that has all the fields defined, it should return a dictionary that has three keys, with their names exactly as early defined class A, for the sake of argument let's assume that it will always return all the attributes defined in a class.
The obvious return type that I can use is frozendict[str, Any], however, this doesn't tell mypy nor IDE what keys to expect. I could perhaps make it more precise, by using Union of all the types I expect, but again, that means I have to maintain it and feels redundant if get_type_hints is able to tell me exactly what is there. Alternative solution could be making TypedDict of the same shape as that class, but this would be a nightmare to maintain and would make typing more annoying than useful.
What I'd like to do is, something similar to this:
def convert_object_to_dict(_obj: A) -> TypedDict[A]:
...
So that if I call this function, mypy will be able to tell what shape the dictionary is, and so what keys should be possible to be used with other functions later on. It should raise an error if I try to use a key which doesn't exist in some later full typed contexts.
TypedDict[A] in this case, should create a TypedDict, whose fields are exactly as these available in class A.
Is it possible to do with MyPy / Python typing at all?
I'm new to Python annotation (type hints). I noticed that many of the class definitions in pyi files inherit to Generic[_T], and _T = TypeVar('_T').
I am confused, what does the _T mean here?
from typing import Generic, TypeVar
_T = TypeVar('_T')
class Base(Generic[_T]): pass
I recommend reading through the entire built-in typing module documentation.
typing.TypeVar
Basic Usage
Specifically, typing.TypeVar is used to specify that multiple possible types are allowed. If no specific types are specified, then any type is valid.
from typing import TypeVar
T = TypeVar('T') # <-- 'T' can be any type
A = TypeVar('A', str, int) # <-- 'A' will be either str or int
But, if T can be any type, then why create a typing.TypeVar like that, when you could just use typing.Any for the type hint?
The reason is so you can ensure that particular input and output arguments have the same type, like in the following examples.
A Dict Lookup Example
from typing import TypeVar, Dict
Key = TypeVar('Key')
Value = TypeVar('Value')
def lookup(input_dict: Dict[Key, Value], key_to_lookup: Key) -> Value:
return input_dict[key_to_loopup]
This appears to be a trivial example at first, but these annotations require that the types of the keys in input dictionary are the same as the key_to_lookup argument, and that the type of the output matches the type of the values in the dict as well.
The keys and values as a whole could be different types, and for any particular call to this function, they could be different (because Key and Value do not restrict the types), but for a given call, the keys of the dict must match the type of the lookup key, and the same for the values and the return type.
An Addition Example
If you create a new TypeVar and limit the types to float and int:
B = TypeVar('B', float, int)
def add_x_and_y(x: B, y: B) -> B:
return x + y
This function requires that x and y either both be float, or both be int, and the same type must be returned. If x were a float and y were an int, the type checking should fail.
typing.Generic
I'm a little more sketchy on this one, but the typing.Generic (links to the official docs) Abstract Base Class (ABC) allows setting up a Class that has a defined type hint. They have a good example in the linked docs.
In this case they are creating a completely generic type class. If I understand correctly, this allows using Base[AnyName] as a type hint elsewhere in the code and then one can reuse AnyName to represent the same type elsewhere within the same definition (i.e. within the same code scope).
I suppose this would be useful to avoid having to use TypeVar repeatedly, you can basically create new TypeVars at will by just using the Base class as a type hint, as long as you just need it for the scope of that local definition.
I'm a big fan and advocate for static type hints in Python 3. I've been using them for a while with no problems.
I just ran into a new edge case that I can't seem to compile. What if I want to define a custom type, then define its parameters?
For example, this is common in Python 3:
from typing import List, NewType
CustomObject = NewType('CustomObject', List[int])
def f(data: List[CustomObject]):
# do something
But this won't compile:
class MyContainer():
# some class definition ...
from typing import NewType
SpecialContainer = NewType('SpecialContainer', MyContainer)
def f(data: SpecialContainer[str]):
# do something
I realize that SpecialContainer is technically a function in this case, but it shouldn't be evaluated as one in the context of a type signature. The second code snippet fails with TypeError: 'function' object is not subscriptable.
Compiling My Code Sample
You have to design your classes from the ground up to accept static type hints. This didn't satisfy my original use case, since I was trying to declare special subtypes of 3rd party classes, but it compiles my code sample.
from typing import Generic, TypeVar, Sequence, List
# Declare your own accepted types for your container, required
T = TypeVar('T', int, str, float)
# The custom container has be designed to accept types hints
class MyContainer(Sequence[T]):
# some class definition ...
# Now, you can make a special container type
# Note that Sequence is a generic of List, and T is a generic of str, as defined above
SpecialContainer = TypeVar('SpecialContainer', MyContainer[List[str]])
# And this compiles
def f(data: SpecialContainer):
# do something
Subtyping a 3rd Party Class
My original intention was to create a type hint that explained how a function, f(), took a pd.DataFrame object that was indexed by integers and whose cells were all strings. Using the above answer, I came up with a contrived way of expressing this.
from typing import Mapping, TypeVar, NewType, NamedTuple
from pandas import pd
# Create custom types, required even if redundant
Index = TypeVar('Index')
Row = TypeVar('Row')
# Create a child class of pd.DataFrame that includes a type signature
# Note that Mapping is a generic for a key-value store
class pdDataFrame(pd.DataFrame, Mapping[Index, Row]):
pass
# Now, this compiles, and explains what my special pd.DataFrame does
pdStringDataFrame = NewType('pdDataFrame', pdDataFrame[int, NamedTuple[str]])
# And this compiles
def f(data: pdStringDataFrame):
pass
Was it worth it?
If you are writing a custom class that resembles a container generic like Sequence, Mapping, or Any, then go for it. It is free to add the type variable to your class definition.
If you are trying to notate a specific usage of a 3rd party class that doesn't implement type hints:
Try using an existing type variable to get your point across, e.g. MyOrderedDictType = NewType('MyOrderedDictType', Dict[str, float])
If that doesn't work, you'll have to clutter your namespace with trivial child classes and type variables to get the type hint to compile. Better to use a docstring or comment to explain your situation.
The typing module documentation says that the two code snippets below are equivalent.
from typing import NamedTuple
class Employee(NamedTuple):
name: str
id: int
and
from collections import namedtuple
Employee = namedtuple('Employee', ['name', 'id'])
Are they the exact same thing or, if not, what are the differences between the two implementations?
The type generated by subclassing typing.NamedTuple is equivalent to a collections.namedtuple, but with __annotations__, _field_types and _field_defaults attributes added. The generated code will behave the same, for all practical purposes, since nothing in Python currently acts on those typing related attributes (your IDE might use them, though).
As a developer, using the typing module for your namedtuples allows a more natural declarative interface:
You can easily specify default values for the fields (edit: in Python 3.7, collections.namedtuple got a new defaults keyword so this is no longer an advantage)
You don't need to repeat the type name twice ("Employee")
You can customize the type directly (e.g. adding a docstring or some methods)
As before, your class will be a subclass of tuple, and instances will be instances of tuple as usual. Interestingly, your class will not be a subclass of NamedTuple. If you want to know why, read on for more info about the implementation detail.
from typing import NamedTuple
class Employee(NamedTuple):
name: str
id: int
Behaviour in Python <= 3.8
>>> issubclass(Employee, NamedTuple)
False
>>> isinstance(Employee(name='guido', id=1), NamedTuple)
False
typing.NamedTuple is a class, it uses metaclasses and a custom __new__ to handle the annotations, and then it delegates to collections.namedtuple to build and return the type. As you may have guessed from the lowercased name convention, collections.namedtuple is not a type/class - it's a factory function. It works by building up a string of Python source code, and then calling exec on this string. The generated constructor is plucked out of a namespace and included in a 3-argument invocation of the metaclass type to build and return your class. This explains the weird inheritance breakage seen above, NamedTuple uses a metaclass in order to use a different metaclass to instantiate the class object.
Behaviour in Python >= 3.9
typing.NamedTuple is changed from a type (class) to a function (def)
>>> issubclass(Employee, NamedTuple)
TypeError: issubclass() arg 2 must be a class or tuple of classes
>>> isinstance(Employee(name="guido", id=1), NamedTuple)
TypeError: isinstance() arg 2 must be a type or tuple of types
Multiple inheritance using NamedTuple is now disallowed (it did not work properly in the first place).
See bpo40185 / GH-19371 for the change.
I just wrote a simple #autowired decorator for Python that instantiate classes based on type annotations.
To enable lazy initialization of the class, the package provides a lazy(type_annotation: (Type, str)) function so that the caller can use it like this:
#autowired
def foo(bla, *, dep: lazy(MyClass)):
...
This works very well, under the hood this lazy function just returns a function that returns the actual type and that has a lazy_init property set to True. Also this does not break IDEs' (e.g., PyCharm) code completion feature.
But I want to enable the use of a subscriptable Lazy type use instead of the lazy function.
Like this:
#autowired
def foo(bla, *, dep: Lazy[MyClass]):
...
This would behave very much like typing.Union. And while I'm able to implement the subscriptable type, IDEs' code completion feature will be rendered useless as it will present suggestions for attributes in the Lazy class, not MyClass.
I've been working with this code:
class LazyMetaclass(type):
def __getitem__(lazy_type, type_annotation):
return lazy_type(type_annotation)
class Lazy(metaclass=LazyMetaclass):
def __init__(self, type_annotation):
self.type_annotation = type_annotation
I tried redefining Lazy.__dict__ as a property to forward to the subscripted type's __dict__ but this seems to have no effect on the code completion feature of PyCharm.
I strongly believe that what I'm trying to achieve is possible as typing.Union works well with IDEs' code completion. I've been trying to decipher what in the source code of typing.Union makes it to behave well with code completion features but with no success so far.
For the Container[Type] notation to work you would want to create a user-defined generic type:
from typing import TypeVar, Generic
T = TypeVar('T')
class Lazy(Generic[T]):
pass
You then use
def foo(bla, *, dep: Lazy[MyClass]):
and Lazy is seen as a container that holds the class.
Note: this still means the IDE sees dep as an object of type Lazy. Lazy is a container type here, holding an object of type MyClass. Your IDE won't auto-complete for the MyClass type, you can't use it that way.
The notation also doesn't create an instance of the Lazy class; it creates a subclass instead, via the GenericMeta metaclass. The subclass has a special attribute __args__ to let you introspect the subscription arguments:
>>> a = Lazy[str]
>>> issubclass(a, Lazy)
True
>>> a.__args__
(<class 'str'>,)
If all you wanted was to reach into the type annotations at runtime but resolve the name lazily, you could just support a string value:
def foo(bla, *, dep: 'MyClass'):
This is valid type annotation, and your decorator could resolve the name at runtime by using the typing.get_type_hints() function (at a deferred time, not at decoration time), or by wrapping strings in your lazy() callable at decoration time.
If lazy() is meant to flag a type to be treated differently from other type hints, then you are trying to overload the type hint annotations with some other meaning, and type hinting simply doesn't support such use cases, and using a Lazy[...] containing can't make it work.