Python Dynamic Type Hints (like Dataclasses) - python

I have a dataclass, and a function which will create an instance of that dataclass using all the kwargs passed to it.
If I try to create an instance of that dataclass, I can see the type hints/autocomplete for the __init__ method. I just need the similar type hints for a custom function that I want to create.
from dataclasses import dataclass
#dataclass
class Model:
attr1: str
attr2: int
def my_func(**kwargs):
model = Model(**kwargs)
... # Do something else
ret = [model]
return ret
# my_func should show that it needs 'attr1' & 'attr2'
my_func(attr1='hello', attr2=65535)

If your IDE isn't sophisticated enough to infer that kwargs isn't used for anything else than to create a Model instance (I'm not sure if there is such an IDE), then it has no way of knowing that attr1 and attr2 are the required arguments, and the obvious solution would be to list them explicitly.
I would refactor the function so that it takes a Model instance as argument instead.
Then you would call it like my_func(Model(...)) and the IDE could offer the autocompletion for Model.

Related

inherit class type of constructor argument

I have a pydantic class, let's say class A
I want to use a root_validator on the class and I want intelliense to help me type out the class fields
so I type the values argument as 'Self' to get the intellisense, but intellisense assumes the fields of values are dot accessible, so I wrap my values in a 'dotdict' class that allows me to access them this way
the problem is that mypy complains that my dotdict class doesn't have an attribute X on every access I make
QUESTION: how do I make a instance of the dotdict class inherit the type of the argument passed to the constructor?
# myclasses.py
from mytools import dotdict
from pydantic import BaseModel, root_validator
from typing_extensions import Self
class A(BaseModel):
a: int
b: str
#root_validator
def validateA(cls, values: Self): # type: ignore[valid-type]
"""
Pylance complains about the type for 'values' being unknown
'values' is a dictionary with the fields of the class, so I type it as Self
Pylance is satisfied with this and ONLY NOW provides intellisense for 'values'
(notice my main goal is to get intellisense, but I also prefer the dot notation)
problem is that intellisense recons the fields of the class are dot accessible
so I wrap 'values' with the 'dotdict' class to make them dot accessible
THIS causes one big problem:
mypy complains on every 'values.field' like this:
"class 'dotdict' doesn't have an attribute 'field'
"""
values = dotdict(values) # type: ignore[valid-type]
assert values.a == int(values.b) , "simple assertion for show purposes"
return values
# mytools.py
class dotdict(dict): # type: ignore
"""
a dictionary that supports dot notation and returns 'None' on missing keys
"""
def __getattr__(self, key: str):
try:
v: Optional[Any] = self[key]
return v
except KeyError as ke:
return None
One final note, mypy complains about using 'Self' as a type hint, thus the # type: ignore, but I didn't find any solution to this
Thank You!
Edit: "why are You so desperate for intellisense"
"why don't You just copy Your class as a TypedDict to use that for intellisense"
Very disappointed by these comments. If You want to throw ideas around that's fine, but don't undermine the question and offer answers to questions I didn't ask
I have big models, I have big root_validators, maybe I can make other design choices to reduce their size but the only problem I have right now is the mypy throws a "dotdict doesn't have attribute " on EACH ACCESS, I don't wanna add a # type: ignore comment on each access
I noticed if I don't suppress the error that mypy throws bc I type 'values' as Self then it doesn't throw the "dotdict attribute" errors, so I'm just gonna live with that mypy erro (once pero root_validator) and keep typing my values as Self. Apparently it'll solve itself in the next release (as per one of the comments)
Thanks for nothing
As I already mentioned in a comment, you don't control the type of values. It will always be a dictionary. But if you are really that desperate for those IntelliSense auto-suggestions, you can always create your own TypedDict to mirror the actual model and annotate values with that.
from pydantic import BaseModel, root_validator
from typing import TypedDict
class DictA(TypedDict):
a: int
b: str
class A(BaseModel):
a: int
b: str
#root_validator
def validate_stuff(cls, values: DictA) -> DictA:
...
return values
If you then try doing something like values["a"]. inside the method, your IDE should give you a selection of int methods and attributes. Note that if your validator is pre=True or you are allowing arbitrary extra values, then that annotation is still a lie and the dictionary can contain whatever.
But again, I don't see the point of this. You can always just selectively cast the values you are actually using in the validator in intermediary variables, if you are doing complex stuff with them. There should hardly be a need to have your IDE know the types of all the fields.
from pydantic import BaseModel, root_validator
from typing import Any, cast
class A(BaseModel):
a: int
b: str
#root_validator
def validate_stuff(cls, values: dict[str, Any]) -> dict[str, Any]:
val_a, val_b = cast(int, values["a"]), cast(str, values["b"])
...
return values
But even that seems over the top. I cannot imagine a validator being so complex that you really need that.

Why are attributes defined outside __init__ in popular packages like SQLAlchemy or Pydantic?

I'm modifying an app, trying to use Pydantic for my application models and SQLAlchemy for my database models.
I have existing classes, where I defined attributes inside the __init__ method as I was taught to do:
class Measure:
def __init__(
self,
t_received: int,
mac_address: str,
data: pd.DataFrame,
battery_V: float = 0
):
self.t_received = t_received
self.mac_address = mac_address
self.data = data
self.battery_V = battery_V
In both Pydantic and SQLAlchemy, following the docs, I have to define those attributes outside the __init__ method, for example in Pydantic:
import pydantic
class Measure(pydantic.BaseModel):
t_received: int
mac_address: str
data: pd.DataFrame
battery_V: float
Why is it the case? Isn't this bad practice? Is there any impact on other methods (classmethods, staticmethods, properties ...) of that class?
Note that this is also very unhandy because when I instantiate an object of that class, I don't get suggestions on what parameters are expected by the constructor!
Defining attributes of a class in the class namespace directly is totally acceptable and is not special per se for the packages you mentioned. Since the class namespace is (among other things) essentially a blueprint for instances of that class, defining attributes there can actually be useful, when you want to e.g. provide all public attributes with type annotations in a single place in a consistent manner.
Consider also that a public attribute does not necessarily need to be reflected by a parameter in the constructor of the class. For example, this is entirely reasonable:
class Foo:
a: list[int]
b: str
def __init__(self, b: str) -> None:
self.a = []
self.b = b
In other words, just because something is a public attribute, that does not mean it should have to be provided by the user upon initialization. To say nothing of protected/private attributes.
What is special about Pydantic (to take your example), is that the metaclass of BaseModel as well as the class itself does a whole lot of magic with the attributes defined in the class namespace. Pydantic refers to a model's typical attributes as "fields" and one bit of magic allows special checks to be done during initialization based on those fields you defined in the class namespace. For example, the constructor must receive keyword arguments that correspond to the non-optional fields you defined.
from pydantic import BaseModel
class MyModel(BaseModel):
field_a: str
field_b: int = 1
obj = MyModel(
field_a="spam", # required
field_b=2, # optional
field_c=3.14, # unexpected/ignored
)
If I were to omit field_a during construction of a MyModel instance, an error would be raised. Likewise, if I had tried to pass field_b="eggs", an error would be raised.
So the fact that you don't write your own __init__ method is a feature Pydantic provides you. You only define the fields and an appropriate constructor is "magically" there for you already.
As for the drawback you mentioned, where you don't get any auto-suggestions, that is true by default for all IDEs. Static type checkers cannot understand that dynamic constructor and simply infer what arguments are expected. Currently this is solved via extensions, such as the mypy plugin and the PyCharm plugin. Maybe soon the #dataclass_transform decorator from PEP 681
will standardize this for similar packages and thus improve support by static type checkers.
It is also worth noting that even the standard library's dataclasses only work via special extensions in type checkers.
To your other question, there is obviously some impact on methods of such classes (by design), though the specifics are not always obvious. You should of course not simply write your own __init__ method without being careful to call the superclass' __init__ properly inside it. Also, #property-setters currently don't work as you would expect it (though it is debatable if it even makes sense to use properties on Pydantic models).
To wrap up, this approach is not only not bad practice, it is a great idea to reduce boilerplate code and it is extremely common these days, as evidenced by the fact that hugely popular and established packages (like the aforementioned Pydantic, as well as e.g. SQLAlchemy, Django and others) use this pattern to a certain extent.
Pydantic has its own (rewriting) magic, but SQLalchemy is a bit easier to explain.
A SA model looks like this :
>>> from sqlalchemy import Column, Integer, String
>>> class User(Base):
...
... id = Column(Integer, primary_key=True)
... name = Column(String)
Column, Integer and String are descriptors. A descriptor is a class that overrides the get and set methods. In practice, this means the class can control how data is accessed and stored.
For example this assignment would now use the __set__ method from Column:
class User(Base):
id = Column(Integer, primary_key=True)
name = Column(String)
user = User()
user.name = 'John'
This is the same as user.name.__set__('John') , however, because of the MRO, it finds a set method in Column, so uses that instead. In a simplified version the Column looks something like this:
class Column:
def __init__(self, field=""):
self.field= field
def __get__(self, obj, type):
return obj.__dict__.get(self.field)
def __set__(self, obj, val):
if validate_field(val)
obj.__dict__[self.field] = val
else:
print('not a valid value')
(This is similar to using #property. A Descriptor is a re-usable #property)

Changing frozen dataclass to support dynamic fields

I have a Python dataclass that looks like
#dataclass(frozen=True)
class BaseDataclass:
immutable_arg_1 = field1
immutable_arg_2 = field2
It's preferrable to not have to change the base class much since it's currently used in a bunch of places in the codebase it is in, but now there's a use-case for where having a dataclass that's dynamically constructed is much more convenitant. Is there any way of constructing a new dataclass that extends this with some dynamically chosen arguments (the subclass itself may be dynamically defined).
def dataclass_factory(kwargs: Dict) -> BaseDataclass:
#dataclass(frozen=True)
class DerivedDataclass(BaseDataclass):
**kwargs # not valid python syntax here
return DerivedDataclass(**kwargs)
so that I can do something like
new_dataclass = dataclass_factory({'immutable_arg_3': field3, 'immutable_arg_4': field4})

Create a pydantic.BaseModel definition with external class or dictionary

I have a (dynamic) definition of a simple class, like so:
class Simple:
val: int = 1
I intend to use this definition to build a pydantic.BaseModel, so it can be defined from the Simple class; basically doing this, but via type, under a metaclass structure where the Simple class is retrieved from.
from pydantic import BaseModel
class SimpleModel(Simple, BaseModel):
pass
# Actual ways tried:
SimpleModel = type('SimpleModel', (Simple, BaseModel), {})
# or
SimpleModel = type('SimpleModel', (BaseModel, ), Simple.__annotations__)
However, this approach was not returning a model class with the parameters from the Simple class.
I understand that the BaseModel already uses a rather complex metaclass under the hood, however, my intended implementation is also under a metaclass, where I intend to dynamically transfer the Simple class into a BaseModel from pydantic.
Your suggestions will be kindly appreciated.
I managed to get this working by first, casting my Simple class to be a dataclass from pydantic, then getting a pydantic model from it.
I am not an expert in pydantic, so would not mind your views on the approach.
from pydantic.dataclasses import dataclass
SimpleModel = dataclass(Simple).__pydantic_model__
The trouble I did however find (same with an answer provided by #jsbueno), that when declaring annotation for data type for pathlib.Path (as an example) with BaseModel directly, the string value provided gets coerced to the annotation data type. But with my or #jsbueno approaches, the data type remains original (no coercion).
You can simply call type passing a dictionary made of SimpleModel's __dict__ attribute - that will contain your fileds default values and the __annotations__ attribute, which are enough information for Pydantic to do its thing.
I just would just take the extra step of deleting the __weakref__ attribute that is created by default in the plain "SimpleModel" before doing that - to avoid it pointing to the wrong class.
from pydantic import BaseModel
class Simple:
val: int = 1
new_namespace = dict(Simple.__dict__) # copies the class dictproxy into a plain dictionary
del new_namespace["__weakref__"]
SimpleModel = type("SimpleModel", (BaseModel,), new_namespace)
and we have
In [58]: SimpleModel.schema()
Out[58]:
{'title': 'Simple',
'type': 'object',
'properties': {'one_val': {'title': 'One Val',
'default': 1,
'type': 'integer'}}}
That works - but since Pydantic is complex, to make it more futureproof, it might be better to use the Pydantic's metaclass supplied namespace object instead of a plain dictionary - the formal way to do that is by using
the helper functions in the types model:
import types
from pydantic import BaseModel
class Simple:
val: int = 1
SimpleModel = types.new_class(
"SimpleModel",
(BaseModel,),
exec_body=lambda ns:ns.update(
{key: val for key, val in Simple.__dict__.items()
if not key.startswith("_")}
)
)
The new_type call computes the appropriate metaclass, and pass the correct namespace object to the callback in the exec_body argument. There, we just fill it with the contents of the dict on your dynamic class.
Here, I opted to update the namespace and filter all "_" values in a single line, but you can define the function passes to "exec_body" as a full multiline function and filter the contents you want out more carefully.

Can a method in a python class be annotated with a type that is defined by a subclass?

I have a superclass that has a method which is shared by its subclasses. However, this method should return an object with a type that is defined on the subclass. I'd like the return type for the method to be statically annotated (not a dynamic type) so code using the subclasses can benefit from mypy type checking on the return values. But I don't want to have to redefine the common method on the subclass just to provide its type annotation. Is this possible with python type annotations and mypy?
Something like this:
from typing import Type
class AbstractModel:
pass
class OrderModel(AbstractModel):
def do_order_stuff():
pass
class AbstractRepository:
model: Type[AbstractModel]
def get(self) -> model:
return self.model()
class OrderRepository(AbstractRepository):
model = OrderModel
repo = OrderRepository()
order = repo.get()
# Type checkers (like mypy) should recognize that this is valid
order.do_order_stuff()
# Type checkers should complain about this; because `OrderModel`
# does not define `foo`
order.foo()
The tricky move here is that get() is defined on the superclass AbstractRepository, which doesn't yet know the type of model. (And the -> model annotation fails, since the value of model hasn't been specified yet).
The value of model is specified by the subclass, but the subclass doesn't (re)define get() in order to provide the annotation. It seems like this should be statically analyzable; though it's a little tricky, since it would require the static analyzer to trace the model reference from the superclass to the subclass.
Any way to accomplish both a shared superclass implementation and a precise subclass return type?
Define AbstractRepository as a generic class.
from typing import TypeVar, Generic, Type, ClassVar
T = TypeVar('T')
class AbstractRespotitory(Generic[T]):
model: ClassVar[Type[T]]
#classmethod
def get(cls) -> T:
return cls.model()
(get only makes use of a class attribute, so can--and arguably should--be a class method.)

Categories