Trying to set a superclass field in a subclass using validator - python

I am trying to set a super-class field in a subclass using validator as follows:
Approach 1
from typing import List
from pydantic import BaseModel, validator, root_validator
class ClassSuper(BaseModel):
field1: int = 0
class ClassSub(ClassSuper):
field2: List[int]
#validator('field1')
def validate_field1(cls, v, values):
return len(values["field2"])
sub = ClassSub(field2=[1, 2, 3])
print(sub.field1) # It prints 0, but expected it to print 3
If I run the code above it prints 0, but I expected it to print 3 (which is basically len(field2)). However, if I use #root_validator() instead, I get the expected result.
Approach 2
from typing import List
from pydantic import BaseModel, validator, root_validator
class ClassSuper(BaseModel):
field1: int = 0
class ClassSub(ClassSuper):
field2: List[int]
#root_validator()
def validate_field1(cls, values):
values["field1"] = len(values["field2"])
return values
sub = ClassSub(field2=[1, 2, 3])
print(sub.field1) # This prints 3, as expected
New to using pydantic and I am bit puzzled what I am doing wrong with the Approach 1. Thank you for your help.

The reason your Approach 1 does not work is because by default, validators for a field are not called, when the value for that field is not supplied (see docs).
Your validate_field1 is never even called. If you add always=True to your #validator, the method is called, even if you don't provide a value for field1.
However, if you try that, you'll see that it will still not work, but instead throw an error about the key "field2" not being present in values.
This in turn is due to the fact that validators are called in the order they were defined. In this case, field1 is defined before field2, which means that field2 is not yet validated by the time validate_field1 is called. And values only contains previously-validated fields (see docs). Thus, at the time validate_field1 is called, values is simply an empty dictionary.
Using the #root_validator is the correct approach here because it receives the entire model's data, regardless of whether or not field values were supplied explicitly or by default.
And just as a side note: If you don't need to specify any parameters for it, you can use #root_validator without the parantheses.
And as another side note: If you are using Python 3.9+, you can use the regular list class as the type annotation. (See standard generic alias types) That means field2: list[int] without the need for typing.List.
Hope this helps.

Related

Automatic indent pythoh dataclasses with commentы as Go structs style

Python dataclasses are really great. They allow to define classes in very beautiful way.
from dataclasses import dataclass
#dataclass
class InventoryItem:
"""Class for keeping track of an item in inventory."""
name: str
unit_price: float
quantity_on_hand: int = 0
def total_cost(self) -> float:
return self.unit_price * self.quantity_on_hand
Moreover lots of useful tools re-use python annotations the same way and allow to define classes (that are more like structures in other languages) the same way. One of the example is Pydantic.
from pydantic import BaseModel
class User(BaseModel):
id: int
name = 'John Doe'
signup_ts: Optional[datetime] = None
friends: List[int] = []
I myself use pydantic quite a lot these days. Look at an example from my recent practice:
class G6A(BaseModel):
transaction_id: items.TransactionReference # Transaction Id
mpan_core: items.MPAN # MPAN Core
registration_date: items.CallistoDate # Registration Date
action_required: G6AAction # Action Required
I parse some very inconvenient api and that's why I want to leave a comment on every line. With that it will be working as self-documentation. The problem is that, at least for me, this looks very ugly. It's hard to look throw lines, 'cause the look like table with broken columns. Let's try to fix this by doing accurate indentations:
class G6A(BaseModel):
transaction_id: items.TransactionReference # Transaction Id
mpan_core: items.MPAN # MPAN Core
registration_date: items.CallistoDate # Registration Date
action_required: G6AAction # Action Required
I'm sure that this way it is much more readable. By doing so we define structure, like actual table, where 1 column is attribute name, 2 column is attribute type and the last one is comment. It is actually inspired by Go structs
type T struct {
name string // name of the object
value int // its value
}
So, my questions is - are there any automatic tools (linters), that will reformat dataclass/pydantic-models the way I described above? I looked throw autopep8, black linter and find nothing. Also googled and so on and still nothing. Any ideas how to achieve that by existing tools ?
I think yapf has something like that for comments. Check the SPACES_BEFORE_COMMENT "knob":
The number of spaces required before a trailing comment. This can be a single value (representing the number of spaces before each trailing comment) or list of of values (representing alignment column values; trailing comments within a block will be aligned to the first column value that is greater than the maximum line length within the block)
.style.yapf:
[style]
based_on_style = pep8
spaces_before_comment = 10,20,30,40,50,60,70,80
Configures alignment columns: 10, 20, ..., 80.
foo.py:
class G6A(BaseModel):
transaction_id: items.TransactionReference # Transaction Id
mpan_core: items.MPAN # MPAN Core
registration_date: items.CallistoDate # Registration Date
action_required: G6AAction # Action Required
Output of yapf foo.py:
class G6A(BaseModel):
transaction_id: items.TransactionReference # Transaction Id
mpan_core: items.MPAN # MPAN Core
registration_date: items.CallistoDate # Registration Date
action_required: G6AAction # Action Required

Dynamically creating get request query parameters based on Pydantic schema

I know that you can create get requests like this:
#router.get("/findStuff")
def get_stuff(a: int = None, b: str = None):
return {'a': a, 'b': b}
But is there a way to create the query parameters dynamically from, let's say, a Pydantic schema? I've tried this below and although it does seem to create the query parameters in the OpenAPI doc, it's unable to process them, returning a 422 (Unprocessable entity). Has anyone tried to do something similar? Being able to specify an object containing the query parameters lets me create get requests dynamically for any arbitrary object with primitive fields. I did this in Flask with webargs, but am not sure what I can do within FastApi.
class MySchema(BaseModel):
a: int = None
b: str = None
#router.get("/findStuff")
def get_stuff(inputs = Depends(MySchema)):
return inputs
This was unrealized user error. There was this end point with a path parameter
#router.get("/{id}")
def get_stuff_by_id(id: int):
return id
that appeared above the /findStuffs end point, so it got clobbered.
The solution was to just put the /findStuffs block above this block with the path parameter.

Check for extra keys in marshmallow.Schema.dump()

I want to be able to take some Python object (more precisely, a dataclass) and dump it to it's dict representation using a schema. Let me give you an example:
from marshmallow import Schema, field
import dataclasses
#dataclasses.dataclass
class Foo:
x: int
y: int
z: int
class FooSchema(Schema):
x = field.Int()
y = field.Int()
FooSchema().dump(Foo(1,2,3))
As you can see, the schema differs from the Foo definition. I want to somehow be able to recognize it when dumping - so I would get some ValidationError with an explanation that there's an extra field z. It doesn't really have to be .dump(), I looked at .load() and .validate() but only the former seems to accept objects, not only dicts.
Is there a way to do this in marshmallow? Because for now when I do this dump, I would just get a dictionary: {"x": 1, "y": 2} without z of course, but no errors whatsoever. And I would want the same behavior for a case, when there's no key in dumped object (like z was in schema but not in Foo). This wold basically serve me as a sanity check of changes done to the classes themselves - maybe if it's not possible in marshmallow you know some lib/technique that makes it so?
So I had this problem today and did some digging. Based off https://github.com/marshmallow-code/marshmallow/issues/1545 its something people are considering but the current implimentaiton of dump iterates through the fields listed dout in the schema definition so wont work.
The best I could get to work was:
from marshmallow import EXCLUDE, INCLUDE, RAISE, Schema, fields
class FooSchema(Schema):
class Meta:
unknown = INCLUDE
x = fields.Int()
y = fields.Int()
Which atleast sort of displays as a dict.

Python dataclasses_json: can I store many references to one object?

I want to use #dataclass_json decorator to store my #dataclass instances.
And I want to have many reference to one object in the instances. And I want to have this reference structure saved (so that I could modify one settings object and the modifications would be applied to many objects that use the settings).
It can be easily done while the dataclass object lies in memory, but when I try to store it in JSON, it saves the copy of instance instead of a reference of it. Can I somehow deal with it?
P.S. Here's my code example:
from dataclasses import dataclass
from dataclasses_json import dataclass_json
from typing import List
#dataclass_json
#dataclass
class RadarSettings:
freq: float = 10e9
prf: float = 1e-3
#dataclass_json
#dataclass
class Radar:
name: str = ""
preset_settings: RadarSettings = None # Here should be references to some boilerplate preset settings for many radars
custom_settings: RadarSettings = None # And here should be the custom settings to this current radar
#dataclass_json
#dataclass
class RadarScene:
name: str = ""
radars: List["Radar"] = None
preset = RadarSettings()
radar1 = Radar(name="mega search mode radar from hell", preset_settings=preset)
radar2 = Radar(name="satanic sensor array radar", preset_settings=preset)
# The preset_settings is one same object for both radars! If I modify it, the modifications will be applied to both radars
print(id(radar1.preset_settings), id(radar2.preset_settings))
scene_to_save = RadarScene(name="Infernal scene", radars=[radar1, radar2])
loaded_scene = RadarScene.from_json(scene_to_save.to_json())
print(id(loaded_scene.radars[0]), id(loaded_scene.radars[1]))
# Alas! Here will be two instances of preset_settings saved. I need one =(
The problem you have described is expected behavior. When you save your data to json format you get a string representation of the data that is plain text.
You may fix the issue with at least couple approaches.
Method 1.
Load RadarScene data, create preset = RadarSettings(), iterate over all Radars in the RadarScene and update preset_settings attribute: radar.preset_settings = preset. This method can be incapsulated into RadarScene class so you can call it right after loading data.
Method 2.
Create new singleton class RadarSettingsDefault inherited from RadarSettings and modify Radar class: preset_settings: RadarSettingsDefault = None.

Filter by an object in SQLAlchemy

I have a declared model where the table stores a "raw" path identifier of an object. I then have a #hybrid_property which allows directly getting and setting the object which is identified by this field (which is not another declarative model). Is there a way to query directly on this high level?
I can do this:
session.query(Member).filter_by(program_raw=my_program.raw)
I want to be able to do this:
session.query(Member).filter_by(program=my_program)
where my_program.raw == "path/to/a/program"
Member has a field program_raw and a property program which gets the correct Program instance and sets the appropriate program_raw value. Program has a simple raw field which identifies it uniquely. I can provide more code if necessary.
The problem is that currently, SQLAlchemy simply tries to pass the program instance as a parameter to the query, instead of its raw value. This results in a Error binding parameter 0 - probably unsupported type. error.
Either, SQLAlchemy needs to know that when comparing the program, it must use Member.program_raw and match that against the raw property of the parameter. Getting it to use Member.program_raw is done simply using #program.expression but I can't figure out how to translate the Program parameter correctly (using a Comparator?), and/or
SQLAlchemy should know that when I filter by a Program instance, it should use the raw attribute.
My use-case is perhaps a bit abstract, but imagine I stored a serialized RGB value in the database and had a property with a Color class on the model. I want to filter by the Color class, and not have to deal with RGB values in my filters. The color class has no problems telling me its RGB value.
Figured it out by reading the source for relationship. The trick is to use a custom Comparator for the property, which knows how to compare two things. In my case it's as simple as:
from sqlalchemy.ext.hybrid import Comparator, hybrid_property
class ProgramComparator(Comparator):
def __eq__(self, other):
# Should check for case of `other is None`
return self.__clause_element__() == other.raw
class Member(Base):
# ...
program_raw = Column(String(80), index=True)
#hybrid_property
def program(self):
return Program(self.program_raw)
#program.comparator
def program(cls):
# program_raw becomes __clause_element__ in the Comparator.
return ProgramComparator(cls.program_raw)
#program.setter
def program(self, value):
self.program_raw = value.raw
Note: In my case, Program('abc') == Program('abc') (I've overridden __new__), so I can just return a "new" Program all the time. For other cases, the instance should probably be lazily created and stored in the Member instance.

Categories