I am creating a configuration management system in python and are exploring options between hydra/pydantic/both. I get a little confused over when to use MISSING versus just leaving it blank/optional. I will use an example of OmegaConf here since
the underlying structure of hydra uses it.
#dataclass
class User:
# A simple user class with two missing fields
name: str = MISSING
height: Height = MISSING
where it says that this MISSING field will convert to yaml's ??? equivalent. Can I just leave it blank?
I would like to know the difference between classes built normally in python and those built with the Pydantic lib, for example:
eg normal;
class Node:
def __init__(self, chave=None, esquerda=None, direita=None):
self.chave = chave
self.esquerda = esquerda
self.direita = direita
eg pydantic;
from datetime import datetime
from typing import List, Optional
from pydantic import BaseModel
class User(BaseModel):
id: int
name = 'John Doe'
signup_ts: Optional[datetime] = None
friends: List[int] = []
There are a few main differences.
Firstly, purpose. Pydantic models are designed to:
Data validation and settings management using python type annotations.
pydantic enforces type hints at runtime, and provides user friendly errors when data is invalid.
When you use type annotations you receive a lot of validators and some useful methods out of the box.
As Ahmed and John says, in your example you can't assign “hello” to id in BaseModel (pydantic) because you type id as an int. But you can pass a string “1” (must be a numerical, not float) and it will be mapped to int. In this case:
pydantic uses int(v) to coerce types to an int; see this warning on loss of information during data conversion
Also Pydantic models allows you to use many more types than standard python types, like urls and much more. It means that you can easily validate more data types.
You can easily create complex models using composition.
Pydantic has some kind of integration with orms: docs
There are a lot of other features, much more than I can describe in a single answer. I strongly recommend reading the documentation, it is very clear and useful.
The pydantic models are very useful for example in building microservices where you can share your interfaces as pydantic models. Also all models can easily generate the json schema. See: Schema, exporting models.
Pydantic is also a big part of a growing in popularity python web framework fastapi.
Pydantic model for compulsory field with alias is created as follows
class MedicalFolderUpdate(RWModel):
id : str = Field(alias='_id')
university : Optional[str]
How to add optional field university's alias name 'school' as like of id?
It is not documented on the Pydantic website how to use the typing Optional with the Fields Default besides their allowed types in which they include the mentioned Optional:
Optional[x] is simply shorthand for Union[x, None]; see Unions below for more detail on parsing and validation and Required Fields for details about required fields that can receive None as a value.
for that, you would have to use their field customizations as in the example:
class Figure(BaseModel):
name: str = Field(alias='Name')
edges: str = Field(default=None, alias='Edges')
without the default value, it breaks because the optional does not override that the field is required and needs a default value. Which is the solution I used to overcome this problem while using Pydantic with fast API to manage mongo resources
I'm asking myself a question about clean architecture.
Let's imagine a small api that allow us to create and get a user using that type of archi. This app has two endpoints and store the data in a database.
Let's say that we have a db model that look like
class User:
id: int
firstname: str
lastname: str
Firstly, the GET endpoint will use the usecase GetUser and use a User entity. This entity will look like:
class User:
id: int
firstname: str
lastname: str
My question concerns the POST endpoint.
The data passed in this endpoint is only the fields firstname and lastname, obviously.
Do I have to do another entity like this one under ?
class UserRequest:
firstname: str
lastname: str
I found this unsatisfying because it does not make sense to imagine such an entity as a business point-of-view.
Nevertheless, it seems a bit wobbly to make an entity "composite" such as:
class User:
id: Optional[int]
firstname: str
lastname: str
A third option is to use a class inside the usecase file that have for only purpose to model the past coming from the POST request. ie
class UserRequest:
firstname: str
lastname: str
class CreateUserUseCase:
def __init__():
...
def execute(request: UserRequest):
...
So the question is: According to clean architecture principles, What is the best way to model data coming from a POST request that is not a business entity?
Thanks a lot for your help, and don't hesitate to ask question if my examples are not clear enough.
Stef.
It would be helpful to view multiple endpoints (use-cases) in the context of the same entity as the lifecycle of that entity, for example:
Creating (POST) a new user 'xyz' (writing to database)
Mutating (POST/PUT/PATCH) user 'xyz' (writing to database)
Querying (GET) user 'xyz' (reading from database)
Each of the above actions should involve the same business entity User:
Creating: User entity is being constructed inside use-case (application layer) using UserRequest DTO (you have actually demonstrated exactly that) then passed to repository object for persistence.
Mutating: User entity is being retrieved from database (repository object) then modified (application) finally passed to repository object for persistence.
Querying: User entity is being retrieved from database (repository object) then passed back to presentation layer finally translated into response DTO.
One of the principles in CA is to have DTOs inside presentation layer being mapped to/from input/output ports. The heart of CA is domain entities, being constructed either from input (representing request DTO) or from database.
Okay, so pardon me if I don't make much sense. I face this 'ObjectId' object is not iterable whenever I run the collections.find() functions. Going through the answers here, I'm not sure where to start. I'm new to programming, please bear with me.
Every time I hit the route which is supposed to fetch me data from Mongodb, I getValueError: [TypeError("'ObjectId' object is not iterable"), TypeError('vars() argument must have __dict__ attribute')].
Help
Exclude the "_id" from the output.
result = collection.find_one({'OpportunityID': oppid}, {'_id': 0})
I was having a similar problem to this myself. Not having seen your code I am guessing the traceback similarly traces the error to FastAPI/Starlette not being able to process the "_id" field - what you will therefore need to do is change the "_id" field in the results from an ObjectId to a string type and rename the field to "id" (without the underscore) on return to avoid incurring issues with Pydantic.
First of all, if we had some examples of your code, this would be much easier. I can only assume that you are not mapping your MongoDb collection data to your Pydantic BaseModel correctly.
Read this:
MongoDB stores data as BSON. FastAPI encodes and decodes data as JSON strings. BSON has support for additional non-JSON-native data types, including ObjectId which can't be directly encoded as JSON. Because of this, we convert ObjectIds to strings before storing them as the _id.
I want to draw attention to the id field on this model. MongoDB uses _id, but in Python, underscores at the start of attributes have special meaning. If you have an attribute on your model that starts with an underscore, pydantic—the data validation framework used by FastAPI—will assume that it is a private variable, meaning you will not be able to assign it a value! To get around this, we name the field id but give it an alias of _id. You also need to set allow_population_by_field_name to True in the model's Config class.
Here is a working example:
First create the BaseModel:
class PyObjectId(ObjectId):
""" Custom Type for reading MongoDB IDs """
#classmethod
def __get_validators__(cls):
yield cls.validate
#classmethod
def validate(cls, v):
if not ObjectId.is_valid(v):
raise ValueError("Invalid object_id")
return ObjectId(v)
#classmethod
def __modify_schema__(cls, field_schema):
field_schema.update(type="string")
class Student(BaseModel):
id: PyObjectId = Field(default_factory=PyObjectId, alias="_id")
first_name: str
last_name: str
class Config:
allow_population_by_field_name = True
arbitrary_types_allowed = True
json_encoders = {ObjectId: str}
Now just unpack everything:
async def get_student(student_id) -> Student:
data = await collection.find_one({'_id': student_id})
if data is None:
raise HTTPException(status_code=404, detail='Student not found.')
student: Student = Student(**data)
return student
Use the response model inside app decorator Here is the sample example
from pydantic import BaseModel
class Todo(BaseModel):
title:str
details:str
main.py
#app.get("/{title}",response_model=Todo)
async def get_todo(title:str):
response=await fetch_one_todo(title)
if not response:
raise
HTTPException(status_code=status.HTTP_404_NOT_FOUND,detail='not found')
return response
use db.collection.find(ObjectId:"12348901384918")
here db.collection is database name and use double quotes for the string .
I was trying to iterate through all the documents and what worked for me was this solution https://github.com/tiangolo/fastapi/issues/1515#issuecomment-782835977
These lines just needed to be added after the child of ObjectID class. An example is given in the following link.
https://github.com/tiangolo/fastapi/issues/1515#issuecomment-782838556
I had this issue until I upgraded from mongodb version 5.0.9 to version 6.0.0 so mongodb made some changes on their end to handle this if you have the ability to upgrade! I ran into this issue when creating a test server and when I created a new test server that was 6.0.0, it fixed the error.