What is the point of using `MISSING` in hydra or pydantic?

What is the point of using `MISSING` in hydra or pydantic? - python

I am creating a configuration management system in python and are exploring options between hydra/pydantic/both. I get a little confused over when to use MISSING versus just leaving it blank/optional. I will use an example of OmegaConf here since
the underlying structure of hydra uses it.
#dataclass
class User:
# A simple user class with two missing fields
name: str = MISSING
height: Height = MISSING
where it says that this MISSING field will convert to yaml's ??? equivalent. Can I just leave it blank?

Related

Pydantic schema logic

So, I'm building an API to interact with my personal wine labels collections database.
For what I understand, a pydantic model purpose is to serve as a "verifier" of the schema that is sent to the API. So, my pydantic schema for adding a label is the following:
from pydantic import BaseModel
from typing import Optional
class WineLabels(BaseModel):
name: Optional[str]
type: Optional[str]
year = Optional[int]
grapes = Optional[str]
country = Optional[str]
region = Optional[str]
price = Optional[float]
id = Optional[str]
None of the fields is to be updated automatically. This is equal to the sqlalchemy model since I want to add all the fields manually.
So my question is, let's say I want to create a call to search by ID and another one to search by name. I do not believe these schema should be applied. Should I create another schema ? Should I create one like this?:
class SearchWineLabel(WineLabels):
id: str
Should a schema be created for each purpose that cannot be fulfilled by an already existing schema?
Sorry, but I can't understand the logic behind it.
Thanks!!

If you want to search by id or name, I'm not sure if you even need a schema - one or more get parameters would usually be enough in those cases (and is usually better semantically).
In any case, the schema would be written for what the endpoint is expected to receive, not by using a general schema that contains the field in some other way. Think of the schemas as the input/output definitions for given resources and endpoints.
You usually want to have different schemas for adding and updating (since adding will require certain fields to be present, while updating may allow null or a missing field in any location).
The Pydantic schemas will allow you to express these differences without writing code, and it will be reflected in your generated api docs under /docs

What's the diference between 'normal' Python Classes and Pydantic Classes?

I would like to know the difference between classes built normally in python and those built with the Pydantic lib, for example:
eg normal;
class Node:
def __init__(self, chave=None, esquerda=None, direita=None):
self.chave = chave
self.esquerda = esquerda
self.direita = direita
eg pydantic;
from datetime import datetime
from typing import List, Optional
from pydantic import BaseModel
class User(BaseModel):
id: int
name = 'John Doe'
signup_ts: Optional[datetime] = None
friends: List[int] = []

There are a few main differences.
Firstly, purpose. Pydantic models are designed to:
Data validation and settings management using python type annotations.
pydantic enforces type hints at runtime, and provides user friendly errors when data is invalid.
When you use type annotations you receive a lot of validators and some useful methods out of the box.
As Ahmed and John says, in your example you can't assign “hello” to id in BaseModel (pydantic) because you type id as an int. But you can pass a string “1” (must be a numerical, not float) and it will be mapped to int. In this case:
pydantic uses int(v) to coerce types to an int; see this warning on loss of information during data conversion
Also Pydantic models allows you to use many more types than standard python types, like urls and much more. It means that you can easily validate more data types.
You can easily create complex models using composition.
Pydantic has some kind of integration with orms: docs
There are a lot of other features, much more than I can describe in a single answer. I strongly recommend reading the documentation, it is very clear and useful.
The pydantic models are very useful for example in building microservices where you can share your interfaces as pydantic models. Also all models can easily generate the json schema. See: Schema, exporting models.
Pydantic is also a big part of a growing in popularity python web framework fastapi.

Automatic indent pythoh dataclasses with commentы as Go structs style

Python dataclasses are really great. They allow to define classes in very beautiful way.
from dataclasses import dataclass
#dataclass
class InventoryItem:
"""Class for keeping track of an item in inventory."""
name: str
unit_price: float
quantity_on_hand: int = 0
def total_cost(self) -> float:
return self.unit_price * self.quantity_on_hand
Moreover lots of useful tools re-use python annotations the same way and allow to define classes (that are more like structures in other languages) the same way. One of the example is Pydantic.
from pydantic import BaseModel
class User(BaseModel):
id: int
name = 'John Doe'
signup_ts: Optional[datetime] = None
friends: List[int] = []
I myself use pydantic quite a lot these days. Look at an example from my recent practice:
class G6A(BaseModel):
transaction_id: items.TransactionReference # Transaction Id
mpan_core: items.MPAN # MPAN Core
registration_date: items.CallistoDate # Registration Date
action_required: G6AAction # Action Required
I parse some very inconvenient api and that's why I want to leave a comment on every line. With that it will be working as self-documentation. The problem is that, at least for me, this looks very ugly. It's hard to look throw lines, 'cause the look like table with broken columns. Let's try to fix this by doing accurate indentations:
class G6A(BaseModel):
transaction_id: items.TransactionReference # Transaction Id
mpan_core: items.MPAN # MPAN Core
registration_date: items.CallistoDate # Registration Date
action_required: G6AAction # Action Required
I'm sure that this way it is much more readable. By doing so we define structure, like actual table, where 1 column is attribute name, 2 column is attribute type and the last one is comment. It is actually inspired by Go structs
type T struct {
name string // name of the object
value int // its value
}
So, my questions is - are there any automatic tools (linters), that will reformat dataclass/pydantic-models the way I described above? I looked throw autopep8, black linter and find nothing. Also googled and so on and still nothing. Any ideas how to achieve that by existing tools ?

I think yapf has something like that for comments. Check the SPACES_BEFORE_COMMENT "knob":
The number of spaces required before a trailing comment. This can be a single value (representing the number of spaces before each trailing comment) or list of of values (representing alignment column values; trailing comments within a block will be aligned to the first column value that is greater than the maximum line length within the block)
.style.yapf:
[style]
based_on_style = pep8
spaces_before_comment = 10,20,30,40,50,60,70,80
Configures alignment columns: 10, 20, ..., 80.
foo.py:
class G6A(BaseModel):
transaction_id: items.TransactionReference # Transaction Id
mpan_core: items.MPAN # MPAN Core
registration_date: items.CallistoDate # Registration Date
action_required: G6AAction # Action Required
Output of yapf foo.py:
class G6A(BaseModel):
transaction_id: items.TransactionReference # Transaction Id
mpan_core: items.MPAN # MPAN Core
registration_date: items.CallistoDate # Registration Date
action_required: G6AAction # Action Required

General syntax python question : from typed language to untyped one

I'am new to python language.
And my question is certainly a naive one and concerning python syntax.
I am at the step where I must go from theory to practice.
here is a class (a typescript one) I want to translate to python language.
class Category {
id: number;
type: 'shop'|'blog';
name: string;
slug: string;
path: string;
image: string|null;
items: number;
customFields: CustomFields;
parents?: Category[]|null;
children?: Category[]|null;
}
as python is untyped language I've got doubts about how to translate :
the optional property : '?'
the associated class : customFields: CustomFields;
the arrays of associated class (that are self associated) and that are nullable : children?: Category[]|null;
I've always worked with typed language until now and it's destabilising my habits to just write nothing.
would that look like this (it's a model for django.db migration):
>from django.db import models
>>class Category(models.Model):
>>> id = models.AutoField(primary_key=True)
>>> type: 'shop'|'blog'
>>> name = models.CharField(max_length=100)
>>> slug = models.CharField(max_length=100)
>>> path = models.CharField(max_length=250)
and then ... ?
could you provide also some tuto, doc, example where you learn python in pratice ?
thanks to all of you !

When you define a function in Python you can enforce static typing but its not necessary. In case you need to have a static typing enforced you can do something like this.
//for functions
def addition(a: int, b: int) -> int:
return a + b
addition(4,10)
//For Variables or attributes
name: str = 'test'
age: int = 10
rating: float = 1.11
is_exist: bool = True
There are more things found in python documentation related to typing in case
you can refer documentation.

it's best to learn the language by following the documentation for it, there are enough examples in the documentation. https://docs.python.org/3/
If you want to learn Django, there are quite a lot of tutorials. But first of all, look here. https://docs.djangoproject.com/en/3.2/
As for your example with the Category Class
The id is determined automatically and it is not necessary to explicitly specify it in the model, see more https://docs.djangoproject.com/en/3.2/topics/db/models/

If you must enforce typing in your Python code, take a look at isinstance.
For optional Class attributes in Python, you can use keyword arguments as shown in this answer.

True way to use setting in python project [duplicate]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 2 years ago.
The community reviewed whether to reopen this question last year and left it closed:
Original close reason(s) were not resolved
Improve this question
In my endless quest in over-complicating simple stuff, I am researching the most 'Pythonic' way to provide global configuration variables inside the typical 'config.py' found in Python egg packages.
The traditional way (aah, good ol' #define!) is as follows:
MYSQL_PORT = 3306
MYSQL_DATABASE = 'mydb'
MYSQL_DATABASE_TABLES = ['tb_users', 'tb_groups']
Therefore global variables are imported in one of the following ways:
from config import *
dbname = MYSQL_DATABASE
for table in MYSQL_DATABASE_TABLES:
print table
or:
import config
dbname = config.MYSQL_DATABASE
assert(isinstance(config.MYSQL_PORT, int))
It makes sense, but sometimes can be a little messy, especially when you're trying to remember the names of certain variables. Besides, providing a 'configuration' object, with variables as attributes, might be more flexible. So, taking a lead from bpython config.py file, I came up with:
class Struct(object):
def __init__(self, *args):
self.__header__ = str(args[0]) if args else None
def __repr__(self):
if self.__header__ is None:
return super(Struct, self).__repr__()
return self.__header__
def next(self):
""" Fake iteration functionality.
"""
raise StopIteration
def __iter__(self):
""" Fake iteration functionality.
We skip magic attribues and Structs, and return the rest.
"""
ks = self.__dict__.keys()
for k in ks:
if not k.startswith('__') and not isinstance(k, Struct):
yield getattr(self, k)
def __len__(self):
""" Don't count magic attributes or Structs.
"""
ks = self.__dict__.keys()
return len([k for k in ks if not k.startswith('__')\
and not isinstance(k, Struct)])
and a 'config.py' that imports the class and reads as follows:
from _config import Struct as Section
mysql = Section("MySQL specific configuration")
mysql.user = 'root'
mysql.pass = 'secret'
mysql.host = 'localhost'
mysql.port = 3306
mysql.database = 'mydb'
mysql.tables = Section("Tables for 'mydb'")
mysql.tables.users = 'tb_users'
mysql.tables.groups = 'tb_groups'
and is used in this way:
from sqlalchemy import MetaData, Table
import config as CONFIG
assert(isinstance(CONFIG.mysql.port, int))
mdata = MetaData(
"mysql://%s:%s#%s:%d/%s" % (
CONFIG.mysql.user,
CONFIG.mysql.pass,
CONFIG.mysql.host,
CONFIG.mysql.port,
CONFIG.mysql.database,
)
)
tables = []
for name in CONFIG.mysql.tables:
tables.append(Table(name, mdata, autoload=True))
Which seems a more readable, expressive and flexible way of storing and fetching global variables inside a package.
Lamest idea ever? What is the best practice for coping with these situations? What is your way of storing and fetching global names and variables inside your package?

How about just using the built-in types like this:
config = {
"mysql": {
"user": "root",
"pass": "secret",
"tables": {
"users": "tb_users"
}
# etc
}
}
You'd access the values as follows:
config["mysql"]["tables"]["users"]
If you are willing to sacrifice the potential to compute expressions inside your config tree, you could use YAML and end up with a more readable config file like this:
mysql:
- user: root
- pass: secret
- tables:
- users: tb_users
and use a library like PyYAML to conventiently parse and access the config file

I like this solution for small applications:
class App:
__conf = {
"username": "",
"password": "",
"MYSQL_PORT": 3306,
"MYSQL_DATABASE": 'mydb',
"MYSQL_DATABASE_TABLES": ['tb_users', 'tb_groups']
}
__setters = ["username", "password"]
#staticmethod
def config(name):
return App.__conf[name]
#staticmethod
def set(name, value):
if name in App.__setters:
App.__conf[name] = value
else:
raise NameError("Name not accepted in set() method")
And then usage is:
if __name__ == "__main__":
# from config import App
App.config("MYSQL_PORT") # return 3306
App.set("username", "hi") # set new username value
App.config("username") # return "hi"
App.set("MYSQL_PORT", "abc") # this raises NameError
.. you should like it because:
uses class variables (no object to pass around/ no singleton required),
uses encapsulated built-in types and looks like (is) a method call on App,
has control over individual config immutability, mutable globals are the worst kind of globals.
promotes conventional and well named access / readability in your source code
is a simple class but enforces structured access, an alternative is to use #property, but that requires more variable handling code per item and is object-based.
requires minimal changes to add new config items and set its mutability.
--Edit--:
For large applications, storing values in a YAML (i.e. properties) file and reading that in as immutable data is a better approach (i.e. blubb/ohaal's answer).
For small applications, this solution above is simpler.

How about using classes?
# config.py
class MYSQL:
PORT = 3306
DATABASE = 'mydb'
DATABASE_TABLES = ['tb_users', 'tb_groups']
# main.py
from config import MYSQL
print(MYSQL.PORT) # 3306

Let's be honest, we should probably consider using a Python Software Foundation maintained library:
https://docs.python.org/3/library/configparser.html
Config example: (ini format, but JSON available)
[DEFAULT]
ServerAliveInterval = 45
Compression = yes
CompressionLevel = 9
ForwardX11 = yes
[bitbucket.org]
User = hg
[topsecret.server.com]
Port = 50022
ForwardX11 = no
Code example:
>>> import configparser
>>> config = configparser.ConfigParser()
>>> config.read('example.ini')
>>> config['DEFAULT']['Compression']
'yes'
>>> config['DEFAULT'].getboolean('MyCompression', fallback=True) # get_or_else
Making it globally-accessible:
import configpaser
class App:
__conf = None
#staticmethod
def config():
if App.__conf is None: # Read only once, lazy.
App.__conf = configparser.ConfigParser()
App.__conf.read('example.ini')
return App.__conf
if __name__ == '__main__':
App.config()['DEFAULT']['MYSQL_PORT']
# or, better:
App.config().get(section='DEFAULT', option='MYSQL_PORT', fallback=3306)
....
Downsides:
Uncontrolled global mutable state.

A small variation on Husky's idea that I use. Make a file called 'globals' (or whatever you like) and then define multiple classes in it, as such:
#globals.py
class dbinfo : # for database globals
username = 'abcd'
password = 'xyz'
class runtime :
debug = False
output = 'stdio'
Then, if you have two code files c1.py and c2.py, both can have at the top
import globals as gl
Now all code can access and set values, as such:
gl.runtime.debug = False
print(gl.dbinfo.username)
People forget classes exist, even if no object is ever instantiated that is a member of that class. And variables in a class that aren't preceded by 'self.' are shared across all instances of the class, even if there are none. Once 'debug' is changed by any code, all other code sees the change.
By importing it as gl, you can have multiple such files and variables that lets you access and set values across code files, functions, etc., but with no danger of namespace collision.
This lacks some of the clever error checking of other approaches, but is simple and easy to follow.

Similar to blubb's answer. I suggest building them with lambda functions to reduce code. Like this:
User = lambda passwd, hair, name: {'password':passwd, 'hair':hair, 'name':name}
#Col Username Password Hair Color Real Name
config = {'st3v3' : User('password', 'blonde', 'Steve Booker'),
'blubb' : User('12345678', 'black', 'Bubb Ohaal'),
'suprM' : User('kryptonite', 'black', 'Clark Kent'),
#...
}
#...
config['st3v3']['password'] #> password
config['blubb']['hair'] #> black
This does smell like you may want to make a class, though.
Or, as MarkM noted, you could use namedtuple
from collections import namedtuple
#...
User = namedtuple('User', ['password', 'hair', 'name']}
#Col Username Password Hair Color Real Name
config = {'st3v3' : User('password', 'blonde', 'Steve Booker'),
'blubb' : User('12345678', 'black', 'Bubb Ohaal'),
'suprM' : User('kryptonite', 'black', 'Clark Kent'),
#...
}
#...
config['st3v3'].password #> passwd
config['blubb'].hair #> black

I did that once. Ultimately I found my simplified basicconfig.py adequate for my needs. You can pass in a namespace with other objects for it to reference if you need to. You can also pass in additional defaults from your code. It also maps attribute and mapping style syntax to the same configuration object.

please check out the IPython configuration system, implemented via traitlets for the type enforcement you are doing manually.
Cut and pasted here to comply with SO guidelines for not just dropping links as the content of links changes over time.
traitlets documentation
Here are the main requirements we wanted our configuration system to have:
Support for hierarchical configuration information.
Full integration with command line option parsers. Often, you want to read a configuration file, but then override some of the values with command line options. Our configuration system automates this process and allows each command line option to be linked to a particular attribute in the configuration hierarchy that it will override.
Configuration files that are themselves valid Python code. This accomplishes many things. First, it becomes possible to put logic in your configuration files that sets attributes based on your operating system, network setup, Python version, etc. Second, Python has a super simple syntax for accessing hierarchical data structures, namely regular attribute access (Foo.Bar.Bam.name). Third, using Python makes it easy for users to import configuration attributes from one configuration file to another.
Fourth, even though Python is dynamically typed, it does have types that can be checked at runtime. Thus, a 1 in a config file is the integer ‘1’, while a '1' is a string.
A fully automated method for getting the configuration information to the classes that need it at runtime. Writing code that walks a configuration hierarchy to extract a particular attribute is painful. When you have complex configuration information with hundreds of attributes, this makes you want to cry.
Type checking and validation that doesn’t require the entire configuration hierarchy to be specified statically before runtime. Python is a very dynamic language and you don’t always know everything that needs to be configured when a program starts.
To acheive this they basically define 3 object classes and their relations to each other:
1) Configuration - basically a ChainMap / basic dict with some enhancements for merging.
2) Configurable - base class to subclass all things you'd wish to configure.
3) Application - object that is instantiated to perform a specific application function, or your main application for single purpose software.
In their words:
Application: Application
An application is a process that does a specific job. The most obvious application is the ipython command line program. Each application reads one or more configuration files and a single set of command line options and then produces a master configuration object for the application. This configuration object is then passed to the configurable objects that the application creates. These configurable objects implement the actual logic of the application and know how to configure themselves given the configuration object.
Applications always have a log attribute that is a configured Logger. This allows centralized logging configuration per-application.
Configurable: Configurable
A configurable is a regular Python class that serves as a base class for all main classes in an application. The Configurable base class is lightweight and only does one things.
This Configurable is a subclass of HasTraits that knows how to configure itself. Class level traits with the metadata config=True become values that can be configured from the command line and configuration files.
Developers create Configurable subclasses that implement all of the logic in the application. Each of these subclasses has its own configuration information that controls how instances are created.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

What is the point of using `MISSING` in hydra or pydantic? - python

Related

Pydantic schema logic

What's the diference between 'normal' Python Classes and Pydantic Classes?

Automatic indent pythoh dataclasses with commentы as Go structs style

General syntax python question : from typed language to untyped one

True way to use setting in python project [duplicate]

Categories

Resources