Let's say I have an app.py like this
class myClassA :
def __init__(self):
self.id = 100
class myClassB :
def __init__(self, objA, id):
pass
Is there a way to use hydra to have a config file like below work like it intuitively should ?
myClassA:
_target_: myapp.myClassA
myclassB:
_target_: myapp.myClassB
param1: ${myClassA}
param2: ${myclassB.param1.id}
My issue is that in order to instanciate my class B, I need an attribute from the class A object but this attribute is set in the init function of classA and cannot be set in the config file.
I've tried putting id: ??? but it didn't work
Thank a lot !
The following does the trick:
# app.py
import hydra
from hydra.utils import instantiate
from omegaconf import OmegaConf
class myClassA:
def __init__(self):
self.id = 100
class myClassB:
def __init__(self, objA, objA_id):
assert isinstance(objA, myClassA)
assert objA_id == 100
print("myClassB __init__ ran")
#hydra.main(config_name="conf.yaml", config_path=".", version_base="1.2")
def app(cfg):
instantiate(cfg)
if __name__ == "__main__":
app()
# conf.yaml
myClassA:
_target_: __main__.myClassA
myClassB:
_target_: __main__.myClassB
objA: ${myClassA}
objA_id:
_target_: builtins.getattr
_args_:
- ${myClassA}
- "id"
$ python app.py
myClassB __init__ ran
How does this work? Using builtins.getattr as a target allows for looking up the "id" attribute on an instance of myClassA.
NOTE: Several instances of myClassA will be created here. There is an open feature request in Hydra regarding support for a singleton pattern in recursive instantiation, which would enable re-using the same instance of myClassA in several places.
Related
I have a hydra configuration in which I have to use dataclasses. As values for some members I want again to use configs which inherit from some common baseclass. Lets have a look at the following minimal example:
from dataclasses import dataclass, field
from typing import List, Any
import hydra.utils
from hydra.core.config_store import ConfigStore
from omegaconf import MISSING
# data.py:
# for a machine learning project, I have two different dataset classes.
class Dataset1:
def __init__(self, member1):
pass
class Dataset2:
def __init__(self, member2):
pass
# They have each their own config dataclass with different members.
# For later use they are also based on a common base class.
#dataclass
class DataConfig:
"""This is just a common base class."""
pass
#dataclass
class Dataset1Config(DataConfig):
_target_: str = "Dataset1"
member1: int = 1
#dataclass
class Dataset2Config(DataConfig):
_target_: str = "Dataset2"
member2: int = 2
# I register them at some place in my folder structure.
cs = ConfigStore.instance()
cs.store(group="some/folder", name=Dataset1Config.__name__, node=Dataset1Config)
cs.store(group="some/folder", name=Dataset2Config.__name__, node=Dataset2Config)
# main.py:
# for the training routine of my machine learning project, I also have a config that needs a dataset
# usually I would use dataset1, so I have this as a default.
# In any case, I want to be able to use any config inheriting from `DataConfig` as config for my dataset.
#dataclass
class TrainConfig:
defaults: List[Any] = field(default_factory=lambda: ["some/folder/Dataset1Config#dataset"])
dataset: DataConfig = MISSING
cs.store(name="TrainConfig", node=TrainConfig)
#hydra.main(config_name="my_config", version_base="1.2", config_path=".")
def main(cfg):
instance_dict = hydra.utils.instantiate(cfg)
if __name__ == "__main__":
main()
Now I want to use Dataset2Config instead of the default Dataset1Config. To this end, I pass the following my_config.yaml to the script.
# my_config.yaml
defaults:
- TrainConfig # I want to have this, because in reality there are other defaults set in it I want to use
- /some/folder/Dataset2Config#dataset # trying to overwrite the default value in the TrainConfig
I would now like to have all stuff from Dataset1Config replaced by Dataset2Config. However the cfg I obtain when running the main.py is {'dataset': {'_target_': 'Dataset2', 'member1': 1, 'member2': 2}}. (In other slightly more complex examples, the config wasn't built at all, but I can't yet reproduce that).
What do I have to do, to end up with a cfg like {'dataset': {'_target_': 'Dataset2', 'member2': 2}}?
I have a factory as shown in the following code:
class ClassFactory:
registry = {}
#classmethod
def register(cls, name):
def inner_wrapper(wrapped_class):
if name in cls.registry:
print(f'Class {name} already exists. Will replace it')
cls.registry[name] = wrapped_class
return wrapped_class
return inner_wrapper
#classmethod
def create_type(cls, name):
exec_class = cls.registry[name]
type = exec_class()
return type
#ClassFactory.register('Class 1')
class M1():
def __init__(self):
print ("Starting Class 1")
#ClassFactory.register('Class 2')
class M2():
def __init__(self):
print("Starting Class 2")
This works fine and when I do
if __name__ == '__main__':
print(ClassFactory.registry.keys())
foo = ClassFactory.create_type("Class 2")
I get the expected result of dict_keys(['Class 1', 'Class 2']) Starting Class 2
Now the problem is that I want to isolate classes M1 and M2 to their own files m1.py and m2.py, and in the future add other classes using their own files in a plugin manner.
However, simply placing it in their own file
m2.py
from test_ import ClassFactory
#MethodFactory.register('Class 2')
class M2():
def __init__(self):
print("Starting Class 2")
gives the result dict_keys(['Class 1']) since it never gets to register the class.
So my question is: How can I ensure that the class is registered when placed in a file different from the factory, without making changes to the factory file whenever I want to add a new class? How to self register in this way? Also, is this decorator way a good way to do this kind of thing, or are there better practices?
Thanks
How can I ensure that the class is registered when placed in a file different from the factory, without making changes to the factory file whenever I want to add a new class?
I'm playing around with a similar problem, and I've found a possible solution. It seems too much of a 'hack' though, so set your critical thinking levels to 'high' when reading my suggestion below :)
As you've mentioned in one of your comments above, the trick is to force the loading of the individual *.py files that contain individual class definitions.
Applying this to your example, this would involve:
Keeping all class implementations in a specific folders, e.g., structuring the files as follows:
.
└- factory.py # file with the ClassFactory class
└─ classes/
└- __init__.py
└- m1.py # file with M1 class
└- m2.py # file with M2 class
Adding the following statement to the end of your factory.py file, which will take care of loading and registering each individual class:
from classes import *
Add a piece of code like the snippet below to your __init__.py within the classes/ foder, so that to dynamically load all classes [1]:
from inspect import isclass
from pkgutil import iter_modules
from pathlib import Path
from importlib import import_module
# iterate through the modules in the current package
package_dir = Path(__file__).resolve().parent
for (_, module_name, _) in iter_modules([package_dir]):
# import the module and iterate through its attributes
module = import_module(f"{__name__}.{module_name}")
for attribute_name in dir(module):
attribute = getattr(module, attribute_name)
if isclass(attribute):
# Add the class to this package's variables
globals()[attribute_name] = attribute
If I then run your test code, I get the desired result:
# test.py
from factory import ClassFactory
if __name__ == "__main__":
print(ClassFactory.registry.keys())
foo = ClassFactory.create_type("Class 2")
$ python test.py
dict_keys(['Class 1', 'Class 2'])
Starting Class 2
Also, is this decorator way a good way to do this kind of thing, or are there better practices?
Unfortunately, I'm not experienced enough to answer this question. However, when searching for answers to this problem, I've came across the following sources that may be helpful to you:
[2] : this presents a method for registering class existence based on Python Metaclasses. As far as I understand, it relies on the registering of subclasses, so I don't know how well it applies to your case. I did not follow this approach, as I've noticed that the new edition of the book suggests the use of another technique (see bullet below).
[3], item 49 : this is the 'current' suggestion for subclass registering, which relies on the definition of the __init_subclass__() function in a base class.
If I had to apply the __init_subclass__() approach to your case, I'd do the following:
Add a Registrable base class to your factory.py (and slightly re-factor ClassFactory), like this:
class Registrable:
def __init_subclass__(cls, name:str):
ClassFactory.register(name, cls)
class ClassFactory:
registry = {}
#classmethod
def register(cls, name:str, sub_class:Registrable):
if name in cls.registry:
print(f'Class {name} already exists. Will replace it')
cls.registry[name] = sub_class
#classmethod
def create_type(cls, name):
exec_class = cls.registry[name]
type = exec_class()
return type
from classes import *
Slightly modify your concrete classes to inherit from the Registrable base class, e.g.:
from factory import Registrable
class M2(Registrable, name='Class 2'):
def __init__(self):
print ("Starting Class 2")
I have scenario where I am passing a file name and checking if it has argument start as constructor if it has then I have to create instance of that class.
Consider the example where I have a file named test.py which have three class namely A,B,C now only class A has start parameter others have other different parameter or extra parameter.
#test.py
class A:
def __init__(self, start=""):
pass
class B:
def __init__(self, randomKeyword, start=""):
pass
class C:
def __init__(self):
pass
Now I want to write a script which takes test.py as an argument and create instance of A. Till now my progress is
detail = importlib.util.spec_from_file_location('test.py', '/path/to/test.py')
module = importlib.util.module_from_spec(detail)
spec.loader.exec_module(mod)
Bacially I need to write a program which finds init argument of all class in file and create an instance of file with start as init argument.
As mentioned by #deceze it's not a good idea to instantiate a class on the basis of it's init parameter as we're not sure what is there. But it's possible to do it. So I am posting this answer just so that you know how it can be done.
#test.py
class A:
def __init__(self, start=""):
pass
class B:
def __init__(self, randomKeyword, start=""):
pass
class C:
def __init__(self):
pass
One of the possibility is
#init.py
import importlib.util
from inspect import getmembers, isclass, signature
detail = importlib.util.spec_from_file_location('test.py', '/path/to/test.py')
module = importlib.util.module_from_spec(detail)
spec.loader.exec_module(module)
for name, data in getmembers(mod, isclass):
cls = getattr(mod, name)
parameter = signature(cls.__init__).parameters.keys()
# parameter start
if len(parameter) == 2 and 'start' in parameter:
object = cls(start="Whatever you want")
Ofcourse it's not the best approach so more answer are welcome and if you are in this scenario consider #deceze comment and define a builder.
I'm fairly new to python and currently attempting to write a unit test for a class, but am having some problems with mocking out dependencies. I have 2 classes, one of which (ClassB) is a dependency of the other (ClassC). The goal is to mock out ClassB and the ArgumentParser classes in the test case for ClassC. ClassB looks as follows:
# defined in a.b.b
class ClassB:
def doStuff(self) -> None:
# do stuff
pass
def doSomethingElse(self) -> None:
# do something else
pass
ClassC:
# defined in a.b.c
from .b import ClassB
from argparse import ArgumentParser
class ClassC:
b
def __init__(self) -> None:
arguments = self.parseArguments()
self.b = ClassB()
self.b.doStuff()
def close(self) -> None:
self.b.doSomethingElse()
def parseArguments(self) -> dict:
c = ArgumentParser()
return return parser.parse_args()
And finally, the test case for ClassC:
# inside a.b.test
from unittest import TestCase
from unittest.mock import patch, MagicMock
from a.b.c import ClassC
class ClassCTest(TestCase):
#patch('a.b.c.ClassB')
#patch('a.b.c.ArgumentParser')
def test__init__(self, mock_ArgumentParser, mock_ClassB):
c = ClassC()
print(isinstance(c.b, MagicMock)) # outputs False
# for reference
print(isinstance(mock_ClassB, MagicMock)) # outputs True
I read in the patch docs that it's important to mock the class in the namespace it is used not where it is defined. So that's what I did, I mocked: a.b.c.classB instead of a.b.b.classB, have tried both though. I also tried importing ClassC inside the test__init__ method body, but this also didn't work.
I prefer not mocking methods of ClassB but rather the entire class to keep the test as isolated as possible.
Environment info:
Python 3.6.1
Any help would be greatly appreciated!
Since i'm new to python i didn't know about class attributes. I had a class attribute in ClassC that held ClassB and an instance attribute in init that shadowed the class attribute.
I have a class instance I want to access in other modules. This class loads config values using configParser to update an class instance __dict__ attribute as per this post:
I want to access this instance in other module. The instance is only created in the main.py file where it has access to the required parameters, which come via command line arguments.
I have three files: main.py, config.py and file.py. I don't know the best way to access the instance in the file.py. I only have access to it in main.py and not other modules.
I've looked at the following answers, here and here but they don't fully answer my scenario.
#config.py
class Configuration():
def __init__(self, *import_sections):
#use configParser, get config for relevant sections, update self.__dict__
#main.py
from config import Configuration
conf = Configuration('general', 'dev')
# other lines of code use conf instance ... e.g. config.log_path in log setup
#file.py
#I want to use config instance like this:
class File():
def __init__(self, conf.feed_path):
# other code here...
Options considered:
Initialise Configuration in config.py module
In config.py after class definition I could add:
conf = Configuration('general', 'dev')
and in file.py and main.py:
from config import conf
but the general and dev variables are only found in main.py so doesn't look like it will work.
Make Configuration class a function
I could make it a function and create a module-level dictionary and import data into other modules:
#config.py
conf = {}
def set_config(*import_section):
# use configParser, update conf dictionary
conf.update(...)
This would mean referring to it as config.conf['log_path'] for example. I'd prefer conf.log_path as it's used multiple times.
Pass via other instances
I could pass the conf instance as parameters via other class instances from main.py, even if the intermediate instances don't use it. Seems very messy.
Other options?
Can I use Configuration as an instance somehow?
By changing your Configuration class into a Borg, you are guaranteed to get a common state from wherever you want. You can either provide initialization through a specific __init__:
#config.py
class Configuration:
__shared_state = {}
def __init__(self, *import_sections):
self.__dict__ = self.__shared_state
if not import_sections: # we are not initializing this time
return
#your old code verbatim
initialization is donne as usual with a c = config.Configuration('general','dev') and any call to conf = config.Configuration() will get the state that c created.
or you can provide an initialization method to avoid tampering with the shared state in the __init__:
#config.py
class Configuration:
__shared_state = {}
def __init__(self):
self.__dict__ = self.__shared_state
def import(self, *import_sections):
#your old __init__
that way there is only one meaning to the __init__ method, which is cleaner.
In both cases, you can get the shared state, once initialized, from anywhere in your code by using config.Configuration().