Suppose in "./data_writers/excel_data_writer.py", I have:
from generic_data_writer import GenericDataWriter
class ExcelDataWriter(GenericDataWriter):
def __init__(self, config):
super().__init__(config)
self.sheet_name = config.get('sheetname')
def write_data(self, pandas_dataframe):
pandas_dataframe.to_excel(
self.get_output_file_path_and_name(), # implemented in GenericDataWriter
sheet_name=self.sheet_name,
index=self.index)
In "./data_writers/csv_data_writer.py", I have:
from generic_data_writer import GenericDataWriter
class CSVDataWriter(GenericDataWriter):
def __init__(self, config):
super().__init__(config)
self.delimiter = config.get('delimiter')
self.encoding = config.get('encoding')
def write_data(self, pandas_dataframe):
pandas_dataframe.to_csv(
self.get_output_file_path_and_name(), # implemented in GenericDataWriter
sep=self.delimiter,
encoding=self.encoding,
index=self.index)
In "./datawriters/generic_data_writer.py", I have:
import os
class GenericDataWriter:
def __init__(self, config):
self.output_folder = config.get('output_folder')
self.output_file_name = config.get('output_file')
self.output_file_path_and_name = os.path.join(self.output_folder, self.output_file_name)
self.index = config.get('include_index') # whether to include index column from Pandas' dataframe in the output file
Suppose I have a JSON config file that has a key-value pair like this:
{
"__comment__": "Here, user can provide the path and python file name of the custom data writer module she wants to use."
"custom_data_writer_module": "./data_writers/excel_data_writer.py"
"there_are_more_key_value_pairs_in_this_JSON_config_file": "for other input parameters"
}
In "main.py", I want to import the data writer module based on the custom_data_writer_module provided in the JSON config file above. So I wrote this:
import os
import importlib
def main():
# Do other things to read and process data
data_writer_class_file = config.get('custom_data_writer_module')
data_writer_module = importlib.import_module\
(os.path.splitext(os.path.split(data_writer_class_file)[1])[0])
dw = data_writer_module.what_should_this_be? # <=== Here, what should I do to instantiate the right specific data writer (Excel or CSV) class instance?
for df in dataframes_to_write_to_output_file:
dw.write_data(df)
if __name__ == "__main__":
main()
As I asked in the code above, I want to know if there's a way to retrieve and instantiate the class defined in a Python module assuming that there is ONLY ONE class defined in the module. Or if there is a better way to refactor my code (using some sort of pattern) without changing the structure of JSON config file described above, I'd like to learn from Python experts on StackOverflow. Thank you in advance for your suggestions!
You can do this easily with vars:
cls1,=[v for k,v in vars(data_writer_module).items()
if isinstance(v,type)]
dw=cls1(config)
The comma enforces that exactly one class is found. If the module is allowed to do anything like from collections import deque (or even foo=str), you might need to filter based on v.__module__.
I'm trying to use Python's PyYAML to create a custom tag that will allow me to retrieve environment variables with my YAML.
import os
import yaml
class EnvTag(yaml.YAMLObject):
yaml_tag = u'!Env'
def __init__(self, env_var):
self.env_var = env_var
def __repr__(self):
return os.environ.get(self.env_var)
settings_file = open('conf/defaults.yaml', 'r')
settings = yaml.load(settings_file)
And inside of defaults.yaml is simply:
example: !ENV foo
The error I keep getting:
yaml.constructor.ConstructorError:
could not determine a constructor for the tag '!ENV' in
"defaults.yaml", line 1, column 10
I plan to have more than one custom tag as well (assuming I can get this one working)
Your PyYAML class had a few problems:
yaml_tag is case sensitive, so !Env and !ENV are different tags.
So, as per the documentation, yaml.YAMLObject uses meta-classes to define itself, and has default to_yaml and from_yaml functions for those cases. By default, however, those functions require that your argument to your custom tag (in this case !ENV) be a mapping. So, to work with the default functions, your defaults.yaml file must look like this (just for example) instead:
example: !ENV {env_var: "PWD", test: "test"}
Your code will then work unchanged, in my case print(settings) now results in {'example': /home/Fred} But you're using load instead of safe_load -- in their answer below, Anthon pointed out that this is dangerous because the parsed YAML can overwrite/read data anywhere on the disk.
You can still easily use your YAML file format, example: !ENV foo—you just have to define an appropriate to_yaml and from_yaml in class EnvTag, ones that can parse and emit scalar variables like the string "foo".
So:
import os
import yaml
class EnvTag(yaml.YAMLObject):
yaml_tag = u'!ENV'
def __init__(self, env_var):
self.env_var = env_var
def __repr__(self):
v = os.environ.get(self.env_var) or ''
return 'EnvTag({}, contains={})'.format(self.env_var, v)
#classmethod
def from_yaml(cls, loader, node):
return EnvTag(node.value)
#classmethod
def to_yaml(cls, dumper, data):
return dumper.represent_scalar(cls.yaml_tag, data.env_var)
# Required for safe_load
yaml.SafeLoader.add_constructor('!ENV', EnvTag.from_yaml)
# Required for safe_dump
yaml.SafeDumper.add_multi_representer(EnvTag, EnvTag.to_yaml)
settings_file = open('defaults.yaml', 'r')
settings = yaml.safe_load(settings_file)
print(settings)
s = yaml.safe_dump(settings)
print(s)
When this program is run, it outputs:
{'example': EnvTag(foo, contains=)}
{example: !ENV 'foo'}
This code has the benefit of (1) using the original pyyaml, so nothing extra to install and (2) adding a representer. :)
I'd like to share how I resolved this as an addendum to the great answers above provided by Anthon and Fredrick Brennan. Thank you for your help.
In my opinion, the PyYAML document isn't real clear as to when you might want to add a constructor via a class (or "metaclass magic" as described in the doc), which may involve re-defining from_yaml and to_yaml, or simply adding a constructor using yaml.add_constructor.
In fact, the doc states:
You may define your own application-specific tags. The easiest way to do it is to define a subclass of yaml.YAMLObject
I would argue the opposite is true for simpler use-cases. Here's how I managed to implement my custom tag.
config/__init__.py
import yaml
import os
environment = os.environ.get('PYTHON_ENV', 'development')
def __env_constructor(loader, node):
value = loader.construct_scalar(node)
return os.environ.get(value)
yaml.add_constructor(u'!ENV', __env_constructor)
# Load and Parse Config
__defaults = open('config/defaults.yaml', 'r').read()
__env_config = open('config/%s.yaml' % environment, 'r').read()
__yaml_contents = ''.join([__defaults, __env_config])
__parsed_yaml = yaml.safe_load(__yaml_contents)
settings = __parsed_yaml[environment]
With this, I can now have a seperate yaml for each environment using an env PTYHON_ENV (default.yaml, development.yaml, test.yaml, production.yaml). And each can now reference ENV variables.
Example default.yaml:
defaults: &default
app:
host: '0.0.0.0'
port: 500
Example production.yaml:
production:
<<: *defaults
app:
host: !ENV APP_HOST
port: !ENV APP_PORT
To use:
from config import settings
"""
If PYTHON_ENV == 'production', prints value of APP_PORT
If PYTHON_ENV != 'production', prints default 5000
"""
print(settings['app']['port'])
If your goal is to find and replace environment variables (as strings) defined in your yaml file, you can use the following approach:
example.yaml:
foo: !ENV "Some string with ${VAR1} and ${VAR2}"
example.py:
import yaml
# Define the function that replaces your env vars
def env_var_replacement(loader, node):
replacements = {
'${VAR1}': 'foo',
'${VAR2}': 'bar',
}
s = node.value
for k, v in replacements.items():
s = s.replace(k, v)
return s
# Define a loader class that will contain your custom logic
class EnvLoader(yaml.SafeLoader):
pass
# Add the tag to your loader
EnvLoader.add_constructor('!ENV', env_var_replacement)
# Now, use your custom loader to load the file:
with open('example.yaml') as yaml_file:
loaded_dict = yaml.load(yaml_file, Loader=EnvLoader)
# Prints: "Some string with foo and bar"
print(loaded_dict['foo'])
It's worth noting, you don't necessarily need to create a custom EnvLoader class. You can call add_constructor directly on the SafeLoader class or the yaml module itself. However, this can have an unintended side-effect of adding your loader globally to all other modules that rely on those loaders, which could potentially cuase problems if those other modules have their own custom logic for loading that !ENV tag.
There are several problems with your code:
!Env in your YAML file is not the same as !ENV in your code.
You are missing the classmethod from_yaml that has to be provided for EnvTag.
Your YAML document specifies a scalar for !Env, but the subclassing mechanism for yaml.YAMLObject calls construct_yaml_object which in turn calls construct_mapping so a scalar is not allowed.
You are using .load(). This is unsafe, unless you have complete control over the YAML input, now and in the future. Unsafe in the sense that uncontrolled YAML can e.g. wipe or upload any information from your disc. PyYAML doesn't warn you for that possible loss.
PyYAML only supports most of YAML 1.1, the latest YAML specification is 1.2 (from 2009).
You should consistently indent your code at 4 spaces at every level (or 3 spaces, but not 4 at the first and 3 a the next level).
your __repr__ doesn't return a string if the environment variable is not set, which will throw an error.
So change your code to:
import sys
import os
from ruamel import yaml
yaml_str = """\
example: !Env foo
"""
class EnvTag:
yaml_tag = u'!Env'
def __init__(self, env_var):
self.env_var = env_var
def __repr__(self):
return os.environ.get(self.env_var, '')
#staticmethod
def yaml_constructor(loader, node):
return EnvTag(loader.construct_scalar(node))
yaml.add_constructor(EnvTag.yaml_tag, EnvTag.yaml_constructor,
constructor=yaml.SafeConstructor)
data = yaml.safe_load(yaml_str)
print(data)
os.environ['foo'] = 'Hello world!'
print(data)
which gives:
{'example': }
{'example': Hello world!}
Please note that I am using ruamel.yaml (disclaimer: I am the author of that package), so you can use YAML 1.2 (or 1.1) in your YAML file. With minor changes you can do the above with the old PyYAML as well.
You can do this by subclassing of YAMLObject as well, and in a safe way:
import sys
import os
from ruamel import yaml
yaml_str = """\
example: !Env foo
"""
yaml.YAMLObject.yaml_constructor = yaml.SafeConstructor
class EnvTag(yaml.YAMLObject):
yaml_tag = u'!Env'
def __init__(self, env_var):
self.env_var = env_var
def __repr__(self):
return os.environ.get(self.env_var, '')
#classmethod
def from_yaml(cls, loader, node):
return EnvTag(loader.construct_scalar(node))
data = yaml.safe_load(yaml_str)
print(data)
os.environ['foo'] = 'Hello world!'
print(data)
This will give you the same results as above.
I need to assign a module & class to a dictionary key. Then pickle that dictionary to file. Then later, load the pkl file, then import & instantiate the class, based on that dictionary key value.
I've tried this:
import module_example
from module_example import ClassExample
dictionary = {'module': module_example, 'class': ClassExample)
Yet it won't store a reference to module_exmaple.py in the pkl file.
I've tried a workaround of using a string instead of the module & class name. But that's going to lead to a mess if the module name gets refactored or location is changed down the road.
Is there anyway do this directly? Somehow store a reference to the module & class in a dictionary, then later import & instantiate based on that reference?
This works for single class. If you want to do this in multiple modules and classes, you can extend the following code.
module_class_writer.py
import module_example
from module_example import ClassExample
included_module = ["module_example"]
d = {}
for name, val in globals().items():
if name in included_module:
if "__module__" in dir(val):
d["module"] = val.__module__
d["class"] = name
#d = {'module': module_example, 'class': ClassExample}
import pickle
filehandler = open("imports.pkl","wb")
pickle.dump(d, filehandler)
filehandler.close()
module_class_reader.py
import pickle
filehandler = open("imports.pkl",'rb')
d = pickle.load(filehandler)
filehandler.close()
def reload_class(module_name, class_name):
mod = __import__(module_name, fromlist=[class_name])
reload(mod)
return getattr(mod, class_name)
if "class" in d and "module" in d:
reload(__import__(d["module"]))
ClassExample = reload_class(d["module"], d["class"])
If you want the unpickled class to be the exact same object that was pickled, you'd have to store the class's code (using an external library like dill, for example).
Otherwise, there is no way with standard pickle to store a reference to a class that will 'survive' a refactoring.
I have the following problem:
# testcase is a list object
i=0
for e in testcase:
testclass = e['class'] #unicode Object
properties = e['properties'] #dict Object
probertie = testcase[i]['properties']
tester = test.ScreenshotCounterTest.ScreenshotCounterTest(probertie)
#tester = InstanceClass(testclass,propertie)
tester.checkResults()
i=i+1
The unicode object testclass contains several unicode strings like test.ScreenshotCounterTest or FileExistTest
I need a way to initialize these Objects with the property argument
a way to dynamically create objects from this testclass list object.
Thanks for your help
If I get what you need, then this should help you:
import importlib
qual_name = "my.module.MyClass"
splitted = qual_name.split(".")
class_name = splitted[-1]
module_name = ".".join(splitted[:-1])
module_obj = importlib.import_module(module_name)
class_obj = getattr(module_obj, class_name)
class_inst = class_obj() # this is instance of my.module.MyClass; assuming that this class has no-argument constructor; in other case use constructor arguments here
importlib interface may have changed between p2.7 and p3k, I'm not sure. Also, there is (was?) __import__ built-in function, but I find importlib most reliable.
In your case testclass/e['class'] will be used instead of qual_name, and tester should be assigned like class_inst in this example.
I have a python module that I've made that contains regular function defintions as well as classes. For some reason when I call the constructor and pass a value, it's not updating the class variable.
My module is called VODControl (VODControl.py). The class I have declared inside the module is called DRMPath. The DRMPath class has two instance vairables: logfile and results. logfile is a string and results is a dictionary.
My constructor looks like this:
def __init__(self, file):
self.logilfe = file
self.results['GbE1'] = ""
self.results['GbE2'] = ""
self.results['NetCrypt'] = ""
self.results['QAM'] = ""
when I import it from my other python script I do:
import VODControl
The call i use is the following:
d = VODControl.DRMPath('/tmp/adk.log')
However, when I print the value of the logfile instance variable, it isn't updated with what I passed to the constructor:
print d.logfile
After printing, it's still an empty string. What gives?
self.logilfe = file is not the same as self.logfile = file In addition, it is likely returning None, not an empty string.