Load YAML nested with Jinja2 in Python - python

I have a YAML file (all.yaml) that looks like:
...
var1: val1
var2: val2
var3: {{var1}}-{{var2}}.txt
...
If I load it in Python like this:
import yaml
f = open('all.yaml')
dataMap = yaml.safe_load(f)
f.close()
print(dataMap["var3"])
the output is {{var1}}-{{var2}}.txt and not val1-val2.txt.
Is it possible to replace the nested vars with the value?
I tried to load it with:
import jinja2
templateLoader = jinja2.FileSystemLoader( searchpath="/path/to/dir" )
templateEnv = jinja2.Environment( loader=templateLoader )
TEMPLATE_FILE = "all.yaml"
template = templateEnv.get_template( TEMPLATE_FILE )
The exception is no longer thrown, now I am stuck and have to research how to proceed.

First define an Undefined class and load yaml to get known values. Then load it again and render with known values.
#!/usr/bin/env python
import yaml
from jinja2 import Template, Undefined
str1 = '''var1: val1
var2: val2
var3: {{var1}}-{{var2}}.txt
'''
class NullUndefined(Undefined):
def __getattr__(self, key):
return ''
t = Template(str1, undefined=NullUndefined)
c = yaml.safe_load(t.render())
print t.render(c)
Run it:
$ ./test.py
var1: val1
var2: val2
var3: val1-val2.txt

Here is one possible solution:
Parse your YAML document with the yaml module
Iterate over the keys in your YAML document, treating each value as a Jinja2 template to which you pass in the keys of the YAML document as parameters.
For example:
import yaml
from jinja2 import Template
with open('sample.yml') as fd:
data = yaml.load(fd)
for k, v in data.items():
t = Template(v)
data[k] = t.render(**data)
print yaml.safe_dump(data, default_flow_style=False)
This will work fine with your particular example, but wouldn't do anything useful for, say, nested data structures (in fact, it would probably just blow up).

There is no replacement/substitution of scalar parts within the YAML specification.
Anything you want to do on that level has to be done in your application. For me, and for YAML, {{var1}} is just a nested mapping. {{var1}} is short for {{var1: null}: null}. After that the - is not allowed.
There are however multiple problems with your post:
You are using PyYAML which only supports the old (2005) YAML 1.1. Therefore you cannot you cannot have multiple documents (i.e. ended with ...) without using an explicit document start (---) like you can in YAML 1.2
Even if you correct the first line to read --- instead of ... your file will not load as a dict {{var1}} cannot be followed by a scalar - (from -{{var2}}.txt)
And if you would just use {{var1}} in your file, PyYAML cannot load this as it loads YAML mappings as Python dict and Python doesn't allow mutable keys for a dict. Just like you get an TypeError in Python when you try to do: {dict(var1=None): None}
So you should at least change your input file all.yaml to:
---
var1: val1
var2: val2
var3: '{{var1}}-{{var2}}.txt'
...
to get this to load in YAML.
You'll have to load this file two times:
once by PyYAML to get the values that you can use to render template
once as template by jinja2
After you render the template you load that (string) once more in PyYAML and you have the value that you want.
Given the corrected all.yaml as specified above in the current directory and this program:
import yaml
import jinja2
YAML_FILE = 'all.yaml'
with open(YAML_FILE) as fp:
dataMap = yaml.safe_load(fp)
env = jinja2.Environment(loader=jinja2.FileSystemLoader(searchpath='.'))
template = env.get_template(YAML_FILE)
data = yaml.safe_load(template.render(**dataMap))
print(data["var3"])
will print what you wanted:
val1-val2.txt

I do not believe you can use:
yaml.load
or
yaml.safe_load
on a file containing jinja2 variables as values. The {{variable}} will attempt to be interpreted as a dict by yaml.

Related

How do I set environment variables in a cfg file that's going to be parsed by ConfigParser()? [duplicate]

it's a little bit I'm out of python syntax and I have a problem in reading a .ini file with interpolated values.
this is my ini file:
[DEFAULT]
home=$HOME
test_home=$home
[test]
test_1=$test_home/foo.csv
test_2=$test_home/bar.csv
Those lines
from ConfigParser import SafeConfigParser
parser = SafeConfigParser()
parser.read('config.ini')
print parser.get('test', 'test_1')
does output
$test_home/foo.csv
while I'm expecting
/Users/nkint/foo.csv
EDIT:
I supposed that the $ syntax was implicitly included in the so called string interpolation (referring to the manual):
On top of the core functionality, SafeConfigParser supports
interpolation. This means values can contain format strings which
refer to other values in the same section, or values in a special
DEFAULT section.
But I'm wrong. How to handle this case?
First of all according to the documentation you should use %(test_home)s to interpolate test_home. Moreover the key are case insensitive and you can't use both HOME and home keys. Finally you can use SafeConfigParser(os.environ) to take in account of you environment.
from ConfigParser import SafeConfigParser
import os
parser = SafeConfigParser(os.environ)
parser.read('config.ini')
Where config.ini is
[DEFAULT]
test_home=%(HOME)s
[test]
test_1=%(test_home)s/foo.csv
test_2=%(test_home)s/bar.csv
You can write custom interpolation in case of Python 3:
import configparser
import os
class EnvInterpolation(configparser.BasicInterpolation):
"""Interpolation which expands environment variables in values."""
def before_get(self, parser, section, option, value, defaults):
value = super().before_get(parser, section, option, value, defaults)
return os.path.expandvars(value)
cfg = """
[section1]
key = value
my_path = $PATH
"""
config = configparser.ConfigParser(interpolation=EnvInterpolation())
config.read_string(cfg)
print(config['section1']['my_path'])
If you want to expand some environment variables, you can do so using os.path.expandvars before parsing a StringIO stream:
import ConfigParser
import os
import StringIO
with open('config.ini', 'r') as cfg_file:
cfg_txt = os.path.expandvars(cfg_file.read())
config = ConfigParser.ConfigParser()
config.readfp(StringIO.StringIO(cfg_txt))
the trick for proper variable substitution from environment is to use the ${} syntax for the environment variables:
[DEFAULT]
test_home=${HOME}
[test]
test_1=%(test_home)s/foo.csv
test_2=%(test_home)s/bar.csv
ConfigParser.get values are strings, even if you set values as integer or True. But ConfigParser has getint, getfloat and getboolean.
settings.ini
[default]
home=/home/user/app
tmp=%(home)s/tmp
log=%(home)s/log
sleep=10
debug=True
config reader
>>> from ConfigParser import SafeConfigParser
>>> parser = SafeConfigParser()
>>> parser.read('/home/user/app/settings.ini')
>>> parser.get('defaut', 'home')
'/home/user/app'
>>> parser.get('defaut', 'tmp')
'/home/user/app/tmp'
>>> parser.getint('defaut', 'sleep')
10
>>> parser.getboolean('defaut', 'debug')
True
Edit
Indeed you could get name values as environ var if you initialize SafeConfigParser with os.environ. Thanks for the Michele's answer.
Quite late, but maybe it can help someone else looking for the same answers that I had recently. Also, one of the comments was how to fetch Environment variables and values from other sections. Here is how I deal with both converting environment variables and multi-section tags when reading in from an INI file.
INI FILE:
[PKG]
# <VARIABLE_NAME>=<VAR/PATH>
PKG_TAG = Q1_RC1
[DELIVERY_DIRS]
# <DIR_VARIABLE>=<PATH>
NEW_DELIVERY_DIR=${DEL_PATH}\ProjectName_${PKG:PKG_TAG}_DELIVERY
Python Class that uses the ExtendedInterpolation so that you can use the ${PKG:PKG_TAG} type formatting. I add the ability to convert the windows environment vars when I read in INI to a string using the builtin os.path.expandvars() function such as ${DEL_PATH} above.
import os
from configparser import ConfigParser, ExtendedInterpolation
class ConfigParser(object):
def __init__(self):
"""
initialize the file parser with
ExtendedInterpolation to use ${Section:option} format
[Section]
option=variable
"""
self.config_parser = ConfigParser(interpolation=ExtendedInterpolation())
def read_ini_file(self, file='./config.ini'):
"""
Parses in the passed in INI file and converts any Windows environ vars.
:param file: INI file to parse
:return: void
"""
# Expands Windows environment variable paths
with open(file, 'r') as cfg_file:
cfg_txt = os.path.expandvars(cfg_file.read())
# Parses the expanded config string
self.config_parser.read_string(cfg_txt)
def get_config_items_by_section(self, section):
"""
Retrieves the configurations for a particular section
:param section: INI file section
:return: a list of name, value pairs for the options in the section
"""
return self.config_parser.items(section)
def get_config_val(self, section, option):
"""
Get an option value for the named section.
:param section: INI section
:param option: option tag for desired value
:return: Value of option tag
"""
return self.config_parser.get(section, option)
#staticmethod
def get_date():
"""
Sets up a date formatted string.
:return: Date string
"""
return datetime.now().strftime("%Y%b%d")
def prepend_date_to_var(self, sect, option):
"""
Function that allows the ability to prepend a
date to a section variable.
:param sect: INI section to look for variable
:param option: INI search variable under INI section
:return: Void - Date is prepended to variable string in INI
"""
if self.config_parser.get(sect, option):
var = self.config_parser.get(sect, option)
var_with_date = var + '_' + self.get_date()
self.config_parser.set(sect, option, var_with_date)
Based on #alex-markov answer (and code) and #srand9 comment, the following solution works with environment variables and cross-section references.
Note that the interpolation is now based on ExtendedInterpolation to allow cross-sections references and on before_read instead of before_get.
#!/usr/bin/env python3
import configparser
import os
class EnvInterpolation(configparser.ExtendedInterpolation):
"""Interpolation which expands environment variables in values."""
def before_read(self, parser, section, option, value):
value = super().before_read(parser, section, option, value)
return os.path.expandvars(value)
cfg = """
[paths]
foo : ${HOME}
[section1]
key = value
my_path = ${paths:foo}/path
"""
config = configparser.ConfigParser(interpolation=EnvInterpolation())
config.read_string(cfg)
print(config['section1']['my_path'])
It seems in the last version 3.5.0, ConfigParser was not reading the env variables, so I end up providing a custom Interpolation based on the BasicInterpolation one.
class EnvInterpolation(BasicInterpolation):
"""Interpolation as implemented in the classic ConfigParser,
plus it checks if the variable is provided as an environment one in uppercase.
"""
def _interpolate_some(self, parser, option, accum, rest, section, map,
depth):
rawval = parser.get(section, option, raw=True, fallback=rest)
if depth > MAX_INTERPOLATION_DEPTH:
raise InterpolationDepthError(option, section, rawval)
while rest:
p = rest.find("%")
if p < 0:
accum.append(rest)
return
if p > 0:
accum.append(rest[:p])
rest = rest[p:]
# p is no longer used
c = rest[1:2]
if c == "%":
accum.append("%")
rest = rest[2:]
elif c == "(":
m = self._KEYCRE.match(rest)
if m is None:
raise InterpolationSyntaxError(option, section,
"bad interpolation variable reference %r" % rest)
var = parser.optionxform(m.group(1))
rest = rest[m.end():]
try:
v = os.environ.get(var.upper())
if v is None:
v = map[var]
except KeyError:
raise InterpolationMissingOptionError(option, section, rawval, var) from None
if "%" in v:
self._interpolate_some(parser, option, accum, v,
section, map, depth + 1)
else:
accum.append(v)
else:
raise InterpolationSyntaxError(
option, section,
"'%%' must be followed by '%%' or '(', "
"found: %r" % (rest,))
The difference between the BasicInterpolation and the EnvInterpolation is in:
v = os.environ.get(var.upper())
if v is None:
v = map[var]
where I'm trying to find the var in the enviornment before checking in the map.
Below is a simple solution that
Can use default value if no environment variable is provided
Overrides variables with environment variables (if found)
needs no custom interpolation implementation
Example:
my_config.ini
[DEFAULT]
HOST=http://www.example.com
CONTEXT=${HOST}/auth/
token_url=${CONTEXT}/oauth2/token
ConfigParser:
import os
import configparser
config = configparser.ConfigParser(interpolation=configparser.ExtendedInterpolation())
ini_file = os.path.join(os.path.dirname(__file__), 'my_config.ini')
# replace variables with environment variables(if exists) before loading ini file
with open(ini_file, 'r') as cfg_file:
cfg_env_txt = os.path.expandvars(cfg_file.read())
config.read_string(cfg_env_txt)
print(config['DEFAULT']['token_url'])
Output:
If no environtment variable $HOST or $CONTEXT is present this config will take the default value
user can override the default value by creating $HOST, $CONTEXT environment variable
works well with docker container

Loading and dumping multiple yaml files with ruamel.yaml (python)

Using python 2 (atm) and ruamel.yaml 0.13.14 (RedHat EPEL)
I'm currently writing some code to load yaml definitions, but they are split up in multiple files. The user-editable part contains eg.
users:
xxxx1:
timestamp: '2018-10-22 11:38:28.541810'
<< : *userdefaults
xxxx2:
<< : *userdefaults
timestamp: '2018-10-22 11:38:28.541810'
the defaults are stored in another file, which is not editable:
userdefaults: &userdefaults
# Default values for user settings
fileCountQuota: 1000
diskSizeQuota: "300g"
I can process these together by loading both and concatinating the strings, and then running them through merged_data = list(yaml.load_all("{}\n{}".format(defaults_data, user_data), Loader=yaml.RoundTripLoader)) which correctly resolves everything. (when not using RoundTripLoader I get errors that the references cannot be resolved, which is normal)
Now, I want to do some updates via python code (eg. update the timestamp), and for that I need to just write back the user part. And that's where things get hairy. I sofar haven't found a way to just write that yaml document, not both.
First of all, unless there are multiple documents in your defaults file, you
don't have to use load_all, as you don't concatenate two documents into a
multiple-document stream. If you had by using a format string with a document-end
marker ("{}\n...\n{}") or with a directives-end marker ("{}\n---\n{}")
your aliases would not carry over from one document to another, as per the
YAML specification:
It is an error for an alias node to use an anchor that does not
previously occur in the document.
The anchor has to be in the document, not just in the stream (which can consist of multiple
documents).
I tried some hocus pocus, pre-populating the already represented dictionary
of anchored nodes:
import sys
import datetime
from ruamel import yaml
def load():
with open('defaults.yaml') as fp:
defaults_data = fp.read()
with open('user.yaml') as fp:
user_data = fp.read()
merged_data = yaml.load("{}\n{}".format(defaults_data, user_data),
Loader=yaml.RoundTripLoader)
return merged_data
class MyRTDGen(object):
class MyRTD(yaml.RoundTripDumper):
def __init__(self, *args, **kw):
pps = kw.pop('pre_populate', None)
yaml.RoundTripDumper.__init__(self, *args, **kw)
if pps is not None:
for pp in pps:
try:
anchor = pp.yaml_anchor()
except AttributeError:
anchor = None
node = yaml.nodes.MappingNode(
u'tag:yaml.org,2002:map', [], flow_style=None, anchor=anchor)
self.represented_objects[id(pp)] = node
def __init__(self, pre_populate=None):
assert isinstance(pre_populate, list)
self._pre_populate = pre_populate
def __call__(self, *args, **kw):
kw1 = kw.copy()
kw1['pre_populate'] = self._pre_populate
myrtd = self.MyRTD(*args, **kw1)
return myrtd
def update(md, file_name):
ud = md.pop('userdefaults')
MyRTD = MyRTDGen([ud])
yaml.dump(md, sys.stdout, Dumper=MyRTD)
with open(file_name, 'w') as fp:
yaml.dump(md, fp, Dumper=MyRTD)
md = load()
md['users']['xxxx2']['timestamp'] = str(datetime.datetime.utcnow())
update(md, 'user.yaml')
Since the PyYAML based API requires a class instead of an object, you need to
use a class generator, that actually adds the data elements to pre-populate on
the fly from withing yaml.load().
But this doesn't work, as a node only gets written out with an anchor once it is
determined that the anchor is used (i.e. there is a second reference). So actually the
first merge key gets written out as an anchor. And although I am quite familiar
with the code base, I could not get this to work properly in a reasonable amount of time.
So instead, I would just rely on the fact that there is only one key that matches
the first key of users.yaml at the root level of the dump of the combined updated
file and strip anything before that.
import sys
import datetime
from ruamel import yaml
with open('defaults.yaml') as fp:
defaults_data = fp.read()
with open('user.yaml') as fp:
user_data = fp.read()
merged_data = yaml.load("{}\n{}".format(defaults_data, user_data),
Loader=yaml.RoundTripLoader)
# find the key
for line in user_data.splitlines():
line = line.split('# ')[0].rstrip() # end of line comment, not checking for strings
if line and line[-1] == ':' and line[0] != ' ':
split_key = line
break
merged_data['users']['xxxx2']['timestamp'] = str(datetime.datetime.utcnow())
buf = yaml.compat.StringIO()
yaml.dump(merged_data, buf, Dumper=yaml.RoundTripDumper)
document = split_key + buf.getvalue().split('\n' + split_key)[1]
sys.stdout.write(document)
which gives:
users:
xxxx1:
<<: *userdefaults
timestamp: '2018-10-22 11:38:28.541810'
xxxx2:
<<: *userdefaults
timestamp: '2018-10-23 09:59:13.829978'
I had to make a virtualenv to make sure I could run the above with ruamel.yaml==0.13.14.
That version is from the time I was still young (I won't claim to have been innocent).
There have been over 85 releases of the library since then.
I can understand that you might not be able to run anything but
Python2 at the moment and cannot compile/use a newer version. But what
you really should do is install virtualenv (can be done using EPEL, but also without
further "polluting" your system installation), make a virtualenv for the
code you are developping and install the latest version of ruamel.yaml (and
your other libraries) in there. You can also do that if you need
to distribute your software to other systems, just install virtualenv there as well.
I have all my utilties under /opt/util, and managed
virtualenvutils a
wrapper around virtualenv.
For writing the user part, you will have to manually split the output of yaml.dump() multifile output and write the appropriate part back to users yaml file.
import datetime
import StringIO
import ruamel.yaml
yaml = ruamel.yaml.YAML(typ='rt')
data = None
with open('defaults.yaml', 'r') as defaults:
with open('users.yaml', 'r') as users:
raw = "{}\n{}".format(''.join(defaults.readlines()), ''.join(users.readlines()))
data = list(yaml.load_all(raw))
data[0]['users']['xxxx1']['timestamp'] = datetime.datetime.now().isoformat()
with open('users.yaml', 'w') as outfile:
sio = StringIO.StringIO()
yaml.dump(data[0], sio)
out = sio.getvalue()
outfile.write(out.split('\n\n')[1]) # write the second part here as this is the contents of users.yaml

How to get data from a yml file, to then return as a dictionary

I have a python function that will open a YAML file and read the data. The YAML file contains two api keys and a domain. I want to return each value in a dictionary so it can be used in the program. However I get the error
"list indices must be integers, not str".
Should I just make the variables global, so it doesn't have to return anything?
The code is:
def ImportConfig():
with open("config.yml", 'r') as ymlfile:
config = yaml.load(ymlfile)
darksky_api = config['darksky']['api_key']
gmaps_api = ['gmaps']['api_key']
gmaps_domain = ['gmaps']['domain']
return {'darksky_api_key': darksky_api, 'gmaps_api_key': gmaps_api, 'gmaps_domain': gmaps_domain }
What does it mean that the list indices must be integers? I thought curly brackets indicated a dictionary? Also is there a better way to do this?
Independent of your yaml file if you type ['xy'] a the prompt of Python you create a list with one element and if you then index that with another string:
['xy']['abc']
you'll get that error.
You are missing config in line 5 and 6 of your program:
def ImportConfig():
with open("config.yml", 'r') as ymlfile:
config = yaml.safe_load(ymlfile)
darksky_api = config['darksky']['api_key']
gmaps_api = config['gmaps']['api_key']
gmaps_domain = config['gmaps']['domain']
return {'darksky_api_key': darksky_api, 'gmaps_api_key': gmaps_api, 'gmaps_domain': gmaps_domain }
please note that using load in PyYAML is security risk and for your data you should use safe_load().

Creating Custom Tag in PyYAML

I'm trying to use Python's PyYAML to create a custom tag that will allow me to retrieve environment variables with my YAML.
import os
import yaml
class EnvTag(yaml.YAMLObject):
yaml_tag = u'!Env'
def __init__(self, env_var):
self.env_var = env_var
def __repr__(self):
return os.environ.get(self.env_var)
settings_file = open('conf/defaults.yaml', 'r')
settings = yaml.load(settings_file)
And inside of defaults.yaml is simply:
example: !ENV foo
The error I keep getting:
yaml.constructor.ConstructorError:
could not determine a constructor for the tag '!ENV' in
"defaults.yaml", line 1, column 10
I plan to have more than one custom tag as well (assuming I can get this one working)
Your PyYAML class had a few problems:
yaml_tag is case sensitive, so !Env and !ENV are different tags.
So, as per the documentation, yaml.YAMLObject uses meta-classes to define itself, and has default to_yaml and from_yaml functions for those cases. By default, however, those functions require that your argument to your custom tag (in this case !ENV) be a mapping. So, to work with the default functions, your defaults.yaml file must look like this (just for example) instead:
example: !ENV {env_var: "PWD", test: "test"}
Your code will then work unchanged, in my case print(settings) now results in {'example': /home/Fred} But you're using load instead of safe_load -- in their answer below, Anthon pointed out that this is dangerous because the parsed YAML can overwrite/read data anywhere on the disk.
You can still easily use your YAML file format, example: !ENV foo—you just have to define an appropriate to_yaml and from_yaml in class EnvTag, ones that can parse and emit scalar variables like the string "foo".
So:
import os
import yaml
class EnvTag(yaml.YAMLObject):
yaml_tag = u'!ENV'
def __init__(self, env_var):
self.env_var = env_var
def __repr__(self):
v = os.environ.get(self.env_var) or ''
return 'EnvTag({}, contains={})'.format(self.env_var, v)
#classmethod
def from_yaml(cls, loader, node):
return EnvTag(node.value)
#classmethod
def to_yaml(cls, dumper, data):
return dumper.represent_scalar(cls.yaml_tag, data.env_var)
# Required for safe_load
yaml.SafeLoader.add_constructor('!ENV', EnvTag.from_yaml)
# Required for safe_dump
yaml.SafeDumper.add_multi_representer(EnvTag, EnvTag.to_yaml)
settings_file = open('defaults.yaml', 'r')
settings = yaml.safe_load(settings_file)
print(settings)
s = yaml.safe_dump(settings)
print(s)
When this program is run, it outputs:
{'example': EnvTag(foo, contains=)}
{example: !ENV 'foo'}
This code has the benefit of (1) using the original pyyaml, so nothing extra to install and (2) adding a representer. :)
I'd like to share how I resolved this as an addendum to the great answers above provided by Anthon and Fredrick Brennan. Thank you for your help.
In my opinion, the PyYAML document isn't real clear as to when you might want to add a constructor via a class (or "metaclass magic" as described in the doc), which may involve re-defining from_yaml and to_yaml, or simply adding a constructor using yaml.add_constructor.
In fact, the doc states:
You may define your own application-specific tags. The easiest way to do it is to define a subclass of yaml.YAMLObject
I would argue the opposite is true for simpler use-cases. Here's how I managed to implement my custom tag.
config/__init__.py
import yaml
import os
environment = os.environ.get('PYTHON_ENV', 'development')
def __env_constructor(loader, node):
value = loader.construct_scalar(node)
return os.environ.get(value)
yaml.add_constructor(u'!ENV', __env_constructor)
# Load and Parse Config
__defaults = open('config/defaults.yaml', 'r').read()
__env_config = open('config/%s.yaml' % environment, 'r').read()
__yaml_contents = ''.join([__defaults, __env_config])
__parsed_yaml = yaml.safe_load(__yaml_contents)
settings = __parsed_yaml[environment]
With this, I can now have a seperate yaml for each environment using an env PTYHON_ENV (default.yaml, development.yaml, test.yaml, production.yaml). And each can now reference ENV variables.
Example default.yaml:
defaults: &default
app:
host: '0.0.0.0'
port: 500
Example production.yaml:
production:
<<: *defaults
app:
host: !ENV APP_HOST
port: !ENV APP_PORT
To use:
from config import settings
"""
If PYTHON_ENV == 'production', prints value of APP_PORT
If PYTHON_ENV != 'production', prints default 5000
"""
print(settings['app']['port'])
If your goal is to find and replace environment variables (as strings) defined in your yaml file, you can use the following approach:
example.yaml:
foo: !ENV "Some string with ${VAR1} and ${VAR2}"
example.py:
import yaml
# Define the function that replaces your env vars
def env_var_replacement(loader, node):
replacements = {
'${VAR1}': 'foo',
'${VAR2}': 'bar',
}
s = node.value
for k, v in replacements.items():
s = s.replace(k, v)
return s
# Define a loader class that will contain your custom logic
class EnvLoader(yaml.SafeLoader):
pass
# Add the tag to your loader
EnvLoader.add_constructor('!ENV', env_var_replacement)
# Now, use your custom loader to load the file:
with open('example.yaml') as yaml_file:
loaded_dict = yaml.load(yaml_file, Loader=EnvLoader)
# Prints: "Some string with foo and bar"
print(loaded_dict['foo'])
It's worth noting, you don't necessarily need to create a custom EnvLoader class. You can call add_constructor directly on the SafeLoader class or the yaml module itself. However, this can have an unintended side-effect of adding your loader globally to all other modules that rely on those loaders, which could potentially cuase problems if those other modules have their own custom logic for loading that !ENV tag.
There are several problems with your code:
!Env in your YAML file is not the same as !ENV in your code.
You are missing the classmethod from_yaml that has to be provided for EnvTag.
Your YAML document specifies a scalar for !Env, but the subclassing mechanism for yaml.YAMLObject calls construct_yaml_object which in turn calls construct_mapping so a scalar is not allowed.
You are using .load(). This is unsafe, unless you have complete control over the YAML input, now and in the future. Unsafe in the sense that uncontrolled YAML can e.g. wipe or upload any information from your disc. PyYAML doesn't warn you for that possible loss.
PyYAML only supports most of YAML 1.1, the latest YAML specification is 1.2 (from 2009).
You should consistently indent your code at 4 spaces at every level (or 3 spaces, but not 4 at the first and 3 a the next level).
your __repr__ doesn't return a string if the environment variable is not set, which will throw an error.
So change your code to:
import sys
import os
from ruamel import yaml
yaml_str = """\
example: !Env foo
"""
class EnvTag:
yaml_tag = u'!Env'
def __init__(self, env_var):
self.env_var = env_var
def __repr__(self):
return os.environ.get(self.env_var, '')
#staticmethod
def yaml_constructor(loader, node):
return EnvTag(loader.construct_scalar(node))
yaml.add_constructor(EnvTag.yaml_tag, EnvTag.yaml_constructor,
constructor=yaml.SafeConstructor)
data = yaml.safe_load(yaml_str)
print(data)
os.environ['foo'] = 'Hello world!'
print(data)
which gives:
{'example': }
{'example': Hello world!}
Please note that I am using ruamel.yaml (disclaimer: I am the author of that package), so you can use YAML 1.2 (or 1.1) in your YAML file. With minor changes you can do the above with the old PyYAML as well.
You can do this by subclassing of YAMLObject as well, and in a safe way:
import sys
import os
from ruamel import yaml
yaml_str = """\
example: !Env foo
"""
yaml.YAMLObject.yaml_constructor = yaml.SafeConstructor
class EnvTag(yaml.YAMLObject):
yaml_tag = u'!Env'
def __init__(self, env_var):
self.env_var = env_var
def __repr__(self):
return os.environ.get(self.env_var, '')
#classmethod
def from_yaml(cls, loader, node):
return EnvTag(loader.construct_scalar(node))
data = yaml.safe_load(yaml_str)
print(data)
os.environ['foo'] = 'Hello world!'
print(data)
This will give you the same results as above.

Dumping Collection to YAML file with PyYaml

I am writing a python application. I am trying to dump my python object into yaml using PyYaml. I am using Python 2.6 and running Ubuntu Lucid 10.04. I am using the PyYAML package in Ubuntu Package: http://packages.ubuntu.com/lucid/python/python-yaml
My object has 3 text variables and a list of objects. Roughly it is something like this:
ClassToDump:
#3 text variables
text_variable_1
text_variable_2
text_variable_3
#a list of AnotherObjectsClass instances
list_of_another_objects = [object1,object2,object3]
AnotherObjectsClass:
text_variable_1
text_variable_2
text_variable_3
The class that I want to dump contains a list of AnotherObjectClass instances. This class has a few text variables.
PyYaml somehow does not dump the collections in AnotherObjectClass instance. PyYAML does dump text_variable_1, text_variable_2, and text_variable_3.
I am using the following pyYaml API to dump ClassToDump instance:
classToDump = ClassToDump();
yaml.dump(ClassToDump,yaml_file_to_dump)
Does any one has any experience with dumping a list of objects into YAML ?
Here is the actual full code snippet:
def write_config(file_path,class_to_dump):
config_file = open(file_path,'w');
yaml.dump(class_to_dump,config_file);
def dump_objects():
rule = Miranda.Rule();
rule.rule_condition = Miranda.ALL
rule.rule_setting = ruleSetting
rule.rule_subjects.append(rule1)
rule.rule_subjects.append(rule2)
rule.rule_verb = ruleVerb
write_config(rule ,'./config.yaml');
This is the output :
!!python/object:Miranda.Rule
rule_condition: ALL
rule_setting: !!python/object:Miranda.RuleSetting {confirm_action: true, description: My
Configuration, enabled: true, recursive: true, source_folder: source_folder}
rule_verb: !!python/object:Miranda.RuleVerb {compression: true, dest_folder: /home/zainul/Downloads,
type: Move File}
The PyYaml module takes care of the details for you, hopefully the following snippet will help
import sys
import yaml
class AnotherClass:
def __init__(self):
pass
class MyClass:
def __init__(self):
self.text_variable_1 = 'hello'
self.text_variable_2 = 'world'
self.text_variable_3 = 'foobar'
self.list_of_another_objects = [
AnotherClass(),
AnotherClass(),
AnotherClass()
]
obj = MyClass()
yaml.dump(obj, sys.stdout)
The output of that code is:
!!python/object:__main__.MyClass
list_of_another_objects:
- !!python/object:__main__.AnotherClass {}
- !!python/object:__main__.AnotherClass {}
- !!python/object:__main__.AnotherClass {}
text_variable_1: hello
text_variable_2: world
text_variable_3: foobar

Categories