Keeping comments in ruamel.yaml - python

I'm trying to manipulate some YAML files using ruamel.yaml, notably removing specific keys. This seems to work, but in the process all comment lines and empty lines that follow the key, up to the next key, are also removed. Minimal example:
from ruamel.yaml import YAML
import sys
yaml_str = """\
# Our app configuration
# foo is our main variable
foo: bar
# foz is also important
foz: quz
# And finally we have our optional configs.
# These are not really mandatory, they can be
# considere as "would be nice".
opt1: foqz
"""
yaml = YAML(typ='rt')
data = yaml.load(yaml_str)
data.pop('foz', None)
yaml.dump(data, sys.stdout)
which gives:
# Our app configuration
# foo is our main variable
foo: bar
# foz is also important
opt1: foqz
Is there maybe a way to avoid this and only remove the key itself, and any inline comment?

The ruamel.yaml documentation states
This preservation is normally not broken unless you severely alter
the structure of a component (delete a key in a dict, remove list entries).
so this behavior should not have come as a surprise.
What you can do is
extend the comment on the key before the one you delete with the comment on the key
you are going to delete (you can actually do it afterwards, as the comment is
still there, there is just no longer a key while dumping to associate it with).
import sys
import ruamel.yaml
yaml_str = """\
# Our app configuration
# foo is our main variable
foo: bar
# foz is also important
foz: quz
# And finally we have our optional configs.
# These are not really mandatory, they can be
# considere as "would be nice".
opt1: foqz
"""
undefined = object()
def my_pop(self, key, default=undefined):
if key not in self:
if default is undefined:
raise KeyError(key)
return default
keys = list(self.keys())
idx = keys.index(key)
if key in self.ca.items:
if idx == 0:
raise NotImplementedError('cannot handle moving comment when popping the first key', key)
prev = keys[idx-1]
# print('prev', prev, self.ca)
comment = self.ca.items.pop(key)[2]
if prev in self.ca.items:
self.ca.items[prev][2].value += comment.value
else:
self.ca.items[prev] = self.ca.items.pop(key)
res = self.__getitem__(key)
self.__delitem__(key)
return res
ruamel.yaml.comments.CommentedMap.pop = my_pop
yaml = ruamel.yaml.YAML()
data = yaml.load(yaml_str)
data.pop('foz', None)
yaml.dump(data, sys.stdout)
which gives:
# Our app configuration
# foo is our main variable
foo: bar
# foz is also important
# And finally we have our optional configs.
# These are not really mandatory, they can be
# considere as "would be nice".
opt1: foqz
If you need to be able to pop the first key in a commented map, then you need to inspect self.ca
and handle its comment attribute, which is somewhat more complicated.
As always when using these kind of internals, you should pin the version of ruamel.yaml that you are using.
This internals will change.

Related

Identify if yaml key is anchor or pointer

I use ruamel.yaml in order to parse YAML files and I'd like to identify if the key is the anchor itself or just a pointer. Given the following:
foo: &some_anchor
bar: 1
baz: *some_anchor
I'd like to understand that foo is the actual anchor and baz is a pointer. From what I can see, there's an anchor property on the node (and also yaml_anchor method), but both baz and foo show that their anchor is some_anchor - meaning that I cannot differentiate.
How can I get this info?
Since PyYaml and Ruamel.yaml load an alias node as a reference of the object loaded from the corresponding anchor node, you can traverse an object tree and check if each node is a reference of a previous visited object or not.
The following is a simple example only checking dictionaries.
from ruamel.yaml import YAML
root = YAML().load('''
foo: &some_anchor
bar: 1
baz: *some_anchor
''')
dict_ids = set()
def visit(parent):
if isinstance(parent, dict):
i = id(parent)
print(parent, ', is_alias:', i in dict_ids)
dict_ids.add(i)
for k, v in parent.items():
visit(v)
elif isinstance(parent, list):
for e in parent:
visit(e)
visit(root)
This will output the following.
ordereddict([('foo', ordereddict([('bar', 1)])), ('baz', ordereddict([('bar', 1)]))]) , is_alias: False
ordereddict([('bar', 1)]) , is_alias: False
ordereddict([('bar', 1)]) , is_alias: True
In your example &some_anchor is the anchor for the single element mapping bar: 1 and
*some_anchor is the alias. Writing the "foo is the actual anchor and baz is pointer`" is
in IMO both incorrect terminology and confusing keys with their (anchored/aliased) values. If you had a YAML document:
- 3
- 5
- 9
- &some_anchor
bar: 1
- 42
- *some_anchor
would you actually say, probably after carefully counting,
that '4 is the anchor and 6 is the pointer(or3and5` depending on
where you start counting)?
If you want to test if a key of a dict has a value that was an anchored node in YAML, or if that
value was an aliased node, you'll have to look at the value, and you'll find that they are the same Python data structure
for keys foo resp. baz.
What determines on dumping, which key's value gets the anchor and which key's (or keys') value(s) are dumped as an alias,
is entirely determined
by which gets dumped first, as the YAML specification stats that an anchor has to come before its use as an alias (an
anchor can come after an alias if it is re-defined).
As #relent95 describes you should recursively walk over the
data structure you loaded (to see which key gets there first) and in both ruamel.yaml and PyYAML look at the id().
But for PyYAML that only works for complex data (dict, list, objects) as it throws away anchoring information and will
not find the same id() on e.g. an anchored integer value.
The alternative to using the id is to look at the actual anchor name that ruamel.yaml stores in attribute/property anchor.
If you know up front that your YAML document is as simple as your example ( anchored/aliased nodes are values for
the root level mapping ) you can do:
import sys
import ruamel.yaml
yaml_str = """\
foo: &some_anchor
bar: 1
baz: *some_anchor
oof: 42
"""
def is_keys_value_anchor(key, data, verbose=0):
anchor_found = set()
for k, v in data.items():
res = None
try:
anchor = v.anchor.value
if anchor is not None:
res = anchor not in anchor_found
anchor_found.add(anchor)
except AttributeError:
pass
if k == key:
break
if verbose > 0:
print(f'key "{key}" {res}')
return res
yaml = ruamel.yaml.YAML()
data = yaml.load(yaml_str)
is_keys_value_anchor('foo', data, verbose=1)
is_keys_value_anchor('baz', data, verbose=1)
is_keys_value_anchor('oof', data, verbose=1)
which gives:
key "foo" True
key "baz" False
key "oof" None
But this in ineffecient for root mappings with lots of keys, and won't find anchors/aliases that were nested deeply
in the document. A more generic approach is to recursively walk the data structure once and create dict with
as key the anchor used, and as value a list of "paths", A path itself being a list of keys/indices with which
which you can traverse the data structure starting at the root. The first path in the list being the anchor, the rest aliases:
import sys
import ruamel.yaml
yaml_str = """\
foo: &some_anchor
- bar: 1
- klm: &anchored_num 42
baz:
xyz:
- *some_anchor
oof: [1, 2, c: 13, magic: [*anchored_num]]
"""
def find_anchor_alias_paths(data, path=None, res=None):
def check_add_anchor(d, path, anchors):
# returns False when an alias is found, to prevent recursing into a node twice.
try:
anchor = d.anchor.value
if anchor is not None:
tmp = anchors.setdefault(anchor, [])
tmp.append(path)
return len(tmp) == 1
except AttributeError:
pass
return True
if path is None:
path = []
if res is None:
res = {}
if isinstance(data, dict):
for k, v in data.items():
next_path = path.copy()
next_path.append(k)
if check_add_anchor(v, next_path, res):
find_anchor_alias_paths(v, next_path, res)
elif isinstance(data, list):
for idx, elem in enumerate(data):
next_path = path.copy()
next_path.append(idx)
if check_add_anchor(elem, next_path, res):
find_anchor_alias_paths(elem, next_path, res)
return res
yaml = ruamel.yaml.YAML()
data = yaml.load(yaml_str)
anchor_alias_paths = find_anchor_alias_paths(data)
for anchor, paths in anchor_alias_paths.items():
print(f'anchor: "{anchor}", anchor_path: {paths[0]}, alias_path(s): {paths[1:]}')
print('value for last anchor/alias found', data.mlget(paths[-1], list_ok=True))
which gives:
anchor: "some_anchor", anchor_path: ['foo'], alias_path(s): [['baz', 'xyz', 0]]
anchor: "anchored_num", anchor_path: ['foo', 1, 'klm'], alias_path(s): [['oof', 3, 'magic', 0]]
value for last anchor/alias found 42
You can then test your the paths you are interested in against the values returned by find_anchor_alias_paths,
or the key against the final elements of such paths.

How to make a config file property unique with config parser?

How can I create a config file with unique only one unique property?
Or are there any loop for checking the config file where I find a duplicated name property value and it gives an exception?
[NAME+TIMESTAMP]
Name=UNIQUE NAME
Property=something
Property1=something1
Property2=something2
[NAME+TIMESTAMP]
Name=UNIQUE NAME
Property=something
Property1=something1
Property2=something2
Due to configParser being dict-like in most things, what we can do here is use the key of the different code-blocks to make sure that there are unique blocks.
Here is a simple test to show it in action:
import configparser
class Container:
configs=[] # keeps a list of all initialized objects.
def __init__(self, **kwargs):
for k,v in kwargs.items():
self.__setattr__(k,v) # Sets each named attribute to their given value.
Container.configs.append(self)
# Initializing some objects.
Container(
Name="Test-Object1",
Property1="Something",
Property2="Something2",
Property3="Something3",
)
Container(
Name="Test-Object2",
Property1="Something",
Property2="Something2",
Property3="Something3",
)
Container(
Name="Test-Object2",
Property1="Something Completely different",
Property2="Something Completely different2",
Property3="Something Completely different3",
)
config = configparser.ConfigParser()
for item in Container.configs: # Loops through all the created objects.
config[item.Name] = item.__dict__ # Adds all variables set on the object, using "Name" as the key.
with open("example.ini", "w") as ConfigFile:
config.write(ConfigFile)
In the above example, I create three objects that contain variables to be set by configparser. However, the third object shares the Name variable with the second Object. That means that the third one will "overwrite" the second while writing the .ini file.
example.ini:
[Test-Object1]
name = Test-Object1
property1 = Something
property2 = Something2
property3 = Something3
[Test-Object2]
name = Test-Object2
property1 = Something Completely different
property2 = Something Completely different2
property3 = Something Completely different3

Python new section in config class

I am trying to write a dynamic config .ini
Where I could add new sections with key and values also I could add key less values.
I have written a code which create a .ini. But the section is coming as 'default'.
Also it is just overwriting the file every time with out adding new section.
I have written a code in python 3 to create a .ini file.
import configparser
"""Generates the configuration file with the config class.
The file is a .ini file"""
class Config:
"""Class for data in uuids.ini file management"""
def __init__(self):
self.config = configparser.ConfigParser()
self.config_file = "conf.ini"
# self.config.read(self.config_file)
def wrt(self, config_name={}):
condict = {
"test": "testval",
'test1': 'testval1',
'test2': 'testval2'
}
for name, val in condict.items():
self.config.set(config_name, name, val)
#self.config.read(self.config_file)
with open(self.config_file, 'w+') as out:
self.config.write(out)
if __name__ == "__main__":
Config().wrt()
I should be able to add new sections with key or with out keys.
Append keys or value.
It should have proper section name.
Some problems with your code:
The usage of mutable objects as default parameters can be a little
tricky and you may see unexpected behavior.
You are using config.set() which is legacy.
you are defaulting config_name to a dictionary, why?
Too much white space :p
You don't need to iterate through the dictionary items to write them using the newer (none legacy) function, as shown below
This should work:
"""Generates the configuration file with the config class.
The file is a .ini file
"""
import configparser
import re
class Config:
"""Class for data in uuids.ini file management."""
def __init__(self):
self.config = configparser.ConfigParser()
self.config_file = "conf.ini"
# self.config.read(self.config_file)
def wrt(self, config_name='DEFAULT', condict=None):
if condict is None:
self.config.add_section(config_name)
return
self.config[config_name] = condict
with open(self.config_file, 'w') as out:
self.config.write(out)
# after writing to file check if keys have no value in the ini file (e.g: key0 = )
# the last character is '=', let us strip it off to only have the key
with open(self.config_file) as out:
ini_data = out.read()
with open(self.config_file, 'w') as out:
new_data = re.sub(r'^(.*?)=\s+$', r'\1', ini_data, 0, re.M)
out.write(new_data)
out.write('\n')
condict = {"test": "testval", 'test1': 'testval1', 'test2': 'testval2'}
c = Config()
c.wrt('my section', condict)
c.wrt('EMPTY')
c.wrt(condict={'key': 'val'})
c.wrt(config_name='NO_VALUE_SECTION', condict={'key0': '', 'key1': ''})
This outputs:
[DEFAULT]
key = val
[my section]
test = testval
test1 = testval1
test2 = testval2
[EMPTY]
[NO_VALUE_SECTION]
key1
key0

How to Parse YAML Using PyYAML if there are '!' within the YAML

I have a YAML file that I'd like to parse the description variable only; however, I know that the exclamation points in my CloudFormation template (YAML file) are giving PyYAML trouble.
I am receiving the following error:
yaml.constructor.ConstructorError: could not determine a constructor for the tag '!Equals'
The file has many !Ref and !Equals. How can I ignore these constructors and get a specific variable I'm looking for -- in this case, the description variable.
If you have to deal with a YAML document with multiple different tags, and
are only interested in a subset of them, you should still
handle them all. If the elements you are intersted in are nested
within other tagged constructs you at least need to handle all of the "enclosing" tags
properly.
There is however no need to handle all of the tags individually, you
can write a constructor routine that can handle mappings, sequences
and scalars register that to PyYAML's SafeLoader using:
import yaml
inp = """\
MyEIP:
Type: !Join [ "::", [AWS, EC2, EIP] ]
Properties:
InstanceId: !Ref MyEC2Instance
"""
description = []
def any_constructor(loader, tag_suffix, node):
if isinstance(node, yaml.MappingNode):
return loader.construct_mapping(node)
if isinstance(node, yaml.SequenceNode):
return loader.construct_sequence(node)
return loader.construct_scalar(node)
yaml.add_multi_constructor('', any_constructor, Loader=yaml.SafeLoader)
data = yaml.safe_load(inp)
print(data)
which gives:
{'MyEIP': {'Type': ['::', ['AWS', 'EC2', 'EIP']], 'Properties': {'InstanceId': 'MyEC2Instance'}}}
(inp can also be a file opened for reading).
As you see above will also continue to work if an unexpected !Join tag shows up in your code,
as well as any other tag like !Equal. The tags are just dropped.
Since there are no variables in YAML, it is a bit of guesswork what
you mean by "like to parse the description variable only". If that has
an explicit tag (e.g. !Description), you can filter out the values by adding 2-3 lines
to the any_constructor, by matching the tag_suffix parameter.
if tag_suffix == u'!Description':
description.append(loader.construct_scalar(node))
It is however more likely that there is some key in a mapping that is a scalar description,
and that you are interested in the value associated with that key.
if isinstance(node, yaml.MappingNode):
d = loader.construct_mapping(node)
for k in d:
if k == 'description':
description.append(d[k])
return d
If you know the exact position in the data hierarchy, You can of
course also walk the data structure and extract anything you need
based on keys or list positions. Especially in that case you'd be better of
using my ruamel.yaml, was this can load tagged YAML in round-trip mode without
extra effort (assuming the above inp):
from ruamel.yaml import YAML
with YAML() as yaml:
data = yaml.load(inp)
You can define a custom constructors using a custom yaml.SafeLoader
import yaml
doc = '''
Conditions:
CreateNewSecurityGroup: !Equals [!Ref ExistingSecurityGroup, NONE]
'''
class Equals(object):
def __init__(self, data):
self.data = data
def __repr__(self):
return "Equals(%s)" % self.data
class Ref(object):
def __init__(self, data):
self.data = data
def __repr__(self):
return "Ref(%s)" % self.data
def create_equals(loader,node):
value = loader.construct_sequence(node)
return Equals(value)
def create_ref(loader,node):
value = loader.construct_scalar(node)
return Ref(value)
class Loader(yaml.SafeLoader):
pass
yaml.add_constructor(u'!Equals', create_equals, Loader)
yaml.add_constructor(u'!Ref', create_ref, Loader)
a = yaml.load(doc, Loader)
print(a)
Outputs:
{'Conditions': {'CreateNewSecurityGroup': Equals([Ref(ExistingSecurityGroup), 'NONE'])}}

How we convert string into json

I want to convert ansible-init file into json. So, I just use this code:
common_shared file:
[sql]
x.com
[yps_db]
y.com
[ems_db]
c.com
[scc_db]
d.com
[all:vars]
server_url="http://x.com/x"
app_host=abc.com
server_url="https://x.com"
[haproxy]
1.1.1.1 manual_hostname=abc instance_id=i-dddd
2.2.2.2 manual_hostname=xyz instance_id=i-cccc
For converting Ansible INI file in JSON:
import json
options= {}
f = open('common_shared')
x = f.read()
config_entries = x.split()
for key,value in zip(config_entries[0::2], config_entries[1::2]):
cleaned_key = key.replace("[",'').replace("]",'')
options[cleaned_key]=value
print json.dumps(options,indent=4,ensure_ascii=False)
But it will print this result:
{
"scc_db": "xxx",
"haproxy": "x.x.x.x",
"manual_hostname=xxx": "instance_id=xx",
"ems_db": "xxx",
"yps_db": "xxx",
"all:vars": "yps_server_url=\"xxx\"",
"1.1.1.5": "manual_hostname=xxx",
"sql": "xxx",
"xxx": "scc_server_url=xxxx\""
}
But I wanted to print result in proper JSON format but not able to understand how. I tried config parser but didn't get help to print it in desired format.
You can use ConfigParser to read in your file, and then do the conversion to a dict to dump.
from ConfigParser import ConfigParser
from collections import defaultdict
config = ConfigParser()
config.readfp(open('/path/to/file.ini'))
def convert_to_dict(config):
config_dict = defaultdict(dict)
for section in config.sections():
for key, value in config.items(section):
config_dict[section][key] = value
return config_dict
print convert_to_dict(config)
EDIT
As you stated in your comment, some line items are just 'things' with no value, the below might work for you.
import re
from collections import defaultdict
SECTION_HEADER_RE = re.compile('^\[.*\]$')
KEY_VALUE_RE = re.compile('^.*=.*$')
def convert_ansible_to_dict(filepath_and_name):
ansible_dict = defaultdict(dict)
with open(filepath_and_name) as input_file:
section_header = None
for line in input_file:
if SECTION_HEADER_RE.findall(line.strip()):
section_header = SECTION_HEADER_RE.findall(line.strip())[0]
elif KEY_VALUE_RE.findall(line.strip()):
if section_header:
# Make sure you have had a header section prior to the line
key, value = KEY_VALUE_RE.findall(line.strip())[0].split('=', 1)
ansible_dict[section_header][key] = value
else:
if line.strip() and section_header:
# As they're just attributes without value, assign the value None
ansible_dict[section_header][line.strip()] = None
return ansible_dict
This is a naive approach, and might not catch all corner cases for you, but perhaps it's a step in the right direction. If you have any 'bare-attributes' prior to your first section header, they will not be included in the dictionary as it would not know where to apportion it to, and the regex for key=value pairs is working on the assumption that there will be only 1 equals sign in the line. I'm sure there might be many other cases I'm not seeing right now, but hopefully this helps.
Christian's answer is the correct one: Use ConfigParser.
Your issue with his solution is that you have an incorrectly formatted INI file.
You need to change all your properties to:
key=value
key: value
e.g.
[sql]
aaaaaaa: true
https://wiki.python.org/moin/ConfigParserExamples
https://en.wikipedia.org/wiki/INI_file#Keys_.28properties.29

Categories