I'm using ruamel.yaml to insert some values into yaml files that i alredy have.
I am able to insert new items into the yaml file, but i can not insert comments.
This is what i'm trying to achieve
Resources:
Statement:
- Sid: Sid2
Resource:
# 1 - Account 1
- item1
- item # New Comment
This is the python code i'm using, but i'm not able to insert the item2 and the comments.
import sys
from ruamel.yaml import YAML
yaml_doc = """\
Resources:
Statement:
- Sid: Sid2
Resource:
# 1 - Account 1
- item1
"""
yaml = YAML()
data = yaml.load(yaml_doc)
data['Resources']['Statement'][0]['Resource'].append("item2")
data['Resources']['Statement'][0]['Resource'].yaml_add_eol_comment('New Comment', 'item2', column=0)
yaml.dump(data, sys.stdout)
Can I get some help on how to add the comment in line with the newly added item?
Thanks
Related
I'm trying to manipulate some YAML files using ruamel.yaml, notably removing specific keys. This seems to work, but in the process all comment lines and empty lines that follow the key, up to the next key, are also removed. Minimal example:
from ruamel.yaml import YAML
import sys
yaml_str = """\
# Our app configuration
# foo is our main variable
foo: bar
# foz is also important
foz: quz
# And finally we have our optional configs.
# These are not really mandatory, they can be
# considere as "would be nice".
opt1: foqz
"""
yaml = YAML(typ='rt')
data = yaml.load(yaml_str)
data.pop('foz', None)
yaml.dump(data, sys.stdout)
which gives:
# Our app configuration
# foo is our main variable
foo: bar
# foz is also important
opt1: foqz
Is there maybe a way to avoid this and only remove the key itself, and any inline comment?
The ruamel.yaml documentation states
This preservation is normally not broken unless you severely alter
the structure of a component (delete a key in a dict, remove list entries).
so this behavior should not have come as a surprise.
What you can do is
extend the comment on the key before the one you delete with the comment on the key
you are going to delete (you can actually do it afterwards, as the comment is
still there, there is just no longer a key while dumping to associate it with).
import sys
import ruamel.yaml
yaml_str = """\
# Our app configuration
# foo is our main variable
foo: bar
# foz is also important
foz: quz
# And finally we have our optional configs.
# These are not really mandatory, they can be
# considere as "would be nice".
opt1: foqz
"""
undefined = object()
def my_pop(self, key, default=undefined):
if key not in self:
if default is undefined:
raise KeyError(key)
return default
keys = list(self.keys())
idx = keys.index(key)
if key in self.ca.items:
if idx == 0:
raise NotImplementedError('cannot handle moving comment when popping the first key', key)
prev = keys[idx-1]
# print('prev', prev, self.ca)
comment = self.ca.items.pop(key)[2]
if prev in self.ca.items:
self.ca.items[prev][2].value += comment.value
else:
self.ca.items[prev] = self.ca.items.pop(key)
res = self.__getitem__(key)
self.__delitem__(key)
return res
ruamel.yaml.comments.CommentedMap.pop = my_pop
yaml = ruamel.yaml.YAML()
data = yaml.load(yaml_str)
data.pop('foz', None)
yaml.dump(data, sys.stdout)
which gives:
# Our app configuration
# foo is our main variable
foo: bar
# foz is also important
# And finally we have our optional configs.
# These are not really mandatory, they can be
# considere as "would be nice".
opt1: foqz
If you need to be able to pop the first key in a commented map, then you need to inspect self.ca
and handle its comment attribute, which is somewhat more complicated.
As always when using these kind of internals, you should pin the version of ruamel.yaml that you are using.
This internals will change.
I need to create a YAML file that is based on simple YAML that needs to be updated based on properties supplied by developers and contain the following YAML structure:
- switch: null
title: switch
case:
- condition: (parameter1==='Parameter1_value1')
execute:
- switch:
case:
- condition: $(parameter2) == "Parameter2_value1"
execute:
- invoke:
parameter3:
parameter3_value1: null
- condition: $(parameter2) == "Parameter2_value2"
execute:
- invoke:
parameter3:
parameter3_value2: null
- condition: (parameter1==='Parameter1_value2')
execute:
- switch:
case:
- condition: $(parameter2) == "Parameter2_value1"
execute:
- invoke:
parameter3:
parameter3_value1: null
- condition: $(parameter2) == "Parameter2_value2"
execute:
- invoke:
parameter3:
parameter3_value2: null
Both parameter1 and parameter2 can have multiple values, so I need to populate the structure dynamically, according to the values that I receive.
I tried to do the following:
Import the following
import ruamel.yaml
from jinja2 import Template
Load the basic file -
yaml = ruamel.yaml.YAML()
data_src = yaml.load(file_name)
In parallel receive the values from another JSON file, and once I have the data, I created using Jinja the following:
parameter2_data_tmpl = Template(""" - condition: $(parameter2) == "{{ parameter2_value }}"
execute:
- invoke:
parameter3: {{ parameter3_value }}
""");
parameter2_data = parameter2_data_tmpl.render(parameter2_value = parameter2_value, parameter3_value = parameter3_value)
This works like a charm, and when I print it - it looks great. but then I tried to add the new YAML piece to the structure I have by first add it to relevant array (Using the append method), and then assign the array to the relevant element in the original YAML structure.
But when I add it to the array, it added it in different format:
case: [' - condition: parameter2 == \"\
parameter2_value\"\\n execute:\\n -\
parameter3: parameter3_value\\n \
It's like jinja2 created it correctly, but not as YAML syntax.
Why doesn't this work? Is there an alternative to creating these code dynamically?
The result of the render is a string:
from jinja2 import Template
parameter2_data_tmpl = Template(""" - condition: $(parameter2) == "{{ parameter2_value }}"
execute:
- invoke:
parameter3: {{ parameter3_value }}
""");
parameter2_value = 'parameter2_value'
parameter3_value = 'parameter3_value'
parameter2_data = parameter2_data_tmpl.render(parameter2_value = parameter2_value, parameter3_value = parameter3_value)
print(type(parameter2_data))
This gives:
<class 'str'>
When you add that string somewhere in the datastructure that you loaded using ruamel.yaml you
get a quoted scalar.
What you want to do is add a data structure, but you cannot get from
the template, because the output of the render is not valid YAML.
You can directly create the data structure without rendering, if you don't want
to do that, you should change the template, so that it generates valid YAML input, load that
and then insert that data structure in the data_src at the appropriate place
(which is unclear from your incomplete program):
import sys
import ruamel.yaml
from jinja2 import Template
parameter2_data_tmpl = Template("""\
- condition: $(parameter2) == "{{ parameter2_value }}"
execute:
- invoke:
parameter3: {{ parameter3_value }}
""");
parameter2_value = 'parameter2_value'
parameter3_value = 'parameter3_value'
parameter2_data = parameter2_data_tmpl.render(parameter2_value = parameter2_value, parameter3_value = parameter3_value)
yaml = ruamel.yaml.YAML()
yaml.indent(sequence=4, offset=2)
new_data = yaml.load(parameter2_data)
data = yaml.load("""\
case: some value to overwrite
""")
data['case'] = new_data
yaml.dump(data, sys.stdout)
which gives:
case:
- condition: $(parameter2) == "parameter2_value"
execute:
- invoke:
parameter3: parameter3_value
Please note that there also exists a plugin ruamel.yaml.jinja2
that allows you to load
unrendered Jinja2 templates that generate YAML output and modify values.
I have the following YAML:
instance:
name: test
flavor: x-large
image: centos7
tasks:
centos-7-prepare:
priority: 1
details::
ha: 0
args:
template: &startup
name: startup-centos-7
version: 1.2
timeout: 1800
centos-7-crawl:
priority: 5
details::
ha: 1
args:
template: *startup
timeout: 0
The first task defines template name and version, which is then used by other tasks. Template definition should not change, however others especially task name will.
What would be the best way to change template name and version in Python?
I have the following regex for matching (using re.DOTALL):
template:.*name: (.*?)version: (.*?)\s
However did not figure out re.sub usage so far. Or is there any more convenient way of doing this?
For this kind of round-tripping (load-modify-dump) of YAML you should be using ruamel.yaml (disclaimer: I am the author of that package).
If your input is in input.yaml, you can then relatively easily find
the name and version under key template and update them:
import sys
import ruamel.yaml
def find_template(d):
if isinstance(d, list):
for elem in d:
x = find_template(elem)
if x is not None:
return x
elif isinstance(d, dict):
for k in d:
v = d[k]
if k == 'template':
if 'name' in v and 'version' in v:
return v
x = find_template(v)
if x is not None:
return x
return None
yaml = ruamel.yaml.YAML()
# yaml.indent(mapping=4, sequence=4, offset=2)
yaml.preserve_quotes = True
with open('input.yaml') as ifp:
data = yaml.load(ifp)
template = find_template(data)
template['name'] = 'startup-centos-8'
template['version'] = '1.3'
yaml.dump(data, sys.stdout)
which gives:
instance:
name: test
flavor: x-large
image: centos7
tasks:
centos-7-prepare:
priority: 1
'details:':
ha: 0
args:
template: &startup
name: startup-centos-8
version: '1.3'
timeout: 1800
centos-7-crawl:
priority: 5
'details:':
ha: 1
args:
template: *startup
timeout: 0
Please note that the (superfluous) quotes that I inserted in the input, as well as the comment and the name of the alias are preserved.
I would parse the yaml file into a dictionary, and the edit the field and write the dictionary back out to yaml.
See this question for discussion on parsing yaml in python How can I parse a YAML file in Python but I think you would end up with something like this.
from ruamel.yaml import YAML
from io import StringIO
yaml=YAML(typ='safe')
yaml.default_flow_style = False
#Parse from string
myConfig = yaml.load(doc)
#Example replacement code
for task in myConfig["tasks"]:
if myConfig["tasks"][task]["details"]["args"]["template"]["name"] == "&startup":
myConfig["tasks"][task]["details"]["args"]["template"]["name"] = "new value"
#Convert back to string
buf = StringIO()
yaml.dump(myConfig, buf)
updatedYml = buf.getvalue()
I have the following scenario:
a yaml file:
team:
owner:
contact:
channel:
role:
role1: 3r
role2: 6q
And a python script that needs to extract the keyvalue pairs of role:
def yaml_processor(role):
filepath = "../1/2/3.yaml"
data = yaml_loader(filepath)
data = data.get(role)
for team in data.iteritems():
print(role)
file.close()
There are a few things strange in your code, but lets start with
defining yaml_loader:
from ruamel.yaml import YAML
def yaml_loader(file_path):
yaml = YAML()
with open(file_path) as fp:
data = yaml.load(fp)
return data
that loads your YAML into a hierarchical data structure. At the root
of that structure is a dict, with one key: team, because at the root
of your document there is a mapping with that key.
As for your code:
You always print the argument to yaml_processor for each team you
find. You are not using the data from the YAML document
you try to replace the data you get from the YAML document with
data.get(role), that will only work if role == 'team' because
there is only that key at the root level
you do file.close() what is file?
you are using .iteritems() that is a Python 2 construct. Python 2
is end-of-life in
2020. Do you really
intent to use learn to use Python 2 specific code this late in the game?
I would include:
from future import print_function
at the top of my program and then do something like:
def yaml_processor():
# filepath = "../1/2/3.yaml"
filepath = '3.yaml' # different filestructure on my disc
data = yaml_loader(filepath)
for team in data:
team_data = data[team]
for role_nr in team_data['role']:
role = team_data['role'][role_nr]
print('team {}, role number: {}, role: {}'.format(team, role_nr, role))
yaml_processor()
which gives:
team team, role number: role1, role: 3r
team team, role number: role2, role: 6q
If you are not realy using the key role1 and role2 you should
consider using a sequence in your YAML document:
team:
owner:
contact:
channel:
roles:
- 3r
- 6q
I want to add a links property to each couchdb document based on data in a csv file.
the value of the links property is to be an array of dicts containing the couchdb _id of the linked document and the linkType
When I run the script i get a links error (see error info below)
I am not sure how to create the dict key links if it doesn't exist and add the link data, or otherwise append to the links array if it does exist.
an example of a document with the links will look like this:
{
_id: p_3,
name: 'Smurfette'
links: [
{to_id: p_2, linkType: 'knows'},
{to_id: o_56, linkType: 'follows'}
]
}
python script for processing the csv file:
#!/usr/bin/python
# coding: utf-8
# Version 1
#
# csv fields: ID,fromType,fromID,toType,toID,LinkType,Directional
import csv, sys, couchdb
def csv2couchLinks(database, csvfile):
# CouchDB Database Connection etc
server = couchdb.Server()
#assumes that couchdb runs on http://localhost:5984
db = server[database]
#assumes that db is already created
# CSV file
data = csv.reader(open(csvfile, "rb")) # Read in the CSV file rb=read/binary
csv_links= csv.DictReader(open(csvfile, "rb"))
def makeLink(from_id, to_id, linkType):
# get doc from db
doc = db[from_id]
# construct link object
link = {'to_id':to_id, 'linkType':linkType}
# add link reference to array at key 'links'
if doc['links'] in doc:
doc['links'].append(link)
else:
doc['links'] = [link]
# update the record in the database
db[doc.id] = doc
# read each row in csv file
for row in csv_links:
# get entityTypes as lowercase and entityIDs
fromType = row['fromType'].lower()
fromID = row['fromID']
toType = row['toType'].lower()
toID = row['toID']
linkType = row['LinkType']
# concatenate 'entity type' and 'id' to make couch '_id'
fromIDcouch = fromType[0]+'_'+fromID #eg 'p_2' <= person 2
toIDcouch = toType[0]+'_'+toID
makeLink(fromIDcouch, toIDcouch, linkType)
makeLink(toIDcouch, fromIDcouch, linkType)
# Run csv2couchLinks() if this is not an imported module
if __name__ == '__main__':
DATABASE = sys.argv[1]
CSVFILE = sys.argv[2]
csv2couchLinks(DATABASE,CSVFILE)
error info:
$ python LINKS_csv2couchdb_v1.py "qmhonour" "./tablesAsCsv/links.csv"
Traceback (most recent call last):
File "LINKS_csv2couchdb_v1.py", line 65, in <module>
csv2couchLinks(DATABASE,CSVFILE)
File "LINKS_csv2couchdb_v1.py", line 57, in csv2couchLinks
makeLink(fromIDcouch, toIDcouch, linkType)
File "LINKS_csv2couchdb_v1.py", line 33, in makeLink
if doc['links'] in doc:
KeyError: 'links'
Another option is condensing the if block to this:
doc.setdefault('links', []).append(link)
The dictionary's setdefault method checks to see if links exists in the dictionary, and if it doesn't, it creates a key and makes the value an empty list (the default). It then appends link to that list. If links does exist, it just appends link to the list.
def makeLink(from_id, to_id, linkType):
# get doc from db
doc = db[from_id]
# construct link object
link = {'to_id':to_id, 'linkType':linkType}
# add link reference to array at key 'links'
doc.setdefault('links', []).append(link)
# update the record in the database
db[doc.id] = doc
Replace:
if doc['links'] in doc:
With:
if 'links' in doc: