I have a script that reads a YAML file into a python dictionary. How do I read the values and concatenate some of them to be more meaningful?
#script to load the yaml file into a python object
import yaml
from yaml import load, dump
#read data from the config yaml file
with open("config.yaml", "r") as stream:
try:
print(yaml.load(stream))
except yaml.YAMLError as exc:
print(exc)
Contents of YAML file:
os2:
host:hostname
ip:10.123.3.182
path:/var/log/syslog
file:syslog
Your yaml is inappropriately formatted. There should be a space after the : in each of the sub items like so:
os2:
host: hostname
ip: 10.123.3.182
path: /var/log/syslog
file: syslog
After that if you do a data = yaml.load(stream) it should pass the data correctly as such:
{'os2': {'file': 'syslog',
'host': 'hostname',
'ip': '10.123.3.182',
'path': '/var/log/syslog'}}
Also, you don't need the line from yaml import load, dump since you already import yaml in its entirety.
Once the data is loaded, you can do pretty much anything you wish with it. You might want to use str.format() or f strings (Python 3.6+) as such:
'{host}#{ip}:{path}'.format(**data['os2'])
# 'hostname#10.123.3.182:/var/log/syslog'
this is called string formatting . The **data['os2'] bit is essentially unpacking the dictionary within `data['os2'] so you can refer to the keys directly in your string as such:
{'file': 'syslog',
'host': 'hostname',
'ip': '10.123.3.182',
'path': '/var/log/syslog'}
Note that since your yaml doesn't include the key or value "ubuntu" there's no way for you to get reference that string unless you update your yaml.
Also Note: Don't confuse dictionary keys with attributes. You cannot reference data.os2.file as no such attribute exist under dictionary. You can however reference data['os2']['file'] (note they are in strings) to retrieve the data stored.
Your YAML is perfectly normal, and it loads as you can see here.
You have one key (os2) and as value, a multiline plain scalar that loads, following the YAML standard, as a string with a space where the YAML has newline+spaces. That value thus loads as "host:hostname ip:10.123.3.182 path:/var/log/syslog file:syslog".
Since you indicate you expect values (multiple), you either have to introduce make the value for os2 a flow-style mapping (in which case you must quote the scalars, otherwise you could e.g. not write plain URLs as scalars in valid YAML):
os2: {
"host":"hostname",
"ip":"10.123.3.182",
"path":"/var/log/syslog",
"file":"syslog"
}
or you should follow the guideline from the YAML standard that
Normally, YAML insists the “:” mapping value indicator be separated from the value by white space.
os2:
host: hostname
ip: 10.123.3.182
path: /var/log/syslog
file: syslog
You should load YAML (when using PyYAML), using yaml.safe_load() as there is absolutely no need to use yaml.load() function, which is documented to be potentially unsafe.
With either of the above in config.yaml, you can do:
import sys
import yaml
with open('config.yaml') as stream:
d = yaml.safe_load(stream)
os2 = d['os2']
# "concatenate" host, ip and path
print('{host}#{ip}:{path}'.format(**d['os2']))
to get:
hostname#10.123.3.182:/var/log/syslog
Your yaml file is incorrectly configured. There should be a space between each key and its value. You should have something like:
os2:
host: hostname
ip: 10.123.3.182
path: /var/log/syslog
file: syslog
yaml.load will return a dictionary whose values you can access normally.
{'os2': {'host': 'hostname', 'ip': '10.123.3.182', 'path': '/var/log/syslog', 'file': 'syslog'}}
Your code will look like this
#script to load the yaml file into a python object
import yaml
from yaml import load, dump
#read data from the config yaml file
with open("config.yaml", "r") as stream:
try:
config = yaml.load(stream)
#concatenate into string
string = f"{config['os2']['host']}#{config['os2']['ip']}:{config['os2']['path']}"
except yaml.YAMLError as exc:
print(exc)
Related
I have a lot yaml file names that have similar structure but with different data. I need to parse out selective data, and put into a single csv (excel) file as three columns.
But i facing an issue with empty key, that always gives me an "KeyError: 'port'"
my yaml file example:
base:
server: 10.100.80.47
port: 3306
namePrefix: well
user: user1
password: kjj&%$
base:
server: 10.100.80.48
port:
namePrefix: done
user: user2
password: fhfh#$%
In the second block i have an empty "port", and my script is stuck on that point.
I need that always that an empty key is found it doesn't write anything.
from asyncio.windows_events import NULL
from queue import Empty
import yaml
import csv
import glob
yaml_file_names = glob.glob('./*.yaml')
rows_to_write = []
for i, each_yaml_file in enumerate(yaml_file_names):
print("Processing file {} of {} file name: {}".format(
i+1, len(yaml_file_names),each_yaml_file))
with open(each_yaml_file) as file:
data = yaml.safe_load(file)
for v in data:
if "port" in v == "":
data['base']['port'] = ""
rows_to_write.append([data['base']['server'],data['base']['port'],data['server']['host'],data['server']['contex']])
with open('output_csv_file.csv', 'w', newline='') as out:
csv_writer = csv.writer(out)
csv_writer.writerow(["server","port","hostname", "contextPath"])
csv_writer.writerows(rows_to_write)
print("Output file output_csv_file.csv created")
You are trying to access the key by index e.g.
data['base']['port']
But what you want is to access it with the get method like so:
data['base'].get('port')
This way if the key does not exists it return None as default, and you could even change the default value to whatever you want by passing it as the second parameter.
In PyYAML, an empty element is returned as None, not an empty string.
if data['base']['port'] is None:
data['base']['port'] = ""
Your yaml file is invalid. In yaml file, whenever you have a key (like port: in your example) you must provide a value, you cannot leave it empty and go to the next line. Unless the value is the next bunch of keys of course, but in that case you need to ident the following lines one step more, which is obviously not what you intend to do here.
This is likely why you cannot parse the file as expected with the python yaml module. If you are the creator of those yaml file, you really need to put a key in the file like port: None if you don't want to provide a value for the port, or even better you just not provide any port key at all.
If they are provided to you by someone else, ask them to provide valid yaml files.
Then the other solutions posted should work.
I have a yaml file as below:
server1:
host: os1
ip: ##.###.#.##
path: /var/log/syslog
file: syslog
identityfile: /identityfile/keypair.pub
server2:
host: os2
ip: ##.###.#.##
path: /var/log/syslog
file: syslog.1
identityfile: /identityfile/id_rsa.pub
I have a piece of code to parse the yaml and read entries.
read data from the config yaml file
def read_yaml(file):
with open(file, "r") as stream:
try:
config = yaml.load(stream)
print(config)
except yaml.YAMLError as exc:
print(exc)
print("\n")
return config
read_yaml("config_file")
print(config)
My problems:
1. I am unable to return values and I get a "NameError: name 'config' is not defined" at the print statement called outside the function.
How can I iterate and read the values in my yaml file by passing only the parameters?
Ex:
print('{host}#{ip}:{path}'.format(**config['os1']))
but without the 'os1' as the yaml file may have 100s of entries
I ensured there are no duplicates by using sets but want to use a loop and store the values from my string formatting command into a variable without using 'os1' or 'os2' or 'os#'.
def iterate_yaml():
remotesys = set()
for key,val in config.items():
print("{} = {}".format(key,val))
#check to ensure duplicates are removed by storing it in a set
remotesys.add('{host}#{ip}:{path}'.format(**config['os1']))
remotesys.add('{host}#{ip}:{path}'.format(**config['os2']))
remotesys.add('{host}#{ip}:{path}'.format(**config['os3']))
Thanks for the help.
You get the NameError exception because you don't return any values. You have to return config from the function.
For example:
def read_yaml(...):
# code
return config
Then, by calling read_yaml, you'll get your configuration returned.
Check the Python documentation & tutorials for that.
2-3. You can perform a for loop using the dict.items method.
For example:
x = {'lol': 1, 'kek': 2}
for name, value in x.items():
print(name, value)
I have the following YAML file:
heat_template_version: 2015-10-15
parameters:
image:
type: string
label: Image name or ID
default: CirrOS
private_network_id:
type: string
label: Private network name or ID
floating_ip:
type: string
I want to add key-> default to private_network_id and floating_ip (if default doesn't exist) and to the default key I want to add the value (which I get from user)
How can I achieve this in python?
The resulting YAML should look like:
heat_template_version: 2015-10-15
parameters:
image:
type: string
label: Image name or ID
default: CirrOS
private_network_id:
type: string
label: Private network name or ID
default: <private_network_id>
floating_ip:
type: string
default: <floating_ip>
For this kind of round-tripping you should do use ruamel.yaml (disclaimer: I am the author of the package).
Assuming your input is in a file input.yaml and the following program:
from ruamel.yaml import YAML
from pathlib import Path
yaml = YAML()
path = Path('input.yaml')
data = yaml.load(path)
parameters = data['parameters']
# replace assigned values with user input
parameters['private_network_id']['default'] = '<private_network_id>'
parameters['floating_ip']['default'] = '<floating_ip>'
yaml.dump(data, path)
After that your file will exact match the output you requested.
Please note that comments in the YAML file, as well as the key ordering are automatically preserved (not guaranteed by the YAML specification).
If you are still using Python2 (which has no pathlib in the standard library) use from ruamel.std.pathlib import Path or rewrite the .load() and .dump() lines with appropriately opened, old style, file objects. E.g.
with open('input.yaml', 'w') as fp:
yaml.dump(data, fp)
I'm trying to make a script to back up a MySQL database. I have a config.yml file:
DB_HOST :'localhost'
DB_USER : 'root'
DB_USER_PASSWORD:'P#$$w0rd'
DB_NAME : 'moodle_data'
BACKUP_PATH : '/var/lib/mysql/moodle_data'
Now I need to read this file. My Python code so far:
import yaml
config = yaml.load(open('config.yml'))
print(config.DB_NAME)
And this is an error that comes up:
file "conf.py", line 4, in <module>
print(config.DB_NAME)
AttributeError: 'str' object has no attribute 'DB_NAME'
Does anyone have an idea where I made a mistake?
There are 2 issues:
As others have said, yaml.load() loads associative arrays as mappings, so you need to use config['DB_NAME'].
The syntax in your config file is not correct: in YAML, keys are separated from values by a colon+space.
Should work if the file is formatted like this:
DB_HOST: 'localhost'
DB_USER: 'root'
DB_USER_PASSWORD: 'P#$$w0rd'
DB_NAME: 'moodle_data'
BACKUP_PATH: '/var/lib/mysql/moodle_data'
To backup your data base, you should be able to export it as a .sql file. If you're using a specific interface, look for Export.
Then, for Python's yaml parser.
DB_HOST :'localhost'
DB_USER : 'root'
DB_USER_PASSWORD:'P#$$w0rd'
DB_NAME : 'moodle_data'
BACKUP_PATH : '/var/lib/mysql/moodle_data'
is a key-value thing (sorry, didn't find a better word for that one). In certain langage (such as PHP I think), they are converted to objects. In python though, they are converted to dicts (yaml parser does it, JSON parser too).
# access an object's attribute
my_obj.attribute = 'something cool'
my_obj.attribute # something cool
del my_obj.attribute
my_obj.attribute # error
# access a dict's key's value
my_dict = {}
my_dict['hello'] = 'world!'
my_dict['hello'] # world!
del my_dict['hello']
my_dict['hello'] # error
So, that's a really quick presentation of dicts, but that should you get you going (run help(dict), and/or have a look here you won't regret it)
In your case:
config['DB_NAME'] # moodle_data
Try this:
import yaml
with open('config.yaml', 'r') as f:
doc = yaml.load(f)
To access "DB_NAME" you can use:
txt = doc["DB_NAME"]
print txt
This is the part of the mailer.py script:
config = pyfig.Pyfig(config_file)
svnlook = config.general.svnlook #svnlook path
sendmail = config.general.sendmail #sendmail path
From = config.general.from_email #from email address
To = config.general.to_email #to email address
what does this config variable contain? Is there a way to get the value for config variable without pyfig?
In this case config = a pyfig.Pyfig object initialised with the contents of the file named by the content of the string config_file.
To find out what that object does and contains you can either look at the documentation and/or the source code, both here, or you can print out, after the initialisation, e.g.:
config = pyfig.Pyfig(config_file)
print "Config Contains:\n\t", '\n\t'.join(dir(config))
if hasattr(config, "keys"):
print "Config Keys:\n\t", '\n\t'.join(config.keys())
or if you are using Python 3,
config = pyfig.Pyfig(config_file)
print("Config Contains:\n\t", '\n\t'.join(dir(config)))
if hasattr(config, "keys"):
print("Config Keys:\n\t", '\n\t'.join(config.keys()))
To get the same data without pyfig you would need to read and parse at the content of the file referenced by config_file within your own code.
N.B.: Note that pyfig seems to be more or less abandoned - no updates in over 5 years, web site no longer exists, etc., so I would strongly recommend converting the code to use a json configuration file instead.