python and yaml extract values of certain yaml field

python and yaml extract values of certain yaml field - python

I have the following scenario:
a yaml file:
team:
owner:
contact:
channel:
role:
role1: 3r
role2: 6q
And a python script that needs to extract the keyvalue pairs of role:
def yaml_processor(role):
filepath = "../1/2/3.yaml"
data = yaml_loader(filepath)
data = data.get(role)
for team in data.iteritems():
print(role)
file.close()

There are a few things strange in your code, but lets start with
defining yaml_loader:
from ruamel.yaml import YAML
def yaml_loader(file_path):
yaml = YAML()
with open(file_path) as fp:
data = yaml.load(fp)
return data
that loads your YAML into a hierarchical data structure. At the root
of that structure is a dict, with one key: team, because at the root
of your document there is a mapping with that key.
As for your code:
You always print the argument to yaml_processor for each team you
find. You are not using the data from the YAML document
you try to replace the data you get from the YAML document with
data.get(role), that will only work if role == 'team' because
there is only that key at the root level
you do file.close() what is file?
you are using .iteritems() that is a Python 2 construct. Python 2
is end-of-life in
2020. Do you really
intent to use learn to use Python 2 specific code this late in the game?
I would include:
from future import print_function
at the top of my program and then do something like:
def yaml_processor():
# filepath = "../1/2/3.yaml"
filepath = '3.yaml' # different filestructure on my disc
data = yaml_loader(filepath)
for team in data:
team_data = data[team]
for role_nr in team_data['role']:
role = team_data['role'][role_nr]
print('team {}, role number: {}, role: {}'.format(team, role_nr, role))
yaml_processor()
which gives:
team team, role number: role1, role: 3r
team team, role number: role2, role: 6q
If you are not realy using the key role1 and role2 you should
consider using a sequence in your YAML document:
team:
owner:
contact:
channel:
roles:
- 3r
- 6q

Related

Python - parsing and updating json

I'm able to load and parse a json file with Python by referring to list items by name. My users.json data file:
{
"joe": {
"secret": "abc123.321def"
},
"sarah": {
"secret": "led789.321plo"
},
"dave": {
"secret": "ghi532.765dlmn"
}
}
My code - to output the 'secret' value associated with a specific user (e.g. Dave):
import json
with open('users_sample.json') as f:
users = json.load(f)
# f.close()
print(users['dave']['secret'])
This outputs Dave's secret:
ghi532.765dlmn
That's easy enough when I can predict or know the user names, but how do I iterate through each user in the users.json file and output each user's 'secret' value?
Thanks in advance!

I would encapsulate the logic to print each user and their associated function into a helper function:
def print_users(users_dict: dict, header='Before'):
print(f'-- {header}')
for u in users_dict:
print(f' {u}: {users_dict[u].get("secret", "<empty>")}')
Then, upon loading the users object initially via json.load, you can then call the function like so:
print_users(users)
To replace the secret for each user, in this case to replace every occurrence of a dot . with a plus +, a simple approach could be to use a for loop to update the users object in place:
for name, user in users.items():
if 'secret' in user:
user['secret'] = user['secret'].replace('.', '+')
Then print the result after the replacements are carried out:
print_users(users, 'After')
Finally, we can write the result users object back out to a file:
with open('users_sample_UPDATED.json', 'w') as out_file:
json.dump(users, out_file)
The output of the above code, in this case would be:
-- Before
joe: abc123.321def
sarah: led789.321plo
dave: ghi532.765dlmn
-- After
joe: abc123+321def
sarah: led789+321plo
dave: ghi532+765dlmn
The full code:
import json
def main():
with open('users_sample.json') as f:
users = json.load(f)
print_users(users)
new_users = {name: {'secret': user['secret'].replace('.', '+')}
for name, user in users.items()}
print_users(new_users, 'After')
with open('users_sample_UPDATED.json', 'w') as out_file:
json.dump(new_users, out_file)
def print_users(users_dict: dict, header='Before'):
print(f'-- {header}')
for u in users_dict:
print(f' {u}: {users_dict[u].get("secret", "<empty>")}')
if __name__ == '__main__':
main()

Iterate the dictionary using a for loop
code that works:
import json
with open('users_sample.json') as f:
users = json.load(f)
for user in users:
print(f"user name: {user} secret: {users[user]['secret']}")

You have a nested dictionary - i.e., each value associated with a top-level key is also a dictionary. You can iterate over those dictionaries with the built-in values() function. This leads to:
print(*[e.get('secret') for e in users.values()], sep='\n')

How to get API to output one parameter in JSON

I'm trying to withdraw all the tasks from a specific project within Todoist using their API and Python.
My code looks like this:
ListOfProjects = api.get_projects()
ListOfPeople = api.get_tasks(project_id = 1234567899,)
file = open('outputa.txt', 'w', encoding="utf-8")
print(ListOfPeople, file = file)
file.close()
input("Press Enter To Exit")
This then prints the JSON Encoded information to said file, for example:
[
Task(
id: 2995104339,
project_id: 2203306141,
section_id: 7025,
parent_id: 2995104589,
content: 'Buy Milk',
description: '',
comment_count: 10,
assignee: 2671142,
assigner: 2671362,
order: 1,
priority: 1,
url: 'https://todoist.com/showTask?id=2995104339'
...
)
This gives me a massive, unwieldy text document (as there is one of the above for every task in a nearly 300 task project). I just need the string after the Content parameter.
Is there a way to specify that just the Content parameter should be printed?

According to documentation, get_tasks method returns a list of Task object, so if you need to store just content property for each of them, you can change your code like this:
data = [t.content for t in api.get_tasks(project_id = 1234567899,)]
with open('outputa.txt', 'w', encoding="utf-8") as f:
print(data, file=f)

how to change a specific records in Mongodb

I have a database in MongoDb and it contains a collection called users that has information about people like name,job title,gender,etc.the problem is genders are wrong in most cases I mean the name is a male name but the gender is female although it had to be male.
I would be happy if you could help me find a way to write a python code to fix the problem.
I have two files containing names of females and names of males and here is the code I have written but it is not working since some of the records remain un changed and I do not know what is wrong:
"""
import json
import sys
from pymongo import MongoClient
from tqdm import tqdm
client = MongoClient("mongodb://192.168.20.92:27017")
users = client["97_Production_DB"]["invalid"]
updated_users_sepid = client["db_97_updated"]["users"]
invalid_users_sepid = client["db_97_updated"]["invalid"]
with open("male.json", "rb") as f:
male = json.load(f)
with open("female.json", "rb") as f:
female = json.load(f)
person = {}
for key in male:
person[key] = True
for key in female:
person[key] = False
for data in tqdm(users.find(no_cursor_timeout=True)):
firstName = data["firstName"].replace("سید", "").replace("سیده", "").replace("سادات", "").replace(" ", "")
if firstName in person.keys:
data["gender"] = person.get(firstName)
updated_users_sepid.insert_one(data)
else:
invalid_
"""

Updating keys in dict via loop

I'm trying to create a dict that contains a list of users and their ssh-keys.
The list of users and the ssh-keys are stored in different yaml files which need to grab the info from. The files are "admins" and "users" and they look like:
Admins file:
admins:
global:
- bob
- john
- jimmy
- hubert
SSH key file:
users:
bob:
fullname: Bob McBob
ssh-keys:
ssh-rsa "thisismysshkey"
john:
fullname: John McJohn
ssh-keys:
ssh-rsa "thisismysshkey"
So far i have this code:
import yaml
#open admins list as "f"
f = open("./admins.sls", 'r')
#creates "admins" list
admins = yaml.load(f)
#grab only needed names and make a list
admins = admins['admins']['global']
#convert back to dict with dummy values of 0
admin_dict = dict.fromkeys(admins, 0)
So at this point I have this dict:
print(admin_dict)
{'bob': 0, 'john': 0}
Now i want to loop through the list of names in "admins" and update the key (currently set to 0) with their ssh-key from the other file.
So i do:
f = open("./users.sls", 'r')
ssh_keys = yaml.load(f)
for i in admins:
admin_dict[k] = ssh_keys['users'][i]['ssh-keys']
but when running that for loop, only one value is getting updated.
Kinda stuck here, i'm way out of my python depth... am i on the right track here?
edit:
changed that last loop to be:
for i in admins:
for key, value in admin_dict.items():
admin_dict[key] = ssh_keys['users'][i]['ssh-keys']
and things look better. Is this valid?

With an admin.yaml file like:
admins:
global:
- bob
- john
- jimmy
- hubert
And a ssh_key.yaml like so:
users:
bob:
fullname: Bob McBob
ssh-keys:
ssh-rsa: "bob-rsa-key"
john:
fullname: John McJohn
ssh-keys:
ssh-rsa: "john-rsa-key"
jimmy:
fullname: Jimmy McGill
ssh-keys:
ssh-rsa: "jimmy-rsa-key"
ssh-ecdsa: "jimmy-ecdsa-key"
You could do something like this asssuming you want to know which type of ssh key each user has (if not just go index one level deeper for the specific name of the key type in the dictionary comprehension):
import yaml
import pprint
def main():
with open('admin.yaml', 'r') as f:
admins_dict = yaml.load(f, yaml.SafeLoader)
admins_list = admins_dict['admins']['global']
with open('ssh_keys.yaml', 'r') as f:
ssh_dict = yaml.load(f, yaml.SafeLoader)
users_dict = ssh_dict['users']
admins_with_keys_dict = {
admin: users_dict[admin]['ssh-keys'] if admin in users_dict else None
for admin in admins_list
}
pp = pprint.PrettyPrinter(indent=2)
pp.pprint(admins_with_keys_dict)
if __name__ == '__main__':
main()
Output:
{ 'bob': {'ssh-rsa': 'bob-rsa-key'},
'hubert': None,
'jimmy': {'ssh-ecdsa': 'jimmy-ecdsa-key', 'ssh-rsa': 'jimmy-rsa-key'},
'john': {'ssh-rsa': 'john-rsa-key'}}
Alternative Output if you only want the rsa keys:
{ 'bob': 'bob-rsa-key',
'hubert': None,
'jimmy': 'jimmy-rsa-key',
'john': 'john-rsa-key'}
Above output achieved making the following change to the dictionary comprehension:
admin: users_dict[admin]['ssh-keys']['ssh-rsa'] if admin in users_dict else None
^^^^^^^^^^^

Replacing strings in YAML using Python

I have the following YAML:
instance:
name: test
flavor: x-large
image: centos7
tasks:
centos-7-prepare:
priority: 1
details::
ha: 0
args:
template: &startup
name: startup-centos-7
version: 1.2
timeout: 1800
centos-7-crawl:
priority: 5
details::
ha: 1
args:
template: *startup
timeout: 0
The first task defines template name and version, which is then used by other tasks. Template definition should not change, however others especially task name will.
What would be the best way to change template name and version in Python?
I have the following regex for matching (using re.DOTALL):
template:.*name: (.*?)version: (.*?)\s
However did not figure out re.sub usage so far. Or is there any more convenient way of doing this?

For this kind of round-tripping (load-modify-dump) of YAML you should be using ruamel.yaml (disclaimer: I am the author of that package).
If your input is in input.yaml, you can then relatively easily find
the name and version under key template and update them:
import sys
import ruamel.yaml
def find_template(d):
if isinstance(d, list):
for elem in d:
x = find_template(elem)
if x is not None:
return x
elif isinstance(d, dict):
for k in d:
v = d[k]
if k == 'template':
if 'name' in v and 'version' in v:
return v
x = find_template(v)
if x is not None:
return x
return None
yaml = ruamel.yaml.YAML()
# yaml.indent(mapping=4, sequence=4, offset=2)
yaml.preserve_quotes = True
with open('input.yaml') as ifp:
data = yaml.load(ifp)
template = find_template(data)
template['name'] = 'startup-centos-8'
template['version'] = '1.3'
yaml.dump(data, sys.stdout)
which gives:
instance:
name: test
flavor: x-large
image: centos7
tasks:
centos-7-prepare:
priority: 1
'details:':
ha: 0
args:
template: &startup
name: startup-centos-8
version: '1.3'
timeout: 1800
centos-7-crawl:
priority: 5
'details:':
ha: 1
args:
template: *startup
timeout: 0
Please note that the (superfluous) quotes that I inserted in the input, as well as the comment and the name of the alias are preserved.

I would parse the yaml file into a dictionary, and the edit the field and write the dictionary back out to yaml.
See this question for discussion on parsing yaml in python How can I parse a YAML file in Python but I think you would end up with something like this.
from ruamel.yaml import YAML
from io import StringIO
yaml=YAML(typ='safe')
yaml.default_flow_style = False
#Parse from string
myConfig = yaml.load(doc)
#Example replacement code
for task in myConfig["tasks"]:
if myConfig["tasks"][task]["details"]["args"]["template"]["name"] == "&startup":
myConfig["tasks"][task]["details"]["args"]["template"]["name"] = "new value"
#Convert back to string
buf = StringIO()
yaml.dump(myConfig, buf)
updatedYml = buf.getvalue()

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

python and yaml extract values of certain yaml field - python

Related

Python - parsing and updating json

How to get API to output one parameter in JSON

how to change a specific records in Mongodb

Updating keys in dict via loop

Replacing strings in YAML using Python

Categories

Resources