I have the following json data.
users:
[
{
"group_ids": [
"group_1"
],
"user_id": "U_1",
"name": "kite"
},
{
"group_ids": [
"group_1",
"group_2"
],
"user_id": "U_2",
"name": "mike"
},
{
"group_ids": [
"group_1",
"group_3"
],
"user_id": "U_3",
"name": "an"
},
{
"group_ids": [
"group_3"
],
"user_id": "U_4",
"name": "joe"
}
]
groups:
{
"group_1": {
"label": "sre",
"group_type": "freelance"
},
"group_2": {
"label": "dev",
"group_type": "staff"
},
"group_3": {
"label": "qa",
"group_type": "member"
},
"group_4": {
"label": "ops",
"group_type": "staff"
}
}
I want to get the following output with the keys in order when given user id U_2.
Any pseudo code or hints will be good.
{
"groups": [
{"label": "sre", "group_type": "freelance"},
{"label": "dev", "group_type": "staff"}
],
"user_id": "U_2",
"name": "mike"
}
To keep dict keys in order, you'll have to use the standard OrderedDict class.
In the snippet below, I assume you have two JSON files users.json and groups.json.
from collections import OrderedDict
import json
from pathlib import Path
# Load data from JSON files
users = json.loads(Path("users.json").read_text())
groups = json.loads(Path("groups.json").read_text())
# Index users by their ID
users = {user["user_id"]: user for user in users}
def get_groups(user_id):
# Get the required user representation
user = users[user_id].copy()
# Get list of its group IDs and remove it from its representation
group_ids = user.pop("group_ids")
# Add group representation for each group
user["groups"] = [groups[group_id] for group_id in group_ids]
# Convert user to OrderedDict to ensure keys are sorted
keys = "groups", "user_id", "name"
user = OrderedDict([(key, user[key]) for key in keys])
# Done!
return user
The result of get_groups("U_2") is then:
>>> get_groups("U_2")
OrderedDict([('groups', [{'label': 'sre', 'group_type': 'freelance'}, {'label': 'dev', 'group_type': 'staff'}]), ('user_id', 'U_2'), ('name', 'mike')])
Finally, the standard json.dump and json.dumps to convert to JSON string will respect the order of keys when you pass an OrderedDict to them.
>>> print(json.dumps(get_groups("U_2"), indent=4))
{
"groups": [
{
"label": "sre",
"group_type": "freelance"
},
{
"label": "dev",
"group_type": "staff"
}
],
"user_id": "U_2",
"name": "mike"
}
Related
I am new to python and now want to convert a csv file into json file. Basically the json file is nested with dynamic structure, the structure will be defined using the csv header.
From csv input:
ID, Name, person_id/id_type, person_id/id_value,person_id_expiry_date,additional_info/0/name,additional_info/0/value,additional_info/1/name,additional_info/1/value,salary_info/details/0/grade,salary_info/details/0/payment,salary_info/details/0/amount,salary_info/details/1/next_promotion
1,Peter,PASSPORT,A452817,1-01-2055,Age,19,Gender,M,Manager,Monthly,8956.23,unknown
2,Jane,PASSPORT,B859804,2-01-2035,Age,38,Gender,F,Worker, Monthly,125980.1,unknown
To json output:
[
{
"ID": 1,
"Name": "Peter",
"person_id": {
"id_type": "PASSPORT",
"id_value": "A452817"
},
"person_id_expiry_date": "1-01-2055",
"additional_info": [
{
"name": "Age",
"value": 19
},
{
"name": "Gender",
"value": "M"
}
],
"salary_info": {
"details": [
{
"grade": "Manager",
"payment": "Monthly",
"amount": 8956.23
},
{
"next_promotion": "unknown"
}
]
}
},
{
"ID": 2,
"Name": "Jane",
"person_id": {
"id_type": "PASSPORT",
"id_value": "B859804"
},
"person_id_expiry_date": "2-01-2035",
"additional_info": [
{
"name": "Age",
"value": 38
},
{
"name": "Gender",
"value": "F"
}
],
"salary_info": {
"details": [
{
"grade": "Worker",
"payment": " Monthly",
"amount": 125980.1
},
{
"next_promotion": "unknown"
}
]
}
}
]
Is this something can be done by the existing pandas API or I have to write lots of complex codes to dynamically construct the json object? Thanks.
I have a JSON with the following structure. I want to extract some data to different lists so that I will be able to transform them into a pandas dataframe.
{
"ratings": {
"like": {
"average": null,
"counts": {
"1": {
"total": 0,
"users": []
}
}
}
},
"sharefile_vault_url": null,
"last_event_on": "2021-02-03 00:00:01",
],
"fields": [
{
"type": "text",
"field_id": 130987800,
"label": "Name and Surname",
"values": [
{
"value": "John Smith"
}
],
{
"type": "category",
"field_id": 139057651,
"label": "Gender",
"values": [
{
"value": {
"status": "active",
"text": "Male",
"id": 1,
"color": "DCEBD8"
}
}
],
{
"type": "category",
"field_id": 151333010,
"label": "Field of Studies",
"values": [
{
"value": {
"status": "active",
"text": "Languages",
"id": 3,
"color": "DCEBD8"
}
}
],
}
}
For example, I create a list
names = []
where if "label" in the "fields" list is "Name and Surname" I append ["values"][0]["value"] so names now contains "John Smith". I do exactly the same for the "Gender" label and append the value to the list genders.
The above dictionary is contained in a list of dictionaries so I just have to loop though the list and extract the relevant fields like this:
names = []
genders = []
for r in range(len(users)):
for i in range(len(users[r].json()["items"])):
for field in users[r].json()["items"][i]["fields"]:
if field["label"] == "Name and Surname":
names.append(field["values"][0]["value"])
elif field["label"] == "Gender":
genders.append(field["values"][0]["value"]["text"])
else:
# Something else
where users is a list of responses from the API, each JSON of which has the items is a list of dictionaries where I can find the field key which has as the value a list of dictionaries of different fields (like Name and Surname and Gender).
The problem is that the dictionary with "label: Field of Studies" is optional and is not always present in the list of fields.
How can I manage to check for its presence, and if so append its value to a list, and None otherwise?
To me it seems that the data you have is not valid JSON. However if I were you I would try using pandas.json_normalize. According to the documentation this function will put None if it encounters an object with a label not inside it.
I want to print a user from a JSON list into Python that I select however I can only print all the users. How do you print a specific user? At the moment I have this which prints all the users out in a ugly format
import json
with open('Admin_sample.json') as f:
admin_json = json.load(f)
print(admin_json['staff'])
The JSON file looks like this
{
"staff": [
{
"id": "DA7153",
"name": [
"Fran\u00c3\u00a7ois",
"Ullman"
],
"department": {
"name": "Admin"
},
"server_admin": "true"
},
{
"id": "DA7356",
"name": [
"Bob",
"Johnson"
],
"department": {
"name": "Admin"
},
"server_admin": "false"
},
],
"assets": [
{
"asset_name": "ENGAGED SLOTH",
"asset_type": "File",
"owner": "DA8333",
"details": {
"security": {
"cia": [
"HIGH",
"INTERMEDIATE",
"LOW"
],
"data_categories": {
"Personal": "true",
"Personal Sensitive": "true",
"Customer Sensitive": "true"
}
},
"retention": 2
},
"file_type": "Document",
"server": {
"server_name": "ISOLATED UGUISU",
"ip": [
10,
234,
148,
52
]
}
},
{
"asset_name": "ISOLATED VIPER",
"asset_type": "File",
"owner": "DA8262",
"details": {
"security": {
"cia": [
"LOW",
"HIGH",
"LOW"
],
"data_categories": {
"Personal": "false",
"Personal Sensitive": "false",
"Customer Sensitive": "true"
}
},
"retention": 2
},
},
]
I just can't work it out. Any help would be appreciated.
Thanks.
You need to index into the staff list, e.g.:
print(admin_json['staff'][0])
I suggest reading up a bit on dictionaries in Python. Dictionary values can be set to any object: in this case, the value of the staff key is set to a list of dicts. Here's an example that will loop through all the staff members and print their names:
staff_list = admin_json['staff']
for person in staff_list:
name_parts = person['name']
full_name = ' '.join(name_parts) # combine name parts into a string
print(full_name)
Try something like this:
import json
def findStaffWithId(allStaff, id):
for staff in allStaff:
if staff["id"] == id:
return staff
return {} # no staff found
with open('Admin_sample.json') as f:
admin_json = json.load(f)
print(findStaffWithId(admin_json['staff'], "DA7356"))
You can list all the users name with
users = [user["name"] for user in admin_json['staff']]
You have two lists in this JSON file. When you try to parse it, you'll be reach a list. For example getting the first staff id:
print(admin_json['staff'][0]['id'])
This will print:
DA7153
When you use "json.loads" this will simply converts JSON file to the Python dictionary. For further info:
https://docs.python.org/3/tutorial/datastructures.html#dictionaries
I am trying to add a key id with the same uuid.uuid4() into the inner dictionary when 'node' values are equal and a new uuid.uuid4() when a distinct uuid is found.
Let's say 2 keys ('node' in this case) have same value like-> node: 'Bangalore', so I want to generate the same ID for it and a fresh ID for every other distinct node.
This is the code I'm working on now:
import uuid
import json
node_list = [
{
"nodes": [
{
"node": "Kunal",
"label": "PERSON"
},
{
"node": "Bangalore",
"label": "LOC"
}
]
},
{
"nodes": [
{
"node": "John",
"label": "PERSON"
},
{
"node": "Bangalore",
"label": "LOC"
}
]
}
]
for outer_node_dict in node_list:
for inner_dict in outer_node_dict["nodes"]:
inner_dict['id'] = str(uuid.uuid4()) # Remember the key's value here and apply this statement somehow?
print(json.dumps(node_list, indent = True))
This is the response I want:
"[
{
"nodes": [
{
"node": "Kunal",
"label": "PERSON",
"id": "fbf094eb-8670-4c31-a641-4cf16c3596d1"
},
{
"node": "Bangalore",
"label": "LOC",
"id": "24867c2a-f66a-4370-8c5d-8af5b9a25675"
}
]
},
{
"nodes": [
{
"node": "John",
"label": "PERSON",
"id": "5eddc375-ed3e-4f6a-81dc-3966590e8f35"
},
{
"node": "Bangalore",
"label": "LOC",
"id": "24867c2a-f66a-4370-8c5d-8af5b9a25675"
}
]
}
]"
But currently its generating like this:
"[
{
"nodes": [
{
"node": "Kunal",
"label": "PERSON",
"id": "3cce6e36-9d1c-4058-a11b-2bcd0da96c83"
},
{
"node": "Bangalore",
"label": "LOC",
"id": "4d860d3b-1835-4816-a372-050c1cc88fbb"
}
]
},
{
"nodes": [
{
"node": "John",
"label": "PERSON",
"id": "67fc9ba9-b591-44d4-a0ae-70503cda9dfe"
},
{
"node": "Bangalore",
"label": "LOC",
"id": "f83025a0-7d8e-4ec8-b4a0-0bced982825f"
}
]
}
]"
How to remember key's value and apply the same ID for it in the dictionary?
Looks like you want the uuid to be the same for the same "node" value. So, instead of generating it, store it to a dict
node_uuids = defaultdict(lambda: uuid.uuid4())
and then, in your inner loop, instead of
inner_dict['id'] = str(uuid.uuid4())
you write
inner_dict['id'] = node_uuids[inner_dict['node']]
A complete working example is as follows:
from collections import defaultdict
import uuid
import json
node_list = [
{
"nodes": [
{
"node": "Kunal",
"label": "PERSON"
},
{
"node": "Bangalore",
"label": "LOC"
}
]
},
{
"nodes": [
{
"node": "John",
"label": "PERSON"
},
{
"node": "Bangalore",
"label": "LOC"
}
]
}
]
node_uuids = defaultdict(lambda: uuid.uuid4())
for outer_node_dict in node_list:
for inner_dict in outer_node_dict["nodes"]:
inner_dict['id'] = str(node_uuids[inner_dict['node']])
print(json.dumps(node_list, indent = True))
Are there any python helper libraries I can use to create models that I can use to generate complex json files, such as this. I've read about colander but I'm not sure it does what I need. The tricky bit about the following is that the trigger-rule section may have nested match rules, something as described at https://github.com/adnanh/webhook/wiki/Hook-Rules
[
{
"id": "webhook",
"execute-command": "/home/adnan/redeploy-go-webhook.sh",
"command-working-directory": "/home/adnan/go",
"pass-arguments-to-command":
[
{
"source": "payload",
"name": "head_commit.id"
},
{
"source": "payload",
"name": "pusher.name"
},
{
"source": "payload",
"name": "pusher.email"
}
],
"trigger-rule":
{
"and":
[
{
"match":
{
"type": "payload-hash-sha1",
"secret": "mysecret",
"parameter":
{
"source": "header",
"name": "X-Hub-Signature"
}
}
},
{
"match":
{
"type": "value",
"value": "refs/heads/master",
"parameter":
{
"source": "payload",
"name": "ref"
}
}
}
]
}
}
]
Define a class like this:
class AttributeDictionary(dict):
__getattr__ = dict.__getitem__
__setattr__ = dict.__setitem__
When you load your JSON, pass AttributeDictionary as the object_hook:
import json
data = json.loads(json_str, object_hook=AttributeDictionary)
Then you can access dict entries by specifying the key as an attribute:
print data[0].id
Output
webhook
Note: You will want to replace dashes in keys with underscores. If you don't, this approach won't work on those keys.