Transforming nested JSON to simple dictionary JSON structure - python

I'm calling an API which returns me data in such a format:
{
"records": [
{
"columns": [
{
"fieldNameOrPath": "Name",
"value": "Burlington Textiles Weaving Plant Generator"
},
{
"fieldNameOrPath": "AccountName",
"value": "Burlington Textiles Corp of America"
}
]
},
{
"columns": [
{
"fieldNameOrPath": "Name",
"value": "Dickenson Mobile Generators"
},
{
"fieldNameOrPath": "AccountName",
"value": "Dickenson plc"
}
]
}
]
}
in order to properly use this data for my following workflow I need a structure such as:
{
"records": [
{
"Name": "Burlington Textiles Weaving Plant Generator",
"AccountName": "Burlington Textiles Corp of America"
},
{
"Name": "Dickenson Mobile Generators",
"AccountName": "Dickenson plc"
}
]
}
So the fieldNameOrPath value needs to become the key and the value value needs to become the value.
Can this transformation be done with a python function?
Those conditions apply:
I don't know how many objects will be inside each columns list element
The key and the value names could be different (so I need to pass fieldNameOrPath as the key for the key and value as the key for the value to the function in order to specify them)

We'll suppose the data from the API is stored in a variable data. To get the data transformed into the format you propose, we can iterate through all the records, and for each record create a dictionary by iterating through its columns, using the fieldNameOrPath values as the keys, and the value values as the dictionary values.
trans_data = {"records": []}
for record in data["records"]:
trans_record = {}
for column in record["columns"]:
trans_record[column["fieldNameOrPath"]] = column["value"]
trans_data["records"].append(trans_record)

Related

How do I return the value of a key that is nested in an anonymous JSON block with jsonpath?

I am trying to extract the value of a key that is nested in an anonymous JSON block. This is what the JSON block looks like after result:
"extras": [
{
"key": "alternative_name",
"value": "catr"
},
{
"key": "lineage",
"value": "This dataset was amalgamated, optimised and published by the Spatial hub. For more information visit www.spatialhub.scot."
},
{
"key": "ssdi_link",
"value": "https://www.spatialdata.gov.scot/geonetwork/srv/eng/catalog.search#/metadata/4826c148-c1eb-4eaa-abad-ca4b1ec65230"
},
{
"key": "update_frequency",
"value": "annually"
}
],
What I am trying to do is extract the value annually but I can't use index because some other datasets have more keys under the extras section. I am trying to write a jsonpath expression that extracts value where key is update_frequency
So far what I have tried is:
$.result.extras[*].value[?(key='update_frequency')]
And still no luck.
Any idea what could be wrong?
This should work:
$.result.extras[?(#.key=="update_frequency")].value

Pandas DataFrame to JSON with Nested list/dicts

Here is my df:
text
date
channel
sentiment
product
segment
0
I like the new layout
2021-08-30T18:15:22Z
Snowflake
predict
Skills
EMEA
I need to convert this to JSON output that matches the following:
[
{
"text": "I like the new layout",
"date": "2021-08-30T18:15:22Z",
"channel": "Snowflake",
"sentiment": "predict",
"fields": [
{
"field": "product",
"value": "Skills"
},
{
"field": "segment",
"value": "EMEA"
}
]
}
]
I'm getting stuck with mapping the keys of the columns to the values in the first dict and mapping the column and row to new keys in the final dict. I've tried various options using df.groupby with .apply() but am coming up short.
Samples of what I've tried:
df.groupby(['text', 'date','channel','sentiment','product','segment']).apply(
lambda r: r[['27cf2f]].to_dict(orient='records')).unstack('text').apply(lambda s: [
{s.index.name: idx, 'fields': value}
for idx, value in s.items()]
).to_json(orient='records')
Any and all help is appreciated!
Solved with this:
# Specify field column names
fieldcols = ['product','segment']
# Build a dict for each group as a Series named `fields`
res = (df.groupby(['text', 'date','channel','sentiment'])
.apply(lambda s: [{'field': field,
'value': value}
for field in fieldcols
for value in s[field].values])
).rename('fields')
# Convert Series to DataFrame and then to_json
res = res.reset_index().to_json(orient='records', date_format='iso')
Output:
[
{
"text": "I like the new layout",
"date": "2021-08-30T18:15:22Z",
"channel": "Snowflake",
"sentiment": "predict",
"fields": [
{
"field": "product",
"value": "Skills"
},
{
"field": "segment",
"value": "EMEA"
}
]
}
]

Very nested JSON with optional fields into pandas dataframe

I have a JSON with the following structure. I want to extract some data to different lists so that I will be able to transform them into a pandas dataframe.
{
"ratings": {
"like": {
"average": null,
"counts": {
"1": {
"total": 0,
"users": []
}
}
}
},
"sharefile_vault_url": null,
"last_event_on": "2021-02-03 00:00:01",
],
"fields": [
{
"type": "text",
"field_id": 130987800,
"label": "Name and Surname",
"values": [
{
"value": "John Smith"
}
],
{
"type": "category",
"field_id": 139057651,
"label": "Gender",
"values": [
{
"value": {
"status": "active",
"text": "Male",
"id": 1,
"color": "DCEBD8"
}
}
],
{
"type": "category",
"field_id": 151333010,
"label": "Field of Studies",
"values": [
{
"value": {
"status": "active",
"text": "Languages",
"id": 3,
"color": "DCEBD8"
}
}
],
}
}
For example, I create a list
names = []
where if "label" in the "fields" list is "Name and Surname" I append ["values"][0]["value"] so names now contains "John Smith". I do exactly the same for the "Gender" label and append the value to the list genders.
The above dictionary is contained in a list of dictionaries so I just have to loop though the list and extract the relevant fields like this:
names = []
genders = []
for r in range(len(users)):
for i in range(len(users[r].json()["items"])):
for field in users[r].json()["items"][i]["fields"]:
if field["label"] == "Name and Surname":
names.append(field["values"][0]["value"])
elif field["label"] == "Gender":
genders.append(field["values"][0]["value"]["text"])
else:
# Something else
where users is a list of responses from the API, each JSON of which has the items is a list of dictionaries where I can find the field key which has as the value a list of dictionaries of different fields (like Name and Surname and Gender).
The problem is that the dictionary with "label: Field of Studies" is optional and is not always present in the list of fields.
How can I manage to check for its presence, and if so append its value to a list, and None otherwise?
To me it seems that the data you have is not valid JSON. However if I were you I would try using pandas.json_normalize. According to the documentation this function will put None if it encounters an object with a label not inside it.

Getting Values with Python from JSON Array with multiple entries

i have a question regarding getting a specific value out of a JSON array based on a value that the array has. This might be a little vague bet let me show you.
I have a results array in JSON format:
{
"result": [{
"id": "SomeID1",
"name": "NAME1"
},
{
"id": "SomeID2",
"name": "NAME2"
}
]
}
I always know the name, but the ID is subject to change. So what i want to do is get the ID value based on the name I give. I am not able to alter the JSON format as it is a result i get from an API call.
So when enter NAME1 the result should be "SomeID1"
One approach could be (if name is unique):
data={
"result": [{
"id": "SomeID1",
"name": "NAME1"
},
{
"id": "SomeID2",
"name": "NAME2"
}
]
}
known_name ="NAME1"
print(next(x['id'] for x in data["result"] if x["name"]==known_name))
If name is not unique:
for x in data["result"]:
if x['name'] == known_name:
print(x["id"])
or you could store them in a list
print([x['id'] for x in data["result"] if x["name"]==known_name])

How to get and append output data from json response and store the duplicate keys if any in file with RobotFramework and python?

I have a json response with nested values. I have read the required keys and values from json. What issue I am facing is I am not able to append the particular keys and values inside a json file where i am storing. So, Can anyone help me in this by giving any solutions.
JSON RESPONSE having tags mulitple keys and values LIKE:
${API_Output}= {
"data": {
"resources": {
"edges": [
{
"node": {
"tags": [],
}
},
{
"node": {
"tags": [
{
"name": "app",
"value": "e2e"
},
{
"name": "Cost",
"value": "qwerty"
}
}
},
{
"node": {
"tags": [
{
"name": "app",
"value": "e2e"
},
{
"name": "Cost",
"value": "qwerty"
},
{
"name": "test",
"value": "qwerty"
}
}
}
]
}
}
}
Robot Framework code:
${dict1}= Set Variable ${API_Output}
${cnt}= get length ${dict1['data']['resources']['edges']}
${edge}= set variable ${dict1['data']['resources']['edges']}
run keyword if ${cnt}==0 set test message The resources count is Zero(0)
log to console ${cnt}-count
: FOR ${item} IN RANGE 0 ${cnt}
\ ${readName}= Set Variable ${edge[${item}]['node']['configuration']}
\ ${readvalue2}= Set Variable ${edge[${item}]['node']['tags']}
\ ${tag_Count}= get length ${edge[${item}]['node']['tags']}
\ ${tag_variable}= set variable ${edge[${item}]['node']['tags']}
\ forkeyword ${tag_Count} ${tag_variable} ${readName}
${req_json} Json.Dumps ${dict}
Create File results.json ${req_json}
forkeyword
[Arguments] ${tag_Count} ${tag_variable} ${readName}
#{z}= create list
: FOR ${item} IN RANGE 0 ${tag_Count}
\ ${resourceName}= run keyword if ${tag_Count} > 0 set variable ${readName['name']}
\ log to console ${resourceName}-forloop
\ ${readkey}= set variable ${tag_variable[${item}]['name']}
\ ${readvalue}= set variable ${tag_variable[${item}]['value']}
\ set to dictionary ${dict} resourceName ${resourceName}
\ set to dictionary ${dict} ${readkey} ${readvalue}
set suite variable ${dict}
I expected all values from all tags but only the last tags key and value is getting printed. Can anyone please guide me on the code.
Thanks in advance for any help!!
Why do you have duplicate keys in first place?
Duplicate keys in JSON aren't covered by the spec and can lead to undefined behavior. If you read the JSON into a Python dict, the information will be lost, since Python dict keys must be unique. Your best bet is to change the JSON; duplicate keys are a bad idea
Also if you change the json to have no duplicates it gives out the correct result
as python keys must be unique.
Since all the other keys are duplicate your simplified version would be:
"data": {
"resources": {
"edges": [
{
"node": {
"tags": [
{
"name": "app",
"value": "e2e"
},
{
"name": "Cost",
"value": "qwert"
},
{
"name": "test",
"value": "qwerty"
}
]
}
}
]
}
}
So if your robot framework gives out the result as
{
"name": "app",
"value": "e2e"
},
{
"name": "Cost",
"value": "qwert"
},
{
"name": "test",
"value": "qwerty"
}
It is correct.

Categories