How to convert one format of json into another using python?

How to convert one format of json into another using python? - python

I Want to convert json1 into json2 with minimal looping in Python because there are many records like below.
JSON1:
{
"1-5":[
{
"NAME": "A",
"AGE": "1"
},
{
"NAME": "A",
"AGE": "2"
},
{
"NAME": "B",
"AGE": "3"
}
],
"6-10":[
{
"NAME": "x",
"AGE": "6"
},
{
"NAME": "y",
"AGE": "6"
},
{
"NAME": "z",
"AGE": "10"
}
]
}
JSON2:
{
"1": [
{
"NAME": "A",
"AGE": "1"
}
],
"2": [
{
"NAME": "A",
"AGE": "2"
},
{
"NAME": "B",
"AGE": "2"
}
],
"3": [
{
"NAME": "B",
"AGE": "1"
}
],
...
}
Is there any way to do like this, Could anyone help me with this?
Note Extra info for this question to be submitted.
JSON (JavaScript Object Notation) is a lightweight data-interchange format. It is easy for humans to read and write. It is easy for machines to parse and generate. It is based on a subset of the JavaScript Programming Language Standard ECMA-262 3rd Edition - December 1999.

Related

merge/overwrite two JSON object in Python

I want to merge two JSON files while the data overwrites itself in the main JSON file.
My main object is the following:
{
"data": [
{
"name": "name1",
"gender": "male",
"age": "20",
"subject": "Python",
"pass": "No"
}
]
}
"new data.json" needs to be overridden with:
{
"data": [
{
"name": "name1",
"subject": "Python",
"pass": "Yes"
}
]
}
The result object should be:
{
"data": [
{
"name": "name1",
"gender": "male",
"age": "20",
"subject": "Python",
"pass": "Yes" //updated
}
]
}

In general, you can use the built-in Python update method for dictionaries, which updates the current dictionary with values of a new one while keeping old data present.
For your case (assuming you always need to update the FIRST element in the array within the "data" key):
original_data = {
"data": [
{
"name": "name1",
"gender": "male",
"age": "20",
"subject": "Python",
"pass": "No"
}
]
}
new_data = {
"data": [
{
"name": "name1",
"subject": "Python",
"pass": "Yes"
}
]
}
original_data["data"][0].update(new_data["data"][0]) #this updates the original JSON

Creating custom JSON from existing JSON using Python

(Python beginner alert) I am trying to create a custom JSON from an existing JSON. The scenario is - I have a source which can send many set of fields but I want to cherry pick some of them and create a subset of that while maintaining the original JSON structure. Original Sample
{
"Response": {
"rCode": "11111",
"rDesc": "SUCCESS",
"pData": {
"code": "123-abc-456-xyz",
"sData": [
{
"receiptTime": "2014-03-02T00:00:00.000",
"sessionDate": "2014-02-28",
"dID": {
"d": {
"serialNo": "3432423423",
"dType": "11111",
"dTypeDesc": "123123sd"
},
"mode": "xyz"
},
"usage": {
"duration": "661",
"mOn": [
"2014-02-28_20:25:00",
"2014-02-28_22:58:00"
],
"mOff": [
"2014-02-28_21:36:00",
"2014-03-01_03:39:00"
]
},
"set": {
"abx": "1",
"ayx": "1",
"pal": "1"
},
"rEvents": {
"john": "doe",
"lorem": "ipsum"
}
},
{
"receiptTime": "2014-04-02T00:00:00.000",
"sessionDate": "2014-04-28",
"dID": {
"d": {
"serialNo": "123123",
"dType": "11111",
"dTypeDesc": "123123sd"
},
"mode": "xyz"
},
"usage": {
"duration": "123",
"mOn": [
"2014-04-28_20:25:00",
"2014-04-28_22:58:00"
],
"mOff": [
"2014-04-28_21:36:00",
"2014-04-01_03:39:00"
]
},
"set": {
"abx": "4",
"ayx": "3",
"pal": "1"
},
"rEvents": {
"john": "doe",
"lorem": "ipsum"
}
}
]
}
}
}
Here the sData array tag has got few tags out of which I want to keep only 24 and get rid of the rest. I know I could use element.pop() but I cannot go and delete a new incoming field every time the source publishes it. Below is the expected output -
Expected Output
{
"Response": {
"rCode": "11111",
"rDesc": "SUCCESS",
"pData": {
"code": "123-abc-456-xyz",
"sData": [
{
"receiptTime": "2014-03-02T00:00:00.000",
"sessionDate": "2014-02-28",
"usage": {
"duration": "661",
"mOn": [
"2014-02-28_20:25:00",
"2014-02-28_22:58:00"
],
"mOff": [
"2014-02-28_21:36:00",
"2014-03-01_03:39:00"
]
},
"set": {
"abx": "1",
"ayx": "1",
"pal": "1"
}
},
{
"receiptTime": "2014-04-02T00:00:00.000",
"sessionDate": "2014-04-28",
"usage": {
"duration": "123",
"mOn": [
"2014-04-28_20:25:00",
"2014-04-28_22:58:00"
],
"mOff": [
"2014-04-28_21:36:00",
"2014-04-01_03:39:00"
]
},
"set": {
"abx": "4",
"ayx": "3",
"pal": "1"
}
}
]
}
}
}
I myself took reference from How can I create a new JSON object form another using Python? but its not working as expected. Looking forward for inputs/solutions from all of you gurus. Thanks in advance.

Kind of like this:
data = json.load(open("fullset.json"))
def subset(d):
newd = {}
for name in ('receiptTime','sessionData','usage','set'):
newd[name] = d[name]
return newd
data['Response']['pData']['sData'] = [subset(d) for d in data['Response']['pData']['sData']]
json.dump(data, open('newdata.json','w'))

How to use S3 Select for Nested Parquet Objects

I have dumped data into a parquet file.
When I use
SELECT * FROM s3object s LIMIT 1
it gives me the following result.
{
"name": "John",
"age": "45",
"country": "USA",
"experience": [{
"company": {
"name": "ABC",
"years": "10",
"position": "Manager"
}
},
{
"company": {
"name": "BBC",
"years": "2",
"position": "Assistant"
}
}
]
}
I want to filter the result where company.name = "ABC"
so, the output should be looks like following.
{
"name": "John",
"age": "45",
"country": "USA",
"experience": [{
"company": {
"name": "ABC",
"years": "10",
"position": "Manager"
}
}
]
}
or this
{
"name": "John",
"age": "45",
"country": "USA",
"experience.company.name": "ABC",
"experience.company.years": "10",
"experience.company.position": "Manager"
}
Any support is highly appreciated.
Thanks.

Wit.ai Python - Extract confidence level from API output

I am new to Wit.ai and have started to implement it in my code. I was pondering an easier way than hardcoding to extract all the confidence levels from a given wit.ai API output.
For example(API output):
{
"_text": "I believe I am a human",
"entities": {
"statement": [
{
"confidence": 0.97691847787856,
"value": "I",
"type": "value"
},
{
"confidence": 0.91728476663947,
"value": "I",
"type": "value"
}
],
"query": [
{
"confidence": 1,
"value": "am",
"type": "value"
}
]
},
"msg_id": "0YKCUvDvHC2gyydiU"
}
Thank You in advance.

You can iterate over entities to get confidence.
Something like :
data = {
"_text": "I believe I am a human",
"entities": {
"statement": [
{
"confidence": 0.97691847787856,
"value": "I",
"type": "value"
},
{
"confidence": 0.91728476663947,
"value": "I",
"type": "value"
}
],
"query": [
{
"confidence": 1,
"value": "am",
"type": "value"
}
]
},
"msg_id": "0YKCUvDvHC2gyydiU"
}
confidence = list()
for k , v in data['entities'].iteritems():
for item in v:
confidence.append( (item['value'], item['confidence']))
print confidence
Which gives us:
[('I', 0.97691847787856), ('I', 0.91728476663947), ('am', 1)]
Hope this helps

Unable to pull data from json using python

I have the following json
{
"response": {
"message": null,
"exception": null,
"context": [
{
"headers": null,
"name": "aname",
"children": [
{
"type": "cluster-connectivity",
"name": "cluster-connectivity"
},
{
"type": "consistency-groups",
"name": "consistency-groups"
},
{
"type": "devices",
"name": "devices"
},
{
"type": "exports",
"name": "exports"
},
{
"type": "storage-elements",
"name": "storage-elements"
},
{
"type": "system-volumes",
"name": "system-volumes"
},
{
"type": "uninterruptible-power-supplies",
"name": "uninterruptible-power-supplies"
},
{
"type": "virtual-volumes",
"name": "virtual-volumes"
}
],
"parent": "/clusters",
"attributes": [
{
"value": "true",
"name": "allow-auto-join"
},
{
"value": "0",
"name": "auto-expel-count"
},
{
"value": "0",
"name": "auto-expel-period"
},
{
"value": "0",
"name": "auto-join-delay"
},
{
"value": "1",
"name": "cluster-id"
},
{
"value": "true",
"name": "connected"
},
{
"value": "synchronous",
"name": "default-cache-mode"
},
{
"value": "true",
"name": "default-caw-template"
},
{
"value": "blah",
"name": "default-director"
},
{
"value": [
"blah",
"blah"
],
"name": "director-names"
},
{
"value": [
],
"name": "health-indications"
},
{
"value": "ok",
"name": "health-state"
},
{
"value": "1",
"name": "island-id"
},
{
"value": "blah",
"name": "name"
},
{
"value": "ok",
"name": "operational-status"
},
{
"value": [
],
"name": "transition-indications"
},
{
"value": [
],
"name": "transition-progress"
}
],
"type": "cluster"
}
],
"custom-data": null
}
}
which im trying to parse using the json module in python. I am only intrested in getting the following information out of it.
Name Value
operational-status Value
health-state Value
Here is what i have tried.
in the below script data is the json returned from a webpage
json = json.loads(data)
healthstate= json['response']['context']['operational-status']
operationalstatus = json['response']['context']['health-status']
Unfortunately i think i must be missing something as the above results in an error that indexes must be integers not string.
if I try
healthstate= json['response'][0]
it errors saying index 0 is out of range.
Any help would be gratefully received.

json['response']['context'] is a list, so that object requires you to use integer indices.
Each item in that list is itself a dictionary again. In this case there is only one such item.
To get all "name": "health-state" dictionaries out of that structure you'd need to do a little more processing:
[attr['value'] for attr in json['response']['context'][0]['attributes'] if attr['name'] == 'health-state']
would give you a list of of matching values for health-state in the first context.
Demo:
>>> [attr['value'] for attr in json['response']['context'][0]['attributes'] if attr['name'] == 'health-state']
[u'ok']

You have to follow the data structure. It's best to interactively manipulate the data and check what every item is. If it's a list you'll have to index it positionally or iterate through it and check the values. If it's a dict you'll have to index it by it's keys. For example here is a function that get's the context and then iterates through it's attributes checking for a particular name.
def get_attribute(data, attribute):
for attrib in data['response']['context'][0]['attributes']:
if attrib['name'] == attribute:
return attrib['value']
return 'Not Found'
>>> data = json.loads(s)
>>> get_attribute(data, 'operational-status')
u'ok'
>>> get_attribute(data, 'health-state')
u'ok'

json['reponse']['context'] is a list, not a dict. The structure is not exactly what you think it is.
For example, the only "operational status" I see in there can be read with the following:
json['response']['context'][0]['attributes'][0]['operational-status']

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

How to convert one format of json into another using python? - python

Related

merge/overwrite two JSON object in Python

Creating custom JSON from existing JSON using Python

How to use S3 Select for Nested Parquet Objects

Wit.ai Python - Extract confidence level from API output

Unable to pull data from json using python

Categories

Resources