Parsing muilti dimensional Json array to Python - python

I'm in over my head, trying to parse JSON for my first time and dealing with a multi dimensional array.
{
"secret": "[Hidden]",
"minutes": 20,
"link": "http:\/\/www.1.com",
"bookmark_collection": {
"free_link": {
"name": "#free_link#",
"bookmarks": [
{
"name": "1",
"link": "http:\/\/www.1.com"
},
{
"name": "2",
"link": "http:\/\/2.dk"
},
{
"name": "3",
"link": "http:\/\/www.3.in"
}
]
},
"boarding_pass": {
"name": "Boarding Pass",
"bookmarks": [
{
"name": "1",
"link": "http:\/\/www.1.com\/"
},
{
"name": "2",
"link": "http:\/\/www.2.com\/"
},
{
"name": "3",
"link": "http:\/\/www.3.hk"
}
]
},
"sublinks": {
"name": "sublinks",
"link": [
"http:\/\/www.1.com",
"http:\/\/www.2.com",
"http:\/\/www.3.com"
]
}
}
}
This is divided into 3 parts, the static data on my first dimension (secret, minutes, link) Which i need to get as seperate strings.
Then I need a dictionary per "bookmark collection" which does not have fixed names, so I need the name of them and the links/names of each bookmark.
Then there is the seperate sublinks which is always the same, where I need all the links in a seperate dictionary.
I'm reading about parsing JSON but most of the stuff I find is a simple array put into 1 dictionary.
Does anyone have any good techniques to do this ?

After you parse the JSON, you will end up with a Python dict. So, suppose the above JSON is in a string named input_data:
import json
# This converts from JSON to a python dict
parsed_input = json.loads(input_data)
# Now, all of your static variables are referenceable as keys:
secret = parsed_input['secret']
minutes = parsed_input['minutes']
link = parsed_input['link']
# Plus, you can get your bookmark collection as:
bookmark_collection = parsed_input['bookmark_collection']
# Print a list of names of the bookmark collections...
print bookmark_collection.keys() # Note this contains sublinks, so remove it if needed
# Get the name of the Boarding Pass bookmark:
print bookmark_collection['boarding_pass']['name']
# Print out a list of all bookmark links as:
# Boarding Pass
# * 1: http://www.1.com/
# * 2: http://www.2.com/
# ...
for bookmark_definition in bookmark_collection.values():
# Skip sublinks...
if bookmark_definition['name'] == 'sublinks':
continue
print bookmark_definition['name']
for bookmark in bookmark_definition['bookmarks']:
print " * %(name)s: %(link)s" % bookmark
# Get the sublink definition:
sublinks = parsed_input['bookmark_collection']['sublinks']
# .. and print them
print sublinks['name']
for link in sublinks['link']:
print ' *', link

Hmm, doesn't json.loads do the trick?
For example, if your data is in a file,
import json
text = open('/tmp/mydata.json').read()
d = json.loads(text)
# first level fields
print d['minutes'] # or 'secret' or 'link'
# the names of each of bookmark_collections's items
print d['bookmark_collection'].keys()
# the sublinks section, as a dict
print d['bookmark_collection']['sublinks']
The output of this code (given your sample input above) is:
20
[u'sublinks', u'free_link', u'boarding_pass']
{u'link': [u'http://www.1.com', u'http://www.2.com', u'http://www.3.com'], u'name': u'sublinks'}
Which, I think, gets you what you need?

Related

Improperly formatted json? [duplicate]

This question already has answers here:
Python list of dictionaries search
(24 answers)
Closed last month.
First, I am new to Python and working with JSON.
I am trying to extract just one value from an API request response, and I am having a difficult time parsing out the data I need.
I have done a lot of searching on how to do this, but most all the examples use a string or file that is formatted is much more basic than what I am getting.
I understand the key - value pair concept but I am unsure how to reference the key-value I want. I think it has something to do with the response having multiple objects having the same kay names. Or maybe the first line "Bookmark" is making things goofy.
The value I want is for the model name in the response example below.
That's all I need from this. Any help would be greatly appreciated.
{
"Bookmark": "<B><P><p>SerNum</p><p>Item</p></P><D><f>false</f><f>false</f></D><F><v>1101666</v><v>ADDMASTER IJ7102-23E</v></F><L><v>123456</v><v>Model Name</v></L></B>",
"Items": [
[
{
"Name": "SerNum",
"Value": "123456"
},
{
"Name": "Item",
"Value": "Model Name"
},
{
"Name": "_ItemId",
"Value": "PBT=[unit] unt.DT=[2021-07-28 08:20:33.513] unt.ID=[eae2621d-3e9f-4515-9763-55e67f65fae6]"
}
]
],
"Message": "Success",
"MessageCode": 0
}
If you want to find value of dictionary with key 'Name' and value 'Item' you can do:
import json
with open('your_data.json', 'r') as f_in:
data = json.load(f_in)
model_name = next((i['Value'] for lst in data['Items'] for i in lst if i['Name'] == 'Item'), 'Model name not found.')
print(model_name)
Prints:
Model Name
Note: if the dictionary is not found string 'Model name not found.' is returned
First, load the JSON into a python dict:
import json
x = '''{
"Bookmark": "<B><P><p>SerNum</p><p>Item</p></P><D><f>false</f><f>false</f></D><F><v>1101666</v><v>ADDMASTER IJ7102-23E</v></F><L><v>123456</v><v>Model Name</v></L></B>",
"Items": [
[
{
"Name": "SerNum",
"Value": "123456"
},
{
"Name": "Item",
"Value": "Model Name"
},
{
"Name": "_ItemId",
"Value": "PBT=[unit] unt.DT=[2021-07-28 08:20:33.513] unt.ID=[eae2621d-3e9f-4515-9763-55e67f65fae6]"
}
]
],
"Message": "Success",
"MessageCode": 0
}'''
# parse x:
y = json.loads(x)
# The result is a Python dictionary.
Now if you want the value 'Model Name', you would do:
print(y['Items'][0][1]['Value'])

Convert multiple string stored in a variable into a single list in python

I hope everyone is doing well.
I need a little help where I need to get all the strings from a variable and need to store into a single list in python.
For example -
I have json file from where I am getting ids and all the ids are getting stored into a variable called id as below when I run print(id)
17298626-991c-e490-bae6-47079c6e2202
17298496-19bd-2f89-7b5f-881921abc632
17298698-3e17-7a9b-b337-aacfd9483b1b
172986ac-d91d-c4ea-2e50-d53700480dd0
172986d0-18aa-6f51-9c62-6cb087ad31e5
172986f4-80f0-5c21-3aee-12f22a5f4322
17298712-a4ac-7b36-08e9-8512fa8322dd
17298747-8cc6-d9d0-8d05-50adf228c029
1729875c-050f-9a99-4850-bb0e6ad35fb0
1729875f-0d50-dc94-5515-b4891c40d81c
17298761-c26b-3ce5-e77e-db412c38a5b4
172987c8-2b5d-0d94-c365-e8407b0a8860
1729881a-e583-2b54-3a52-d092020d9c1d
1729881c-64a2-67cf-d561-6e5e38ed14cb
172987ec-7a20-7eb6-3ebe-a9fb621bb566
17298813-7ac4-258b-d6f9-aaf43f9147b1
17298813-f1ef-d28a-0817-5f3b86c3cf23
17298828-b62b-9ee6-248b-521b0663226e
17298825-7449-2fcb-378e-13671cb4688a
I want these all values to be stored into a single list.
Can some please help me out with this.
Below is the code I am using:
import json
with open('requests.json') as f:
data = json.load(f)
print(type(data))
for i in data:
if 'traceId' in i:
id = i['traceId']
newid = id.split()
#print(type(newid))
print(newid)
And below is my json file looks like:
[
{
"id": "376287298-hjd8-jfjb-khkf-6479280283e9",
"submittedTime": 1591692502558,
"traceId": "17298626-991c-e490-bae6-47079c6e2202",
"userName": "ABC",
"onlyChanged": true,
"description": "Not Required",
"startTime": 1591694487929,
"result": "NONE",
"state": "EXECUTING",
"paused": false,
"application": {
"id": "16b22a09-a840-f4d9-f42a-64fd73fece57",
"name": "XYZ"
},
"applicationProcess": {
"id": "dihihdosfj9279278yrie8ue",
"name": "Deploy",
"version": 12
},
"environment": {
"id": "fkjdshkjdshglkjdshgldshldsh03r937837",
"name": "DEV"
},
"snapshot": {
"id": "djnglkfdglki98478yhgjh48yr844h",
"name": "DEV_snapshot"
},
},
{
"id": "17298495-f060-3e9d-7097-1f86d5160789",
"submittedTime": 1591692844597,
"traceId": "17298496-19bd-2f89-7b5f-881921abc632",
"userName": "UYT,
"onlyChanged": true,
"startTime": 1591692845543,
"result": "NONE",
"state": "EXECUTING",
"paused": false,
"application": {
"id": "osfodsho883793hgjbv98r3098w",
"name": "QA"
},
"applicationProcess": {
"id": "owjfoew028r2uoieroiehojehfoef",
"name": "EDC",
"version": 5
},
"environment": {
"id": "16cf69c5-4194-e557-707d-0663afdbceba",
"name": "DTESTU"
},
}
]
From where I am trying to get the traceId.
you could use simple split method like the follwing:
ids = '''17298626-991c-e490-bae6-47079c6e2202 17298496-19bd-2f89-7b5f-881921abc632 17298698-3e17-7a9b-b337-aacfd9483b1b 172986ac-d91d-c4ea-2e50-d53700480dd0 172986d0-18aa-6f51-9c62-6cb087ad31e5 172986f4-80f0-5c21-3aee-12f22a5f4322 17298712-a4ac-7b36-08e9-8512fa8322dd 17298747-8cc6-d9d0-8d05-50adf228c029 1729875c-050f-9a99-4850-bb0e6ad35fb0 1729875f-0d50-dc94-5515-b4891c40d81c 17298761-c26b-3ce5-e77e-db412c38a5b4 172987c8-2b5d-0d94-c365-e8407b0a8860 1729881a-e583-2b54-3a52-d092020d9c1d 1729881c-64a2-67cf-d561-6e5e38ed14cb 172987ec-7a20-7eb6-3ebe-a9fb621bb566 17298813-7ac4-258b-d6f9-aaf43f9147b1 17298813-f1ef-d28a-0817-5f3b86c3cf23 17298828-b62b-9ee6-248b-521b0663226e 17298825-7449-2fcb-378e-13671cb4688a'''
l = ids.split(" ")
print(l)
This will give the following result, I assumed that the separator needed is simple space you can adjust properly:
['17298626-991c-e490-bae6-47079c6e2202', '17298496-19bd-2f89-7b5f-881921abc632', '17298698-3e17-7a9b-b337-aacfd9483b1b', '172986ac-d91d-c4ea-2e50-d53700480dd0', '172986d0-18aa-6f51-9c62-6cb087ad31e5', '172986f4-80f0-5c21-3aee-12f22a5f4322', '17298712-a4ac-7b36-08e9-8512fa8322dd', '17298747-8cc6-d9d0-8d05-50adf228c029', '1729875c-050f-9a99-4850-bb0e6ad35fb0', '1729875f-0d50-dc94-5515-b4891c40d81c', '17298761-c26b-3ce5-e77e-db412c38a5b4', '172987c8-2b5d-0d94-c365-e8407b0a8860', '1729881a-e583-2b54-3a52-d092020d9c1d', '1729881c-64a2-67cf-d561-6e5e38ed14cb', '172987ec-7a20-7eb6-3ebe-a9fb621bb566', '17298813-7ac4-258b-d6f9-aaf43f9147b1', '17298813-f1ef-d28a-0817-5f3b86c3cf23', '17298828-b62b-9ee6-248b-521b0663226e', '17298825-7449-2fcb-378e-13671cb4688a']
Edit
You get list of lists because each iteration you read only 1 id, so what you need to do is to initiate an empty list and append each id to it in the following way:
l = []
for i in data
if 'traceId' in i:
id = i['traceId']
l.append(id)
you can append the ids variable to the list such as,
#list declaration
l1=[]
#this must be in your loop
l1.append(ids)
I'm assuming you get the id as a str type value. Using id.split() will return a list of all ids in one single Python list, as each id is separated by space here in your example.
id = """17298626-991c-e490-bae6-47079c6e2202 17298496-19bd-2f89-7b5f-881921abc632
17298698-3e17-7a9b-b337-aacfd9483b1b 172986ac-d91d-c4ea-2e50-d53700480dd0
172986d0-18aa-6f51-9c62-6cb087ad31e5 172986f4-80f0-5c21-3aee-12f22a5f4322
17298712-a4ac-7b36-08e9-8512fa8322dd 17298747-8cc6-d9d0-8d05-50adf228c029
1729875c-050f-9a99-4850-bb0e6ad35fb0 1729875f-0d50-dc94-5515-b4891c40d81c
17298761-c26b-3ce5-e77e-db412c38a5b4 172987c8-2b5d-0d94-c365-e8407b0a8860
1729881a-e583-2b54-3a52-d092020d9c1d 1729881c-64a2-67cf-d561-6e5e38ed14cb
172987ec-7a20-7eb6-3ebe-a9fb621bb566 17298813-7ac4-258b-d6f9-aaf43f9147b1
17298813-f1ef-d28a-0817-5f3b86c3cf23 17298828-b62b-9ee6-248b-521b0663226e
17298825-7449-2fcb-378e-13671cb4688a"""
id_list = id.split()
print(id_list)
Output:
['17298626-991c-e490-bae6-47079c6e2202', '17298496-19bd-2f89-7b5f-881921abc632',
'17298698-3e17-7a9b-b337-aacfd9483b1b', '172986ac-d91d-c4ea-2e50-d53700480dd0',
'172986d0-18aa-6f51-9c62-6cb087ad31e5', '172986f4-80f0-5c21-3aee-12f22a5f4322',
'17298712-a4ac-7b36-08e9-8512fa8322dd', '17298747-8cc6-d9d0-8d05-50adf228c029',
'1729875c-050f-9a99-4850-bb0e6ad35fb0', '1729875f-0d50-dc94-5515-b4891c40d81c',
'17298761-c26b-3ce5-e77e-db412c38a5b4', '172987c8-2b5d-0d94-c365-e8407b0a8860',
'1729881a-e583-2b54-3a52-d092020d9c1d', '1729881c-64a2-67cf-d561-6e5e38ed14cb',
'172987ec-7a20-7eb6-3ebe-a9fb621bb566', '17298813-7ac4-258b-d6f9-aaf43f9147b1',
'17298813-f1ef-d28a-0817-5f3b86c3cf23', '17298828-b62b-9ee6-248b-521b0663226e',
'17298825-7449-2fcb-378e-13671cb4688a']
split() splits by default with space as a separator. You can use the sep argument to use any other separator if needed.

Python: Iterate JSON and remove items with specific criteria

I am trying to filter out data from API JSON response with Python and I get weird results. I would be glad if somebody can guide me how to deal with the situation.
The main idea is to remove irrelevant data in the JSON and keep only the data that is associated with particular people which I hold in a list.
Here is a snip of the JSON file:
{
"result": [
{
"number": "Number1",
"short_description": "Some Description",
"assignment_group": {
"display_value": "Some value",
"link": "https://some_link.com"
},
"incident_state": "Closed",
"sys_created_on": "2020-03-30 11:51:24",
"priority": "4 - Low",
"assigned_to": {
"display_value": "John Doe",
"link": "https://some_link.com"
}
},
{
"number": "Number2",
"short_description": "Some Description",
"assignment_group": {
"display_value": "Some value",
"link": "https://some_link.com"
},
"incident_state": "Closed",
"sys_created_on": "2020-03-10 11:07:13",
"priority": "4 - Low",
"assigned_to": {
"display_value": "Tyrell Greenley",
"link": "https://some_link.com"
}
},
{
"number": "Number3",
"short_description": "Some Description",
"assignment_group": {
"display_value": "Some value",
"link": "https://some_link.com"
},
"incident_state": "Closed",
"sys_created_on": "2020-03-20 10:23:35",
"priority": "4 - Low",
"assigned_to": {
"display_value": "Delmar Vachon",
"link": "https://some_link.com"
}
},
{
"number": "Number4",
"short_description": "Some Description",
"assignment_group": {
"display_value": "Some value",
"link": "https://some_link.com"
},
"incident_state": "Closed",
"sys_created_on": "2020-03-30 11:51:24",
"priority": "4 - Low",
"assigned_to": {
"display_value": "Samual Isham",
"link": "https://some_link.com"
}
}
]
}
Here is the Python code:
users_test = ['Ahmad Wickert', 'Dick Weston', 'Gerardo Salido', 'Rosendo Dewey', 'Samual Isham']
# Load JSON file
with open('extract.json', 'r') as input_file:
input_data = json.load(input_file)
# Create a function to clear the data
def clear_data(data, users):
"""Filter out the data and leave only records for the names in the users_test list"""
for elem in data:
print(elem['assigned_to']['display_value'] not in users)
if elem['assigned_to']['display_value'] not in users:
print('Removing {} from JSON as not present in list of names.'.format(elem['assigned_to']['display_value']))
data.remove(elem)
else:
print('Keeping the record for {} in JSON.'.format(elem['assigned_to']['display_value']))
return data
cd = clear_data(input_data['result'], users_test)
And here is the output, which seems to iterate through only 2 of the items in the file:
True
Removing John Doe from JSON as not present in list of names.
True
Removing Delmar Vachon from JSON as not present in list of names.
Process finished with exit code 0
It seems that the problem is more or less related to the .remove() method however I don't find any other suitable solution to delete these particular items that I do not need.
Here is the output of the iteration without applying the remove() method:
True
Removing John Doe from JSON as not present in list of names.
True
Removing Tyrell Greenley from JSON as not present in list of names.
True
Removing Delmar Vachon from JSON as not present in list of names.
False
Keeping the record for Samual Isham in JSON.
Process finished with exit code 0
Note: I have left the check for the name visible on purpose.
I would appreciate any ideas to sort out the situation.
If you don't need to log info about people you are removing you could simply try
filtered = [i for i in data['result'] if i['assigned_to']['display_value'] in users_test]
users_test = ['Ahmad Wickert', 'Dick Weston', 'Gerardo Salido', 'Rosendo Dewey', 'Samual Isham']
solution = []
for user in users_test:
print(user)
for value in data['result']:
if user == value['assigned_to']['display_value']:
solution.append(value)
print(solution)
for more efficient code, as asked by #NomadMonad
solution = list(filter(lambda x: x['assigned_to']['display_value'] in users_test, data['result']))
You are modifying a dictionary while at the same time iterating through it. Check out this blog post which describes this behavior.
A safer way to do this is to make a copy of your dictionary to iterate over, and to delete from your original dictionary:
import copy
def clear_data(data, users):
"""Filter out the data and leave only records for the names in the users_test list"""
for elem in copy.deepcopy(data): # deepcopy handles nested dicts
# Still call data.remove() in here

Grab element from json dump

I'm using the following python code to connect to a jsonrpc server and nick some song information. However, I can't work out how to get the current title in to a variable to print elsewhere. Here is the code:
TracksInfo = []
for song in playingSongs:
data = { "id":1,
"method":"slim.request",
"params":[ "",
["songinfo",0,100, "track_id:%s" % song, "tags:GPASIediqtymkovrfijnCYXRTIuwxN"]
]
}
params = json.dumps(data, sort_keys=True, indent=4)
conn.request("POST", "/jsonrpc.js", params)
httpResponse = conn.getresponse()
data = httpResponse.read()
responce = json.loads(data)
print json.dumps(responce, sort_keys=True, indent=4)
TrackInfo = responce['result']["songinfo_loop"][0]
TracksInfo.append(TrackInfo)
This brings me back the data in json format and the print json.dump brings back:
pi#raspberrypi ~/pithon $ sudo python tom3.py
{
"id": 1,
"method": "slim.request",
"params": [
"",
[
"songinfo",
"0",
100,
"track_id:-140501481178464",
"tags:GPASIediqtymkovrfijnCYXRTIuwxN"
]
],
"result": {
"songinfo_loop": [
{
"id": "-140501481178464"
},
{
"title": "Witchcraft"
},
{
"artist": "Pendulum"
},
{
"duration": "253"
},
{
"tracknum": "1"
},
{
"type": "Ogg Vorbis (Spotify)"
},
{
"bitrate": "320k VBR"
},
{
"coverart": "0"
},
{
"url": "spotify:track:2A7ZZ1tjaluKYMlT3ItSfN"
},
{
"remote": 1
}
]
}
}
What i'm trying to get is result.songinfoloop.title (but I tried that!)
The songinfo_loop structure is.. peculiar. It is a list of dictionaries each with just one key.
Loop through it until you have one with a title:
TrackInfo = next(d['title'] for d in responce['result']["songinfo_loop"] if 'title' in d)
TracksInfo.append(TrackInfo)
A better option would be to 'collapse' all those dictionaries into one:
songinfo = reduce(lambda d, p: d.update(p) or d,
responce['result']["songinfo_loop"], {})
TracksInfo.append(songinfo['title'])
songinfo_loop is a list not a dict. That means you need to call it by position, or loop through it and find the dict with a key value of "title"
positional:
responce["result"]["songinfo_loop"][1]["title"]
loop:
for info in responce["result"]["songinfo_loop"]:
if "title" in info.keys():
print info["title"]
break
else:
print "no song title found"
Really, it seems like you would want to have the songinfo_loop be a dict, not a list. But if you need to leave it as a list, this is how you would pull the title.
The result is really a standard python dict, so you can use
responce["result"]["songinfoloop"]["title"]
which should work

How to format unicode strings to utf-8 in Python?

I'm reading in a JSON string which is littered with u'string' style strings. Example:
[
{
"!\/award\/award_honor\/honored_for": {
"award": {
"id": "\/en\/spiel_des_jahres"
},
"year": {
"value": "1996"
}
},
"guid": "#9202a8c04000641f80000000003a0ee6",
"type": "\/games\/game",
"id": "\/en\/el_grande",
"name": "El Grande"
},
{
"!\/award\/award_honor\/honored_for": {
"award": {
"id": "\/en\/spiel_des_jahres"
},
"year": {
"value": "1995"
}
},
"guid": "#9202a8c04000641f80000000000495ec",
"type": "\/games\/game",
"id": "\/en\/settlers_of_catan",
"name": "Settlers of Catan"
}
]
If I assign name = result.name. Then when I log of pass that value to a Django template, it displays as u'Dominion'
How do I format it to display as Dominion?
++ UPDATE ++
I think the problem has to do with printing values from a list or dictionary. For example:
result = freebase.mqlread(query)
games = {}
count = 0
r = result[0]
name = r.name
games["name"] = name,
self.response.out.write(games["name"])
self.response.out.write(name)
This displays as:
(u'Dominion',) // saved response to dictionary, and then printed
Dominion // when calling the value directly from the response
I need to iterate through an array of JSON items and the values are being shown with the unicode. Why?
The comma at the end of games["name"] = name, makes it a 1-tuple. Remove it.
>>> # example
>>> s = u"Jägermütze"
>>> s.encode("utf-8")
'J\xc3\xa4germ\xc3\xbctze'
>>> print s.encode("utf-8") # on a utf-8 terminal
Jägermütze
Don't know much about Django, but not accepting snicode strings seems unpythonic to me.
You can use str(your string) to do this.

Categories