python nested lists/dictionaries and popping values

python nested lists/dictionaries and popping values - python

Apologies in advance for this being such a newbie question. I'm just beginning to write python and i've been having some confusion around popping values from nested dictionaries/lists so i appreciate any help!
I have this sample json data:
{ "scans": [
{ "status": "completed", "starttime": "20150803T000000", "id":533},
{ "status": "completed", "starttime": "20150803T000000", "id":539}
] }
i'd like to pop the 'id' out of the "scans" key.
def listscans():
response = requests.get(scansurl + "scans", headers=headers, verify=False)
json_data = json.loads(response.text)
print json.dumps(json_data['scans']['id'], indent=2)
doesnt seem to be working because the nested key/values are inside of a list. i.e.
>>> print json.dumps(json_data['scans']['id'])
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: list indices must be integers, not str
can anyone point me in the right direction to get this to work? my longterm goal with this is to create a for-loop that places all of the id's into another dictionary or list that i can use for another function.

json_data['scans'] returns a list of dicts, you are trying to index the list using a str i.e []["id"] which fails for obvious reasons so you need to use the index to get each subelement:
print json_data['scans'][0]['id'] # -> first dict
print json_data['scans'][1]['id'] # -> second dict
Or to see all the id's iterate over the list of dicts returned using json_data["scans"]:
for dct in json_data["scans"]:
print(dct["id"])
To save append to a list:
all_ids = []
for dct in json_data["scans"]:
all_ids.append(dct["id"])
Or use a list comp:
all_ids = [dct["id"] for dct in json_data["scans"]]
If there is a chance the key id might not be in every dict, use in to check before you access:
all_ids = [dct["id"] for dct in json_data["scans"] if "id" in dct]

Here how can you iterate over the items and extract all ids:
json_data = ...
ids = []
for scan in json_data['scans']:
id = scan.pop('id')
# you can use get instead of pop
# then your initial data would not be changed,
# but you'll still have the ids
# id = scan.get('id')
ids.append();
This approach will work too:
ids = [item.pop('id') for item in json_data['scans']]

Related

Python3 - Parse list of strings inside nested json

Python Noob here. I saw many similar questions but none of it my exact use case. I have a simple nested json, and I'm trying to access the element name present inside metadata. Below is my sample json.
{
"items": [{
"metadata": {
"name": "myname1"
}
},
{
"metadata": {
"name": "myname1"
}
}
]
}
Below is the code That I have tried so far, but not successfull.
import json
f = open('./myfile.json')
x = f.read()
data = json.loads(x)
for i in data['items']:
for j in i['metadata']:
print (j['name'])
It errors out stating below
File "pythonjson.py", line 8, in
print (j['name']) TypeError: string indices must be integers
When I printed print (type(j)) I received the following o/p <class 'str'>. So I can see that it is a list of strings and not an dictinoary. So now How can I parse through a list of strings? Any official documentation or guide would be much helpful to know the concept of this.

Your json is bad, and the python exception is clear and unambiguous. You have the basic string "name" and you are trying to ... do a lookup on that?
Let's cut out all the json and look at the real issue. You do not know how to iterate over a dict. You're actually iterating over the keys themselves. If you want to see their values too, you're going to need dict.items()
https://docs.python.org/3/tutorial/datastructures.html#looping-techniques
metadata = {"name": "myname1"}
for key, value in metadata.items():
if key == "name":
print ('the name is', value)
But why bother if you already know the key you want to look up?
This is literally why we have dict.
print ('the name is', metadata["name"])

You likely need:
import json
f = open('./myfile.json')
x = f.read()
data = json.loads(x)
for item in data['items']:
print(item["metadata"]["name"]
Your original JSON is not valid (colons missing).

to access contents of name use "i["metadata"].keys()" this will return all keys in "metadata".
Working code to access all values of the dictionary in "metadata".
for i in data['items']:
for j in i["metadata"].keys():
print (i["metadata"][j])
**update:**Working code to access contents of "name" only.
for i in data['items']:
print (i["metadata"]["name"])

Take 2 key values from list of python dicts & make new list/tuple/array/dictionary with each index containing 2 key values from 1st listed dict

I have a list of dictionaries in a json file.
I have iterated through the list and each dictionary to obtain two specific key:value pairs from each dictionary for each element.
i.e. List[dictionary{i(key_x:value_x, key_y:value_y)}]
My question is now:
How do I place these two new key: value pairs in a new list/dictionary/array/tuple, representing the two key: value pairs extracted for each listed element in the original?
To be clear:
ORIGINAL_LIST (i.e. with each element being a nested dictionary) =
[{"a":{"blah":"blah",
"key_1":value_a1,
"key_2":value_a2,
"key_3":value_a3,
"key_4":value_a4,
"key_5":value_a5,},
"b":"something_a"},
{"a":{"blah":"blah",
"key_1":value_b1,
"key_2":value_b2,
"key_3":value_b3,
"key_4":value_b4,
"key_5":value_b5,},
"b":"something_b"}]
So my code so far is:
import json
from collections import *
from pprint import pprint
json_file = "/some/path/to/json/file"
with open(json_file) as json_data:
data = json.load(json_data)
json_data.close()
for i in data:
event = dict(i)
event_key_b = event.get('b')
event_key_2 = event.get('key_2')
print(event_key_b)#print value of "b" for each nested dict for 'i'
print(event_key_2)#print value of "key_2" for each nested dict for 'i'
To be clear:
FINAL_LIST(i.e. with each element being a nested dictionary) =
[{"b":"something_a", "key_2":value_2},
{"b":"something_b", "key_2":value_2}]

So I have an answer to getting the keys into individual dictionaries, as follows in the code below. The only problem is that the value for 'key_2' in the original json dictionaries is either an int value or it is "" for values which are 0. My script just returns 'None' for all instances of value_2 for key_2. How can I get it to read the appropriate values for 'value_2'? I want to only return dictionaries for cases where 'value_2' > 0 (i.e. where value_2 != "")
Below is the current code:
import json
from pprint import pprint
json_file = "/some/path/to/json/file"
with open(json_file) as json_data:
data = json.load(json_data)
json_data.close()
for i in data:
event_key_b = event.get('b')
for x in i:
event_key_2 = event.get('key_2')
x = {'b' : something_b, 'key_2' : value_2}
print(x)
Also, if there are any more elegant solutions anyone can think of I would really be interested in learning them ... Some of the json files I'm looking at can range from 200 dictionary entries in the original list to 2,000,000. I'm planning to feed my parsed results into a message queue for processing by a different service and any efficiencies in the code will help for scalability in processing. Also if anyone has any recommendations to give on Redis vs. RabbitMQ, I'd really appreciate it

Python 2.7: Why does json.loads not convert my string to a dict correctly?

I am asking an ElasticSearch database to provide me with a list of indices and their creation dates using Python 2.7 and the Requests package. The idea is to quickly calculate which indices have exceeded the retention policy and need to be put to sleep.
The request works perfectly and the results are exactly what I want. However, when I run the code below, when I try to convert the json result to a dict, the type of theDict is correct but it reports a size of 1, when there should be at least a couple dozen entries. What am I doing wrong? I have a feeling it's something really dumb but I just can't snag it! :)
import json
import requests
esEndPoint = "https://localhost:9200"
retrieveString = "/_cat/indices?h=index,creation.date.string&format=json&s=creation.date"
# Gets the current indices and their creation dates
def retrieveIndicesAndDates():
try:
theResult = requests.get(esEndPoint+retrieveString)
print (theResult.content)
except Exception as e:
print("Unable to retrieve list of indices with creation dates.")
print("Error: "+e)
exit(3)
return theResult.content
def main():
theDict = dict(json.loads(retrieveIndicesAndDates()))
print(type(theDict)) # Reports correct type
print(len(theDict)) # Always outputs "1" ??
for index, creationdate in theDict.items():
print("Index: ",index,", Creation date: ",theDict[index])
return
The json the call returns:
[{"index":".kibana","creation.date.string":"2017-09-14T15:01:38.611Z"},{"index":"logstash-2018.07.23","creation.date.string":"2018-07-23T00:00:01.024Z"},{"index":"cwl-2018.07.23","creation.date.string":"2018-07-23T00:00:03.877Z"},{"index":"k8s-testing-internet-2018.07.23","creation.date.string":"2018-07-23T14:19:10.024Z"},{"index":"logstash-2018.07.24","creation.date.string":"2018-07-24T00:00:01.023Z"},{"index":"k8s-testing-internet-2018.07.24","creation.date.string":"2018-07-24T00:00:01.275Z"},{"index":"cwl-2018.07.24","creation.date.string":"2018-07-24T00:00:02.157Z"},{"index":"k8s-testing-internet-2018.07.25","creation.date.string":"2018-07-25T00:00:01.022Z"},{"index":"logstash-2018.07.25","creation.date.string":"2018-07-25T00:00:01.186Z"},{"index":"cwl-2018.07.25","creation.date.string":"2018-07-25T00:00:04.012Z"},{"index":"logstash-2018.07.26","creation.date.string":"2018-07-26T00:00:01.026Z"},{"index":"k8s-testing-internet-2018.07.26","creation.date.string":"2018-07-26T00:00:01.185Z"},{"index":"cwl-2018.07.26","creation.date.string":"2018-07-26T00:00:02.587Z"},{"index":"k8s-testing-internet-2018.07.27","creation.date.string":"2018-07-27T00:00:01.027Z"},{"index":"logstash-2018.07.27","creation.date.string":"2018-07-27T00:00:01.144Z"},{"index":"cwl-2018.07.27","creation.date.string":"2018-07-27T00:00:04.485Z"},{"index":"ctl-2018.07.27","creation.date.string":"2018-07-27T09:02:09.854Z"},{"index":"cfl-2018.07.27","creation.date.string":"2018-07-27T11:12:44.681Z"},{"index":"elb-2018.07.27","creation.date.string":"2018-07-27T11:13:51.340Z"},{"index":"cfl-2018.07.24","creation.date.string":"2018-07-27T11:45:23.697Z"},{"index":"cfl-2018.07.23","creation.date.string":"2018-07-27T11:45:24.646Z"},{"index":"cfl-2018.07.25","creation.date.string":"2018-07-27T11:45:25.700Z"},{"index":"cfl-2018.07.26","creation.date.string":"2018-07-27T11:45:26.341Z"},{"index":"elb-2018.07.24","creation.date.string":"2018-07-27T11:45:27.440Z"},{"index":"elb-2018.07.25","creation.date.string":"2018-07-27T11:45:29.572Z"},{"index":"elb-2018.07.26","creation.date.string":"2018-07-27T11:45:36.170Z"},{"index":"logstash-2018.07.28","creation.date.string":"2018-07-28T00:00:01.023Z"},{"index":"k8s-testing-internet-2018.07.28","creation.date.string":"2018-07-28T00:00:01.316Z"},{"index":"cwl-2018.07.28","creation.date.string":"2018-07-28T00:00:03.945Z"},{"index":"elb-2018.07.28","creation.date.string":"2018-07-28T00:00:53.992Z"},{"index":"ctl-2018.07.28","creation.date.string":"2018-07-28T00:07:19.543Z"},{"index":"k8s-testing-internet-2018.07.29","creation.date.string":"2018-07-29T00:00:01.026Z"},{"index":"logstash-2018.07.29","creation.date.string":"2018-07-29T00:00:01.378Z"},{"index":"cwl-2018.07.29","creation.date.string":"2018-07-29T00:00:04.100Z"},{"index":"elb-2018.07.29","creation.date.string":"2018-07-29T00:00:59.241Z"},{"index":"ctl-2018.07.29","creation.date.string":"2018-07-29T00:06:44.199Z"},{"index":"logstash-2018.07.30","creation.date.string":"2018-07-30T00:00:01.024Z"},{"index":"k8s-testing-internet-2018.07.30","creation.date.string":"2018-07-30T00:00:01.179Z"},{"index":"cwl-2018.07.30","creation.date.string":"2018-07-30T00:00:04.417Z"},{"index":"elb-2018.07.30","creation.date.string":"2018-07-30T00:01:01.442Z"},{"index":"ctl-2018.07.30","creation.date.string":"2018-07-30T00:08:28.936Z"},{"index":"cfl-2018.07.30","creation.date.string":"2018-07-30T06:52:16.739Z"}]

Your error is trying to convert a list of dicts to a dict:
theDict = dict(json.loads(retrieveIndicesAndDates()))
# ^^^^^ ^
That would only work for a dict of lists. It would be redundant, though.
Just use the reply directly. Each entry is a dict with the appropriate keys:
data = json.loads(retrieveIndicesAndDates())
for entry in data:
print("Index: ", entry["index"], ", Creation date: ", entry["creation.date.string"])
So what happens when you do convert that list to a dict? Why is there just one entry?
The dict understands three initialisation methods: keywords, mappings and iterables. A list fits the last one.
Initialisation from an iterable goes through it and expects key-value iterables as elements. If one were to do it manually, it would look like this:
def sequence2dict(sequence):
map = {}
for element in sequence:
key, value = element
map[key] = value
return map
Notice how each element is unpacked via iteration? In the reply each element is a dict with two entries. Iteration on that yields the two keys but ignores the values.
key, value = {"index":".kibana","creation.date.string":"2017-09-14T15:01:38.611Z"}
print(key, '=>', value) # prints "index => creation.date.string"
To the dict constructor, every element in the reply has the same key-value pair: "index" and "creation.date.string". Since keys in a dict are unique, all elements collapse to the same entry: {"index": "creation.date.string"}.

How do I use a for loop when reading from a dictionary that might contain a list of dicts, but might not?

I apologize in advance that the title is so confusing. It makes a lot more sense in code, so here goes:
I am parsing data from a REST API that returns JSON, and I have a bit of an issue with this particular structure:
{ 'Order' : [
{ 'orderID': '1',
'OrderLines': {
'OrderLine': [
{ 'lineID':'00001', 'quantity':'1', 'cost':'10', 'description':'foo' },
{ 'lineID':'00002', 'quantity':'2', 'cost':'23.42', 'description':'bar' }
]}
}
{ 'orderID': '2',
'OrderLines': {
'OrderLine':
{ 'lineID':'00003', 'quantity':'6', 'cost':'12.99', 'description':'lonely' }
}
}
]}
If you'll notice, the second order only has one OrderLine, so instead of returning a list containing dictionaries, it returns the dictionary. Here is what I am trying to do:
orders_json = json.loads(from_server)
for order in orders_json['Order']:
print 'Order ID: {}'.format(order['orderID'])
for line in order['OrderLines']['OrderLine']:
print '-> Line ID: {}, Quantity: {}'.format(line['lineID'], line['quantity'])
It works just fine for the first order, but the second order throws TypeError: string indices must be integers since line is now a string containing the dictionary, instead of a dictionary from the list. I've been banging my head against this for hours now, and I feel like I am missing something obvious.
Here are some of the things I have tried:
Using len(line) to see if it gave me something unique for the one line orders. It does not. It returns the number of key:value pairs in the dictionary, which in my real program is 10, which an order containing 10 lines would also return.
Using a try/except. Well, that stops the TypeError from halting the whole thing, but I can't figure out how to address the dictionary once I've done that. Line is a string for single line orders instead of a dictionary.

Whoever designed that API did not do a terribly good job. Anyway, you could check whether OrderLine is a list and, if it's not, wrap it in a one-element list before doing any processing:
if not isinstance(order_line, list):
order_line = [order_line]
That would work, my personal preference would be to get the API fixed.

I'd check if the type is correct and then convert it to a list if necessary to have a uniform access:
lines = order['OrderLines']['OrderLine']
lines = [lines] if not isinstance(lines, list) else lines
for line in lines:
...

You can check the type of the object you try to access:
# ...
print 'Order ID: {0}'.format(order['orderID'])
lines = order['OrderLines']['OrderLine']
if isinstance(lines, list):
for line in lines:
print line['lineID']
elif isinstance(lines, dict):
print lines['lineID']
else:
raise ValueError('Illegal JSON object')
Edit: Wrapping the dict in a list as proposed by #NPE is the nicer and smarter solution.

Accessing nested values in nested dictionaries in Python 3.3

I'm writing in Python 3.3.
I have a set of nested dictionaries (shown below) and am trying to search using a key at the lowest level and return each of the values that correspond to the second level.
Patients = {}
Patients['PatA'] = {'c101':'AT', 'c367':'CA', 'c542':'GA'}
Patients['PatB'] = {'c101':'AC', 'c367':'CA', 'c573':'GA'}
Patients['PatC'] = {'c101':'AT', 'c367':'CA', 'c581':'GA'}
I'm trying to use a set of 'for loops' to search pull out the value attached to the c101 key in each Pat* dictionary nested under the main Patients dictionary.
This is what I have so far:
pat = 'PatA'
mutations = Patients[pat]
for Pat in Patients.values(): #iterate over the Pat* dictionaries
for mut in Pat.keys(): #iterate over the keys in the Pat* dictionaries
if mut == 'c101': #when the key in a Pat* dictionary matches 'c101'
print(Pat[mut].values()) #print the value attached to the 'c101' key
I get the following error, suggesting that my for loop returns each value as a string and that this can't then be used as a dictionary key to pull out the value.
Traceback (most recent call last):
File "filename", line 13, in
for mut in Pat.keys():
AttributeError: 'str' object has no attribute 'keys'
I think I'm missing something obvious to do with the dictionaries class, but I can't quite tell what it is. I've had a look through this question, but I don't think its quite what I'm asking.
Any advice would be greatly appreciated.

Patients.keys() gives you the list of keys in Patients dictionary (['PatA', 'PatC', 'PatB']) not the list of values hence the error. You can use dict.items to iterate over key: value pairs like this:
for patient, mutations in Patients.items():
if 'c101' in mutations.keys():
print(mutations['c101'])
To make your code working:
# Replace keys by value
for Pat in Patients.values():
# Iterate over keys from Pat dictionary
for mut in Pat.keys():
if mut == 'c101':
# Take value of Pat dictionary using
# 'c101' as a key
print(Pat['c101'])
If you want you can create list of mutations in simple one-liner:
[mutations['c101'] for p, mutations in Patients.items() if mutations.get('c101')]

Patients = {}
Patients['PatA'] = {'c101':'AT', 'c367':'CA', 'c542':'GA'}
Patients['PatB'] = {'c101':'AC', 'c367':'CA', 'c573':'GA'}
Patients['PatC'] = {'c101':'AT', 'c367':'CA', 'c581':'GA'}
for keys,values in Patients.iteritems():
# print keys,values
for keys1,values1 in values.iteritems():
if keys1 is 'c101':
print keys1,values1
#print values1

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

python nested lists/dictionaries and popping values - python

Related

Python3 - Parse list of strings inside nested json

Take 2 key values from list of python dicts & make new list/tuple/array/dictionary with each index containing 2 key values from 1st listed dict

Python 2.7: Why does json.loads not convert my string to a dict correctly?

How do I use a for loop when reading from a dictionary that might contain a list of dicts, but might not?

Accessing nested values in nested dictionaries in Python 3.3

Categories

Resources