Parse output from json python

Parse output from json python - python

I have a json below, and I want to parse out value from this dict.
I can do something like this to get one specific value
print(abc['everything']['A']['1']['tree']['value'])
But, what is best way to parse out all "value?"
I want to output good, bad, good.
abc = {'everything': {'A': {'1': {'tree': {'value': 'good'}}},
'B': {'5': {'tree1': {'value': 'bad'}}},
'C': {'30': {'tree2': {'value': 'good'}}}}}

If you are willing to use pandas, you could just use pd.json_normalize, which is actually quite fast:
import pandas as pd
abc = {'everything': {'A': {'1': {'tree': {'value': 'good'}}},
'B': {'5': {'tree1': {'value': 'bad'}}},
'C': {'30': {'tree2': {'value': 'good'}}}}}
df = pd.json_normalize(abc)
print(df.values[0])
['good' 'bad' 'good']
Without any extra libraries, you will have to iterate through your nested dictionary:
values = [abc['everything'][e][k][k1]['value'] for e in abc['everything'] for k in abc['everything'][e] for k1 in abc['everything'][e][k]]
print(values)
['good', 'bad', 'good']

Provided your keys and dictionaries have a value somewhere, you can try this:
Create a function (or reuse the code) that gets the first element of the dictionary until the value key exists, then return that. Note that there are other ways of doing this.
Iterate through, getting the result under each value key and return.
# Define function
def get(d):
while not "value" in d:
d = list(d.values())[0]
return d["value"]
# Get the results from your example
results = [get(v) for v in list(abc["everything"].values())]
['good', 'bad', 'good']

A Recursive way:
def fun(my_dict, values=[]):
if not isinstance(my_dict, dict):
return values
for i, j in my_dict.items():
if i == 'value':
values.append(j)
else:
values = fun(j, values)
return values
abc = {'everything': {'A': {'1': {'tree': {'value': 'good'}}},
'B': {'5': {'tree1': {'value': 'bad'}}},
'C': {'30': {'tree2': {'value': 'good'}}}}}
data = fun(abc)
print(data)
Output:
['good', 'bad', 'good']

Firstly, the syntax you are using is incorrect.
If you are using pandas, you can code like
import pandas as pd
df4 = pd.DataFrame({"TreeType": ["Tree1", "Tree2", "Tree3"],
"Values": ["Good", "Bad","Good"]})
df4.index = ["A","B","C"]
next just run the code df4, you would get the correct output.
output:
TreeType Values
A Tree1 Good
B Tree2 Bad
C Tree3 Good

Related

Extract subset of dictionaries inside a list of dictionaries until particular key is found

I have a list of dict
dict = [{'a':'1'},{'b':'2'},{'c':'3'},{'Stop':'appending'},{'d':'4'},{'e':'5'},{'f':'6'}]
dict1 = [{'a':'1'},{'b':'2'},{'c':'3'},{'d':'4'},{'Stop':'appending'},{'e':'5'},{'f':'6'}]
I want to extract all list elements until key 'Stop' is found and append it to new dictionary
Expected output:
new_dict = [{'a':'1'},{'b':'2'},{'c':'3'}]
new_dict1 = [{'a':'1'},{'b':'2'},{'c':'3'},{'d':'4'}]
Code:
temp_dict = []
for i in range(0,len(list)):
for key,value in list[i].items():
if key == 'Stop':
break
temp_dict.append(list[i])

It's already in standard library
import itertools
dict1 = [{'a':'1'},{'b':'2'},{'c':'3'},{'d':'4'},{'Stop':'appending'},{'e':'5'},{'f':'6'}]
res = list(itertools.takewhile(lambda x: "Stop" not in x, dict1))
print(res)
output:
[{'a': '1'}, {'b': '2'}, {'c': '3'}, {'d': '4'}]

You can use enumerate() to get the index of the element matching the key Stop and then used list slicing on top of that:
dic = [{'a':'1'},{'b':'2'},{'c':'3'},{'Stop':'appending'},{'d':'4'},{'e':'5'},{'f':'6'}]
index = next(index for index, elt in enumerate(dic) if elt.get('Stop'))
new_dic = dic[0:index] # [{'a': '1'}, {'b': '2'}, {'c': '3'}]
Also, don't use dict keyword for object names to avoid shadowing built-in primitives.
Update: If you want to just skip the element with key Stop and take all others then update the above slicing operation as:
new_dic = dic[0:index] + dic[index+1:] # [{'a': '1'}, {'b': '2'}, {'c': '3'}, {'d': '4'}, {'e': '5'}, {'f': '6'}]

First find the index of the elements that have 'Stop' like a key, and after that just slice the list to the first of those index.
Try:
inds = [i for i in range(len(dict1)) if 'Stop' in dict1[i].keys()]
new_dict1 = dict1[:inds[0]]
Also, I think you should choice better names for your lists, especially for the first, dict is a reserved word in python.

Convert pandas.DataFrame to list of dictionaries in Python

I have a dictionary which is converted from a dataframe as below :
a = d.to_json(orient='index')
Dictionary :
{"0":{"yr":2017,"PKID":"58306, 57011","Subject":"ABC","ID":"T001"},"1":{"yr":2018,"PKID":"1234,54321","Subject":"XYZ","ID":"T002"}}
What I need is it be in a list, so essentially a list of dictionary.
So i just add a [] because that is the format to be used in the rest of the code.
input_dict = [a]
input_dict :
['
{"0":{"yr":2017,"PKID":"58306, 57011","Subject":"ABC","ID":"T001"},"1":{"yr":2018,"PKID":"1234,54321","Subject":"XYZ","ID":"T002"}}
']
I need to get the single quotes removed just after the [ and just before the ]. Also, have the PKID values in form of list.
How can this be achieved ?
Expected Output :
[ {"yr":2017,"PKID":[58306, 57011],"Subject":"ABC","ID":"T001"},"1":{"yr":2018,"PKID":[1234,54321],"Subject":"XYZ","ID":"T002"} ]
NOTE : The PKID column has multiple integer values which have to come as a lift of integers. a string is not acceptable.
so we need like "PKID":[58306, 57011] and not "PKID":"[58306, 57011]"

pandas.DataFrame.to_json returns a string (JSON string), not a dictionary. Try to_dict instead:
>>> df
col1 col2
0 1 3
1 2 4
>>> [df.to_dict(orient='index')]
[{0: {'col1': 1, 'col2': 3}, 1: {'col1': 2, 'col2': 4}}]
>>> df.to_dict(orient='records')
[{'col1': 1, 'col2': 3}, {'col1': 2, 'col2': 4}]

Here is one way:
from collections import OrderedDict
d = {"0":{"yr":2017,"PKID":"58306, 57011","Subject":"ABC","ID":"T001"},"1":{"yr":2018,"PKID":"1234,54321","Subject":"XYZ","ID":"T002"}}
list(OrderedDict(sorted(d.items())).values())
# [{'ID': 'T001', 'PKID': '58306, 57011', 'Subject': 'ABC', 'yr': 2017},
# {'ID': 'T002', 'PKID': '1234,54321', 'Subject': 'XYZ', 'yr': 2018}]
Note the ordered dictionary is ordered by text string keys, as supplied. You may wish to convert these to integers first before any processing via d = {int(k): v for k, v in d.items()}.

You are converting your dictionary to json which is a string. Then you wrap your resulting string a list. So, naturally, the result is a string inside of a list.
Try instead: [d] where d is your raw dictionary (not converted json

You can use a list comprehension
Ex:
d = {"0":{"yr":2017,"PKID":"58306, 57011","Subject":"ABC","ID":"T001"},"1":{"yr":2018,"PKID":"1234,54321","Subject":"XYZ","ID":"T002"}}
print [{k: v} for k, v in d.items()]
Output:
[{'1': {'PKID': '1234,54321', 'yr': 2018, 'ID': 'T002', 'Subject': 'XYZ'}}, {'0': {'PKID': '58306, 57011', 'yr': 2017, 'ID': 'T001', 'Subject': 'ABC'}}]

What about something like this:
from operator import itemgetter
d = {"0":{"yr":2017,"PKID":"58306, 57011","Subject":"ABC","ID":"T001"},"1":
{"yr":2018,"PKID":"1234,54321","Subject":"XYZ","ID":"T002"}}
sorted_d = sorted(d.items(), key=lambda x: int(x[0]))
print(list(map(itemgetter(1), sorted_d)))
Which Outputs:
[{'yr': 2017, 'PKID': '58306, 57011', 'Subject': 'ABC', 'ID': 'T001'},
{'yr': 2018, 'PKID': '1234,54321', 'Subject': 'XYZ', 'ID': 'T002'}]

list comprehension using dictionary entries

trying to figure out how I might be able to use list comprehension for the following:
I have a dictionary:
dict = {}
dict ['one'] = {"tag":"A"}
dict ['two'] = {"tag":"B"}
dict ['three'] = {"tag":"C"}
and I would like to create a list (let's call it "list") which is populated by each of the "tag" values of each key, i.e.
['A', 'B', 'C']
is there an efficient way to do this using list comprehension? i was thinking something like:
list = [x for x in dict[x]["tag"]]
but obviously this doesn't quite work. any help appreciated!

This is an extra step but gets the desired output and avoids using reserved words:
d = {}
d['one'] = {"tag":"A"}
d['two'] = {"tag":"B"}
d['three'] = {"tag":"C"}
new_list = []
for k in ('one', 'two', 'three'):
new_list += [x for x in d[k]["tag"]]
print(new_list)

Try this:
d = {'one': {'tag': 'A'},
'two': {'tag': 'B'},
'three': {'tag': 'C'}}
tag_values = [d[i][j] for i in d for j in d[i]]
>>> print tag_values
['C', 'B', 'A']
You can sort the list afterwards if it matters.
If you have other key/value pairs in the inner dicts, apart from 'tag', you may want to specify the 'tag' keys, like this:
tag_value = [d[i]['tag'] for i in d if 'tag' in d[i]]
for the same result. If 'tag' is definitely always there, remove the if 'tag' in d[i] part.
As a side note, never a good idea to call a list 'list', since it's a reserved word in Python.

You can try this:
[i['tag'] for i in dict.values()]

I would do something like this:
untransformed = {
'one': {'tag': 'A'},
'two': {'tag': 'B'},
'three': {'tag': 'C'},
'four': 'bad'
}
transformed = [value.get('tag') for key,value in untransformed.items() if isinstance(value, dict) and 'tag' in value]
It also sounds like you're trying to get some info out of JSON you might want to look into a tool like https://stedolan.github.io/jq/manual/

Python remove duplicate value in a combined dictionary's list

I need a little bit of homework help. I have to write a function that combines several dictionaries into new dictionary. If a key appears more than once; the values corresponding to that key in the new dictionary should be a unique list. As an example this is what I have so far:
f = {'a': 'apple', 'c': 'cat', 'b': 'bat', 'd': 'dog'}
g = {'c': 'car', 'b': 'bat', 'e': 'elephant'}
h = {'b': 'boy', 'd': 'deer'}
r = {'a': 'adam'}
def merge(*d):
newdicts={}
for dict in d:
for k in dict.items():
if k[0] in newdicts:
newdicts[k[0]].append(k[1])
else:
newdicts[k[0]]=[k[1]]
return newdicts
combined = merge(f, g, h, r)
print(combined)
The output looks like:
{'a': ['apple', 'adam'], 'c': ['cat', 'car'], 'b': ['bat', 'bat', 'boy'], 'e': ['elephant'], 'd': ['dog', 'deer']}
Under the 'b' key, 'bat' appears twice. How do I remove the duplicates?
I've looked under filter, lambda but I couldn't figure out how to use with (maybe b/c it's a list in a dictionary?)
Any help would be appreciated. And thank you in advance for all your help!

Just test for the element inside the list before adding it: -
for k in dict.items():
if k[0] in newdicts:
if k[1] not in newdicts[k[0]]: # Do this test before adding.
newdicts[k[0]].append(k[1])
else:
newdicts[k[0]]=[k[1]]
And since you want just unique elements in the value list, then you can just use a Set as value instead. Also, you can use a defaultdict here, so that you don't have to test for key existence before adding.
Also, don't use built-in for your as your variable names. Instead of dict some other variable.
So, you can modify your merge method as:
from collections import defaultdict
def merge(*d):
newdicts = defaultdict(set) # Define a defaultdict
for each_dict in d:
# dict.items() returns a list of (k, v) tuple.
# So, you can directly unpack the tuple in two loop variables.
for k, v in each_dict.items():
newdicts[k].add(v)
# And if you want the exact representation that you have shown
# You can build a normal dict out of your newly built dict.
unique = {key: list(value) for key, value in newdicts.items()}
return unique

>>> import collections
>>> import itertools
>>> uniques = collections.defaultdict(set)
>>> for k, v in itertools.chain(f.items(), g.items(), h.items(), r.items()):
... uniques[k].add(v)
...
>>> uniques
defaultdict(<type 'set'>, {'a': set(['apple', 'adam']), 'c': set(['car', 'cat']), 'b': set(['boy', 'bat']), 'e': set(['elephant']), 'd': set(['deer', 'dog'])})
Note the results are in a set, not a list -- far more computationally efficient this way. If you would like the final form to be lists then you can do the following:
>>> {x: list(y) for x, y in uniques.items()}
{'a': ['apple', 'adam'], 'c': ['car', 'cat'], 'b': ['boy', 'bat'], 'e': ['elephant'], 'd': ['deer', 'dog']}

In your for loop add this:
for dict in d:
for k in dict.items():
if k[0] in newdicts:
# This line below
if k[1] not in newdicts[k[0]]:
newdicts[k[0]].append(k[1])
else:
newdicts[k[0]]=[k[1]]
This makes sure duplicates aren't added

Use set when you want unique elements:
def merge_dicts(*d):
result={}
for dict in d:
for key, value in dict.items():
result.setdefault(key, set()).add(value)
return result
Try to avoid using indices; unpack tuples instead.

Dictionary transformation and counter

Object:
data = [{'key': 11, 'country': 'USA'},{'key': 21, 'country': 'Canada'},{'key': 12, 'country': 'USA'}]
the result should be:
{'USA': {0: {'key':11}, 1: {'key': 12}}, 'Canada': {0: {'key':21}}}
I started experiment with:
result = {}
for i in data:
k = 0
result[i['country']] = dict(k = dict(key=i['key']))
and I get:
{'Canada': {'k': {'key': 21}}, 'USA': {'k': {'key': 12}}}
So how can I put the counter instead k? Maybe there is a more elegant way to create the dictionary?

I used the len() of the existing result item:
>>> import collections
>>> data = [{'key': 11, 'country': 'USA'},{'key': 21, 'country': 'Canada'},{'key': 12, 'country': 'USA'}]
>>> result = collections.defaultdict(dict)
>>> for item in data:
... country = item['country']
... result[country][len(result[country])] = {'key': item['key']}
...
>>> dict(result)
{'Canada': {0: {'key': 21}}, 'USA': {0: {'key': 11}, 1: {'key': 12}}}
There may be a more efficient way to do this, but I thought this would be most readable.

#zigg's answer is better.
Here's an alternative way:
import itertools as it, operator as op
def dict_transform(dataset, key_name=None, group_by=None):
result = {}
sorted_dataset = sorted(data, key=op.itemgetter(group_by))
for k,g in it.groupby(sorted_dataset, key=op.itemgetter(group_by)):
result[k] = {i:{key_name:j[key_name]} for i,j in enumerate(g)}
return result
if __name__ == '__main__':
data = [{'key': 11, 'country': 'USA'},
{'key': 21, 'country': 'Canada'},
{'key': 12, 'country': 'USA'}]
expected_result = {'USA': {0: {'key':11}, 1: {'key': 12}},
'Canada': {0: {'key':21}}}
result = dict_transform(data, key_name='key', group_by='country')
assert result == expected_result

To add the number, use the {key:value} syntax
result = {}
for i in data:
k = 0
result[i['country']] = dict({k : dict(key=i['key'])})

dict(k = dict(key=i['key']))
This passes i['key'] as the key keyword argument to the dict constructor (which is what you want - since that results in the string "key" being used as a key), and then passes the result of that as the k keyword argument to the dict constructor (which is not what you want) - that's how parameter passing works in Python. The fact that you have a local variable named k is irrelevant.
To make a dict where the value of k is used as a key, the simplest way is to use the literal syntax for dictionaries: {1:2, 3:4} is a dict where the key 1 is associated with the value 2, and the key 3 is associated with the value 4. Notice that here we're using arbitrary expressions for keys and values - not names - so we can use a local variable and the resulting dictionary will use the named value.
Thus, you want {k: {'key': i['key']}}.
Maybe there is a more elegant way to create the dictionary?
You could create a list by appending items, and then transform the list into a dictionary with dict(enumerate(the_list)). That at least saves you from having to do the counting manually, but it's pretty indirect.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Parse output from json python - python

Related

Extract subset of dictionaries inside a list of dictionaries until particular key is found

Convert pandas.DataFrame to list of dictionaries in Python

list comprehension using dictionary entries

Python remove duplicate value in a combined dictionary's list

Dictionary transformation and counter

Categories

Resources