Extracting data from JSON depending on other parameters

Extracting data from JSON depending on other parameters - python

What are the options for extracting value from JSON depending on other parameters (using python)? For example, JSON:
"list": [
{
"name": "value",
"id": "123456789"
},
{
"name": "needed-value",
"id": "987654321"
}
]
When using json_name["list"][0]["id"] it obviously returns 123456789. Is there a way to indicate "name" value "needed-value" so i could get 987654321 in return?

For example:
import json as j
s = '''
{
"list": [
{
"name": "value",
"id": "123456789"
},
{
"name": "needed-value",
"id": "987654321"
}
]
}
'''
js = j.loads(s)
print [x["id"] for x in js["list"] if x["name"] == "needed-value"]

The best way to handle this is to refactor the json as a single dictionary. Since "name" and "id" are redundant you can make the dictionary with the value from "name" as the key and the value from "id" as the value.
import json
j = '''{
"list":[
{
"name": "value",
"id": "123456789"
},{
"name": "needed-value",
"id": "987654321"
}
]
}'''
jlist = json.loads(j)['list']
d = {jd['name']: jd['id'] for jd in jlist}
print(d) ##{'value': '123456789', 'needed-value': '987654321'}
Now you can iterate the items like you normally would from a dictionary.
for k, v in d.items():
print(k, v)
# value 123456789
# needed-value 987654321
And since the names are now hashed, you can check membership more efficiently than continually querying the list.
assert 'needed-value' in d

jsn = {
"list": [
{
"name": "value",
"id": "123456789"
},
{
"name": "needed-value",
"id": "987654321"
}
]
}
def get_id(list, name):
for el in list:
if el['name'] == name:
yield el['id']
print(list(get_id(jsn['list'], 'needed-value')))

Python innately treats JSON as a list of dictionaries. With this in mind, you can call the index of the list you need to be returned since you know it's location in the list (and child dictionary).
In your case, I would use list[1]["id"]
If, however, you don't know where the position of your needed value is within the list, the you can run an old fashioned for loop this way:
for user in list:
if user["name"] == "needed_value":
return user["id"]
This is assuming you only have one unique needed_value in your list.

Related

Get the Parent key and the nested value in nested json

I have a nested json for a JSON schema like this:
{
"config": {
"x-permission": true
},
"deposit_schema": {
"additionalProperties": false,
"$schema": "http://json-schema.org/draft-04/schema#",
"type": "object",
"properties": {
"control_number": {
"type": "string",
"x-cap-permission": {
"users": [
"test#test.com"
]
}
},
"initial": {
"properties": {
"status": {
"x-permission": {
"users": [
"test3#test.com"
]
},
"title": "Status",
"type": "object",
"properties": {
"main_status": {
"type": "string",
"title": "Stage"
}
}
},
"gitlab_repo": {
"description": "Add your repository",
"items": {
"properties": {
"directory": {
"title": "Subdirectory",
"type": "string",
"x-permission": {
"users": [
"test1#test.com",
"test2#test.com"
]
}
},
"gitlab": {
"title": "Gitlab",
"type": "string"
}
},
"type": "object"
},
"title": "Gitlab Repository",
"type": "array"
},
"title": "Initial Input",
"type": "object"
}
},
"title": "Test Analysis"
}
}
The JSON is nested and I want to have the dict of x-permission fields with their parent_key like this:
{
"control_number": {"users": ["test#test.com"]},
"initial.properties.status": {"users": ["test3#test.com"]},
"initial.properties.gitlab_repo.items.properties.directory": {"users": [
"test1#test.com",
"test2#test.com"
]}
}
I am trying to do implement recursive logic for every key in JSON like this:
def extract(obj, parent_key):
"""Recursively search for values of key in JSON tree."""
for k, v in obj.items():
key = parent_key + '.' + k
if isinstance(v, dict):
if v.get('x-permission'):
return key, v.get('x-permission')
elif v.get('properties'):
return extract(v.get('properties'), key)
return None, None
def collect_permission_info(object_):
# _schema = _schema.deposit_schema.get('properties')
_schema = object_ # above json
x_cap_fields = {}
for k in _schema:
parent_key, permission_info = extract(_schema.get(k), k)
if parent_key and permission_info:
x_cap_fields.update({parent_key: permission_info})
return x_cap_fields
I am getting empty dict now, what I am missing here?

You could use this generator of key/value tuples:
def collect_permission_info(schema):
for key, child in schema.items():
if isinstance(child, dict):
if "x-permission" in child:
yield key, child["x-permission"]
if "properties" in child:
for rest, value in collect_permission_info(child["properties"]):
yield key + "." + rest, value
Then call it like this:
result = dict(collect_permission_info(schema))

A few issues I can spot:
You use the parent_key directly in the recursive function. In a case when multiple properties exist in an object ("_experiment" has 2 properties), the path will be incorrect (e.g. _experiment.type.x-permission is constructed in second loop call). Use a new variable so that each subsequent for loop call uses the initial parent_key value
The elif branch is never executed as the first branch has priority. It is a duplicate.
The return value from the recursive execute(...) call is ignored. Anything you might find on deeper levels is therefore ignored
Judging by your example json schema and the desired result, a recursive call on the "initial": {...} object should return multiple results. You would have to modify the extract(...) function to allow for multiple results instead of a single one
You only check if an object contains a x-permission or a properties attribute. This ignores the desired result in the provided "initial" schema branch which contains x-permission nested inside a status and main_status branch. The easiest solution is to invoke a recursive call every time isinstance(v, dict) == true

After reading through the comments and the answers. I got this solution working for my use case.
def parse_schema_permission_info(schema):
x_fields = {}
def extract_permission_field(field, parent_field):
for field, value in field.items():
if field == 'x-permission':
x_fields.update({parent_field: value})
if isinstance(value, dict):
key = parent_field + '.' + field
if value.get('x-permission'):
x_fields.update(
{key: value.get('x-permission')}
)
extract_permission_field(value, key)
for field in schema:
extract_permission_field(schema.get(field), field)
return x_fields

How to iterate through a nested list in python?

I want to iterate through a list that has a lot of dictionaries inside it. The json response I'm trying to iterate looks something like this:
user 1 JSON response:
[
{
"id": "333",
"name": "hello"
},
{
"id": "999",
"name": "hi"
},
{
"id": "666",
"name": "abc"
},
]
user 2 JSON response:
[
{
"id": "555",
"name": "hello"
},
{
"id": "1001",
"name": "hi"
},
{
"id": "26236",
"name": "abc"
},
]
This is not the actual JSON response but it is structured the same way. What I'm trying to do is to find a specific id and store it in a variable. The JSON response I'm trying to iterate is not organized and changes every time depending on the user. So I need to find the specific id which would be easy but there are many dictionaries inside the list. I tried iterating like this:
for guild_info in guilds:
for guild_ids in guild_info:
This returns the first dictionary which is id: 333. For example, I want to find the value 666 and store it in a variable. How would I do that?

What you have is a list of dictionaries.
When you run for guild_info in guilds: you will iterate through dictionaries, so here each guild_info will be a dictionary. Therefore simply take the key id like so: guild_info['id'].
If what you want to do is find the name corresponding to a specific id, you can use list comprehension and take its first element, as follows:
name = [x['name'] for x in guilds if x['id'] == '666'][0]

Here's a function that will search only until it finds the matching id and then return, which avoids checking further entries unnecessarily.
def get_name_for_id(user, id_to_find):
# user is a list, and each guild in it is a dictionary.
for guild in user:
if guild['id'] == id_to_find:
# Once the matching id is found, we're done.
return guild['name']
# If the loop completes without returning, then there was no match.
return None
user = [
{
"id": "333",
"name": "hello"
},
{
"id": "999",
"name": "hi"
},
{
"id": "666",
"name": "abc"
},
]
name = get_name_for_id(user, '666')
print(name)
name2 = get_name_for_id(user, '10000')
print(name2)
Output:
abc
None

This will create a loop which will iterate to the list of dictionaries.If you are looking for simple approach
for every_dictionary in List_of_dictionary:
for every_dictionary_item in every_dictionary.keys():
print(every_dictionary[every_dictionary_item])

Getting Values with Python from JSON Array with multiple entries

i have a question regarding getting a specific value out of a JSON array based on a value that the array has. This might be a little vague bet let me show you.
I have a results array in JSON format:
{
"result": [{
"id": "SomeID1",
"name": "NAME1"
},
{
"id": "SomeID2",
"name": "NAME2"
}
]
}
I always know the name, but the ID is subject to change. So what i want to do is get the ID value based on the name I give. I am not able to alter the JSON format as it is a result i get from an API call.
So when enter NAME1 the result should be "SomeID1"

One approach could be (if name is unique):
data={
"result": [{
"id": "SomeID1",
"name": "NAME1"
},
{
"id": "SomeID2",
"name": "NAME2"
}
]
}
known_name ="NAME1"
print(next(x['id'] for x in data["result"] if x["name"]==known_name))
If name is not unique:
for x in data["result"]:
if x['name'] == known_name:
print(x["id"])
or you could store them in a list
print([x['id'] for x in data["result"] if x["name"]==known_name])

Remove key and its value in nested dictionary using python

Looking for a generic solution where I can remove the specific key and its value from dict.
For example, if dict contains the following nested key-value pair:
data={
"set": {
"type": "object", #<-- should remove this key:value pair
"properties": {
"action": {
"type": "string", #<-- should NOT remove this key:value pair
"description": "My settings"
},
"settings": {
"type": "object", #<-- should remove this key:value pair
"description": "for settings",
"properties": {
"temperature": {
"type": "object", #<-- should remove this key:value pair
"description": "temperature in degree C",
"properties": {
"heater": {
"type": "object", #<-- should remove this key:value pair
"properties": {
"setpoint": {
"type": "number"
},
},
"additionalProperties": false
},
},
"additionalProperties": false
},
},
"additionalProperties": false
}
},
"additionalProperties": false
}
}
I want an output dict without "type":"object" across the occurrence of this key:value pair.
The expected output should produce the result without "type":"object"

You can write a recursive function:
def remove_a_key(d, remove_key):
if isinstance(d, dict):
for key in list(d.keys()):
if key == remove_key:
del d[key]
else:
remove_a_key(d[key], remove_key)
and call it as:
remove_a_key(data, 'type')
This recursively removes 'type' key and it's value from each nested dictionary no matter how deep it is.

Use python module nested-lookup to play with any kind of nested documents.
Checkout https://pypi.org/project/nested-lookup/ for more info.
In your case you need to use method nested_delete to delete all occurrences of a key.
Usage:
from nested_lookup import nested_delete
print(nested_delete(data, 'type'))

I get the following error with the recurcive function :
for key in list(d.keys()):
TypeError: 'dict' object is not callable

Convert Csv to JSON with nested array

I have a CSV file
group, first, last
fans, John, Smith
fans, Alice, White
students, Ben, Smith
students, Joan, Carpenter
...
The Output JSON file needs this format:
[
{
"group" : "fans",
"user" : [
{
"first" : "John",
"last" : "Smith"
},
{
"first" : "Alice",
"last" : "White"
}
]
},
{
"group" : "students",
"user" : [
{
"first" : "Ben",
"last" : "Smith"
},
{
"first" : "Joan",
"last" : "Carpenter"
}
]
}
]

Short answer
Use itertools.groupby, as described in the documentation.
Long answer
This is a multi-step process.
Start by getting your CSV into a list of dict:
from csv import DictReader
with open('data.csv') as csvfile:
r = DictReader(csvfile, skipinitialspace=True)
data = [dict(d) for d in r]
groupby needs sorted data, so define a function to get the key, and pass it in like so:
def keyfunc(x):
return x['group']
data = sorted(data, key=keyfunc)
Last, call groupby, providing your sorted data and your key function:
from itertools import groupby
groups = []
for k, g in groupby(data, keyfunc):
groups.append({
"group": k,
"user": [{k:v for k, v in d.items() if k != 'group'} for d in list(g)]
})
This will iterate over your data, and every time the key changes, it drops into the for block and executes that code, providing k (the key for that group) and g (the dict objects that belong to it). Here we just store those in a list for later.
In this example, the user key uses some pretty dense comprehensions to remove the group key from every row of user. If you can live with that little bit of extra data, that whole line can be simplified as:
"user": list(g)
The result looks like this:
[
{
"group": "fans",
"user": [
{
"first": "John",
"last": "Smith"
},
{
"first": "Alice",
"last": "White"
}
]
},
{
"group": "students",
"user": [
{
"first": "Ben",
"last": "Smith"
},
{
"first": "Joan",
"last": "Carpenter"
}
]
}
]

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Extracting data from JSON depending on other parameters - python

For example: import json as j s = ''' { "list": [ { "name": "value", "id": "123456789" }, { "name": "needed-value", "id": "987654321" } ] } ''' js = j.loads(s) print [x["id"] for x in js["list"] if x["name"] == "needed-value"]

jsn = { "list": [ { "name": "value", "id": "123456789" }, { "name": "needed-value", "id": "987654321" } ] } def get_id(list, name): for el in list: if el['name'] == name: yield el['id'] print(list(get_id(jsn['list'], 'needed-value')))

Related

Get the Parent key and the nested value in nested json

How to iterate through a nested list in python?

Getting Values with Python from JSON Array with multiple entries

Remove key and its value in nested dictionary using python

Convert Csv to JSON with nested array

Categories

Resources