python querying a json objectpath

python querying a json objectpath - python

I've a nested json structure, I'm using objectpath (python API version), but I don't understand how to select and filter some information (more precisely the nested information in the structure).
EG.
I want to select the "description" of the action "reading" for the user "John".
JSON:
{
"user":
{
"actions":
[
{
"name": "reading",
"description": "blablabla"
}
]
"name": "John"
}
}
CODE:
$.user[#.name is 'John' and #.actions.name is 'reading'].actions.description
but it doesn't work (empty set but in my JSON it isn't so).
Any suggestion?

Is this what you are trying to do?
import objectpath
data = {
"user": {
"actions": {
"name": "reading",
"description": "blablabla"
},
"name": "John"
}
}
tree = objectpath.Tree(data)
result = tree.execute("$.user[#.name is 'John'].actions[#.name is 'reading'].description")
for entry in result:
print entry
Output
blablabla
I had to fix your JSON. Also, tree.execute returns a generator. You could replace the for loop with print result.next(), but the for loop seemed more clear.

import objectpath import *
your_json = {"name": "felix", "last_name": "diaz"}
# This json path will bring all the key-values of your json
your_json_path='$.*'
my_key_values = Tree(your_json).execute(your_json_path)
# If you want to retrieve the name node...then specify it.
my_name= Tree(your_json).execute('$.name')
# If you want to retrieve a the last_name node...then specify it.
last_name= Tree(your_json).execute('$.last_name')

I believe you're just missing a comma in JSON:
{
"user":
{
"actions": [
{
"name": "reading",
"description": "blablabla"
}
],
"name": "John"
}
}
Assuming there is only one "John", with only one "reading" activity, the following query works:
$.user[#.name is 'John'].actions[0][#.name is 'reading'][0].description
If there could be multiple "John"s, with multiple "reading" activities, the following query will almost work:
$.user.*[#.name is 'John'].actions..*[#.name is 'reading'].description
I say almost because the use of .. will be problematic if there are other nested dictionaries with "name" and "description" entries, such as
{
"user": {
"actions": [
{
"name": "reading",
"description": "blablabla",
"nested": {
"name": "reading",
"description": "broken"
}
}
],
"name": "John"
}
}
To get a correct query, there is an open issue to correctly implement queries into arrays: https://github.com/adriank/ObjectPath/issues/60

Related

Append to a json file using python

Trying to append to a nested json file
My goal is to append some values to a JSON file.
Here is my original JSON file
{
"web": {
"all": {
"side": {
"tags": [
"admin"
],
"summary": "Generates",
"operationId": "Key",
"consumes": [],
"produces": [
"application/json"
],
"responses": {
"200": {
"description": "YES",
"schema": {
"type": "string"
}
}
},
"Honor": [
{
"presidential": []
}
]
}
}
}
}
It is my intention to add two additional lines inside the key "Honor", with the values "Required" : "YES" and "Prepay" : "NO". As a result of appending the two values, I will have the following JSON file.
{
"web": {
"all": {
"side": {
"tags": [
"admin"
],
"summary": "Generates",
"operationId": "Key",
"consumes": [],
"produces": [
"application/json"
],
"responses": {
"200": {
"description": "YES",
"schema": {
"type": "string"
}
}
},
"Honor": [
{
"presidential": [],
"Required" : "YES",
"Prepay" : "NO"
}
]
}
}
}
}
Below is the Python code that I have written
import json
def write_json(data,filename ="exmpleData.json"):
with open(filename,"w") as f:
json.dump(data,f,indent=2)
with open ("exmpleData.json") as json_files:
data= json.load(json_files)
temp = data["Honor"]
y = {"required": "YES","type": "String"}
temp.append(y)
write_json(data)
I am receiving the following error message:
** temp = data["Honor"] KeyError: 'Honor'
**
I would appreciate any guidance that you can provide to help me achieve my goal. I am running Python 3.7

'Honor' is deeply nested in other dictionaries, and its value is a 1-element list containing a dictionary. Here's how to access:
import json
def write_json(data, filename='exmpleData.json'):
with open(filename, 'w') as f:
json.dump(data, f, indent=2)
with open('exmpleData.json') as json_files:
data = json.load(json_files)
# 'Honor' is deeply nested in other dictionaries
honor = data['web']['all']['side']['Honor']
# Its value is a 1-element list containing another dictionary.
honor[0]['Required'] = 'YES'
honor[0]['Prepay'] = 'NO'
write_json(data)

I'd recommend that you practice your fundamentals a bit more since you're making many mistakes in your data structure handling. The good news is, your JSON load/dump is fine.
The cause of your error message is that data doesn't have an "Honor" property. Data only has a "web" property, which contains "all" which contains "side" which contains "Honor", which contains an array with a dictionary that holds the properties you are trying to add to. So you want to set temp with temp = data['web']['all']['side']['Honor'][0]
You also cannot use append on python dictionaries. Instead, check out dict.update().

Populating JSON data from API in Python pandas DataFrame - TypeError and IndexError

I am trying to populate a pandas DataFrame with select information from JSON output fetched from an API.
candidate_list = []
for candidate in candidate_response['data']:
if 'error' not in candidate_response:
candidate_list.append([candidate['id'], candidate['attributes']['first_name'], candidate['attributes']
['last_name'], candidate['relationships']['educations']['data']['id']])
The DataFrame populates fine until I add candidate['relationships']['educations']['data']['id'], which throws TypeError: list indices must be integers or slices, not str.
When trying to get the values of the indexes for ['id'] by using candidate['relationships']['educations']['data'][0]['id'] instead, I get IndexError: list index out of range.
The JSON output looks something like:
"data": [
{
"attributes": {
"first_name": "Tester",
"last_name": "Testman",
"other stuff": "stuff",
},
"id": "732887",
"relationships": {
"educations": {
"data": [
{
"id": "605372",
"type": "educations"
},
{
"id": "605371",
"type": "educations"
},
{
"id": "605370",
"type": "educations"
}
]
}
},
How would I go about successfully filling a column in the DataFrame with the 'id's under 'relationships'>'educations'>'data'?

Please note then when using candidate['relationships']['educations']['data']['id'] you get that error because at data there is a list, and not a dictionary. And you cannot access dictionary by name.
Assuming, what you are trying to achieve is one entry per data.attributes.relationships.educations.data entry. Complete code that works and does what you are trying is:
import json
json_string = """{
"data": [
{
"attributes": {
"first_name": "Tester",
"last_name": "Testman",
"other stuff": "stuff"
},
"id": "732887",
"relationships": {
"educations": {
"data": [
{
"id": "605372",
"type": "educations"
},
{
"id": "605371",
"type": "educations"
},
{
"id": "605370",
"type": "educations"
}
]
}
}
}
]
}"""
candidate_response = json.loads(json_string)
candidate_list = []
for candidate in candidate_response['data']:
if 'error' not in candidate_response:
for data in candidate['relationships']['educations']['data']:
candidate_list.append(
[
candidate['id'],
candidate['attributes']['first_name'],
candidate['attributes']['last_name'],
data['id']
]
)
print(candidate_list)
Code run available at ideone.

I have analyzed your code and also ran it on Jupyter notebook all looks good, I am getting the output,
The error you got list indices must be integers or slices, not str, that is because you were not using the index, this required because the value which you are looking for that is in the list.
and about this error: IndexError: list index out of range. Maybe some code typo mistake is done from your side otherwise the code is fine.
here is the output of your following code:
candidate_list = []
for candidate in candidate_response['data']:
if 'error' not in candidate_response:
candidate_list.append([candidate['id'], candidate['attributes']['first_name'], candidate['attributes']['last_name'],candidate['relationships']['educations']['data'][0]['id']])
Output

probably for any candidate candidate['relationships']['educations']['data'] is an empty list

Create nested json file

I want to create a json file that can be used as a configuration file. I have different files from multiple companies that report the same information with different column names.
I want to use the information provided in the json file to run a python script to consolidate all the information from all files and companies in one master file.
The structure looks like follows:
{"companies":
{"company1": [
{"path": "C:/USER/Path/Company1/",
"files": [
{
{"_CO": {"ID": "ID", "Report Number": "Report_Number"}},
{"_TR": {"ID": "Trade_Ident", "Report Number": "Number of Report"}},
},
],
},
],
},
{"company2": [
{"path": "C:/USER/Path/Company2/",
"files": [
{
{"_CO": {"ID": "Identification", "Report Number": "Report-Number"}},
{"_TR": {"ID": "Ident", "Report Number": "NumberReport"}},
},
],
},
],
},
},
However, I receive the following error when trying to read the .json in python.
json.decoder.JSONDecodeError: Expecting property name enclosed in
double quotes: line 6 column 5 (char 90)
To read the file I use:
import json
path = "/user_folder/USER/Desktop/Data/"
file = "ConfigFile.json"
with open(path+file) as f:
my_test = json.load(f)
Any help appreciated, as I can't figure out my mistake in the file structure.

You're getting error because your json file is incorrectly formatted and thus calling json.load() will raise an JSONDecodeError.
Your json structure should look like,
{
"companies": {
"company1": [
{
"path": "C:/USER/Path/Company1/",
"files": [
{
"_CO": {
"ID": "ID",
"Report Number": "Report_Number"
}
},
{
"_TR": {
"ID": "Trade_Ident",
"Report Number": "Number of Report"
}
}
]
}
],
"company2": [
{
"path": "C:/USER/Path/Company2/",
"files": [
{
"_CO": {
"ID": "Identification",
"Report Number": "Report-Number"
}
},
{
"_TR": {
"ID": "Ident",
"Report Number": "NumberReport"
}
}
]
}
]
}
}
Hope it helps you!

You have some object (the ones with curly braces) without keys, for example in
{
{"_CO": {"ID": "ID", "Report Number": "Report_Number"}}, ...
Objects in JSON are key-value pairs. Just remove the external set of braces and it should be ok.
You can use some online JSON formatter/validator just like this one, and it will easily point out the problem. Otherwise, you can use some JSON linter for your editor. It just does the work for you and also improves indentation :)

No enum error when validating JSON using jsonschema in python

First of all, I am not getting a proper error reponse on the web platform as well (https://jsonschemalint.com). I am using jsonschema in python, and have a proper json schema and json data that works.
The problem I'd like to solve is the following: Before we deliver JSON files with example data, we need to run them through SoapUI to test if they are proper, as we are dealing with huge files and usually our devs may make some errors in generating them, so we do the final check.
I'd like to create a script to automate this, avoiding SoapUI. So after googling, I came across jsonschema, and tried to use it. I get all the proper results,etc, I get errors when I delete certain elements as usual, but the biggest issues are the following:
Example :
I have a subsubsub object in my JSON schema, let's call it Test1, which contains the following :
**Schema**
{
"exname":"2",
"info":{},
"consumes":{},
"produces":{},
"schemes":{},
"tags":{},
"parameters":{},
"paths":{},
"definitions":{
"MainTest1":{
"description":"",
"minProperties":1,
"properties":{
"test1":{
"items":{
"$ref":"#//Test1"
},
"maxItems":10,
"minItems":1,
"type":"array"
},
"test2":{
"items":{
"$ref":"#//"
},
"maxItems":10,
"minItems":1,
"type":"array"
}
}
},
"Test1":{
"description":"test1des",
"minProperties":1,
"properties":{
"prop1":{
"description":"prop1des",
"example":"prop1exam",
"maxLength":10,
"minLength":2,
"type":"string"
},
"prop2":{
"description":"prop2des",
"example":"prop2example",
"maxLength":200,
"minLength":2,
"type":"string"
},
"prop3":{
"enum":[
"enum1",
"enum2",
"enum3"
],
"example":"enum1",
"type":"string"
}
},
"required":[
"prop3"
],
"type":"object"
}
}
}
**Proper example for Test1**
{
"Test1": [{
"prop1": "TestStr",
"prop2": "Test and Test",
"prop3": "enum1"
}]
}
**Improper example that still passes validation for Test1**
{
"test1": [{
"prop1": "TestStr123456", [wrong as it passes the max limit]
"prop2": "Test and Test",
"prop3": " enum1" [wrong as it has a whitespace char before enum1]
}]
}
The first issue I ran across is that enum in prop3 isn't validated correctly. So, when I use " enum1" or "enumruwehrqweur" or "literally anything", the tests pass. In addition, that min-max characters do not get checked throughout my JSON. No matter how many characters I use in any field, I do not get an error. Anyone has any idea how to fix this, or has anyone found a better workaround to do what I would like to do? Thank you in advance!

There were a few issues with your schema. I'll address each of them.
In your schema, you have "Test1". In your JSON instance, you have "test1". Case is important. I would guess this is just an error in creating your example.
In your schema, you have "Test1" at the root level. Because this is not a schema key word, it is ignored, and has no effect on validation. You need to nest it inside a "properties" object, as you have done elsewhere.
{
"properties": {
"test1": {
Your validation would still not work correctly. If you want to validate each item in an array, you need to use the items keyword.
{
"properties": {
"test1": {
"items": {
"description": "test1des",
Finally, you'll need to nest the required and type key words inside the items object.
Here's the complete schema:
{
"properties": {
"test1": {
"items": {
"description": "test1des",
"minProperties": 1,
"properties": {
"prop1": {
"description": "prop1des",
"example": "prop1exam",
"maxLength": 10,
"minLength": 2,
"type": "string"
},
"prop2": {
"description": "prop2des",
"example": "prop2example",
"maxLength": 200,
"minLength": 2,
"type": "string"
},
"prop3": {
"enum": [
"enum1",
"enum2",
"enum3"
],
"example": "enum1",
"type": "string"
}
},
"required": [
"prop3"
],
"type": "object"
}
}
}
}

Printing each instance of a single line item from a JSON using python

Does anyone know how to print and multiple instances of the same line from a JSON output?
The code I wish to decipher looks something similar to:
[
{
"project": {
"id": 6514847,
"name": "Trial_1",
"code": "123",
"created_at": "2014-10-08T04:22:14Z",
"updated_at": "2017-04-11T00:32:43Z",
"starts_on": "2014-10-08"
}
},
{
"project": {
"id": 6514864,
"name": "Trial_2",
"code": "456",
"created_at": "2014-10-08T04:26:39Z",
"updated_at": "2017-04-11T00:32:46Z",
"starts_on": "2014-10-08"
}
},
{
"project": {
"id": 12502453,
"name": "Trial_3",
"code": "789",
"created_at": "2016-12-08T05:14:38Z",
"updated_at": "2017-04-11T00:32:38Z",
"starts_on": "2016-12-08"
}
}
]
This code was a request.get()
I know I can print a single instance of this using
req = requests.get(url, headers=headers)
read_req = req.json()
trial = read_req['project']['code']
print(trial) #123
The final product I wish to see is linking each Project Name to its relevant Project Code.

You have a list of dicts of dicts. To iterate over each "project" dict you just use a for loop.
for entry in read_req:
trial = entry['project']['code']
print(trial)
In this case, each time through the loop entry will be a dictionary containing the "project" key.

You need for loop.
read_req = req.json()
for project in read_req:
print(project['project']['code'])

This should work for you:
assuming jsontxt is having input data
for i in range(0,len(jsontxt)):
print jsontxt[i]['project']['name'], jsontxt[i]['project']['code']

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

python querying a json objectpath - python

Related

Append to a json file using python

Populating JSON data from API in Python pandas DataFrame - TypeError and IndexError

Create nested json file

No enum error when validating JSON using jsonschema in python

Printing each instance of a single line item from a JSON using python

Categories

Resources