I have a problem with a nested json in python script, i need to reproduce the following jq query:
cat inventory.json | jq '.hostvars[] | [.openstack.hostname, .openstack.accessIPv4]'
the json file has a structure like this:
{
"hostvars": {
"096b430e-20f0-4655-bb97-9bb3ab2db73c": {
"openstack": {
"accessIPv4": "192.168.3.6",
"hostname": "vm-1"
}
}
"8fb7b9b7-5ccc-47c8-addf-64563fdd0d4c": {
"openstack": {
"accessIPv4": "192.168.3.7",
"hostname": "vm-2"
}
}
}
}
and the query with jq gives me the correct output:
# cat test.json | jq '.hostvars[] | [.openstack.hostname, .openstack.accessIPv4]'
[
"vm-1",
"192.168.3.6"
]
[
"vm-2",
"192.168.3.7"
]
Now i want reproduce this in python, to handle the individual values in variable but I can't parse the contents of each id, what with jq i do with .hostvars [].
with open('inventory.json', 'r') as inv:
data=inv.read()
obj=json.loads(data)
objh=obj['hostvars'][096b430e-20f0-4655-bb97-9bb3ab2db73c]['openstack']
print(objh)
Calling the id works, but if I replace it with 0 or [] I have a syntax error.
Serializing JSON Data
I think when you are dealing with json python you should use convert to Serializing:
The json module exposes two methods for serializing Python objects into JSON format.
dump() will write Python data to a file-like object. We use this when we want to serialize our Python data to an external JSON file.
dumps() will write Python data to a string in JSON format. This is useful if we want to use the JSON elsewhere in our program, or if we just want to print it to the console to check that it’s correct.
Both the dump() and dumps() methods allow us to specify an optional indent argument. This will change how many spaces is used for indentation, which can make our JSON easier to read.
json_str = json.dumps(data, indent=4)
for exampel:
import json
data={"user":{
"name":"CodeView",
"age":29
}
}
with open("data_file.json","w")as write_file:
json.dump(data,write_file)
json_str=json.dumps(data)
print(json_str)
json_data = {
"hostvars": {
"096b430e-20f0-4655-bb97-9bb3ab2db73c": {
"openstack": {
"accessIPv4": "192.168.3.6",
"hostname": "vm-1"
}
},
"8fb7b9b7-5ccc-47c8-addf-64563fdd0d4c": {
"openstack": {
"accessIPv4": "192.168.3.7",
"hostname": "vm-2"
}
}
}
}
result = [[value['openstack']['hostname'], value['openstack']['accessIPv4']]
for value in json_data['hostvars'].values()]
print(result)
output
[['vm-1', '192.168.3.6'], ['vm-2', '192.168.3.7']]
Related
I'm new to Python and I'm quite stuck (I've gone through multiple other stackoverflows and other sites and still can't get this to work).
I've the below json coming out of an API connection
{
"results":[
{
"group":{
"mediaType":"chat",
"queueId":"67d9fb5e-26b2-4db5-b062-bbcfa8d2ca0d"
},
"data":[
{
"interval":"2021-01-14T13:12:19.000Z/2022-01-14T13:12:19.000Z",
"metrics":[
{
"metric":"nOffered",
"qualifier":null,
"stats":{
"max":null,
"min":null,
"count":14,
"count_negative":null,
"count_positive":null,
"sum":null,
"current":null,
"ratio":null,
"numerator":null,
"denominator":null,
"target":null
}
}
],
"views":null
}
]
}
]
}
and what I'm mainly looking to get out of it is (or at least something as close as)
MediaType
QueueId
NOffered
Chat
67d9fb5e-26b2-4db5-b062-bbcfa8d2ca0d
14
Is something like that possible? I've tried multiple things and I either get the whole of this out in one line or just get different errors.
The error you got indicates you missed that some of your values are actually a dictionary within an array.
Assuming you want to flatten your json file to retrieve the following keys: mediaType, queueId, count.
These can be retrieved by the following sample code:
import json
with open(path_to_json_file, 'r') as f:
json_dict = json.load(f)
for result in json_dict.get("results"):
media_type = result.get("group").get("mediaType")
queue_id = result.get("group").get("queueId")
n_offered = result.get("data")[0].get("metrics")[0].get("count")
If your data and metrics keys will have multiple indices you will have to use a for loop to retrieve every count value accordingly.
Assuming that the format of the API response is always the same, have you considered hardcoding the extraction of the data you want?
This should work: With response defined as the API output:
response = {
"results":[
{
"group":{
"mediaType":"chat",
"queueId":"67d9fb5e-26b2-4db5-b062-bbcfa8d2ca0d"
},
"data":[
{
"interval":"2021-01-14T13:12:19.000Z/2022-01-14T13:12:19.000Z",
"metrics":[
{
"metric":"nOffered",
"qualifier":'null',
"stats":{
"max":'null',
"min":'null',
"count":14,
"count_negative":'null',
"count_positive":'null',
"sum":'null',
"current":'null',
"ratio":'null',
"numerator":'null',
"denominator":'null',
"target":'null'
}
}
],
"views":'null'
}
]
}
]
}
You can extract the results as follows:
results = response["results"][0]
{
"mediaType": results["group"]["mediaType"],
"queueId": results["group"]["queueId"],
"nOffered": results["data"][0]["metrics"][0]["stats"]["count"]
}
which gives
{
'mediaType': 'chat',
'queueId': '67d9fb5e-26b2-4db5-b062-bbcfa8d2ca0d',
'nOffered': 14
}
I am trying to add data into a json key from a csv file and maintain the original structure as is.. the json file looks like this..
{
"inputDocuments": {
"gcsDocuments": {
"documents": [
{
"gcsUri": "gs://test/.PDF",
"mimeType": "application/pdf"
}
]
}
},
"documentOutputConfig": {
"gcsOutputConfig": {
"gcsUri": "gs://test"
}
},
"skipHumanReview": false
The csv file I am trying to load has the following structure..
note that the
mimetype
is not included in the csv file.
I already have code that can do this, however its a bit manual and I am looking for a simpler approach that would just require a csv file with the values and this data will be added into the json structure. The expected outcome should look like this:
{
"inputDocuments": {
"gcsDocuments": {
"documents": [
{
"gcsUri": "gs://sampleinvoices/Handwritten/1.pdf",
"mimeType": "application/pdf"
},
{
"gcsUri": "gs://sampleinvoices/Handwritten/2.pdf",
"mimeType": "application/pdf"
}
]
}
},
"documentOutputConfig": {
"gcsOutputConfig": {
"gcsUri": "gs://test"
}
},
"skipHumanReview": false
The code that I am currently using, which is a bit manual looks like this..
import json
# function to add to JSON
def write_json(new_data, filename='keyvalue.json'):
with open(filename,'r+') as file:
# load existing data into a dict.
file_data = json.load(file)
# Join new_data with file_data inside documents
file_data["inputDocuments"]["gcsDocuments"]["documents"].append(new_data)
# Sets file's current position at offset.
file.seek(0)
# convert back to json.
json.dump(file_data, file, indent = 4)
# python object to be appended
y = {
"gcsUri": "gs://test/.PDF",
"mimeType": "application/pdf"
}
write_json(y)
I would suggest something like this:
import pandas as pd
import json
from pathlib import Path
df_csv = pd.read_csv("your_data.csv")
json_file = Path("your_data.json")
json_data = json.loads(json_file.read_text())
documents = [
{
"gcsUri": cell,
"mimeType": "application/pdf"
}
for cell in df_csv["column_name"]
]
json_data["inputDocuments"]["gcsDocuments"]["documents"] = documents
json_file.write_text(json.dumps(json_data))
Probably you should split this into separate functions, but it should communicate the general idea.
Updated: The XHR response was not correct earlier
I'm failing with flatten my json in a correct way from a XHR-response.
I have just expanded one item below, to make it more readable.
I am using python and I have tried, with incorrect outcome.
u = "URL"
SE_units = requests.get(u,headers=h).json()
dp = pd.json_normalize(SE_units,[SE_units,"Items"])
SE_dp_list.append(dp)
From the XHR-Response below I would like to have the Items-information into a CSV but when i do export.to_CSV I see that it haven't been flattened correctly
{"Content":{
"PaginationCount":12,"FilterValues":null,"Items":
[{
"Id":258370,
"OriginalType":"BostadObjectPage",
"PublishDate":null,
"Title":"02 Skogsvagen",
"Image":
{
"description":null,
"alt":null,
"externalUrl":"/abc.jpg"
},
"StaticMapImage":null,
"Url":"/abcd/",
"HideReadMore":false,
"ProjectData":null,
"ObjectData":
{
"BuildingTypeLabel":"Rad-/Kedje-/Parhus",
"ObjectStatus":"SalesInProgress",
"ObjectStatusLabel":"Till salu",
"ObjectNumber":"02",
"City":"staden",
"RoomInterval":"2-3",
"LivingArea":"101",
"SalesPrice":"2 150 000",
"MonthlyFee":null,
"Elevator":false,
"Balcony":false,
"Terrace":true
},
"FastighetProjectData":null,
"FastighetObjectData":null,
"OfficeData":null
},
{
"Id":258372,
"OriginalType":"BostadObjectPage",
"PublishDate":null,
....."same structure as above"
"OfficeData":null
}],
"NoResultsMessage":null,
"SimplifiedBuildingType":null,
"NextIndex":-1,
"TotalCount":12,
"Heading":null,
"ShowMoreLabel":null,
"DataColumns":null,
"Error":null},
"ObjectSearchData":
{
"BuildingVariantId":"Houses",
"BuildingsFoundLabel":" {count}",
"BuildingTypeIds":[400],
"BuildingsAvailableForSale":12,
"BuildingNoResultsLabel":""
}
}
Expected output format after writing to CSV
I have a JSON file as follows :
{
"desired":{
"property1":{
"port":"/dev/usbserial",
"rx":{
"watchdoginterval":3600
},
"state":{
"path":"/Users/user1"
},
"enabled":"true",
"active":{
"enabled":"true"
}
},
"property2":{
"signal_interrupt":"USR2",
"signal_description_path":"/tmp/logger.log"
},
"property3":{
"periodmins":40
},
}
}
I am having issues trying to convert this into a string for use with AWS IoT. The function I am using is deviceShadowHandler.shadowUpdate(JSONPayload, customShadowCallback_Update, 5)
Where JSONPayload should be the JSON string.
I have tried :
with open('JSONfile.json' , 'r') as f:
dict = json.load(f)
JSONPayload = str(dict)
but I receive an "Invalid JSON file error".
An attempt to manually create a literal string from the jSON file gets messy with complaints about "EOL while scanning string literal" etc.
What is the best solution to solve this? I am new to JSON and stuff and Python.
Trailing commas are not allowed in JSON.
{
"desired":{
"property1":{
"port":"/dev/usbserial",
"rx":{
"watchdoginterval":3600
},
"state":{
"path":"/Users/user1"
},
"enabled":"true",
"active":{
"enabled":"true"
}
},
"property2":{
"signal_interrupt":"USR2",
"signal_description_path":"/tmp/logger.log"
},
"property3":{
"periodmins":40
} # <- no comma there
}
}
I need to parse the below Json data using python and write to a csv file. Below I have included only 2 server names but my list is big. Please help with sample code to get the desired output.
Below is my json data in a file server_info.json:
{
"dev-server":
{
"hoststatus":
{
"host_name":"dev-server",
"current_state":"2",
"last_time_up":"1482525184"
},
"servicestatus":
{
"/ Filesystem Check":
{
"host_name":"dev-server",
"service_description":"/ Filesystem Check",
"current_state":"1",
"state_type":"1"
},
"/home Filesystem Check":
{
"host_name":"dev-server",
"service_description":"/home Filesystem Check",
"current_state":"2",
"state_type":"2"
}
}
},
"uat-server":
{
"hoststatus":
{
"host_name":"uat-server",
"current_state":"0",
"last_time_up":"1460000000"
},
"servicestatus":
{
"/ Filesystem Check":
{
"host_name":"uat-server",
"service_description":"/ Filesystem Check",
"current_state":"0",
"state_type":"1"
},
"/home Filesystem Check":
{
"host_name":"uat-server",
"service_description":"/home Filesystem Check",
"current_state":"1",
"state_type":"2"
}
}
}
}
Expected Output:
output format:
hoststatus.host_name,hoststatus.current_state,hoststatus.last_time_up
-------------------------------------------------------------
dev-server,2,1482525184
uat-server,0,1460000000
and
output format:
servicestatus.host_name,servicestatus.service_description,servicestatus.current_state,servicestatus.state_type
--------------------------------------------------------------------------------
dev-server,/ Filesystem Check,1,1
dev-server,/home Filesystem Check,2,2
uat-server,/ Filesystem Check,0,1
uat-server,/home Filesystem Check,1,2
Elaborating more on what Jean-Fracois Fabre mentioned, json.load() can be used to read a JSON file and parse into Python object representation of JSON. json.loads() does the same except that the input is a string instead of a file (see json module for more details).
Bearing this in mind, say if you have your server logs in a file then you can start with the following:
import json
file = open('logs.txt')
data = json.load(file) # now the JSON object is represented as Python dict
for key in data.keys(): # dev-server and uat-server are keys
service_status = data[key]['servicestatus'] # this would give out the servicestatus
host_status = data[key]['hoststatus'] # this would give out the hoststatus
With this, you could use csv module to write it as CSV file in the format you desire.
A example by list comprehension.
import json
d = json.loads(data)
print("\n".join([','.join((hstat['host_name'], hstat['current_state'], hstat['last_time_up']))
for g in d.values()
for k, hstat in g.items() if k == 'hoststatus']))
print("\n".join([','.join((v['host_name'], v['service_description'], v['current_state'], v['state_type']))
for g in d.values()
for k, sstat in g.items() if k == 'servicestatus'
for v in sstat.values()]))