Update arrays in json object - python

Tried to update the array in json object. Here is my json object
{
"api.version": "v1",
"source": {
"thirdPartyRef": {
"resources": [{
"serviceType": "AwsElbBucket",
"path": {
"pathExpression": "songs/*"
},
"authentication": {
"type": "S3BucketAuthentication"
}
}]
}
}
}
Code that reads json and update awsId. My requirement is to add aws creds int the authentication secition.
Once program run successfully, it should look like
"authentication": {
"type": "S3BucketAuthentication",
"awsId": "AKIAXXXXX",
"awsKey": "MYHSHSYjusXXX"
}
Here is my snippet of code args[5] is the jsonfile
with open(args[5]) as json_data:
source = json.loads(json_data.read())
# source['source']['category']['awsID'] = "test"
source.update( {"awsId" : "AKIAXXXXX", "awsKey": "HHSJSHS"})
print source
output:
{u'api.version': u'v1', 'awsKey': 'HHSJSHS', 'awsId': 'AKIAXXXXX', u'source': {u'thirdPartyRef': {u'resources': [{u'path': {u'pathExpression': u'songs/*'}, u'serviceType': u'AwsElbBucket', u'authentication': {u'type': u'S3BucketAuthentication'}}]}}}
I tried to source.update( "source":{"awsId" : "AKIAXXXXX", "awsKey": "HHSJSHS"}}), it overwrites the rest of the json.

The data structure that you want to update is buried fairly deeply. You can't access it from the very top level.
Try this:
import json
with open('arg5.json') as json_data:
source = json.loads(json_data.read())
print source
source["source"]["thirdPartyRef"]["resources"][0]["authentication"].update(
{"awsId" : "AKIAXXXXX", "awsKey": "HHSJSHS"})

Related

Extracting data from JSON log

I am a beginner when it comes to programming. I'm trying to extract elements from a JSON log file, but I get an error and I don't know how to deal with it.
import json
with open("/Users/milosz/Desktop/logi.json") as f:
data = json.load(f)
print(type(data['Objects']))
print(data)
for object in data ['Objects']:
print(object)
Error:
File "/Users/milosz/PycharmProjects/JsonDataExtracter/Program/Python Exracter.py", line 4, in <module>
print(type(data['Objects']))
TypeError: list indices must be integers or slices, not str
Process finished with exit code 1
I am sending the log below.
{
"_id": "635bd4bfc594743ce9b1a5a3",
"dateStart": "2022-10-28T13:09:28.609Z",
"dateFinish": "2022-10-28T13:10:23.698Z",
"method": "customer.file.upsert",
"request": {
"Objects": [
{
"ERPId": "6915",
"B24Id": 403772,
"FileName": "B2B000202",
"FileContent": "JVBERi0xLjMNJeLjz9MN",
"B24EntityId": 3334
}
]
Following up on the guidance from #accdias, here is a code snippet that closes the gaps in your JSON snippet and demonstrates how to access the Objects section:
import json
json_string = """
{
"_id": "635bd4bfc594743ce9b1a5a3",
"dateStart": "2022-10-28T13:09:28.609Z",
"dateFinish": "2022-10-28T13:10:23.698Z",
"method": "customer.file.upsert",
"request": {
"Objects": [
{
"ERPId": "6915",
"B24Id": 403772,
"FileName": "B2B000202",
"FileContent": "JVBERi0xLjMNJeLjz9MN",
"B24EntityId": 3334
}
]
}
}
"""
json_dict = json.loads(json_string)
print(json_dict["request"]["Objects"])
Output:
[{'ERPId': '6915', 'B24Id': 403772, 'FileName': 'B2B000202', 'FileContent': 'JVBERi0xLjMNJeLjz9MN', 'B24EntityId': 3334}]

How to parse nested JSON object?

I am working on a new project in HubSpot that returns nested JSON like the sample below. I am trying to access the associated contacts id, but am struggling to reference it correctly (the id I am looking for is the value '201' in the example below). I've put together this script, but this script only returns the entire associations portion of the JSON and I only want the id. How do I reference the id correctly?
Here is the output from the script:
{'contacts': {'paging': None, 'results': [{'id': '201', 'type': 'ticket_to_contact'}]}}
And here is the script I put together:
import hubspot
from pprint import pprint
client = hubspot.Client.create(api_key="API_KEY")
try:
api_response = client.crm.tickets.basic_api.get_page(limit=2, associations=["contacts"], archived=False)
for x in range(2):
pprint(api_response.results[x].associations)
except ApiException as e:
print("Exception when calling basic_api->get_page: %s\n" % e)
Here is what the full JSON looks like ('contacts' property shortened for readability):
{
"results": [
{
"id": "34018123",
"properties": {
"content": "Hi xxxxx,\r\n\r\nCan you clarify on how the blocking of script happens? Is it because of any CSP (or) the script will decide run time for every URL’s getting triggered from browser?\r\n\r\nRegards,\r\nLogan",
"createdate": "2019-07-03T04:20:12.366Z",
"hs_lastmodifieddate": "2020-12-09T01:16:12.974Z",
"hs_object_id": "34018123",
"hs_pipeline": "0",
"hs_pipeline_stage": "4",
"hs_ticket_category": null,
"hs_ticket_priority": null,
"subject": "RE: call followup"
},
"createdAt": "2019-07-03T04:20:12.366Z",
"updatedAt": "2020-12-09T01:16:12.974Z",
"archived": false
},
{
"id": "34018892",
"properties": {
"content": "Hi Guys,\r\n\r\nI see that we were placed back on the staging and then removed again.",
"createdate": "2019-07-03T07:59:10.606Z",
"hs_lastmodifieddate": "2021-12-17T09:04:46.316Z",
"hs_object_id": "34018892",
"hs_pipeline": "0",
"hs_pipeline_stage": "3",
"hs_ticket_category": null,
"hs_ticket_priority": null,
"subject": "Re: Issue due to server"
},
"createdAt": "2019-07-03T07:59:10.606Z",
"updatedAt": "2021-12-17T09:04:46.316Z",
"archived": false,
"associations": {
"contacts": {
"results": [
{
"id": "201",
"type": "ticket_to_contact"
}
]
}
}
}
],
"paging": {
"next": {
"after": "35406270",
"link": "https://api.hubapi.com/crm/v3/objects/tickets?associations=contacts&archived=false&hs_static_app=developer-docs-ui&limit=2&after=35406270&hs_static_app_version=1.3488"
}
}
}
You can do api_response.results[x].associations["contacts"]["results"][0]["id"].
Sorted this out, posting in case anyone else is struggling with the response from the HubSpot v3 Api. The response schema for this call is:
Response schema type: Object
String results[].id
Object results[].properties
String results[].createdAt
String results[].updatedAt
Boolean results[].archived
String results[].archivedAt
Object results[].associations
Object paging
Object paging.next
String paging.next.after
String paging.next.linkResponse schema type: Object
String results[].id
Object results[].properties
String results[].createdAt
String results[].updatedAt
Boolean results[].archived
String results[].archivedAt
Object results[].associations
Object paging
Object paging.next
String paging.next.after
String paging.next.link
So to access the id of the contact associated with the ticket, you need to reference it using this notation:
api_response.results[1].associations["contacts"].results[0].id
notes:
results[x] - reference the result in the index
associations["contacts"] -
associations is a dictionary object, you can access the contacts item
by it's name
associations["contacts"].results is a list - reference
by the index []
id - is a string
In my case type was ModelProperty or CollectionResponseProperty couldn't reach dict anyhow.
For the record this got me to go through the results.
for result in list(api_response.results):
ID = result.id

How to add data into a json key from a csv file using python

I am trying to add data into a json key from a csv file and maintain the original structure as is.. the json file looks like this..
{
"inputDocuments": {
"gcsDocuments": {
"documents": [
{
"gcsUri": "gs://test/.PDF",
"mimeType": "application/pdf"
}
]
}
},
"documentOutputConfig": {
"gcsOutputConfig": {
"gcsUri": "gs://test"
}
},
"skipHumanReview": false
The csv file I am trying to load has the following structure..
note that the
mimetype
is not included in the csv file.
I already have code that can do this, however its a bit manual and I am looking for a simpler approach that would just require a csv file with the values and this data will be added into the json structure. The expected outcome should look like this:
{
"inputDocuments": {
"gcsDocuments": {
"documents": [
{
"gcsUri": "gs://sampleinvoices/Handwritten/1.pdf",
"mimeType": "application/pdf"
},
{
"gcsUri": "gs://sampleinvoices/Handwritten/2.pdf",
"mimeType": "application/pdf"
}
]
}
},
"documentOutputConfig": {
"gcsOutputConfig": {
"gcsUri": "gs://test"
}
},
"skipHumanReview": false
The code that I am currently using, which is a bit manual looks like this..
import json
# function to add to JSON
def write_json(new_data, filename='keyvalue.json'):
with open(filename,'r+') as file:
# load existing data into a dict.
file_data = json.load(file)
# Join new_data with file_data inside documents
file_data["inputDocuments"]["gcsDocuments"]["documents"].append(new_data)
# Sets file's current position at offset.
file.seek(0)
# convert back to json.
json.dump(file_data, file, indent = 4)
# python object to be appended
y = {
"gcsUri": "gs://test/.PDF",
"mimeType": "application/pdf"
}
write_json(y)
I would suggest something like this:
import pandas as pd
import json
from pathlib import Path
df_csv = pd.read_csv("your_data.csv")
json_file = Path("your_data.json")
json_data = json.loads(json_file.read_text())
documents = [
{
"gcsUri": cell,
"mimeType": "application/pdf"
}
for cell in df_csv["column_name"]
]
json_data["inputDocuments"]["gcsDocuments"]["documents"] = documents
json_file.write_text(json.dumps(json_data))
Probably you should split this into separate functions, but it should communicate the general idea.

Flatten json in Python from XHR-response

Updated: The XHR response was not correct earlier
I'm failing with flatten my json in a correct way from a XHR-response.
I have just expanded one item below, to make it more readable.
I am using python and I have tried, with incorrect outcome.
u = "URL"
SE_units = requests.get(u,headers=h).json()
dp = pd.json_normalize(SE_units,[SE_units,"Items"])
SE_dp_list.append(dp)
From the XHR-Response below I would like to have the Items-information into a CSV but when i do export.to_CSV I see that it haven't been flattened correctly
{"Content":{
"PaginationCount":12,"FilterValues":null,"Items":
[{
"Id":258370,
"OriginalType":"BostadObjectPage",
"PublishDate":null,
"Title":"02 Skogsvagen",
"Image":
{
"description":null,
"alt":null,
"externalUrl":"/abc.jpg"
},
"StaticMapImage":null,
"Url":"/abcd/",
"HideReadMore":false,
"ProjectData":null,
"ObjectData":
{
"BuildingTypeLabel":"Rad-/Kedje-/Parhus",
"ObjectStatus":"SalesInProgress",
"ObjectStatusLabel":"Till salu",
"ObjectNumber":"02",
"City":"staden",
"RoomInterval":"2-3",
"LivingArea":"101",
"SalesPrice":"2 150 000",
"MonthlyFee":null,
"Elevator":false,
"Balcony":false,
"Terrace":true
},
"FastighetProjectData":null,
"FastighetObjectData":null,
"OfficeData":null
},
{
"Id":258372,
"OriginalType":"BostadObjectPage",
"PublishDate":null,
....."same structure as above"
"OfficeData":null
}],
"NoResultsMessage":null,
"SimplifiedBuildingType":null,
"NextIndex":-1,
"TotalCount":12,
"Heading":null,
"ShowMoreLabel":null,
"DataColumns":null,
"Error":null},
"ObjectSearchData":
{
"BuildingVariantId":"Houses",
"BuildingsFoundLabel":" {count}",
"BuildingTypeIds":[400],
"BuildingsAvailableForSale":12,
"BuildingNoResultsLabel":""
}
}
Expected output format after writing to CSV

Grab element from json dump

I'm using the following python code to connect to a jsonrpc server and nick some song information. However, I can't work out how to get the current title in to a variable to print elsewhere. Here is the code:
TracksInfo = []
for song in playingSongs:
data = { "id":1,
"method":"slim.request",
"params":[ "",
["songinfo",0,100, "track_id:%s" % song, "tags:GPASIediqtymkovrfijnCYXRTIuwxN"]
]
}
params = json.dumps(data, sort_keys=True, indent=4)
conn.request("POST", "/jsonrpc.js", params)
httpResponse = conn.getresponse()
data = httpResponse.read()
responce = json.loads(data)
print json.dumps(responce, sort_keys=True, indent=4)
TrackInfo = responce['result']["songinfo_loop"][0]
TracksInfo.append(TrackInfo)
This brings me back the data in json format and the print json.dump brings back:
pi#raspberrypi ~/pithon $ sudo python tom3.py
{
"id": 1,
"method": "slim.request",
"params": [
"",
[
"songinfo",
"0",
100,
"track_id:-140501481178464",
"tags:GPASIediqtymkovrfijnCYXRTIuwxN"
]
],
"result": {
"songinfo_loop": [
{
"id": "-140501481178464"
},
{
"title": "Witchcraft"
},
{
"artist": "Pendulum"
},
{
"duration": "253"
},
{
"tracknum": "1"
},
{
"type": "Ogg Vorbis (Spotify)"
},
{
"bitrate": "320k VBR"
},
{
"coverart": "0"
},
{
"url": "spotify:track:2A7ZZ1tjaluKYMlT3ItSfN"
},
{
"remote": 1
}
]
}
}
What i'm trying to get is result.songinfoloop.title (but I tried that!)
The songinfo_loop structure is.. peculiar. It is a list of dictionaries each with just one key.
Loop through it until you have one with a title:
TrackInfo = next(d['title'] for d in responce['result']["songinfo_loop"] if 'title' in d)
TracksInfo.append(TrackInfo)
A better option would be to 'collapse' all those dictionaries into one:
songinfo = reduce(lambda d, p: d.update(p) or d,
responce['result']["songinfo_loop"], {})
TracksInfo.append(songinfo['title'])
songinfo_loop is a list not a dict. That means you need to call it by position, or loop through it and find the dict with a key value of "title"
positional:
responce["result"]["songinfo_loop"][1]["title"]
loop:
for info in responce["result"]["songinfo_loop"]:
if "title" in info.keys():
print info["title"]
break
else:
print "no song title found"
Really, it seems like you would want to have the songinfo_loop be a dict, not a list. But if you need to leave it as a list, this is how you would pull the title.
The result is really a standard python dict, so you can use
responce["result"]["songinfoloop"]["title"]
which should work

Categories