Update nested JSON with name of json file name

Update nested JSON with name of json file name - python

I'm wondering if you could help me with filling jsons with their original filenames.
Here is a sample of json:
jsv is a list of jsons (the first main key is number of document (document_0, document_1 ...)
jsv =
[
{
{
"document_0":{
"id":111,
"laboratory":"xxx",
"document_type":"xxx",
"language":"pl",
"creation_date":"09-12-2022",
"source_filename":"None",
"version":"0.1",
"exams_ocr_avg_confidence":0.0,
"patient_data":{
"first_name":"YYYY",
"surname":"YYYY",
"pesel":"12345678901",
"birth_date":"1111-22-22",
"sex":"F",
"age":"None"
},
"exams":[
{
"name":"xx",
"sampling_date":"2020-11-30",
"comment":"None",
"confidence":97,
"result":"222",
"unit":"ml",
"norm":"None",
"material":"None",
"icd9":"uuuuu"
},
{
"document_1":{
"id":111,
"laboratory":"xxx",
"document_type":"xxx",
"language":"pl",
"creation_date":"09-12-2022",
"source_filename":"None",
"version":"0.1",
"exams_ocr_avg_confidence":0.0,
"patient_data":{
"first_name":"YYYY",
"surname":"YYYY",
"pesel":"12345678901",
"birth_date":"1111-22-22",
"sex":"F",
"age":"None"
},
"exams":[
{
"name":"xx",
"sampling_date":"2020-11-30",
"comment":"None",
"confidence":97,
"result":"222",
"unit":"ml",
"norm":"None",
"material":"None",
"icd9":"uuuuu"
}
}
]
And inside of this json there is a key: source_filename which I want to update with real name of json file name
my folder with files as an example:
'11111.pdf.json',
'11112.pdf.json',
'11113.pdf.json',
'11114.pdf.json',
'11115.pdf.json'
What I want to achieve:
jsv =
[
{
{
"document_0":{
"id":111,
"laboratory":"xxx",
"document_type":"xxx",
"language":"pl",
"creation_date":"09-12-2022",
"source_filename":"11111.pdf.json",
"version":"0.1",
"exams_ocr_avg_confidence":0.0,
"patient_data":{
"first_name":"YYYY",
"surname":"YYYY",
"pesel":"12345678901",
"birth_date":"1111-22-22",
"sex":"F",
"age":"None"
},
"exams":[
{
"name":"xx",
"sampling_date":"2222-22-22",
"comment":"None",
"confidence":22,
"result":"222",
"unit":"ml",
"norm":"None",
"material":"None",
"icd9":"uuuuu"
},
{
"document_1":{
"id":111,
"laboratory":"xxx",
"document_type":"xxx",
"language":"pl",
"creation_date":"22-22-2222",
"source_filename":"11111.pdf.json",
"version":"0.1",
"exams_ocr_avg_confidence":0.0,
"patient_data":{
"first_name":"YYYY",
"surname":"YYYY",
"pesel":"12345678901",
"birth_date":"1111-22-22",
"sex":"F",
"age":"None"
},
"exams":[
{
"name":"xx",
"sampling_date":"2222-11-22",
"comment":"None",
"confidence":22,
"result":"222",
"unit":"ml",
"norm":"None",
"material":"None",
"icd9":"uuuuu"
}
}
]
document_0 and document_1 are with the same filename
what I've managed to get:
dir_name = 'path_name'
from os import listdir
from os.path import isfile, join
onlyfiles = [f for f in listdir(dir_name) if isfile(join(dir_name, f))]
only_files which is a list of filenames of my jsons.
Now I was thinking to maybe update somehow my jsv with it in a loop?
But I'm also looking for a method which will be very efficient due to large amount of data I have to process
EDIT:
I've managed to do it with a for loop, but maybe there is more effective way:
for i in range(len(jsv)):
if (type(jsv[i]) == dict):
jsv[i]["document_0"].update({"source_filename": onlyfiles[i]})
else:
print(onlyfiles[i])

If your jsv is:
jsv = [
{
"document_0": {
"id": 111,
"laboratory": "xxx",
"document_type": "xxx",
"language": "pl",
"creation_date": "09-12-2022",
"source_filename": "None",
"version": "0.1",
"exams_ocr_avg_confidence": 0.0,
"patient_data": {
"first_name": "YYYY",
"surname": "YYYY",
"pesel": "12345678901",
"birth_date": "1111-22-22",
"sex": "F",
"age": "None",
},
"exams": [
{
"name": "xx",
"sampling_date": "2020-11-30",
"comment": "None",
"confidence": 97,
"result": "222",
"unit": "ml",
"norm": "None",
"material": "None",
"icd9": "uuuuu",
},
],
}
},
{
"document_1": {
"id": 111,
"laboratory": "xxx",
"document_type": "xxx",
"language": "pl",
"creation_date": "09-12-2022",
"source_filename": "None",
"version": "0.1",
"exams_ocr_avg_confidence": 0.0,
"patient_data": {
"first_name": "YYYY",
"surname": "YYYY",
"pesel": "12345678901",
"birth_date": "1111-22-22",
"sex": "F",
"age": "None",
},
"exams": [
{
"name": "xx",
"sampling_date": "2020-11-30",
"comment": "None",
"confidence": 97,
"result": "222",
"unit": "ml",
"norm": "None",
"material": "None",
"icd9": "uuuuu",
},
],
},
},
]
In Python, you can do something like this:
arq = ['11111.pdf.json', '11112.pdf.json']
if len(arq) == len(jsv):
for i, json in enumerate(jsv):
for key in enumerate(json.keys()):
json[key[1]]['source_filename'] = arq[i]
Need to check if the length of files list is the same of the jsv list!
result this jsv:
[
{
"document_0": {
"id": 111,
"laboratory": "xxx",
"document_type": "xxx",
"language": "pl",
"creation_date": "09-12-2022",
"source_filename": "11111.pdf.json",
"version": "0.1",
"exams_ocr_avg_confidence": 0.0,
"patient_data": {
"first_name": "YYYY",
"surname": "YYYY",
"pesel": "12345678901",
"birth_date": "1111-22-22",
"sex": "F",
"age": "None",
},
"exams": [
{
"name": "xx",
"sampling_date": "2020-11-30",
"comment": "None",
"confidence": 97,
"result": "222",
"unit": "ml",
"norm": "None",
"material": "None",
"icd9": "uuuuu",
}
],
}
},
{
"document_1": {
"id": 222,
"laboratory": "xxx",
"document_type": "xxx",
"language": "pl",
"creation_date": "09-12-2022",
"source_filename": "11112.pdf.json",
"version": "0.1",
"exams_ocr_avg_confidence": 0.0,
"patient_data": {
"first_name": "YYYY",
"surname": "YYYY",
"pesel": "12345678901",
"birth_date": "1111-22-22",
"sex": "F",
"age": "None",
},
"exams": [
{
"name": "xx",
"sampling_date": "2020-11-30",
"comment": "None",
"confidence": 97,
"result": "222",
"unit": "ml",
"norm": "None",
"material": "None",
"icd9": "uuuuu",
}
],
}
},
]

Related

Explode json without pandas

I have a JSON object:
{
"data": {
"geography": [
{
"id": "1",
"state": "USA",
"properties": [
{
"code": "CMD-01",
"value": "34"
},
{
"code": "CMD-02",
"value": "24"
}
]
},
{
"id": "2",
"state": "Canada",
"properties": [
{
"code": "CMD-04",
"value": "50"
},
{
"code": "CMD-05",
"value": "60"
}
]
}
]
}
}
I want to get the result as a new JSON, but without using pandas (and all those explode, flatten and normalize functions...). Is there any option to get this structure without using pandas or having an Out of memory issue?
The output should be:
{ "id": "1",
"state": "USA",
"code": "CMD-01",
"value": "34"
},
{ "id": "1",
"state": "USA",
"code": "CMD-02",
"value": "24",
},
{ "id": "2",
"state": "Canada",
"code": "CMD-04",
"value": "50"
},
{ "id": "2",
"state": "Canada",
"code": "CMD-05",
"value": "60"
},

You can simply loop over the list associated with "geography" and build new dictionaries that you will add to a newly created list:
dict_in = {
"data": {
"geography": [
{
"id": "1",
"state": "USA",
"properties": [
{
"code": "CMD-01",
"value": "34"
},
{
"code": "CMD-02",
"value": "24"
}
]
},
{
"id": "2",
"state": "Canada",
"properties": [
{
"code": "CMD-04",
"value": "50"
},
{
"code": "CMD-05",
"value": "60"
}
]
}
]
}
}
import json
rec_out = []
for obj in dict_in["data"]["geography"]:
for prop in obj["properties"]:
dict_out = {
"id": obj["id"],
"state": obj["state"]
}
dict_out.update(prop)
rec_out.append(dict_out)
print(json.dumps(rec_out, indent=4))
Output:
[
{
"id": "1",
"state": "USA",
"code": "CMD-01",
"value": "34"
},
{
"id": "1",
"state": "USA",
"code": "CMD-02",
"value": "24"
},
{
"id": "2",
"state": "Canada",
"code": "CMD-04",
"value": "50"
},
{
"id": "2",
"state": "Canada",
"code": "CMD-05",
"value": "60"
}
]

What is the best way for me to iterate over this dataset to return all matching values from another key value pair if I match a separate key?

I want to be able to search through this list (see bottom of post) of dicts (I think that is what this particular arrangement is called) to search for an ['address'] that matches '0xd2'. If that match is found, I want to return/print all the corresponding ['id']s.
So in this case I would like to return:
632, 315, 432, 100
I'm able to extract individual values like this:
none = None
print(my_dict['result'][2]["id"])
432
I'm struggling with how to get a loop to do this properly.
{
"total": 4,
"page": 0,
"page_size": 100,
"result": [
{
"address": "0xd2",
"id": "632",
"amount": "1",
"name": "Avengers",
"group": "Marvel",
"uri": "https://google.com/",
"metadata": null,
"synced_at": "2022-05-26T22:52:34.113Z",
"last_sync": "2022-05-26T22:52:34.113Z"
},
{
"address": "0xd2",
"id": "315",
"amount": "1",
"name": "Avengers",
"group": "Marvel",
"uri": "https://google.com/",
"metadata": null,
"synced_at": "2022-05-26T22:52:34.113Z",
"last_sync": "2022-05-26T22:52:34.113Z"
},
{
"address": "0xd2",
"id": "432",
"amount": "1",
"name": "Avengers",
"group": "Marvel",
"uri": "https://google.com/",
"metadata": null,
"synced_at": "2022-05-26T22:52:34.113Z",
"last_sync": "2022-05-26T22:52:34.113Z"
},
{
"address": "0x44",
"id": "100",
"amount": "1",
"name": "Suicide Squad",
"group": "DC",
"uri": "https://google.com/",
"metadata": null,
"synced_at": "2022-05-26T22:52:34.113Z",
"last_sync": "2022-05-26T22:52:34.113Z"
}
],
"status": "SYNCED"
}

Welcome to StackOverflow.
You can try list comprehension:
[res["id"] for res in my_dict["result"] if res["address"] == "0xd2"]
If you'd like to use a for loop:
l = []
for res in my_dict["result"]:
if res["address"] == "0xd2":
l.append(res["id"])

You can use a list comprehension.
import json
json_string = """{
"total": 4,
"page": 0,
"page_size": 100,
"result": [
{
"address": "0xd2",
"id": "632",
"amount": "1",
"name": "Avengers",
"group": "Marvel",
"uri": "https://google.com/",
"metadata": null,
"synced_at": "2022-05-26T22:52:34.113Z",
"last_sync": "2022-05-26T22:52:34.113Z"
},
{
"address": "0xd2",
"id": "315",
"amount": "1",
"name": "Avengers",
"group": "Marvel",
"uri": "https://google.com/",
"metadata": null,
"synced_at": "2022-05-26T22:52:34.113Z",
"last_sync": "2022-05-26T22:52:34.113Z"
},
{
"address": "0xd2",
"id": "432",
"amount": "1",
"name": "Avengers",
"group": "Marvel",
"uri": "https://google.com/",
"metadata": null,
"synced_at": "2022-05-26T22:52:34.113Z",
"last_sync": "2022-05-26T22:52:34.113Z"
},
{
"address": "0x44",
"id": "100",
"amount": "1",
"name": "Suicide Squad",
"group": "DC",
"uri": "https://google.com/",
"metadata": null,
"synced_at": "2022-05-26T22:52:34.113Z",
"last_sync": "2022-05-26T22:52:34.113Z"
}
],
"status": "SYNCED"
}"""
json_dict = json.loads(json_string)
result = [elem['id'] for elem in json_dict['result'] if elem['address'] == '0xd2']
print(result)
Output:
['632', '315', '432']

This would store the associated ids in the list:
ids=[]
for r in dataset.get('result'):
if r.get('address')=='0xd2':
ids.append(r.get('id'))

How to create very large Json object in Python

I have a json file like this:
{
"students": [
{
"name": "Jack",
"age": "12",
"class": "8",
"start_date": "2021-01-01",
"score": {
"Eng": "90",
"Math": "90",
"History": "91",
"Art": "80"
},
"friend": {},
"talents": [
"dance"
]
}
],
"school": [
{
"city": "LA",
"state": "CA",
"country": "US",
}
]
}
Now I want to modify this file and add new students' info into it
like this: (Add more similar json objects into students with minimum change)
{
"students": [
{
"name": "Jack",
"age": "12",
"class": "8",
"start_date": "2021-01-01",
"score": {
"Eng": "90",
"Math": "90",
"History": "91",
"Art": "80"
},
"friend": {},
"talents": [
"dance"
]
},
{
"name": "David",
"age": "12",
"class": "8",
"start_date": "2021-02-01",
"score": {
"Eng": "92",
"Math": "90",
"History": "95",
"Art": "70"
},
"friend": {},
"talents": [
"skate"
]
},
... ...
],
"school": [
{
"city": "LA",
"state": "CA",
"country": "US",
}
]
}
Is there a way to do this in large scale? Since I need to append >1000 new students.
And since this is a large json and only do minimum change, so I don't want specify each key and value for >1000 times.

How to use S3 Select for Nested Parquet Objects

I have dumped data into a parquet file.
When I use
SELECT * FROM s3object s LIMIT 1
it gives me the following result.
{
"name": "John",
"age": "45",
"country": "USA",
"experience": [{
"company": {
"name": "ABC",
"years": "10",
"position": "Manager"
}
},
{
"company": {
"name": "BBC",
"years": "2",
"position": "Assistant"
}
}
]
}
I want to filter the result where company.name = "ABC"
so, the output should be looks like following.
{
"name": "John",
"age": "45",
"country": "USA",
"experience": [{
"company": {
"name": "ABC",
"years": "10",
"position": "Manager"
}
}
]
}
or this
{
"name": "John",
"age": "45",
"country": "USA",
"experience.company.name": "ABC",
"experience.company.years": "10",
"experience.company.position": "Manager"
}
Any support is highly appreciated.
Thanks.

append multiple json files together and ouptut 1 Avro file using Python

I have a use case where I am required to append multiple json files and then convert them into 1 single Avro file. I have written the code below which appends the json files together and then convert them into AVRO file. But the issue I am having is that the JSON file gets appended but the entore JSON is enclosed in [] brackets and so I get error while converting it into AVRO file. I am trying to figure out how can I get rid of the [] from the first and the last line in JSON file? Any help is appreciated.
The error I am getting is (snippet of the error, error is too long to paste : avro.io.AvroTypeException: The datum [{'event_type': 'uplink'.....}] is not an example of the schema
My code:
Laird.py
import avro.schema
from avro.datafile import DataFileReader, DataFileWriter
from avro.io import DatumReader, DatumWriter
from avro import schema, datafile, io
import json
from datetime import date
import glob
data = []
for f in glob.glob("*.txt"):
with open(f,) as infile:
data.append(json.load(infile))
# json.dumps(data)
with open("laird.json",'w') as outfile:
json.dump(data, outfile)
def json_to_avro():
fo = open("laird.json", "r")
data = fo.readlines()
final_header = []
final_rec = []
for header in data[0:1]:
header = header.strip("\n")
header = header.split(",")
final_header = header
for rec in data[1:]:
rec = rec.strip("\n")
rec = rec.split(" ")
rec = ' '.join(rec).split()
final_rec = rec
final_dict = dict(zip(final_header,final_rec))
# print(final_dict)
json_dumps = json.dumps(final_dict, ensure_ascii=False)
# print(json_dumps)
schema = avro.schema.parse(open("laird.avsc", "rb").read())
# print(schema)
writer = DataFileWriter(open("laird.avro", "wb"), DatumWriter(), schema)
with open("laird.json") as fp:
contents = json.load(fp)
print(contents)
writer.append(contents)
writer.close()
json_to_avro()
#Script to read/convert AVRO file to JSON
reader = DataFileReader(open("laird.avro", "rb"), DatumReader())
for user in reader:
print(user)
reader.close()
Schema: lair.avsc
{
"name": "MyClass",
"type": "record",
"namespace": "com.acme.avro",
"fields": [
{
"name": "event_type",
"type": "string"
},
{
"name": "event_data",
"type": {
"name": "event_data",
"type": "record",
"fields": [
{
"name": "device_id",
"type": "string"
},
{
"name": "user_id",
"type": "string"
},
{
"name": "payload",
"type": {
"type": "array",
"items": {
"name": "payload_record",
"type": "record",
"fields": [
{
"name": "name",
"type": "string"
},
{
"name": "sensor_id",
"type": "string"
},
{
"name": "type",
"type": "string"
},
{
"name": "unit",
"type": "string"
},
{
"name": "value",
"type": "float"
},
{
"name": "channel",
"type": "int"
},
{
"name": "timestamp",
"type": "long"
}
]
}
}
},
{
"name": "client_id",
"type": "string"
},
{
"name": "hardware_id",
"type": "string"
},
{
"name": "timestamp",
"type": "long"
},
{
"name": "application_id",
"type": "string"
},
{
"name": "device_type_id",
"type": "string"
}
]
}
},
{
"name": "company",
"type": {
"name": "company",
"type": "record",
"fields": [
{
"name": "id",
"type": "int"
},
{
"name": "address",
"type": "string"
},
{
"name": "city",
"type": "string"
},
{
"name": "country",
"type": "string"
},
{
"name": "created_at",
"type": "string"
},
{
"name": "industry",
"type": "string"
},
{
"name": "latitude",
"type": "float"
},
{
"name": "longitude",
"type": "float"
},
{
"name": "name",
"type": "string"
},
{
"name": "state",
"type": "string"
},
{
"name": "status",
"type": "int"
},
{
"name": "timezone",
"type": "string"
},
{
"name": "updated_at",
"type": "string"
},
{
"name": "user_id",
"type": "string"
},
{
"name": "zip",
"type": "string"
}
]
}
},
{
"name": "location",
"type": {
"name": "location",
"type": "record",
"fields": [
{
"name": "id",
"type": "int"
},
{
"name": "address",
"type": "string"
},
{
"name": "city",
"type": "string"
},
{
"name": "country",
"type": "string"
},
{
"name": "created_at",
"type": "string"
},
{
"name": "industry",
"type": "string"
},
{
"name": "latitude",
"type": "float"
},
{
"name": "longitude",
"type": "float"
},
{
"name": "name",
"type": "string"
},
{
"name": "state",
"type": "string"
},
{
"name": "status",
"type": "int"
},
{
"name": "timezone",
"type": "string"
},
{
"name": "updated_at",
"type": "string"
},
{
"name": "user_id",
"type": "string"
},
{
"name": "zip",
"type": "string"
},
{
"name": "company_id",
"type": "int"
}
]
}
},
{
"name": "device_type",
"type": {
"name": "device_type",
"type": "record",
"fields": [
{
"name": "id",
"type": "string"
},
{
"name": "application_id",
"type": "string"
},
{
"name": "category",
"type": "string"
},
{
"name": "codec",
"type": "string"
},
{
"name": "data_type",
"type": "string"
},
{
"name": "description",
"type": "string"
},
{
"name": "manufacturer",
"type": "string"
},
{
"name": "model",
"type": "string"
},
{
"name": "name",
"type": "string"
},
{
"name": "parent_constraint",
"type": "string"
},
{
"name": "proxy_handler",
"type": "string"
},
{
"name": "subcategory",
"type": "string"
},
{
"name": "transport_protocol",
"type": "string"
},
{
"name": "version",
"type": "string"
},
{
"name": "created_at",
"type": "string"
},
{
"name": "updated_at",
"type": "string"
}
]
}
},
{
"name": "device",
"type": {
"name": "device",
"type": "record",
"fields": [
{
"name": "id",
"type": "int"
},
{
"name": "thing_name",
"type": "string"
},
{
"name": "created_at",
"type": "string"
},
{
"name": "updated_at",
"type": "string"
},
{
"name": "status",
"type": "int"
}
]
}
}
]
}
Generated JSON File: laird.json
[{"event_type": "uplink", "event_data": {"device_id": "42934500-fcfb-11ea-9f13-d1d0271289a6", "user_id": "a5d78945-9f24-48a1-9107-5bee62bf007a", "payload": [{"name": "Humidity", "sensor_id": "42abaf00-fcfb-11ea-9c71-c517ac227ea5", "type": "rel_hum", "unit": "p", "value": 94.29, "channel": 4, "timestamp": 1605007797789}, {"name": "Temperature", "sensor_id": "42b0df20-fcfb-11ea-bf5c-d11ce3dbc1cb", "type": "temp", "unit": "c", "value": 21.64, "channel": 3, "timestamp": 1605007797789}, {"name": "Battery", "sensor_id": "42a98c20-fcfb-11ea-b4dd-cd2887a335f7", "type": "batt", "unit": "p", "value": 100, "channel": 5, "timestamp": 1605007797789}, {"name": "Local Backup", "sensor_id": "42b01bd0-fcfb-11ea-9f13-d1d0271289a6", "type": "digital_sensor", "unit": "d", "value": 1, "channel": 400, "timestamp": 1605007797789}, {"name": "RSSI", "sensor_id": "42b39e40-fcfb-11ea-bf5c-d11ce3dbc1cb", "type": "rssi", "unit": "dbm", "value": -53, "channel": 100, "timestamp": 1605007797789}, {"name": "SNR", "sensor_id": "", "type": "snr", "unit": "db", "value": 10.2, "channel": 101, "timestamp": 1605007797789}], "client_id": "b8468c50-baf0-11ea-a5e9-89c3b09de43a", "hardware_id": "0025ca0a0000e232", "timestamp": 1605007797789, "application_id": "shipcomwireless", "device_type_id": "70776630-e15e-11ea-a8c9-05cd631755a5"}, "company": {"id": 7696, "address": "9240 Kirby Dr", "city": "Houston", "country": "United States", "created_at": "2020-09-11T18:44:50Z", "industry": "[\"Health Care\"]", "latitude": 29.671324, "longitude": -95.415535, "name": "Harris Health System - Production", "state": "TX", "status": 0, "timezone": "America/Chicago", "updated_at": "2020-09-15T03:34:58Z", "user_id": "a5d78945-9f24-48a1-9107-5bee62bf007a", "zip": "77054"}, "location": {"id": 9153, "address": "9240 Kirby Dr", "city": "Houston", "country": "United States", "created_at": "2020-09-18T02:08:03Z", "industry": "[\"Health Care\"]", "latitude": 29.671324, "longitude": -95.415535, "name": "HHS Van Sensors", "state": "TX", "status": 0, "timezone": "America/Chicago", "updated_at": "2020-09-18T02:08:03Z", "user_id": "a5d78945-9f24-48a1-9107-5bee62bf007a", "zip": "77054", "company_id": 7696}, "device_type": {"id": "70776630-e15e-11ea-a8c9-05cd631755a5", "application_id": "", "category": "module", "codec": "lorawan.laird.rs1xx-backup", "data_type": "", "description": "Temp Sensor", "manufacturer": "Laird", "model": "RS1xx", "name": "Laird Temp & Humidity with Local Backup", "parent_constraint": "NOT_ALLOWED", "proxy_handler": "PrometheusClient", "subcategory": "lora", "transport_protocol": "lorawan", "version": "", "created_at": "2020-08-18T14:23:51Z", "updated_at": "2020-08-18T18:16:37Z"}, "device": {"id": 269231, "thing_name": "Van 18-1775 (Ambient)", "created_at": "2020-09-22T17:44:27Z", "updated_at": "2020-09-25T22:39:57Z", "status": 0}}, {"event_type": "uplink", "event_data": {"device_id": "7de32cf0-f9d2-11ea-b4dd-cd2887a335f7", "user_id": "a5d78945-9f24-48a1-9107-5bee62bf007a", "payload": [{"name": "Humidity", "sensor_id": "7dfbbe00-f9d2-11ea-9c71-c517ac227ea5", "type": "rel_hum", "unit": "p", "value": 0, "channel": 4, "timestamp": 1604697684139}, {"name": "Temperature", "sensor_id": "7dfb48d0-f9d2-11ea-9c71-c517ac227ea5", "type": "temp", "unit": "c", "value": -27.22, "channel": 3, "timestamp": 1604697684139}, {"name": "Battery", "sensor_id": "7dfa5e70-f9d2-11ea-bf5c-d11ce3dbc1cb", "type": "batt", "unit": "p", "value": 100, "channel": 5, "timestamp": 1604697684139}, {"name": "Local Backup", "sensor_id": "7dfb96f0-f9d2-11ea-b4dd-cd2887a335f7", "type": "digital_sensor", "unit": "d", "value": 1, "channel": 400, "timestamp": 1604697684139}, {"name": "RSSI", "sensor_id": "7dfc5a40-f9d2-11ea-b4dd-cd2887a335f7", "type": "rssi", "unit": "dbm", "value": -7, "channel": 100, "timestamp": 1604697684139}, {"name": "SNR", "sensor_id": "", "type": "snr", "unit": "db", "value": 10, "channel": 101, "timestamp": 1604697684139}], "client_id": "b8468c50-baf0-11ea-a5e9-89c3b09de43a", "hardware_id": "0025ca0a0000be6a", "timestamp": 1604697684139, "application_id": "shipcomwireless", "device_type_id": "70776630-e15e-11ea-a8c9-05cd631755a5"}, "company": {"id": 7696, "address": "9240 Kirby Dr", "city": "Houston", "country": "United States", "created_at": "2020-09-11T18:44:50Z", "industry": "[\"Health Care\"]", "latitude": 29.671324, "longitude": -95.415535, "name": "Harris Health System - Production", "state": "TX", "status": 0, "timezone": "America/Chicago", "updated_at": "2020-09-15T03:34:58Z", "user_id": "a5d78945-9f24-48a1-9107-5bee62bf007a", "zip": "77054"}, "location": {"id": 9080, "address": "9240 Kirby Dr", "city": "Houston", "country": "United States", "created_at": "2020-09-11T18:46:07Z", "industry": "[\"Health Care\"]", "latitude": 29.671324, "longitude": -95.415535, "name": "HHS Cooler Sensors", "state": "TX", "status": 0, "timezone": "America/Chicago", "updated_at": "2020-09-18T14:17:28Z", "user_id": "a5d78945-9f24-48a1-9107-5bee62bf007a", "zip": "77054", "company_id": 7696}, "device_type": {"id": "70776630-e15e-11ea-a8c9-05cd631755a5", "application_id": "", "category": "module", "codec": "lorawan.laird.rs1xx-backup", "data_type": "", "description": "Temp Sensor", "manufacturer": "Laird", "model": "RS1xx", "name": "Laird Temp & Humidity with Local Backup", "parent_constraint": "NOT_ALLOWED", "proxy_handler": "PrometheusClient", "subcategory": "lora", "transport_protocol": "lorawan", "version": "", "created_at": "2020-08-18T14:23:51Z", "updated_at": "2020-08-18T18:16:37Z"}, "device": {"id": 268369, "thing_name": "Cooler F-0201-AH", "created_at": "2020-09-18T17:15:04Z", "updated_at": "2020-09-25T22:39:57Z", "status": 0}}, {"event_type": "uplink", "event_data": {"device_id": "1c5c66f0-fcfb-11ea-8ae3-2ffdc909c57b", "user_id": "a5d78945-9f24-48a1-9107-5bee62bf007a", "payload": [{"name": "Humidity", "sensor_id": "1c7a4f30-fcfb-11ea-8ae3-2ffdc909c57b", "type": "rel_hum", "unit": "p", "value": 81.22, "channel": 4, "timestamp": 1605148608302}, {"name": "Temperature", "sensor_id": "1c793dc0-fcfb-11ea-bf5c-d11ce3dbc1cb", "type": "temp", "unit": "c", "value": 24.47, "channel": 3, "timestamp": 1605148608302}, {"name": "Battery", "sensor_id": "1c76a5b0-fcfb-11ea-bf5c-d11ce3dbc1cb", "type": "batt", "unit": "p", "value": 100, "channel": 5, "timestamp": 1605148608302}, {"name": "Local Backup", "sensor_id": "1c73e690-fcfb-11ea-9c71-c517ac227ea5", "type": "digital_sensor", "unit": "d", "value": 1, "channel": 400, "timestamp": 1605148608302}, {"name": "RSSI", "sensor_id": "1c780540-fcfb-11ea-b4dd-cd2887a335f7", "type": "rssi", "unit": "dbm", "value": -14, "channel": 100, "timestamp": 1605148608302}, {"name": "SNR", "sensor_id": "", "type": "snr", "unit": "db", "value": 8.8, "channel": 101, "timestamp": 1605148608302}], "client_id": "b8468c50-baf0-11ea-a5e9-89c3b09de43a", "hardware_id": "0025ca0a0000e1e3", "timestamp": 1605148608302, "application_id": "shipcomwireless", "device_type_id": "70776630-e15e-11ea-a8c9-05cd631755a5"}, "company": {"id": 7696, "address": "9240 Kirby Dr", "city": "Houston", "country": "United States", "created_at": "2020-09-11T18:44:50Z", "industry": "[\"Health Care\"]", "latitude": 29.671324, "longitude": -95.415535, "name": "Harris Health System - Production", "state": "TX", "status": 0, "timezone": "America/Chicago", "updated_at": "2020-09-15T03:34:58Z", "user_id": "a5d78945-9f24-48a1-9107-5bee62bf007a", "zip": "77054"}, "location": {"id": 9153, "address": "9240 Kirby Dr", "city": "Houston", "country": "United States", "created_at": "2020-09-18T02:08:03Z", "industry": "[\"Health Care\"]", "latitude": 29.671324, "longitude": -95.415535, "name": "HHS Van Sensors", "state": "TX", "status": 0, "timezone": "America/Chicago", "updated_at": "2020-09-18T02:08:03Z", "user_id": "a5d78945-9f24-48a1-9107-5bee62bf007a", "zip": "77054", "company_id": 7696}, "device_type": {"id": "70776630-e15e-11ea-a8c9-05cd631755a5", "application_id": "", "category": "module", "codec": "lorawan.laird.rs1xx-backup", "data_type": "", "description": "Temp Sensor", "manufacturer": "Laird", "model": "RS1xx", "name": "Laird Temp & Humidity with Local Backup", "parent_constraint": "NOT_ALLOWED", "proxy_handler": "PrometheusClient", "subcategory": "lora", "transport_protocol": "lorawan", "version": "", "created_at": "2020-08-18T14:23:51Z", "updated_at": "2020-08-18T18:16:37Z"}, "device": {"id": 269213, "thing_name": "Van 19-1800 (Ambient)", "created_at": "2020-09-22T17:43:23Z", "updated_at": "2020-09-25T22:39:56Z", "status": 0}}, {"event_type": "uplink", "event_data": {"device_id": "851fd480-f70e-11ea-9f13-d1d0271289a6", "user_id": "a5d78945-9f24-48a1-9107-5bee62bf007a", "payload": [{"name": "Humidity", "sensor_id": "85411820-f70e-11ea-8ae3-2ffdc909c57b", "type": "rel_hum", "unit": "p", "value": 49.52, "channel": 4, "timestamp": 1604558153188}, {"name": "Temperature", "sensor_id": "853f9180-f70e-11ea-9f13-d1d0271289a6", "type": "temp", "unit": "c", "value": 20.52, "channel": 3, "timestamp": 1604558153188}, {"name": "Battery", "sensor_id": "85429ec0-f70e-11ea-9621-a51b22d5dc1d", "type": "batt", "unit": "p", "value": 100, "channel": 5, "timestamp": 1604558153188}, {"name": "Local Backup", "sensor_id": "853f4360-f70e-11ea-9f13-d1d0271289a6", "type": "digital_sensor", "unit": "d", "value": 1, "channel": 400, "timestamp": 1604558153188}, {"name": "RSSI", "sensor_id": "8543b030-f70e-11ea-8ae3-2ffdc909c57b", "type": "rssi", "unit": "dbm", "value": -91, "channel": 100, "timestamp": 1604558153188}, {"name": "SNR", "sensor_id": "", "type": "snr", "unit": "db", "value": 8.5, "channel": 101, "timestamp": 1604558153188}], "client_id": "b8468c50-baf0-11ea-a5e9-89c3b09de43a", "hardware_id": "0025ca0a0000be5b", "timestamp": 1604558153188, "application_id": "shipcomwireless", "device_type_id": "70776630-e15e-11ea-a8c9-05cd631755a5"}, "company": {"id": 7696, "address": "9240 Kirby Dr", "city": "Houston", "country": "United States", "created_at": "2020-09-11T18:44:50Z", "industry": "[\"Health Care\"]", "latitude": 29.671324, "longitude": -95.415535, "name": "Harris Health System - Production", "state": "TX", "status": 0, "timezone": "America/Chicago", "updated_at": "2020-09-15T03:34:58Z", "user_id": "a5d78945-9f24-48a1-9107-5bee62bf007a", "zip": "77054"}, "location": {"id": 9080, "address": "9240 Kirby Dr", "city": "Houston", "country": "United States", "created_at": "2020-09-11T18:46:07Z", "industry": "[\"Health Care\"]", "latitude": 29.671324, "longitude": -95.415535, "name": "HHS Cooler Sensors", "state": "TX", "status": 0, "timezone": "America/Chicago", "updated_at": "2020-09-18T14:17:28Z", "user_id": "a5d78945-9f24-48a1-9107-5bee62bf007a", "zip": "77054", "company_id": 7696}, "device_type": {"id": "70776630-e15e-11ea-a8c9-05cd631755a5", "application_id": "", "category": "module", "codec": "lorawan.laird.rs1xx-backup", "data_type": "", "description": "Temp Sensor", "manufacturer": "Laird", "model": "RS1xx", "name": "Laird Temp & Humidity with Local Backup", "parent_constraint": "NOT_ALLOWED", "proxy_handler": "PrometheusClient", "subcategory": "lora", "transport_protocol": "lorawan", "version": "", "created_at": "2020-08-18T14:23:51Z", "updated_at": "2020-08-18T18:16:37Z"}, "device": {"id": 265040, "thing_name": "Cooler R-0306-PHAR", "created_at": "2020-09-15T04:47:12Z", "updated_at": "2020-09-25T22:39:54Z", "status": 0}}]

contents is a list of records but the writer.append expects a single record, so you iterate over your records and append them one by one.
You just need to change:
writer.append(contents)
to:
for record in contents:
writer.append(record)

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Update nested JSON with name of json file name - python

Related

Explode json without pandas

What is the best way for me to iterate over this dataset to return all matching values from another key value pair if I match a separate key?

How to create very large Json object in Python

How to use S3 Select for Nested Parquet Objects

append multiple json files together and ouptut 1 Avro file using Python

Categories

Resources