Add a key to existing json file

Add a key to existing json file - python

I've a very large JSON file(>60MB), which i was unable to open and edit in any editors. I see that we can do it using python.
Here is my sample data.
[
{
"type": "Feature",
"geometry": {
"type": "MultiPolygon"
},
"id": "94601"
},
{
"type": "Feature",
"geometry": {
"type": "MultiPolygon"
},
"id": "94801"
}
]
and my expected output is
{
"type": "FeatureCollection",
"features":[
{
"type": "Feature",
"geometry": {
"type": "MultiPolygon"
},
"id": "94601"
},
{
"type": "Feature",
"geometry": {
"type": "MultiPolygon"
},
"id": "94801"
}
]
}
I'm not sure of how can I do it in python. Here is the code that I'm using to read the file. But not sure of how to proceed further.
import json
f = open('/Users/Downloads/out_ca_california_zip_codes_geo.json', 'r')
out_file = open("/Users/Downloads/outNew_ca_california_zip_codes_geo.json", "w")
out_file.write(json.dumps(myFinalResult))
Thanks

Decode your json which then become a Python dictionary then you can simply add your key to that, then re-encode and rewrite the file.
Try this:
import json
with open(json_file) as json_file:
json_decoded = json.load(json_file)
json_decoded['ADDED_KEY'] = 'ADDED_VALUE'
with open(json_file, 'w') as json_file:
json.dump(json_decoded, json_file)
For more details check here: https://docs.python.org/3.4/library/json.html

Related

How to position attribute in geojson on top of all properties?

I am trying to remove feature_id property from properties array and move upwards.
with open('test.geojson', 'r+') as gjson:
data = json.load(gjson)
for l in range(0, len(data['features'])):
data['features'][l]['id'] = data['features'][l]['properties']['feature_id']
del data['features'][l]['properties']['feature_id']
gjson.seek(0)
json.dump(data, gjson, indent=4)
gjson.truncate()
This is the input.
{
"type": "FeatureCollection",
"name": "name",
"features": [
{
"type": "Feature",
"properties": {
"feature_id": "1e181120-2047-4f97-a359-942ef5940da1",
"type": 1
},
"geometry": {
"type": "Polygon",
"coordinates": [
[...]
]
}
}
]
}
It does the job but adds the property at the bottom
{
"type": "FeatureCollection",
"name": "name",
"features": [
{
"type": "Feature",
"properties": {
"type": 1
},
"geometry": {
"type": "Polygon",
"coordinates": [
[..]
]
},
"id": "1e181120-2047-4f97-a359-942ef5940da1"
}
]
}
As you can see id is added at last but it should be on top before properties.

You can use OrderedDict for that.
with open('test.geojson', 'r+') as gjson:
data = json.load(gjson, object_pairs_hook=OrderedDict)
for l in range(0, len(data['features'])):
d = data['features'][l]
d['id'] = data['features'][l]['properties']['feature_id']
d.move_to_end('id', last=False)
del d['properties']['feature_id']
gjson.seek(0)
json.dump(data, gjson, indent=4)
gjson.truncate()

Python: JSON to CSV

I am receiving a JSON file from a Docparser API, which I would like to convert to a CSV document.
The structure is here below:
{
"type": "object",
"properties": {
"id": {
"type": "string"
},
"document_id": {
"type": "string"
},
"remote_id": {
"type": "string"
},
"file_name": {
"type": "string"
},
"page_count": {
"type": "integer"
},
"uploaded_at": {
"type": "string"
},
"processed_at": {
"type": "string"
},
"table_data": [
{
"type": "array",
"items": {
"type": "object",
"properties": {
"account_ref": {
"type": "string"
},
"client": {
"type": "string"
},
"transaction_type": {
"type": "string"
},
"key_4": {
"type": "string"
},
"date_yyyymmdd": {
"type": "string"
},
"amount_excl": {
"type": "string"
}
},
"required": [
"account_ref",
"client",
"transaction_type",
"key_4",
"date_yyyymmdd",
"amount_excl"
]
}
}
]
}
}
The first problem that I have is how to only work with the table_data section?
My second problem is writing the actual code that allows me to put each section, i.e. account_ref, client, etc., into their own columns. I had so many changes to my code, the output varied from adding the properties into columns and dumping the table_data part into one cell, to only printing the headers into a single cell (as a list).
Here's my current code (which is not working correctly):
import pydocparser
import json
import pandas as pd
parser = pydocparser.Parser()
parser.login('API')
data2 = str(parser.fetch("Name of Parser", 'documentID'))
data2 = str(data2).replace("'", '"') # I had to put this in because it kept saying that it needs double quotes.
y = json.loads(str(data2))
json_file = open(r"C:\File.json", "w")
json_file.write(str(y))
json_file.close()
df1 = df = pd.DataFrame({str(y)})
df1.to_csv(r"C:\jsonCSV.csv")
Thanks for your help!

Pandas has a nice built in function called pandas.json_noramlize()
If you're using pandas version lower then 1.0.0 use pandas.io.json.json_normalize(), it should split the columns nicely.
read more about it here:
>1.0.0:
https://pandas.pydata.org/pandas-docs/version/0.22/generated/pandas.io.json.json_normalize.html
=<1.0.0
https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.json_normalize.html

Python - Parse complex JSON with objectpath

i need parse terraform file, write in JSON format. I have to extract two data, resource and id, this is example file:
{
"version": 1,
"serial": 1,
"modules": [
{
"path": [
"root"
],
"outputs": {
},
"resources": {
"aws_security_group.vpc-xxxxxxx-test-1": {
"type": "aws_security_group",
"primary": {
"id": "sg-xxxxxxxxxxxxxx",
"attributes": {
"description": "test-1",
"name": "test-1"
}
}
},
"aws_security_group.vpc-xxxxxxx-test-2": {
"type": "aws_security_group",
"primary": {
"id": "sg-yyyyyyyyyyyy",
"attributes": {
"description": "test-2",
"name": "test-2"
}
}
}
}
}
]
}
I need export for any resources, the first key and value of id, in this case, aws_security_group.vpc-xxxxxxx-test-1 sg-xxxxxxxxxxxxxx and aws_security_group.vpc-xxxxxxx-test-2 sg-yyyyyyyyyyyy
I have tried to write this in python:
#!/usr/bin/python3.6
import json
import objectpath
with open('file.json') as json_file:
data = json.load(json_file)
json_tree = objectpath.Tree(data['modules'])
result = tuple(json_tree.execute('$..resources[0]'))
result is
('aws_security_group.vpc-xxxxxxx-test-1', 'aws_security_group.vpc-xxxxxxx-test-2')
It's'ok but I can't extract the id, any help is appreciated, also use other methods
Thanks

I don't know objectpath, but I think you need:
tree.execute('$..resources[0]..primary.id')
or even just
tree.execute('$..resources[0]..id')

Combine part of geojson object into another in Python

EDIT: I am trying to manipulate JSON files in Python. In my data some polygons have multiple related information: coordinates (LineString) and area percent and area (Text and Area in Point), I want to combine them to a single JSON object. As an example, the data from files are as follows:
data = {
"type": "FeatureCollection",
"name": "entities",
"features": [{
"type": "Feature",
"properties": {
"Layer": "0",
"SubClasses": "AcDbEntity:AcDbBlockReference",
"EntityHandle": "2F1"
},
"geometry": {
"type": "LineString",
"coordinates": [
[61.971069681118479, 36.504485105673659],
[46.471068755199667, 36.504485105673659],
[46.471068755199667, 35.954489281866685],
[44.371068755199758, 35.954489281866685],
[44.371068755199758, 36.10448936390457],
[43.371069617387093, 36.104489150107824],
[43.371069617387093, 23.904496401184584],
[48.172716774891342, 23.904496401184584],
[48.171892994728751, 17.404489374370311],
[61.17106949647404, 17.404489281863786],
[61.17106949647404, 19.404489281863786],
[61.971069689453991, 19.404489282256687],
[61.971069681118479, 36.504485105673659]
]
}
},
{
"type": "Feature",
"properties": {
"Layer": "0",
"SubClasses": "AcDbEntity:AcDbMText",
"EntityHandle": "2F1",
"Text": "6%"
},
"geometry": {
"type": "Point",
"coordinates": [49.745686139884583, 28.11445704760262, 0.0]
}
},
{
"type": "Feature",
"properties": {
"Layer": "0",
"SubClasses": "AcDbEntity:AcDbMText",
"EntityHandle": "2F1",
"Area": "100"
},
"geometry": {
"type": "Point",
"coordinates": [50.216857362443989, 63.981197759829229, 0.0]
}
},
{
"type": "Feature",
"properties": {
"Layer": "0",
"SubClasses": "AcDbEntity:AcDbBlockReference",
"EntityHandle": "2F7"
},
"geometry": {
"type": "LineString",
"coordinates": [
[62.37106968111857, 36.504489398648715],
[62.371069689452725, 19.404489281863786],
[63.171069496474047, 19.404489281863786],
[63.171069496474047, 17.404489281863786],
[77.921070051947027, 17.404489281863786],
[77.921070051947027, 19.504489281855054],
[78.671070051947027, 19.504489281855054],
[78.671070051897914, 36.504485105717322],
[62.37106968111857, 36.504489398648715]
]
}
},
{
"type": "Feature",
"properties": {
"Layer": "0",
"SubClasses": "AcDbEntity:AcDbMText",
"EntityHandle": "2F7",
"Text": "5.8%"
},
"geometry": {
"type": "Point",
"coordinates": [67.27548061311245, 28.11445704760262, 0.0]
}
}
]
}
I want to combine Point's Text and Area key and values to LineString based on EntityHandle's values, and also delete Point lines. The expected output is:
{
"type": "FeatureCollection",
"name": "entities",
"features": [{
"type": "Feature",
"properties": {
"Layer": "0",
"SubClasses": "AcDbEntity:AcDbBlockReference",
"EntityHandle": "2F1",
"Text": "6%",
"Area": "100"
},
"geometry": {
"type": "LineString",
"coordinates": [
[61.971069681118479, 36.504485105673659],
[46.471068755199667, 36.504485105673659],
[46.471068755199667, 35.954489281866685],
[44.371068755199758, 35.954489281866685],
[44.371068755199758, 36.10448936390457],
[43.371069617387093, 36.104489150107824],
[43.371069617387093, 23.904496401184584],
[48.172716774891342, 23.904496401184584],
[48.171892994728751, 17.404489374370311],
[61.17106949647404, 17.404489281863786],
[61.17106949647404, 19.404489281863786],
[61.971069689453991, 19.404489282256687],
[61.971069681118479, 36.504485105673659]
]
}
},
{
"type": "Feature",
"properties": {
"Layer": "0",
"SubClasses": "AcDbEntity:AcDbBlockReference",
"EntityHandle": "2F7",
"Text": "5.8%"
},
"geometry": {
"type": "LineString",
"coordinates": [
[62.37106968111857, 36.504489398648715],
[62.371069689452725, 19.404489281863786],
[63.171069496474047, 19.404489281863786],
[63.171069496474047, 17.404489281863786],
[77.921070051947027, 17.404489281863786],
[77.921070051947027, 19.504489281855054],
[78.671070051947027, 19.504489281855054],
[78.671070051897914, 36.504485105717322],
[62.37106968111857, 36.504489398648715]
]
}
}
]
}
Is it possible to get result above in Python? Thanks.
Updated solution, thanks to #dodopy:
import json
features = data["features"]
point_handle_text = {
i["properties"]["EntityHandle"]: i["properties"]["Text"]
for i in features
if i["geometry"]["type"] == "Point"
}
point_handle_area = {
i["properties"]["EntityHandle"]: i["properties"]["Area"]
for i in features
if i["geometry"]["type"] == "Point"
}
combine_features = []
for i in features:
if i["geometry"]["type"] == "LineString":
i["properties"]["Text"] = point_handle_text.get(i["properties"]["EntityHandle"])
combine_features.append(i)
data["features"] = combine_features
combine_features = []
for i in features:
if i["geometry"]["type"] == "LineString":
i["properties"]["Area"] = point_handle_area.get(i["properties"]["EntityHandle"])
combine_features.append(i)
data["features"] = combine_features
with open('test.geojson', 'w+') as f:
json.dump(data, f, indent=2)
But I get an error:
Traceback (most recent call last):
File "<ipython-input-131-d132c8854a9c>", line 6, in <module>
for i in features
File "<ipython-input-131-d132c8854a9c>", line 7, in <dictcomp>
if i["geometry"]["type"] == "Point"
KeyError: 'Text'

example like this:
import json
data = json.loads(json_data)
features = data["features"]
point_handle_text = {
i["properties"]["EntityHandle"]: i["properties"]["Text"]
for i in features
if i["geometry"]["type"] == "Point"
}
combine_features = []
for i in features:
if i["geometry"]["type"] == "LineString":
i["properties"]["Text"] = point_handle_text.get(i["properties"]["EntityHandle"])
combine_features.append(i)
data["features"] = combine_features
json_data = json.dumps(data)

Yes, it is possible to get your result in python. It just requires storing the json data into a data structure we can work with in python and then writing an algorithm to combine features with the same entity type. I wrote up a script to do just that, along with comments. The program extracts the text property from the Point feature and places it into the properties of the LineString feature. Then, we essentially discard Point.
BTW, your 'before' json data has a trailing comma that shouldn't be there.
Using Python 3.7.0:
import json
import collections
def main():
with open('before_data.json') as f:
before_data = json.load(f) # makes a python dict from the json file and stores in before
features = before_data['features'] # list of features
# loop through features, construct dictionary of entity handle mapped to point texts
point_entities = collections.defaultdict() # to avoid 'if key not in' pattern
for feature in features:
entity_handle = feature['properties']['EntityHandle']
# only append points
if feature['geometry']['type'] == 'Point':
point_entities[entity_handle] = feature['properties']['Text']
merged_features = []
for feature in features:
if feature['geometry']['type'] == 'LineString':
entity_handle = feature['properties']['EntityHandle']
text_percent = point_entities[entity_handle]
feature['properties']['Text'] = text_percent
merged_features.append(feature)
# print(json.dumps(before_data, indent=4))
result = before_data
result['features'] = merged_features
# compare with your expected output
with open('after_data.json') as f:
after_data = json.load(f)
print(result == after_data) # returns True
# finally, write your result to a file
with open('result.json', 'w') as output_file:
json.dump(result, output_file)
if __name__ == '__main__':
main()

How to modify nested JSON with python

I need to update (CRUD) a nested JSON file using Python. To be able to call python function(s)(to update/delete/create) entires and write it back to the json file.
Here is a sample file.
I am looking at the remap library but not sure if this will work.
{
"groups": [
{
"name": "group1",
"properties": [
{
"name": "Test-Key-String",
"value": {
"type": "String",
"encoding": "utf-8",
"data": "value1"
}
},
{
"name": "Test-Key-Integer",
"value": {
"type": "Integer",
"data": 1000
}
}
],
"groups": [
{
"name": "group-child",
"properties": [
{
"name": "Test-Key-String",
"value": {
"type": "String",
"encoding": "utf-8",
"data": "value1"
}
},
{
"name": "Test-Key-Integer",
"value": {
"type": "Integer",
"data": 1000
}
}
]
}
]
},
{
"name": "group2",
"properties": [
{
"name": "Test-Key2-String",
"value": {
"type": "String",
"encoding": "utf-8",
"data": "value2"
}
}
]
}
]
}

I feel like I'm missing something in your question. In any event, what I understand is that you want to read a json file, edit the data as a python object, then write it back out with the updated data?
Read the json file:
import json
with open("data.json") as f:
data = json.load(f)
That creates a dictionary (given the format you've given) that you can manipulate however you want. Assuming you want to write it out:
with open("data.json","w") as f:
json.dump(data,f)

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Add a key to existing json file - python

Related

How to position attribute in geojson on top of all properties?

Python: JSON to CSV

Python - Parse complex JSON with objectpath

Combine part of geojson object into another in Python

How to modify nested JSON with python

Categories

Resources