Python: How to search and replace parts of a json file? - python

I'm new to Python and I would like to search and replace titles of IDs in a JSON-file. Normally I would use R for this Task, but how to do it in Python. Here a sample of my JSON code (with a Service ID and a layer ID). I'm interested in replacing the titles in the layer IDs:
...{"services": [
{
"id": "service",
"url": "http://...",
"title": "GEW",
"layers": [
{
"id": "0",
"title": "wrongTitle",
},
{
"id": "1",
"title": "againTitleWrong",
},
],
"options": {}
},],}
For the replace I would use a table/csv like this:
serviceID layerID oldTitle newTitle
service 0 wrongTitle newTitle1
service 1 againTitleWrong newTitle2
....
Do you have ideas? Thanks

Here's an working example on repl.it.
Code:
import json
import io
import csv
### json input
input = """
{
"layers": [
{
"id": "0",
"title": "wrongTitle"
},
{
"id": "1",
"title": "againTitleWrong"
}
]
}
"""
### parse the json
parsed_json = json.loads(input)
#### csv input
csv_input = """serviceID,layerID,oldTitle,newTitle
service,0,wrongTitle,newTitle1
service,1,againTitleWrong,newTitle2
"""
### parse csv and generate a correction lookup
parsed_csv = csv.DictReader(io.StringIO(csv_input))
lookup = {}
for row in parsed_csv:
lookup[row["layerID"]] = row["newTitle"]
#correct and print json
layers = parsed_json["layers"]
for layer in layers:
layer["title"] = lookup[layer["id"]]
parsed_json["layers"] = layers
print(json.dumps(parsed_json))

You don't say which version of Python you are using but there are built-on JSON parsers for the language.
For 2.x: https://docs.python.org/2.7/library/json.html
For 3.x: https://docs.python.org/3.4/library/json.html
These should be able to help you to parse the JSON and replace what you want.

As other users suggested, check the JSON module will be helpful.
Here gives a basic example on python2.7:
import json
j = '''{
"services":
[{
"id": "service",
"url": "http://...",
"title": "GEW",
"options": {},
"layers": [
{
"id": "0",
"title": "wrongTitle"
},
{
"id": "1",
"title": "againTitleWrong"
}
]
}]
}'''
s = json.loads(j)
s["services"][0]["layers"][0]["title"] = "new title"
# save json object to file
with open('file.json', 'w') as f:
json.dump(s, f)
You can index the element and change its title according to your csv file, which requires the use of CSV module.

Related

Iterate and update through folder of .json files and update value in python

I've struck out trying to find a suitable script to iterate through a folder of .json files and update a single line.
Below is an example json file located in a path among others. I would like to iterate through the json files in a folder containing several files like this with various information and update the "seller_fee_basis_points" from "0" to say "500" and save.
Would really appreciate the assistance.
{
"name": "Solflare X NFT",
"symbol": "",
"description": "Celebratory Solflare NFT for the Solflare X launch",
"seller_fee_basis_points": 0,
"image": "https://www.arweave.net/abcd5678?ext=png",
"animation_url": "https://www.arweave.net/efgh1234?ext=mp4",
"external_url": "https://solflare.com",
"attributes": [
{
"trait_type": "web",
"value": "yes"
},
{
"trait_type": "mobile",
"value": "yes"
},
{
"trait_type": "extension",
"value": "yes"
}
],
"collection": {
"name": "Solflare X NFT",
"family": "Solflare"
},
"properties": {
"files": [
{
"uri": "https://www.arweave.net/abcd5678?ext=png",
"type": "image/png"
},
{
"uri": "https://watch.videodelivery.net/9876jkl",
"type": "unknown",
"cdn": true
},
{
"uri": "https://www.arweave.net/efgh1234?ext=mp4",
"type": "video/mp4"
}
],
"category": "video",
"creators": [
{
"address": "SOLFLR15asd9d21325bsadythp547912501b",
"share": 100
}
]
}
}
Updated with an answer due to #JCaesar's help
import json
import glob
import os
SOURCE_DIRECTORY = r'my_favourite_directory'
KEY = 'seller_fee_basis_points'
NEW_VALUE = 500
for file in glob.glob(os.path.join(SOURCE_DIRECTORY, '*.json')):
json_data = json.loads(open(file, encoding="utf8").read())
# note that using the update method means
# that if KEY does not exist then it will be created
# which may not be what you want
json_data.update({KEY: NEW_VALUE})
json.dump(json_data, open(file, 'w'), indent=4)
I recommend using glob to find the files you're interested in. Then utilise the json module for reading and writing the JSON content.
This is very concise and has no sanity checking / exception handling but you should get the idea:
import json
import glob
import os
SOURCE_DIRECTORY = 'my_favourite_directory'
KEY = 'seller_fee_basis_points'
NEW_VALUE = 500
for file in glob.glob(os.path.join(SOURCE_DIRECTORY, '*.json')):
json_data = json.loads(open(file).read())
# note that using the update method means
# that if KEY does not exist then it will be created
# which may not be what you want
json_data.update({KEY: NEW_VALUE})
json.dump(json_data, open(file, 'w'), indent=4)

Python - normalizing nested json file

I have nested json file and i am trying to get the data into data frame. I have to extract sensor-time, then elements and finally sensor info.
Here is how the json file looks like:
{
"sensor-time": {
"timezone": "America/Los_Angeles",
"time": "2019-11-21T01:00:04-08:00"
},
"status": {
"code": "OK"
},
"content": {
"element": [{
"element-id": 0,
"element-name": "Line 0",
"sensor-type": "SINGLE_SENSOR",
"data-type": "LINE",
"from": "2019-11-21T00:00:00-08:00",
"to": "2019-11-21T01:00:00-08:00",
"resolution": "ONE_HOUR",
"measurement": [{
"from": "2019-11-21T00:00:00-08:00",
"to": "2019-11-21T01:00:00-08:00",
"value": [{
"value": 0,
"label": "fw"
}, {
"value": 0,
"label": "bw"
}
]
}
]
}
]
},
"sensor-info": {
"serial-number": "D8:80:39:D9:6B:9B",
"ip-address": "192.168.0.3",
"name": "XD01",
"group": "Boost Mobile",
"device-type": "PC2"
}
}
And here is my code so far:
import json
from pandas.io.json import json_normalize
import glob
import urllib
import sqlalchemy as sa
# Create empty dataframe
# Drill through each file with json extension in the folder, open it, load it and parse it into dataframe
file = 'C:/Test/Loading/testfile.json'
with open(file) as json_file:
json_data = json.load(json_file)
df = json_normalize(json_data, meta=['sensor-time'])
df
and here is the output when I run my code:
I tried using flatten_json librarry and the best I can get it is with this code:
with open(file) as json_file:
json_data = json.load(json_file)
flat = flatten_json(json_data)
df = json_normalize(flat)
And i get output with one row with 33 columns. So in my case since i have multiple values under measurments part of json files, i am getting a column for each of the measurements. What i have to get is 3 rows with 24 columns. One row for each measurements.
So how do i modify this now?
The simplest way I think would be to use pandas.DataFrame(json_data); then you can access these information doing:
pandas.DataFrame(json_data)['sensor-time']['time']
pandas.DataFrame(json_data)['content']['element]
pandas.DataFrame(json_data)['content']['element]

CSV file to JSON for nested array generic template using python (for csv to mongodb insert)

I want to create the JSON file from CSV file using the generic python script.
Found hone package from GitHub but some of the functionalities missing in that code.
csv to json
I want to code like generic template CSV to JSON.
[
{
"birth": {
"day": "7",
"month": "May",
"year": "1985"
},
"name": "Bob",
"reference": "TRUE",
"reference name": "Smith"
}
]
Only handled above type of JSON only.
[
{
"Type": "AwsEc2Instance",
"Id": "i-cafebabe",
"Partition": "aws",
"Region": "us-west-2",
"Tags": {
"billingCode": "Lotus-1-2-3",
"needsPatching": "true"
},
"Details": {
"AwsEc2Instance": {
"Type": "i3.xlarge",
"ImageId": "ami-abcd1234",
"IpV4Addresses": [ "54.194.252.215", "192.168.1.88" ],
"IpV6Addresses": [ "2001:db812341a2b::123" ],
"KeyName": "my_keypair",
"VpcId": "vpc-11112222",
"SubnetId": "subnet-56f5f633",
"LaunchedAt": "2018-05-08T16:46:19.000Z"
}
}
}
]
I want to handle nested array[] ,{}
I have done something like this before and below code can be modified as I have not seen your dataset.
dataframe = pd.read_excel('dataframefilepath', encoding='utf-8', header=0)
'''Adding to list to finally save it as JSON'''
df = []
for (columnName, columnData) in dataframe.iteritems():
if dataframe.columns.get_loc(columnName) > 0:
for indata, rwdata in dataframe.iterrows():
for insav, rwsave in df_to_Save.iterrows():
if rwdata.Selected_Prediction == rwsave.Selected_Prediction:
#print()
df_to_Save.loc[insav, 'Value_to_Save'] = rwdata[dataframe.columns.get_loc(columnName)]
#print(rwdata[dataframe.columns.get_loc(columnName)])
df.append(df_to_Save.set_index('Selected_Prediction').T.to_dict('record'))
df = eval(df)
'''Saving in JSON format'''
path_to_save = '\\your path'
with open(path_to_save, 'w') as json_file:
json.dump(df, json_file)

Parsing JSON data if a key value is matched and print a key value in Python

I am very much new to JSON parsing. Below is my JSON:
[
{
"description": "Newton",
"exam_code": {
"date_added": "2015-05-13T04:49:54+00:00",
"description": "Production",
"exam_tags": [
{
"date_added": "2012-01-13T03:39:17+00:00",
"descriptive_name": "Production v0.1",
"id": 1,
"max_count": "147",
"name": "Production"
}
],
"id": 1,
"name": "Production",
"prefix": "SA"
},
"name": "CM"
},
{
"description": "Opera",
"exam_code": {
"date_added": "2015-05-13T04:49:54+00:00",
"description": "Production",
"test_tags": [
{
"date_added": "2012-02-22T12:44:55+00:00",
"descriptive_name": "Production v0.1",
"id": 1,
"max_count": "147",
"name": "Production"
}
],
"id": 1,
"name": "Production",
"prefix": "SA"
},
"name": "OS"
}
]
Here I am trying to find if name value is CM print description value.
If name value is OS then print description value.
Please help me to to understand how JSON parsing can be done?
Considering you have already read the JSON string from somewhere, be it a file, stdin, or any other source.
You can actually deserialize it into a Python object by doing:
import json
# ...
json_data = json.loads(json_str)
Where json_str is the JSON string that you want to parse.
In your case, json_str will get deserialized into a Python list, so you can do any operation on it as you'd normally do with a list.
Of course, this includes iterating over the elements:
for item in json_data:
if item.get('name') in ('CM', 'OS'):
print(item['description'])
As you can see, the items in json_data have been deserialized into dict, so you can access the actual fields using dict operations.
Note
You can also deserialize a JSON from the source directly, provided you have access to the file handler/descriptor or stream:
# Loading from a file
import json
with open('my_json.json', 'r') as fd:
# Note that we're using json.load, not json.loads
json_data = json.load(fd)
# Loading from stdin
import json, sys
json_data = json.load(sys.stdin)

How to get a fields particular value of JSON in python?

I am currently working on extracting fields from json and then make some use of that. Hence I have face parameters, and I want to store each field's value. I am trying to fetch the Gender value from a JSON of face :
The JSON is as follows:
{
"face": [
{
"attribute": {
"age": {
"range": 5,
"value": 24
},
"gender": {
"confidence": 99.9999,
"value": "Female"
},
"glass": {
"confidence": 99.4157,
"value": "None"
},
"pose": {
"pitch_angle": {
"value": 0.000001
},
"roll_angle": {
"value": 0.650337
},
"yaw_angle": {
"value": -0.42409
}
},
"race": {
"confidence": 98.058,
"value": "Asian"
},
"smiling": {
"value": 3.78394
}
},
"face_id": "42245f24335ad21ea7c54f2db96a09b3",
"position": {
"center": {
"x": 50.121951,
"y": 35.97561
},
"eye_left": {
"x": 43.465122,
"y": 30.670488
},
"eye_right": {
"x": 56.80878,
"y": 30.821951
},
"height": 27.560976,
"mouth_left": {
"x": 45.649512,
"y": 45.041707
},
"mouth_right": {
"x": 55.134878,
"y": 44.858049
},
"nose": {
"x": 50.183415,
"y": 38.410732
},
"width": 27.560976
},
"tag": ""
}
],
"img_height": 410,
"img_id": "1e3007cb3d6cfbaed3a1b4135524ed25",
"img_width": 410,
"session_id": "76ec7f99a471418fa8862a2138cc589d",
"url": "http://www.faceplusplus.com/wp-content/themes/faceplusplus/assets/img/demo/1.jpg?v=2"
}
I want to extract 'Female' from the above json.
And for that, I used this :
import urllib2
import io, json
from bs4 import BeautifulSoup
data = soup #soup has all the data json
with open('data.json', 'w') as outfile:
json.dump(data, outfile, sort_keys = True, indent = 4, ensure_ascii=False)
#content = json.loads(soup)
jsonFile = open('data.json', 'r')
values = json.load(jsonFile)
jsonFile.close()
gender = soup['face'][0]['gender']['value']
print gender
Where is my code incorrect?
According to your json example , gender is inside attribute , so you need to access it as -
gender = soup['face'][0]['attribute']['gender']['value']
Also, seems like values is the json that is read back (dictionary), so you may want to access it using values , though I am not sure what you are trying to achieve so I cannot say for sure.
Finally, I got an answer and it is working perfect.
with open('data.json') as da:
data = json.loads(json.load(da))
print data['face'][0]['attribute']['gender']['value']
You can use some libraries like objectpath, it makes you able to search in JSON in easy way.
just import the library and build the object tree, then type your word that you want to search for.
Importing:
import json
import objectpath
Building the search tree:
gender_tree = objectpath.Tree(values['face'])
Typing your searching word
gender_tuple = tuple(actions_tree.execute('$..gender'))
Now, you can deal with gender_tuple for your required values.
Here, the word that you are searching for is "gender", replace it with your suitable word for future searches.

Categories