I am using this resource to generate the schema https://github.com/wolverdude/GenSON/
I have the below JSON File
{
'name':'Sam',
},
{
'name':'Jack',
}
so on ...
I am wondering how to iterate over a large JSON file. I want to parse each JSON file and pass it to GENSON to generate schema
{
"$schema": "http://json-schema.org/schema#",
"type": "object",
"properties": {
"name": {
"type": [
"string"
]
}
},
"required": [
"name"
]
}
I think you should:
import json
from genson import SchemaBuilder
builder = SchemaBuilder()
with open(filename, 'r') as f:
datastore = json.load(f)
builder.add_object(datastore )
builder.to_schema()
Where filename is your file path.
Related
I've struck out trying to find a suitable script to iterate through a folder of .json files and update a single line.
Below is an example json file located in a path among others. I would like to iterate through the json files in a folder containing several files like this with various information and update the "seller_fee_basis_points" from "0" to say "500" and save.
Would really appreciate the assistance.
{
"name": "Solflare X NFT",
"symbol": "",
"description": "Celebratory Solflare NFT for the Solflare X launch",
"seller_fee_basis_points": 0,
"image": "https://www.arweave.net/abcd5678?ext=png",
"animation_url": "https://www.arweave.net/efgh1234?ext=mp4",
"external_url": "https://solflare.com",
"attributes": [
{
"trait_type": "web",
"value": "yes"
},
{
"trait_type": "mobile",
"value": "yes"
},
{
"trait_type": "extension",
"value": "yes"
}
],
"collection": {
"name": "Solflare X NFT",
"family": "Solflare"
},
"properties": {
"files": [
{
"uri": "https://www.arweave.net/abcd5678?ext=png",
"type": "image/png"
},
{
"uri": "https://watch.videodelivery.net/9876jkl",
"type": "unknown",
"cdn": true
},
{
"uri": "https://www.arweave.net/efgh1234?ext=mp4",
"type": "video/mp4"
}
],
"category": "video",
"creators": [
{
"address": "SOLFLR15asd9d21325bsadythp547912501b",
"share": 100
}
]
}
}
Updated with an answer due to #JCaesar's help
import json
import glob
import os
SOURCE_DIRECTORY = r'my_favourite_directory'
KEY = 'seller_fee_basis_points'
NEW_VALUE = 500
for file in glob.glob(os.path.join(SOURCE_DIRECTORY, '*.json')):
json_data = json.loads(open(file, encoding="utf8").read())
# note that using the update method means
# that if KEY does not exist then it will be created
# which may not be what you want
json_data.update({KEY: NEW_VALUE})
json.dump(json_data, open(file, 'w'), indent=4)
I recommend using glob to find the files you're interested in. Then utilise the json module for reading and writing the JSON content.
This is very concise and has no sanity checking / exception handling but you should get the idea:
import json
import glob
import os
SOURCE_DIRECTORY = 'my_favourite_directory'
KEY = 'seller_fee_basis_points'
NEW_VALUE = 500
for file in glob.glob(os.path.join(SOURCE_DIRECTORY, '*.json')):
json_data = json.loads(open(file).read())
# note that using the update method means
# that if KEY does not exist then it will be created
# which may not be what you want
json_data.update({KEY: NEW_VALUE})
json.dump(json_data, open(file, 'w'), indent=4)
i need parse terraform file, write in JSON format. I have to extract two data, resource and id, this is example file:
{
"version": 1,
"serial": 1,
"modules": [
{
"path": [
"root"
],
"outputs": {
},
"resources": {
"aws_security_group.vpc-xxxxxxx-test-1": {
"type": "aws_security_group",
"primary": {
"id": "sg-xxxxxxxxxxxxxx",
"attributes": {
"description": "test-1",
"name": "test-1"
}
}
},
"aws_security_group.vpc-xxxxxxx-test-2": {
"type": "aws_security_group",
"primary": {
"id": "sg-yyyyyyyyyyyy",
"attributes": {
"description": "test-2",
"name": "test-2"
}
}
}
}
}
]
}
I need export for any resources, the first key and value of id, in this case, aws_security_group.vpc-xxxxxxx-test-1 sg-xxxxxxxxxxxxxx and aws_security_group.vpc-xxxxxxx-test-2 sg-yyyyyyyyyyyy
I have tried to write this in python:
#!/usr/bin/python3.6
import json
import objectpath
with open('file.json') as json_file:
data = json.load(json_file)
json_tree = objectpath.Tree(data['modules'])
result = tuple(json_tree.execute('$..resources[0]'))
result is
('aws_security_group.vpc-xxxxxxx-test-1', 'aws_security_group.vpc-xxxxxxx-test-2')
It's'ok but I can't extract the id, any help is appreciated, also use other methods
Thanks
I don't know objectpath, but I think you need:
tree.execute('$..resources[0]..primary.id')
or even just
tree.execute('$..resources[0]..id')
I have an input json like the following:
{
"page": 2,
"limit": 10,
"order": [
{
"field": "id",
"type": "asc"
},
{
"field": "email",
"type": "desc"
},
...
{
"field": "fieldN",
"type": "desc"
}
],
"filter": [
{
"field": "company_id",
"type": "=",
"value": 1
},
...
{
"field": "counter",
"type": ">",
"value": 5
}
]
}
How do I dynamically construct sqlalchemy query based on my input json if I don't know fields count?
Something like this:
User.query.filter(filter.field, filter.type, filter.value).filter(filter.field1, filter.type1, filter.value1)...filter(filter.fieldN, filter.typeN, filter.valueN).order_by("id", "ask").order_by("email", "desc").order_by("x1", "y1")....order_by("fieldN"...."desc").all()
Convert the json into a dictionary and retrieve the value.
If your json is in a file (say, data.json), the json library will satisfy your needs:
import json
f = open("data.json")
data = json.load(f)
f.close()
User.query.filter(company_id=1).order_by(data["id"], data["ask"]).order_by(data["email"], data["desc"]).all()
If your json is a string (say, json_data):
import json
data = json.loads(json_data)
User.query.filter(company_id=1).order_by(data["id"], data["ask"]).order_by(data["email"], data["desc"]).all()
If your json is a request from the python requests library i.e. res = requests.get(...), then res.json() will return a dictionary:
data = res.json()
User.query.filter(company_id=1).order_by(data["id"], data["ask"]).order_by(data["email"], data["desc"]).all()
I am very much new to JSON parsing. Below is my JSON:
[
{
"description": "Newton",
"exam_code": {
"date_added": "2015-05-13T04:49:54+00:00",
"description": "Production",
"exam_tags": [
{
"date_added": "2012-01-13T03:39:17+00:00",
"descriptive_name": "Production v0.1",
"id": 1,
"max_count": "147",
"name": "Production"
}
],
"id": 1,
"name": "Production",
"prefix": "SA"
},
"name": "CM"
},
{
"description": "Opera",
"exam_code": {
"date_added": "2015-05-13T04:49:54+00:00",
"description": "Production",
"test_tags": [
{
"date_added": "2012-02-22T12:44:55+00:00",
"descriptive_name": "Production v0.1",
"id": 1,
"max_count": "147",
"name": "Production"
}
],
"id": 1,
"name": "Production",
"prefix": "SA"
},
"name": "OS"
}
]
Here I am trying to find if name value is CM print description value.
If name value is OS then print description value.
Please help me to to understand how JSON parsing can be done?
Considering you have already read the JSON string from somewhere, be it a file, stdin, or any other source.
You can actually deserialize it into a Python object by doing:
import json
# ...
json_data = json.loads(json_str)
Where json_str is the JSON string that you want to parse.
In your case, json_str will get deserialized into a Python list, so you can do any operation on it as you'd normally do with a list.
Of course, this includes iterating over the elements:
for item in json_data:
if item.get('name') in ('CM', 'OS'):
print(item['description'])
As you can see, the items in json_data have been deserialized into dict, so you can access the actual fields using dict operations.
Note
You can also deserialize a JSON from the source directly, provided you have access to the file handler/descriptor or stream:
# Loading from a file
import json
with open('my_json.json', 'r') as fd:
# Note that we're using json.load, not json.loads
json_data = json.load(fd)
# Loading from stdin
import json, sys
json_data = json.load(sys.stdin)
I Have csv file with this data and using python i would like to convert in json Format.
I would like to convert in this format Json Format.Can you tell me the which library i should use or any suggestion for sudo code.
I am able to convert in standard json which has key value pair but i don't know how to convert below Json Format.
"T-shirt","Long-tshirt",18
"T-shirt","short-tshirt"19
"T-shirt","half-tshirt",20
"top","very-nice",45
"top","not-nice",56
{
"T-shirts":[
{
"name":"Long-tshirt",
"size":"18"
},
{
"name":"short-tshirt",
"size":"19"
},
{
"name":"half-tshirt",
"size":"20"
},
],
"top":[
{
"name":"very-nice"
"size":45
},
{
"name":"not-nice"
"size":45
},
]
}
In this code, I put your CSV into test.csv file: (as a heads up, the provided code was missing a comma before the 19).
"T-shirt","Long-tshirt",18
"T-shirt","short-tshirt",19
"T-shirt","half-tshirt",20
"top","very-nice",45
"top","not-nice",56
Then, using the built-in csv and json modules you can iterate over each row and add them to a dictionary. I used a defaultdict to save time, and write out that data to a json file.
import csv, json
from collections import defaultdict
my_data = defaultdict(list)
with open("test.csv") as csv_file:
reader = csv.reader(csv_file)
for row in reader:
if row: # To ignore blank lines
my_data[row[0]].append({"name": row[1], "size": row[2]})
with open("out.json", "w") as out_file:
json.dump(my_data, out_file, indent=2)
Generated out file:
{
"T-shirt": [
{
"name": "Long-tshirt",
"size": "18"
},
{
"name": "short-tshirt",
"size": "19"
},
{
"name": "half-tshirt",
"size": "20"
}
],
"top": [
{
"name": "very-nice",
"size": "45"
},
{
"name": "not-nice",
"size": "56"
}
]
}
import json
json_string = json.dumps(your_dict)
You now have a string containing json formatted date from your original dictionary - is that what you wanted?