How to extract elements from JSON File with python? - python

I have the JSON file below and i am trying to extract the value dob_year from each item if the name element is jim.
This is my try:
import json
with open('j.json') as json_file:
data = json.load(json_file)
if data['name'] == 'jim'
print(data['dob_year'])
Error:
File "extractjson.py", line 6 if data['name'] == 'jim' ^ SyntaxError: invalid syntax
this is my file.
[
{
"name": "jim",
"age": "10",
"sex": "male",
"dob_year": "2007",
"ssn": "123-23-1234"
},
{
"name": "jill",
"age": "6",
"sex": "female",
"dob_year": "2011",
"ssn": "123-23-1235"
}
]

You need to iterate over the list in the JSON file
data = json.load(json_file)
for item in data:
if item['name'] == 'jim':
print(item['dob_year'])

I suggest to use list comprehension:
import json
with open('j.json') as json_file:
data = json.load(json_file)
print([item["dob_year"] for item in a if item["name"] == "jim"])

json.load()
returns a list of entries, the way you have saved it. You have to iterate through all list items, then search for your field in the list item.
Also, if it is a repetitive task, make a function out of it. You can pass different files, fields to be extracted, etc. this way, and it can make your job easier.
Example:
def extract_info(search_field, extract_field, name, filename):
import json
with open(filename) as json_file:
data = json.load(json_file)
for item in data:
if item[search_field] == name:
print(item[extract_field])
extract_info('name','dob_year','jim','j.json')

data is a list of dictionaries! you cannot directly access the key value pairs.
Try encapsulating it in a loop to check every dict in the list:
import json
with open('j.json') as json_file:
data = json.load(json_file)
for set in data:
if set['name'] == 'jim':
print(set['dob_year'])

Related

How to update json object with dictionary in python

The goal is to update a json object that contains a particular key
The json file looks like:
{
"collection": [
{"name": "name1", "phone": "10203040"},
{"name": "name2", "phone": "20304050", "corporateIdentificationNumber": "1234"},
{"name": "name3", "phone": "30405060", "corporateIdentificationNumber": "5678"}
]}
if a json object contains the key 'corporateIdentificationNumber', then iterate through a dictonary and update 'name' and 'corporateIdentificationNumber' from dictionary. Dictionary looks like this:
dict = {"westbuilt": "4232", "Northbound": "5556"}
In other words that means that i need to update the json object with a dictionary, and whenever i am updating a json object, it should select key/value pair from dictionary, and then iterate to next key/value for next json object containing 'corporateIdentificationNumber'
Code:
r = requests.get(url="*URL*")
file = r.json()
for i in file['collection']:
if 'corporateIdentificationNumber' in i:
--- select next iterated key/value from dict---
--- update json object ---
result should look like:
{
"collection": [
{"name": "name1", "phone": "10203040"},
{"name": "westbuilt", "phone": "20304050", "corporateIdentificationNumber": "4232"},
{"name": "Northbound", "phone": "30405060", "corporateIdentificationNumber": "5556"}
]}
I think you need to use an iterator to the items:
updates = {"westbuilt": "4232", "Northbound": "5556"}
r = requests.get(url="*URL*")
file = r.json()
items = iter(updates.items())
try:
for i in file['collection']:
if 'corporateIdentificationNumber' in i:
d = next(items)
i['name'] = d[0]
i["corporateIdentificationNumber"] = d[1]
except StopIteration:
pass
print(file)
json_object["corporateIdentificationNumber"] = "updated value"
file = open("your_json_file.json", "w")
json.dump(json_object, file)
file.close()

Reading a json file that has multiple lines

I have a function that I apply to a json file. It works if it looks like this:
import json
def myfunction(dictionary):
#does things
return new_dictionary
data = """{
"_id": {
"$oid": "5e7511c45cb29ef48b8cfcff"
},
"description": "some text",
"startDate": {
"$date": "5e7511c45cb29ef48b8cfcff"
},
"completionDate": {
"$date": "2021-01-05T14:59:58.046Z"
},
"videos":[{"$oid":"5ecf6cc19ad2a4dfea993fed"}]
}"""
info = json.loads(data)
refined = key_replacer(info)
new_data = json.dumps(refined)
print(new_data)
However, I need to apply it to a whole while and the input looks like this (there are multiple elements and they are not separated by commas, they are one after another):
{"_id":{"$oid":"5f06cb272cfede51800b6b53"},"company":{"$oid":"5cdac819b6d0092cd6fb69d3"},"name":"SomeName","videos":[{"$oid":"5ecf6cc19ad2a4dfea993fed"}]}
{"_id":{"$oid":"5ddb781fb4a9862c5fbd298c"},"company":{"$oid":"5d22cf72262f0301ecacd706"},"name":"SomeName2","videos":[{"$oid":"5dd3f09727658a1b9b4fb5fd"},{"$oid":"5d78b5a536e59001a4357f4c"},{"$oid":"5de0b85e129ef7026f27ad47"}]}
How could I do this? I tried opening and reading the file, using load and dump instead of loads and dumps, and it still doesn't work. Do I need to read, or iterate over every line?
You are dealing with ndjson(Newline delimited JSON) data format.
You have to read the whole data string, split it by lines and parse each line as a JSON object resulting in a list of JSONs:
def parse_ndjson(data):
return [json.loads(l) for l in data.splitlines()]
with open('C:\\Users\\test.json', 'r', encoding="utf8") as handle:
data = handle.read()
dicts = parse_ndjson(data)
for d in dicts:
new_d = my_function(d)
print("New dict", new_d)

Delete a certain value in a json file

{
"Basketball": {
"first_name": "Michael",
"last_name": "Jordan"
},
"Football": {
"first_name": "Leo",
"last_name": "Messi"
},
"Football2": {
"first_name": "Cristiano",
"last_name": "Ronaldo"
}
}
This is my json file. I want to delete "Football2" out of this json file. It should work no matter what the value of "Football2" is.
So it should look like this after my code is executed:
{
"Basketball": {
"first_name": "Michael",
"last_name": "Jordan"
},
"Football": {
"first_name": "Leo",
"last_name": "Messi"
}
}
This is my code.
def delete_profile():
delete_profile = input('Which one would you like to delete? ')
with open(os.path.join('recources\datastorage\profiles.json')) as data_file:
data = json.load(data_file)
for element in data:
print(element)
if delete_profile in element:
del(element[delete_profile])
with open('data.json', 'w') as data_file:
json.dump(data, data_file)
But it gives this error: TypeError: 'str' object does not support item deletion
What am I doing wrong and what is the correct way to do this?
Youre looping over the items in your JSON dictionary unnecessarily when you just want to delete the "top" level items:
def delete_profile():
delete_profile = input('Which one would you like to delete? ')
with open(os.path.join('recources\datastorage\profiles.json')) as data_file:
data = json.load(data_file)
# No need to loop, just check if the profile is in the JSON dictionary since your 'profiles' are top-level objects
if (delete_profile in data):
# Delete the profile from the data dictionary, not from the elements of the dictionary
del(data[delete_profile])
# Maybe add an else to handle if the requested profile is not in the JSON?
with open('data.json', 'w') as data_file:
json.dump(data, data_file)

Is there a way to just grab one subset of json data from a large text file?

I'm looking to pull the "name" field from a large json text file and be able to store them in another file for later, but I'm getting every piece of data that was in my previous json file albeit slightly modified. How do I make it so I only grab the data after the "name": field in my json file?
I've tried
names = []
with open('./out.json', 'r') as f:
data = json.load(f)
for name in data:
names.append(data[name])
with open('./names.json','w') as f:
for name in names:
f.write('%s\r\n' % name)
and I'm getting my exact json file back, with no formatting and u' in front of everything, likely from the json.load(f), but I have no idea how to remedy this.
my text file is formatted like this, if it matters:
{
"array":[
{
"name": "Seranul",
"id": 5,
"type": "Paladin",
"itemLevel": 414,
"icon": "Paladin-Holy",
"total": 11107150,
"activeTime": 2205387,
"activeTimeReduced": 2205387
},
{
"name": "Contherious",
"id": 9,
"type": "Hunter",
"itemLevel": 412,
"icon": "Hunter-Marksmanship",
"total": 51102811,
"activeTime": 2637303,
"activeTimeReduced": 2637303
},
{
"name": "Unicorns",
"id": 17,
"type": "Priest",
"itemLevel": null,
"icon": "Priest",
"total": 12252005,
"activeTime": 1768883,
"activeTimeReduced": 1761797
},
...
}
]}
I'm expecting to see the corresponding data for each name field, but I'm getting my entire document back.
It looks like your code is ignoring the structure of the JSON data. Specifically, you are iterating through the keys in the JSON dictionary, which is just array, and then appending the value to you names list. This results in the whole array property being put into your names variable.
Here is what I believe you want: iterate through the entries in array and and them to a list, then export that as JSON to another file.
import json
names = []
with open('./out.json', 'r') as f:
data = json.load(f)
for entry in data["array"]:
names.append(entry["name"])
with open('./names.json', 'w') as f:
f.write(json.dumps(names))
This will result in the following JSON in names.json:
["Seranul", "Contherious", "Unicorns"]

Can't access JSON loaded with json.dumps(json.loads(input))

Suppose I have json data like this.
{"id": {"$oid": "57dbv34346"}, "from": {"$oid": "57dbv34346sbgwe"}, "type": "int"}
{"id": {"$oid": "57dbv34345"}, "from": {"$oid": "57dbv34345sbgwe"}, "type": "int"}
I wrote a script like this in python
import json
with open('klinks_buildson.json', 'r') as f:
for line in f:
distros_dict = json.dumps(json.loads(line), sort_keys=True, indent=4)
print distros_dict['from']
print "\n"
But It is giving me an error:
print distros_dict['from']
TypeError: string indices must be integers, not str
I want data of the from in both the lines.
You don't need to load the line, you can load the file (assuming its valid json); like this:
with open('klinks_buildjson.json', 'r') as f:
data = json.load(f)
Now data is a list, where each item is an object. You can iterate through it:
for row in data:
print(row['from'])
To fix your immediate problem, remove json.dumps which is used to convert an object to a string, which is not what you want here.
distros_dict = json.loads(line)

Categories