Parse through a JSON file and pick out certain properties - python

I have a JSON file containing over 8000 features. An example of one is below:
{"type":"FeatureCollection","features":[{"type":"Feature","properties":{"uuid":"____","part2":"_____","length":"529","function":"______"},"geometry":{"type":"LineString","coordinates":[[-360909.60698310119,7600968.922204642,0.0],[-361357.344715965,7600811.951385159,0.0],[-361805.08159795138,7600654.939420643,0.0]]}}
I am trying to use python to iterate through the features and print out certain aspects from the properties. At the minute I have the following code which is iterating through the features and printing the 1st feature and its properties 8420 times.
import json
# Opening JSON file
f = open('file.json')
# returns JSON object as
# a dictionary
data = json.load(f)
# Iterating through the json
# list
for i in data['features']:
print(data["features"][0]["properties"])
How can I amend the above to show the features and properties for each of the 8420 features? I have tried
for i in data['features']:
print(data["features"][i]["properties"])
but I get the error TypeError: list indices must be integers or slices, not dict

i is a dictionary, not an index.
for i in data['features']:
print(i["properties"])
If it was important to you to use an index, use enumerate or range:
for i, d in enumerate(data['features']):
# data['features'][i] == d
for i in range(len(data['features'])):
# data['features'][i] ...

Related

How to get complete dictionary data from a JSON file based on a value

I have a json file, which I will read and based on the xyz details will create excel report. Below is the sample json file I will use to extract the information which holds data in format of multiple dictionaries.
Now my requirement is to fetch xyz value one by one and based on it using certain field create a report. Below is the small snippet of the code where I am reading the file and based on key populating results. The data I am referencing after reading it from a file.
def pop_ws(dictionary,ws):
r=1
count=1
for k,v in dictionary.items():
offs=len(v['current'])
ws.cell(row=r+1,column=1).value = k
ws.cell(row=r+1,column=4).value = v['abc']
ws.cell(row=r+1,column=5).value = v['def']
wrk=read_cves(k)
count +=1
if wrk !='SAT':
ws.cell(row=r+1,column=7).value =k
ws.cell(row=r+1,column=8).value =tmp1['public_date']
if 'cvss' in list(tmp1.keys()):
.
.
.
def read_f(data):
with open(dat.json) as f:
wrk = f.read()
I am pretty much stuck on how to code in def read_f(data):, so that it read dat.json and based on value i.e data, fetch details defined as in dictionary structure one by one for all the required data and populate as defined under pop_ws in my code.
The data in def read_f(data): will be a dynamic value and based on it I need to filter the dictionary which have value (stored in data) defined against a key and then extract the whole dictionary into another json file.
Any suggestion on this will be appreciated.
Use json package to load json format data like below:
# Python program to read
# json file
import json
# Opening JSON file
f = open('data.json',)
# returns JSON object as
# a dictionary
data = json.load(f)
# Iterating through the json
# list
for i in data['emp_details']:
print(i)
# Closing file
f.close()
I got this from this link, now you can get dict from the file.
Next you can just filter the dict with specific value like below.
You should use filter() built-in function, with a function that returns True, if the dictionary contains one of the values.
def filter_func(dic, filterdic):
for k,v in filterdic.items():
if k == 'items':
if any(elemv in dic[k] for elemv in v):
return True
elif v == dic[k]:
return True
return False
def filter_cards(deck, filterdic):
return list(filter(lambda dic, filterdic=filterdic: filter_func(dic, filterdic) , deck))
You should use a dictionary as the second element.
filter_cards(deck, {'CVE': 'moderate'})
Hopefully, this could helpful for your situation.
Thanks.
Once you get your json object, you can access each value using the key like so:
print(json_obj["key"]) #prints the json value for that key
In your case
print(wrk["CVE"]) # prints CVE-2020-25624

Take 2 key values from list of python dicts & make new list/tuple/array/dictionary with each index containing 2 key values from 1st listed dict

I have a list of dictionaries in a json file.
I have iterated through the list and each dictionary to obtain two specific key:value pairs from each dictionary for each element.
i.e. List[dictionary{i(key_x:value_x, key_y:value_y)}]
My question is now:
How do I place these two new key: value pairs in a new list/dictionary/array/tuple, representing the two key: value pairs extracted for each listed element in the original?
To be clear:
ORIGINAL_LIST (i.e. with each element being a nested dictionary) =
[{"a":{"blah":"blah",
"key_1":value_a1,
"key_2":value_a2,
"key_3":value_a3,
"key_4":value_a4,
"key_5":value_a5,},
"b":"something_a"},
{"a":{"blah":"blah",
"key_1":value_b1,
"key_2":value_b2,
"key_3":value_b3,
"key_4":value_b4,
"key_5":value_b5,},
"b":"something_b"}]
So my code so far is:
import json
from collections import *
from pprint import pprint
json_file = "/some/path/to/json/file"
with open(json_file) as json_data:
data = json.load(json_data)
json_data.close()
for i in data:
event = dict(i)
event_key_b = event.get('b')
event_key_2 = event.get('key_2')
print(event_key_b)#print value of "b" for each nested dict for 'i'
print(event_key_2)#print value of "key_2" for each nested dict for 'i'
To be clear:
FINAL_LIST(i.e. with each element being a nested dictionary) =
[{"b":"something_a", "key_2":value_2},
{"b":"something_b", "key_2":value_2}]
So I have an answer to getting the keys into individual dictionaries, as follows in the code below. The only problem is that the value for 'key_2' in the original json dictionaries is either an int value or it is "" for values which are 0. My script just returns 'None' for all instances of value_2 for key_2. How can I get it to read the appropriate values for 'value_2'? I want to only return dictionaries for cases where 'value_2' > 0 (i.e. where value_2 != "")
Below is the current code:
import json
from pprint import pprint
json_file = "/some/path/to/json/file"
with open(json_file) as json_data:
data = json.load(json_data)
json_data.close()
for i in data:
event_key_b = event.get('b')
for x in i:
event_key_2 = event.get('key_2')
x = {'b' : something_b, 'key_2' : value_2}
print(x)
Also, if there are any more elegant solutions anyone can think of I would really be interested in learning them ... Some of the json files I'm looking at can range from 200 dictionary entries in the original list to 2,000,000. I'm planning to feed my parsed results into a message queue for processing by a different service and any efficiencies in the code will help for scalability in processing. Also if anyone has any recommendations to give on Redis vs. RabbitMQ, I'd really appreciate it

How do I replace a JSON list to print to a CSV?

I'm using an API to gather some data that comes to me in JSON format. I'm using json.loads to import the data and can successfully write it to a CSV. Unfortunately, the data comes in in a format that I don't want so I'd like to reformat the json list.
I've tried creating a new list and assigning the JSON list to the desired list. I get the following error: TypeError: list indices must be integers or slices, not str
import requests
import json
import csv
response = requests.get(url).text //json source
data = json.loads(response)
newsdata = (data["response"]["docs"])
// These two lines reformat the date to what I want it to look like
newsdate = [y["pub_date"] for y in newsdata]
newsdate = [y.split('T')[0] for y in newsdate]
newsdata["pub_date"] = newsdate // This line is what I've tried to replace the json
newssnip = [y["snippet"] for y in newsdata]
newshead = [y["headline"]["main"] for y in newsdata]
for z in newsdata:
csvwriter.writerow([z["pub_date"], //This is the JSON data i want to reformat
z["headline"]["main"],
z["snippet"],
z["web_url"]])
I expected the newsdata["pub_date"] to be overwritten when I assigned newsdate to it but I get the following error instead: TypeError: list indices must be integers or slices, not str
Thank you for your help! :)
EDIT:
I've uploaded an example json response here on github called "exmaple.json": https://github.com/theChef613/nytnewsscrapper
That error is saying that newsdata is list and is therefore not subscriptable with a string. If you post the raw JSON data returned or also print(type(newsdata)) to figure out what class newsdata is and how to work with it. It's also possible that newsdata is a 2D (or N-d) array where the first element is the key and the second element is the value.

Nested dictionary behavior

I am trying to learn how to manipulate data in python.
I have the following data in a txt file
{"summonerId":000000,"games":[{"gameId":111111,"invalid":false,"gameMode":"CLASSIC","gameType":"MATCHED_GAME","subType":"NORMAL","mapId":11,"teamId":200,"championId":89,"spell1":3,"spell2":4,"level":30,"ipEarned":237,"createDate":1443314494341,"fellowPlayers":[{"summonerId":46350758,"teamId":100,"championId":157}],"stats":{"level":15,"goldEarned":10173,"numDeaths":5,"minionsKilled":48,"championsKilled":1,"goldSpent":9205,"totalDamageDealt":48752,"totalDamageTaken":23464,"team":200,"win":true,"largestMultiKill":1,"physicalDamageDealtPlayer":9064,"magicDamageDealtPlayer":35714,"physicalDamageTaken":18944,"magicDamageTaken":4005,"timePlayed":1831,"totalHeal":4129,"totalUnitsHealed":5,"assists":24,"item0":3401,"item1":2049,"item2":3117,"item3":3068,"item4":3075,"item5":1028,"item6":3340,"magicDamageDealtToChampions":9062,"physicalDamageDealtToChampions":3348,"totalDamageDealtToChampions":12411,"trueDamageDealtPlayer":3974,"trueDamageTaken":514,"wardKilled":1,"wardPlaced":16,"totalTimeCrowdControlDealt":104,"playerRole":2,"playerPosition":4}]}
My end goal is to be able to display a specific piece of information from the "stats" dictionary.
When I run the following code
import json
matches = open('testdata.txt', 'r')
output = matches.read()
data=json.loads(output)
display = data["games"]
print("Info: " + str(display))
The output is everything that corresponds to the "games" key as I would expect.
When I try
import json
matches = open('testdata.txt', 'r')
output = matches.read()
data=json.loads(output)
display = data["games"]["stats"]
print("Info: " + str(display))
I receive: TypeError: list indices must be integers, not str
I'm not really sure how to proceed given that the key is clearly a string and not an integer...
Your data["games"] value is a list; each element in that list is a dictionary, and it is those dictionaries in the list that (may) have the 'stats' key. A list can contain 0 or more elements; in this specific case there is just 1 but there could be more or none.
Loop over the list of dictionaries, or pick a specific dictionary from the list with indexing. Since there is only one in your specific example, you could just index that 1 element with the 0 index:
display = data["games"][0]["stats"]

Sum JSON objects and get a JSON file in Python

I have a JSON file with many objects. I want to filter it to discard all the objects that does not have a specific field called ´id´. I developed a piece of code but it does not work:
import json
b=open("all.json","r")
sytems_objs=json.loads(b.read())
flag=0
for i in range(len(sytems_objs)):
if sytems_objs[i]["id"]<>None:
if flag==0:
total=sytems_objs[i]
flag=1
else:
total=total+sytems_objs[i]
file1=open("filtered.json","w+")
json.dump(total, file1)
c=open("filtered.json","r")
sytems_objs2=json.loads(b.read())
I get a Error: ValueError: No JSON object could be decoded
What am I doing wrong?
I'm assuming that system_objs is originally an array of objects
system_objs = json.loads(b.read())
# create a list that only have dicts with the property 'id'
# just read the comment to also include if id is not null
system_objs = [o for o in system_objs if 'id' in o and o['id'] is not None]
# how many dicts have 'id' with it
print len(system_objs)
# write the system objs to a json file
with open('filtered.json', 'w') as f:
f.write(json.dumps(system_objs))

Categories