I have a JSON file that looks like this:
[
"{'_filled': False,\n 'affiliation': u'Postdoctoral Scholar, University of California, Berkeley',\n 'citedby': 113,\n 'email': u'#berkeley.edu',\n 'id': u'4bahYMkAAAAJ',\n 'interests': [u'3D Shape',\n u'Shape from Texture',\n u'Shape from Shading',\n u'Naive Physics',\n u'Haptics'],\n 'name': u'Steven A. Cholewiak',\n 'url_citations': u'/citations?user=4bahYMkAAAAJ&hl=en',\n 'url_picture': u'/citations?view_op=view_photo&user=4bahYMkAAAAJ&citpid=1'}",
"\n"]
I am using python to extract the value of citedby. However, I am not able to figure.
Here is my code:
import json
json_data = open("output.json")
data = json.load(json_data)
print data[]
Now I know data would take an integer value whereas I would want to have it as a dictionary where in I could search using the key.
Is there any way I can achieve this?
import json
import ast
json_data = open("output.json")
data = json.load(json_data)
print ast.literal_eval(data[0])['citedby']
Related
https://github.com/Asabeneh/30-Days-Of-Python/blob/ff24ab221faaec455b664ad5bbdc6e0de76c3caf/data/countries_data.json
how can i loop through this countries_data.json file (see link above) to get 'languages'
i have tried:
import json
f = open("countries_data.json")
file = f.read()
# print(file)
for item in file:
print(item)
You have everything correct and set up but you didn't load the json file. Also there is a double space on "f = open". You also didn't open the file with the read parameter, not too sure if its needed though.
Correct code:
import json
f = open("countries_data.json", "r")
file = json.loads(f.read())
for item in file:
print(item)
Hope this helped, always double check your code.
You can see that you import the json module at the beginning, so you might as well use it
If you go to the documentation you will see a function allowing you to read this file directly.
In the end you end up with just a dictionary list, the code can be summarized as follows.
import json
with open("test/countries_data.json") as file:
data = json.load(file)
for item in data:
print(item["languages"])
You are missing one essential step, which is parsing the JSON data to Python datastructures.
import json
# read file
f = open("countries.json")
# parse JSON to Python datastructures
countries = json.load(f)
# now you have a list of countries
print(type(countries))
# loop through list of countries
for country in countries:
# you can access languages with country["languages"]; JSON objects are Python dictionaries now
print(type(country))
for language in country["languages"]:
print(language)
f.close()
Expected output:
<class 'list'>
<class 'dict'>
Pashto
Uzbek
Turkmen
...
You can use the json built-in package to deserialize the content of that file.
A sample of usage
data = """[
{
"name": "Afghanistan",
"capital": "Kabul",
"languages": [
"Pashto",
"Uzbek",
"Turkmen"
],
"population": 27657145,
"flag": "https://restcountries.eu/data/afg.svg",
"currency": "Afghan afghani"
},
{
"name": "Ă…land Islands",
"capital": "Mariehamn",
"languages": [
"Swedish"
],
"population": 28875,
"flag": "https://restcountries.eu/data/ala.svg",
"currency": "Euro"
}]"""
# deserializing
print(json.loads(data))
For more complex content have a look to the JSONDecoder.
doc
EDIT:
import json
path = # my file
with open(path, 'r') as fd:
# iterate over the dictionaries
for d in json.loads(fd.read()):
print(d['languages'])
EDIT: extra - top 10 languages
import json
import itertools as it
path = # path to file
with open(path, 'r') as fd:
text = fd.read()
languages_from_file = list(it.chain(*(d['languages'] for d in json.loads(text))))
# get unique "list" of languages
languages_all = set(languages_from_file)
# count the repeated languages
languages_count = {l: languages_from_file.count(l) for l in languages_all}
# order them per descending value
top_ten_languages = sorted(languages_count.items(), key=lambda k: k[1], reverse=True)[:10]
print(top_ten_languages)
For example
I have tried
import json
data = [{"ModelCode":"VH017","MakeCode":"VM020","VehicleTypeCode":"VC00000052","Year":2017,"IsActive":true,"RegistrationNumber":"KCC 254 ZY","IsApproved":true,"ApprovedBy":null,"Color":"BLUE","Id":"8c5062da-727b-40d5-b763-408cafdc53d8","_id":"3ce92939-4df7-4b9e-af48-647e218736da"},{"ModelCode":"VH024","MakeCode":"VM026","VehicleTypeCode":"VC00000053","Year":2008,"IsActive":false,"RegistrationNumber":"kkk 333k","IsApproved":false,"ApprovedBy":null,"Color":"blue","Id":"8c5062da-727b-40d5-b763-408cafdc53d8"}]
data_from_api = data.strip('][').split(',')
json.loads(data_from_api)
print(data_from_api)
I get a "NameError: name 'true' is not defined"
json.loads(str) is used to import data from a json string. If you want to create a json string from python data to save in a file as your header suggest then use json.dumps instead.
And as comments suggest. In python null is spelled None, false is False and true is True
import json
data = [
{
"ModelCode":"VH017",
"IsActive":True,
"ApprovedBy":None
},{
"ModelCode":"VH024",
"IsActive":False,
"ApprovedBy":None
}
]
json_str = json.dumps(data)
print(json_str)
I'm getting facedetection data from an API in this form:
{"id":1,"ageMin":0,"ageMax":100,"faceConfidence":66.72220611572266,"emotion":"ANGRY","emotionConfidence":50.0'
b'2540969848633,"eyeglasses":false,"eyeglassesConfidence":50.38102722167969,"eyesOpen":true,"eyesOpenConfidence":50.20328140258789'
b',"gender":"Male","genderConfidence":50.462989807128906,"smile":false,"smileConfidence":50.15522384643555,"sunglasses":false,"sun'
b'glassesConfidence":50.446510314941406}]'
I'd like to save this to a csv-file like this:
id ageMin ageMax faceConfidence
1 0 100 66
... and so on.
I tried to do it this way:
response = requests.get(url, headers=headers)
with open('detections.csv', 'w') as f:
writer = csv.writer(f)
for item in response:
writer.writerow(str(item))
That puts every char in its own cell. I've also tried to use item.id, but that gives an error: AttributeError: 'bytes' object has no attribute 'id'.
Could someone point me to the right direction?
Maybe an overkill for a small task, but you can do the following:
convert JSON response (do not forget to check exceptions, etc.) to python dictionary
dic = response.json()
Create a dataframe, for example using pandas:
df = pandas.DataFrame(dic)
Save to csv omitting index:
df.to_csv('detections.csv', index=False, sep="\t")
You can do this relatively easily with the pandas and json libraries.
import pandas as pd
import json
response = """{
"id": 1,
"ageMin": 0,
"ageMax": 100,
"faceConfidence": 66.72220611572266,
"emotion": "ANGRY",
"emotionConfidence": 50.0,
"eyeglasses": false,
"eyeglassesConfidence": 50.38102722167969,
"eyesOpen": true,
"eyesOpenConfidence": 50.20328140258789,
"gender": "Male",
"genderConfidence": 50.462989807128906,
"smile": false,
"smileConfidence": 50.15522384643555,
"sunglasses": false,
"glassesConfidence":50.446510314941406
}"""
file = json.loads(doc)
json = pd.DataFrame({"data": file})
json.to_csv("response.csv")
This is the response formatted to csv.
,data
ageMax,100
ageMin,0
emotion,ANGRY
emotionConfidence,50.0
eyeglasses,False
eyeglassesConfidence,50.38102722167969
eyesOpen,True
eyesOpenConfidence,50.20328140258789
faceConfidence,66.72220611572266
gender,Male
genderConfidence,50.462989807128906
glassesConfidence,50.446510314941406
id,1
smile,False
smileConfidence,50.15522384643555
sunglasses,False
I need to get the value of the keywords from the json file below. Like:
output = ['abc,'cde']
Json file structure looks like :
d = [{
"response": {"docs": [
{"keywords": [{"value": "abc"}]},
{"keywords": [{"value": "cde"}]}
]}
}]
I have tried the below. I believe it's redundant though since I get only one level of ["response"]["docs"].
keywords = []
data = json.load(data_file)
for i in data:
keywords.append(i["response"]["docs"][0]["keywords"])
keyword_Value = [g['value'] for d in keywords for g in d]
There's a JSON encoder/decoder built in to Python. See: https://docs.python.org/2/library/json.html
Something like
import json
with open ('path/to/your_data.json') as json_data:
data = json.load(json_data)
If you do not mind using an external library, this task is quite easy using jmespath like:
import jmespath
keywords = jmespath.search('[].response.docs[].keywords[].value', data)
Code:
data = [{
"response": {"docs": [
{"keywords": [{"value": "abc"}]},
{"keywords": [{"value": "cde"}]}
]}
}]
import jmespath
keywords = jmespath.search('[].response.docs[].keywords[].value', data)
print(keywords)
Results:
['abc', 'cde']
I have a .json file which I have opened in python. However I wish to only extract the orderIds from the .json file instead of printing the whole thing. Here is my code so far:
import json
from pprint import pprint
import
with open('data-3.json') as data_file:
data = json.load(data_file)
pprint(data)
and here is my .json file:
{'orders': [{'createdTime': '2016-02-29T23:26:32Z',
'currentStatus': {'additionalProperties': {},
'customInfo': None,
'stateActionDescription': None,
'stateCode': 'DESPATCH_END',
'stateDescription': 'Despatch completed',
'stateType': 'DISPATCH',
'timestamp': '2016-03-02T12:47:26Z',
'updateId': 378379,
'user': 'Dave Ffitch'},
It is similar to accessing elements of a dictionary.
Json = {'A':
{'B':['C','D','E']
}
}
Json['A']['B'] will give you ['C','D','E']