I am using the code below which I found here, but it always returns the same quote. I can see that data itself has more quotes, but not sure how to parse them out one at a time. Thank you.
import requests
url = "http://www.forbes.com/forbesapi/thought/uri.json?enrich=true&query=1&relatedlimit=5"
response = requests.get(url)
data = response.json()
quote=data['thought']['quote'].strip()
Returns:
u'Teach self-denial and make its practice pleasure, and you can create for the world a destiny more sublime that ever issued from the brain of the wildest dreamer.'
type(data) returns a dict. Then data.keys() returns [u'thought']. data.items() returns a bunch of text. Why only one key, if are seemingly more than one quote inside? And why does data = response.json() return disc with type(data), rather than a json object?
I copied your json file at http://jsonviewer.stack.hu/ and saw the structure of data. Once you understand the structure.
import requests
url = "http://www.forbes.com/forbesapi/thought/uri.json?enrich=true&query=1&relatedlimit=5"
response = requests.get(url)
data = response.json()
#print recent quote from Thought
print '*'*10
quote=data['thought']["quote"]
print quote
#print quotes from related authors
print '*'*10
quotes_of_related_authors = data['thought']['relatedAuthorThoughts']
for i in quotes_of_related_authors:
print i.get('quote')
# print quotes from related theme thoughts
print '*'*10
quotes_of_related_theme_thoughts = data['thought']['relatedThemeThoughts']
for i in quotes_of_related_theme_thoughts:
print i.get('quote')
Related
I am using Python 3.7 and I am trying to handle some JSON data that I receive back from a website. A sample of the JSON response is below but it can vary in length. In essence, it returns details about 'officers' and in the example below, there is data for two officers. This is using the OpenCorporates API
{"api_version":"0.4","results":{"page":1,"per_page":30,"total_pages":1,"total_count":2,"officers":[{"officer":{"id":212927580,"uid":null,"name":"NEIL KIDMAN","jurisdiction_code":"gb","position":"director","retrieved_at":"2015-12-04T00:00:00+00:00","opencorporates_url":"https://opencorporates.com/officers/212927580","start_date":"2015-01-28","end_date":null,"occupation":"SERVICE MANAGER","current_status":null,"inactive":false,"company":{"name":"GRSS LIMITED","jurisdiction_code":"gb","company_number":"09411531","opencorporates_url":"https://opencorporates.com/companies/gb/09411531"}}},{"officer":{"id":190031476,"uid":null,"name":"NEIL KIDMAN","jurisdiction_code":"gb","position":"director","retrieved_at":"2015-12-04T00:00:00+00:00","opencorporates_url":"https://opencorporates.com/officers/190031476","start_date":"2002-05-17","end_date":null,"occupation":"COMPANY DIRECTOR","current_status":null,"inactive":false,"company":{"name":"GILBERT ROAD SERVICE STATION LIMITED","jurisdiction_code":"gb","company_number":"04441363","opencorporates_url":"https://opencorporates.com/companies/gb/04441363"}}}]}}
My code so far is:-
response = requests.get(url)
response.raise_for_status()
jsonResponse = response.json()
officerDetails = jsonResponse['results']['officers']
This works well but my ultimate goal is to create variables and write them to a .csv. So I'd like to write something like:-
name = jsonResponse['results']['officers']['name']
position = jsonResponse['results']['officers']['name']
companyName = jsonResponse['results']['officers']['company']['name']
Any suggestions how I could do this? As said, I'd like to loop through each 'officer' in the JSON response and then capture these values and write to a .csv (I will tackle the .csv part once I have them assigned to the variables)
officers = jsonResponse['results']['officers']
res = []
for officer in officers:
data = {}
data['name'] = officer['officer']['name']
data['position'] = officer['officer']['position']
data['company_name'] = officer['officer']['company']['name']
res.append(data)
You can then go ahead to write res, which is a list of objects to a csv file.
Disclaimer: I am very new to python, and have no idea what i am doing, i am teaching myself from the web.
I have some code that looks like this
Code:
from request import get # Ed: added for clarity
myurl = URLBASE + args.key
response = get(myurl)
# check key is valid
json = response.text # Ed: this is a requests.Response object
print(json)
if json is None:
sys.exit("Problem getting API data, check your key")
print("how did i get here")
Output:
null
how did i get here
But I have no idea how that is possible ... it literally says it is null in the print, but then doesn't match in the 'if'. Any help would be appreciated.
thx
So I am sure I still don't fully understand, but this "fixes" my problem.
The requests.Response object has Property/Method json - so i should have been using that, thanks wim, instead of text. So changing the code to this (below), as suggested, makes the code work.
from request import get
myurl = URLBASE + args.key
response = get(myurl)
# check key is valid
json = response.json()
if json is None:
sys.exit("Problem getting API data, check your key")
print("how did i get here")
The question (for me inquisitively) remains, how would I do an if statement to determine if a string is null?
Thanks to Ry and wim, for their help.
I used python2 to make a request to RNAcentral database, and I read the response as JSON format by the use of this command: response.json().
This let me read the data as a dictionary data type, so I used the corresponding syntax to obtain the data from cross references, which contained some links to other databases, but when I try to make the request for each link using the command mentioned above, I can't read it as JSON, because I can only obtain the response content as HTML.
So I need to know how to read make a request to each link from cross references and read it as JSON using python language.
Here is the code:
direcc = 'http://rnacentral.org/api/v1/rna/'+code+'/?flat=true.json'
resp = requests.get(direcc)
datos=resp.json()
d={}
links = []
for diccionario in datos['xrefs']['results']:
if diccionario['taxid']==9606:
base_datos=diccionario['database']
for llave,valor in diccionario['accession'].iteritems():
d[base_datos]={'url':diccionario['accession']['url'],
'expert_db_url':diccionario['accession']['expert_db_url'],
'source_url':diccionario['accession']['source_url']}
for key,value in d.iteritems():
links.append(d[key]['expert_db_url'])
for item in links:
response = requests.get(item)
r = response.json()
And this is the error I get: ValueError: No JSON object could be decoded.
Thank you.
So I'm experimenting with json abit and this is the code I've got this far,
import json
from utorrent.client import UTorrentClient
uTorrent = UTorrentClient("xxxx", "xxxx", "xxxx")
data = uTorrent.list()
torrents = json.loads(data)["torrents"]
for torrent in torrents:
print item[0] # hash
print item[2] # name
print item[21] # status
print item[26] # folder
The typical json output can be viewed here. But im getting an "expected string or buffer" error. Anyone with any pointers?
The point with above code is to print out each hash/name.. for each torrent found in the list provided by uTorrent
Did you try using load instead of loads? I was having the same problem and I realized there's a difference.
I have a URL:
http://somewhere.com/relatedqueries?limit=2&query=seedterm
where modifying the inputs, limit and query, will generate wanted data. Limit is the max number of term possible and query is the seed term.
The URL provides text result formatted in this way:
oo.visualization.Query.setResponse({version:'0.5',reqId:'0',status:'ok',sig:'1303596067112929220',table:{cols:[{id:'score',label:'Score',type:'number',pattern:'#,##0.###'},{id:'query',label:'Query',type:'string',pattern:''}],rows:[{c:[{v:0.9894380670262618,f:'0.99'},{v:'newterm1'}]},{c:[{v:0.9894380670262618,f:'0.99'},{v:'newterm2'}]}],p:{'totalResultsCount':'7727'}}});
I'd like to write a python script that takes two arguments (limit number and the query seed), go fetch the data online, parse the result and return a list with the new terms ['newterm1','newterm2'] in this case.
I'd love some help, especially with the URL fetching since I have never done this before.
It sounds like you can break this problem up into several subproblems.
Subproblems
There are a handful of problems that need to be solved before composing the completed script:
Forming the request URL: Creating a configured request URL from a template
Retrieving data: Actually making the request
Unwrapping JSONP: The returned data appears to be JSON wrapped in a JavaScript function call
Traversing the object graph: Navigating through the result to find the desired bits of information
Forming the request URL
This is just simple string formatting.
url_template = 'http://somewhere.com/relatedqueries?limit={limit}&query={seedterm}'
url = url_template.format(limit=2, seedterm='seedterm')
Python 2 Note
You will need to use the string formatting operator (%) here.
url_template = 'http://somewhere.com/relatedqueries?limit=%(limit)d&query=%(seedterm)s'
url = url_template % dict(limit=2, seedterm='seedterm')
Retrieving data
You can use the built-in urllib.request module for this.
import urllib.request
data = urllib.request.urlopen(url) # url from previous section
This returns a file-like object called data. You can also use a with-statement here:
with urllib.request.urlopen(url) as data:
# do processing here
Python 2 Note
Import urllib2 instead of urllib.request.
Unwrapping JSONP
The result you pasted looks like JSONP. Given that the wrapping function that is called (oo.visualization.Query.setResponse) doesn't change, we can simply strip this method call out.
result = data.read()
prefix = 'oo.visualization.Query.setResponse('
suffix = ');'
if result.startswith(prefix) and result.endswith(suffix):
result = result[len(prefix):-len(suffix)]
Parsing JSON
The resulting result string is just JSON data. Parse it with the built-in json module.
import json
result_object = json.loads(result)
Traversing the object graph
Now, you have a result_object that represents the JSON response. The object itself be a dict with keys like version, reqId, and so on. Based on your question, here is what you would need to do to create your list.
# Get the rows in the table, then get the second column's value for
# each row
terms = [row['c'][2]['v'] for row in result_object['table']['rows']]
Putting it all together
#!/usr/bin/env python3
"""A script for retrieving and parsing results from requests to
somewhere.com.
This script works as either a standalone script or as a library. To use
it as a standalone script, run it as `python3 scriptname.py`. To use it
as a library, use the `retrieve_terms` function."""
import urllib.request
import json
import sys
E_OPERATION_ERROR = 1
E_INVALID_PARAMS = 2
def parse_result(result):
"""Parse a JSONP result string and return a list of terms"""
prefix = 'oo.visualization.Query.setResponse('
suffix = ');'
# Strip JSONP function wrapper
if result.startswith(prefix) and result.endswith(suffix):
result = result[len(prefix):-len(suffix)]
# Deserialize JSON to Python objects
result_object = json.loads(result)
# Get the rows in the table, then get the second column's value
# for each row
return [row['c'][2]['v'] for row in result_object['table']['rows']]
def retrieve_terms(limit, seedterm):
"""Retrieves and parses data and returns a list of terms"""
url_template = 'http://somewhere.com/relatedqueries?limit={limit}&query={seedterm}'
url = url_template.format(limit=limit, seedterm=seedterm)
try:
with urllib.request.urlopen(url) as data:
data = perform_request(limit, seedterm)
result = data.read()
except:
print('Could not request data from server', file=sys.stderr)
exit(E_OPERATION_ERROR)
terms = parse_result(result)
print(terms)
def main(limit, seedterm):
"""Retrieves and parses data and prints each term to standard output"""
terms = retrieve_terms(limit, seedterm)
for term in terms:
print(term)
if __name__ == '__main__'
try:
limit = int(sys.argv[1])
seedterm = sys.argv[2]
except:
error_message = '''{} limit seedterm
limit must be an integer'''.format(sys.argv[0])
print(error_message, file=sys.stderr)
exit(2)
exit(main(limit, seedterm))
Python 2.7 version
#!/usr/bin/env python2.7
"""A script for retrieving and parsing results from requests to
somewhere.com.
This script works as either a standalone script or as a library. To use
it as a standalone script, run it as `python2.7 scriptname.py`. To use it
as a library, use the `retrieve_terms` function."""
import urllib2
import json
import sys
E_OPERATION_ERROR = 1
E_INVALID_PARAMS = 2
def parse_result(result):
"""Parse a JSONP result string and return a list of terms"""
prefix = 'oo.visualization.Query.setResponse('
suffix = ');'
# Strip JSONP function wrapper
if result.startswith(prefix) and result.endswith(suffix):
result = result[len(prefix):-len(suffix)]
# Deserialize JSON to Python objects
result_object = json.loads(result)
# Get the rows in the table, then get the second column's value
# for each row
return [row['c'][2]['v'] for row in result_object['table']['rows']]
def retrieve_terms(limit, seedterm):
"""Retrieves and parses data and returns a list of terms"""
url_template = 'http://somewhere.com/relatedqueries?limit=%(limit)d&query=%(seedterm)s'
url = url_template % dict(limit=2, seedterm='seedterm')
try:
with urllib2.urlopen(url) as data:
data = perform_request(limit, seedterm)
result = data.read()
except:
sys.stderr.write('%s\n' % 'Could not request data from server')
exit(E_OPERATION_ERROR)
terms = parse_result(result)
print terms
def main(limit, seedterm):
"""Retrieves and parses data and prints each term to standard output"""
terms = retrieve_terms(limit, seedterm)
for term in terms:
print term
if __name__ == '__main__'
try:
limit = int(sys.argv[1])
seedterm = sys.argv[2]
except:
error_message = '''{} limit seedterm
limit must be an integer'''.format(sys.argv[0])
sys.stderr.write('%s\n' % error_message)
exit(2)
exit(main(limit, seedterm))
i didn't understand well your problem because from your code there it seem to me that you use Visualization API (it's the first time that i hear about it by the way).
But well if you are just searching for a way to fetch data from a web page you could use urllib2 this is just for getting data, and if you want to parse the retrieved data you will have to use a more appropriate library like BeautifulSoop
if you are dealing with another web service (RSS, Atom, RPC) rather than web pages you can find a bunch of python library that you can use and that deal with each service perfectly.
import urllib2
from BeautifulSoup import BeautifulSoup
result = urllib2.urlopen('http://somewhere.com/relatedqueries?limit=%s&query=%s' % (2, 'seedterm'))
htmletxt = resul.read()
result.close()
soup = BeautifulSoup(htmltext, convertEntities="html" )
# you can parse your data now check BeautifulSoup API.