Python JSON KeyError for non missing key - python

For some unknown reason, when I run the below script, the following error is returned along with the desired output. For some reason, this was working without any errors last night. The API output does change every minute but I wouldn't expect a KeyError to be returned. I can't simply pinpoint where this error is coming from:
[u'#AAPL 151204C00128000'] <----- What I want to see printed
Traceback (most recent call last):
File "Options_testing.py", line 60, in <module>
main()
File "Options_testing.py", line 56, in main
if quotes[x]['greeks']['impvol'] > 0: #change this for different greek vals
KeyError: 'impvol'
Here is a little snippet of data:
{"results":{"optionchain":{"expire":"all","excode":"oprac","equityinfo":{"longname":"Apple Inc","shortname":"AAPL"},"money":"at","callput":"all","key":{"symbol":["AAPL"],"exLgName":"Nasdaq Global Select","exShName":"NGS","exchange":"NGS"},"symbolstring":"AAPL"},"quote":[{"delaymin":15,"contract":{"strike":108,"openinterest":3516,"contracthigh":6.16,"contractlow":0.02,"callput":"Put","type":"WEEK","expirydate":"2015-11-13"},"root":{"equityinfo":{"longname":"Apple Inc","shortname":"AAPL"},"key":{"symbol":["AAPL"],"exLgName":"Nasdaq Global Select","exShName":"NGS","exchange":"NGS"}},"greeks":{"vega":0,"theta":0,"gamma":0,"delta":0,"impvol":0,"rho":0}
Code:
#Options screener using Quotemedia's API
import json
import requests
#import csv
def main():
url_auth= "https://app.quotemedia.com/user/g/authenticate/v0/102368/XXXXX/XXXXX"
decode_auth = requests.get(url_auth)
#print decode_auth.json()
#print(type(decode_auth))
auth_data = json.dumps(decode_auth.json())
#Parse decode_auth, grab 'sid'
sid_parsed = json.loads(auth_data)["sid"]
#print sid_parsed
#Pass sid into qm_options
#Construct URL
symbol = 'AAPL'
SID = sid_parsed
url_raw = 'http://app.quotemedia.com/data/getOptionQuotes.json?webmasterId=102368'
url_data = url_raw + '&symbol=' + symbol + '&greeks=true' + '&SID=' + SID
#print url_data
response = requests.get(url_data)
#print response
data = json.dumps(response.json())
#print data
#save data to a file
with open('AAPL_20151118.json', 'w') as outfile:
json.dumps (data, outfile)
#Turn into json object
obj = json.loads(data)
#slim the object
quotes = obj['results']['quote']
#find the number of options contracts
range_count = obj['results']['symbolcount']
#print all contracts with an implied vol > 0
for x in range(0,range_count):
if quotes[x]['greeks']['impvol'] > 0: #change this for different greek vals
print quotes[x]['key']['symbol']
if __name__ == '__main__':
main()
I can provide sample data if necessary.

for x in range(0,range_count):
if quotes[x]['greeks']['impvol'] > 0: #change this for different greek vals
print quotes[x]['key']['symbol']
This loops throug multiple quotes, so maybe there is even just one that does not have an impvol property.
You should add some error handling, so you find out when that happens. Something like this:
# no need to iterate over indexes, just iterate over the items
for quote in quotes:
if 'greeks' not in quote:
print('Quote does not contain `greeks`:', quote)
elif 'impvol' not in quote['greeks']:
print('Quote does not contain `impvol`:', quote)
elif quote['greeks']['impvol'] > 0:
print quote['key']['symbol']

Related

TypeError: 'in <string>' requires string as left operand, not dict

I want to compare 2 request JSON responses and store some items from them in a variable (you can run the code and see the first response) but Im getting this error I have never seen before and dont know how to work with it. It occurs on the if item not in liveRN: line.
My code is :
import requests
import json
import time
getFirst = requests.get("https://api-mainnet.magiceden.dev/v2/collections?offset=0&limit=1")
liveRN = json.dumps(getFirst.json(), indent=4)
while True:
new = []
get = requests.get("https://api-mainnet.magiceden.dev/v2/collections?offset=0&limit=2")
dataPretty = json.dumps(get.json(), indent=4)
data = get.json()
print(data)
symbol = data[0]['symbol']
img = data[0]['image']
name = data[0]['name']
print(symbol)
print(img)
print(name)
if get.status_code == 200:
print("ok")
for item in data:
if item not in liveRN:
print(f"Found difference: {item}")
liveRN.append(item)
print(new)
time.sleep(15)

Unusual error while running Python script for fetching information from an API

I have a python code for extracting balance sheet reports in a loop for multiple locations through API get request. I have set up an else statement to return all the location ID's that fetch me no JSON data.
Sometimes the loop works till the end until it get's the final report. But most of the times the code throws the below error and stops running:
Traceback (most recent call last):
File "<ipython-input-2-85715734b89c>", line 1, in <module>
runfile('C:/Users/PVarimalla/.spyder-py3/temp.py', wdir='C:/Users/PVarimalla/.spyder-py3')
File "C:\Users\PVarimalla\Anaconda3\lib\site-packages\spyder_kernels\customize\spydercustomize.py", line 827, in runfile
execfile(filename, namespace)
File "C:\Users\PVarimalla\Anaconda3\lib\site-packages\spyder_kernels\customize\spydercustomize.py", line 110, in execfile
exec(compile(f.read(), filename, 'exec'), namespace)
File "C:/Users/PVarimalla/.spyder-py3/temp.py", line 107, in <module>
dict1 = json.loads(json_data)
File "C:\Users\PVarimalla\Anaconda3\lib\json\__init__.py", line 348, in loads
return _default_decoder.decode(s)
File "C:\Users\PVarimalla\Anaconda3\lib\json\decoder.py", line 337, in decode
obj, end = self.raw_decode(s, idx=_w(s, 0).end())
File "C:\Users\PVarimalla\Anaconda3\lib\json\decoder.py", line 355, in raw_decode
raise JSONDecodeError("Expecting value", s, err.value) from None
JSONDecodeError: Expecting value
For example,
Perfect Run: throws 15 locations id's out of 50 locations that the it couldn't JSON data for and I have dataframe with all the other franchisees balance sheets appended.
Incorrect Runs: Each time I run the script it throws 5 (or) 6 (or) 3 locations id's that it couldn't fetch JSON data and stops running with the above error.
I don't understand why does the script runs perfectly sometimes and behaves weirdly rest of the times(most of the times). Is it because of internet connection or an issue with Spyder 3.7?
I think I have no error in my whole script but unsure why I'm facing the above issue. Please help me with this.
Below is the code:
import requests
import json
#import DataFrame
import pandas as pd
#from pandas.io.json import json_normalize
#import json_normalize
access_token = 'XXXXXXXXX'
url = 'https://api.XXXX.com/v1/setup'
url_company = "https://api.*****.com/v1/Reporting/ProfitAndLoss?CompanyId=1068071&RelativeDateRange=LastMonth&DateFrequency=Monthly&UseAccountMapping=true&VerticalAnalysisType=None"
url_locations_trend = "https://api.*****.com/v1/location/search?CompanyId=1068071"
url_locations_mu = "https://api.*****.com/v1/location/search?CompanyId=2825826"
url_locations_3yrs = "https://api.qvinci.com/v1/location/search?CompanyId=1328328"
ult_result = requests.get(url_locations_trend,
headers={
'X-apiToken': '{}'.format(access_token)})
#decoded_result= result.read().decode("UTF-8")
json_data_trend = ult_result.text
dict_trend = json.loads(json_data_trend)
locations_trend = {}
#Name
locations_trend["Name"] = []
for i in dict_trend["Items"]:
locations_trend["Name"].append(i["Name"])
#ID
locations_trend["ID"] = []
for i in dict_trend["Items"]:
locations_trend["ID"].append(i["Id"])
#creates dataframe for locations under trend transformations
df_trend = pd.DataFrame(locations_trend)
#making a call to get locations data for under 3 yrs
ul3_result = requests.get(url_locations_3yrs,
headers={
'X-apiToken': '{}'.format(access_token)})
#decoded_result= result.read().decode("UTF-8")
json_data_3yrs= ul3_result.text
dict_3yrs = json.loads(json_data_3yrs)
locations_3yrs = {}
#Name
locations_3yrs["Name"] = []
for i in dict_3yrs["Items"]:
locations_3yrs["Name"].append(i["Name"])
#ID
locations_3yrs["ID"] = []
for i in dict_3yrs["Items"]:
locations_3yrs["ID"].append(i["Id"])
#creates dataframe for locations under 3 yrs
df_3yrs = pd.DataFrame(locations_3yrs)
#making a call to get locations data for under 3 yrs
ulm_result = requests.get(url_locations_mu,
headers={
'X-apiToken': '{}'.format(access_token)})
#decoded_result= result.read().decode("UTF-8")
json_data_mu = ulm_result.text
dict_mu = json.loads(json_data_mu)
locations_mu = {}
#Name
locations_mu["Name"] = []
for i in dict_mu["Items"]:
locations_mu["Name"].append(i["Name"])
#ID
locations_mu["ID"] = []
for i in dict_mu["Items"]:
locations_mu["ID"].append(i["Id"])
#creates dataframe for locations under 3 yrs
df_mu = pd.DataFrame(locations_mu)
locations_df = pd.concat([df_mu, df_3yrs, df_trend])
df_final = pd.DataFrame()
count = 0
for i in locations_df["ID"]:
if count < 3:
url_bs = "https://api.******.com/v1/Reporting/BalanceSheet?DateFrequency=Monthly&UseAccountMapping=true&VerticalAnalysisType=None&IncludeComputedColumns=true&RelativeDateRange=LastTwoCYTD&UseCustomDateRange=false&CompanyId=2825826&Locations=" + i
elif 2 < count < 12:
url_bs = "https://api.******.com/v1/Reporting/BalanceSheet?DateFrequency=Monthly&UseAccountMapping=true&VerticalAnalysisType=None&IncludeComputedColumns=true&RelativeDateRange=LastTwoCYTD&UseCustomDateRange=false&CompanyId=1328328&Locations=" + i
else :
url_bs = "https://api.******.com/v1/Reporting/BalanceSheet?DateFrequency=Monthly&UseAccountMapping=true&VerticalAnalysisType=None&IncludeComputedColumns=true&RelativeDateRange=LastTwoCYTD&UseCustomDateRange=false&CompanyId=1068071&Locations=" + i
result = requests.get(url_bs,
headers={
'X-apiToken': '{}'.format(access_token)})
#decoded_result= result.read().decode("UTF-8")
json_data = result.text
if(json_data != ""):
final = {}
dict1 = json.loads(json_data)
final["Months"] = dict1["ReportModel"]["ColumnNames"]
final["Location"] = [dict1["SelectedOptions"]["Locations"][0]]*len(final["Months"])
set = {"Total 10000 Cash","Total 12000 Inventory Asset","Total Other Current Assets","Total Fixed Assets","Total ASSETS",
"Total Accounts Payable","Total Credit Cards","24004 Customer Deposits","Total Liabilities","Total Equity","Total Long Term Liabilities"}
def search(dict2):
if len(dict2["Children"]) == 0:
return
for i in dict2["Children"]:
if(i["Name"] in set):
final[i["Name"]] = []
for j in i["Values"]:
final[i["Name"]].append(j["Value"])
search(i)
if ("Total " + dict2["Name"]) in set:
final["Total " + dict2["Name"]] = []
for j in dict2["TotalRow"]["Values"]:
final["Total " + dict2["Name"]].append(j["Value"])
return
for total in dict1["ReportModel"]["TopMostRows"]:
search(total)
df_final = pd.concat([df_final,pd.DataFrame(final)], sort = False)
else: print(i)
count = count + 1
#exporting dataframe to pdf
#df_final.to_csv(, sep='\t', encoding='utf-8')
df_final.to_csv('file1.csv')
Thank you.
You should post the code and the entire exception for a more accurate answer. However it seems to me that the API eventually is not returning a JSON (you could, for example, be making to many request at a very short period, so the API returns a 404)
Try priting/logging the API response before decoding to verify this.
EDIT:
Given the feedback, setting a interval between each iteration should resolve your issue. You can use time.sleep(0.5) inside the for loop. (remember to add import time)
You should also consider using try/except in your code so you can handle exceptions more broadly.

TypeError: must be str, not bytes in IBM Watson

I just finished the CodeAcademyIBM Watson course, and they programmed in python 2, when I brought the file over in python 3, I kept getting this error. The file script and all the credentials worked fine in CodeAcademy. Is this because I'm working in Python 3, or is it because of an issue in the code.
Traceback (most recent call last):
File "c:\Users\Guppy\Programs\PythonCode\Celebrity Match\CelebrityMatch.py", line 58, in <module>
user_result = analyze(user_handle)
File "c:\Users\Guppy\Programs\PythonCode\Celebrity Match\CelebrityMatch.py", line 22, in analyze
text += status.text.encode('utf-8')
TypeError: must be str, not bytes
Does anyone know whats wrong, the code is below:
import sys
import operator
import requests
import json
import twitter
from watson_developer_cloud import PersonalityInsightsV2 as PersonalityInsights
def analyze(handle):
twitter_consumer_key = '<key>'
twitter_consumer_secret = '<secret>'
twitter_access_token = '<token>'
twitter_access_secret = '<secret>'
twitter_api = twitter.Api(consumer_key=twitter_consumer_key, consumer_secret=twitter_consumer_secret, access_token_key=twitter_access_token, access_token_secret=twitter_access_secret)
statuses = twitter_api.GetUserTimeline(screen_name = handle, count = 200, include_rts = False)
text = ""
for status in statuses:
if (status.lang =='en'): #English tweets
text += status.text.encode('utf-8')
#The IBM Bluemix credentials for Personality Insights!
pi_username = '<username>'
pi_password = '<password>'
personality_insights = PersonalityInsights(username=pi_username, password=pi_password)
pi_result = personality_insights.profile(text)
return pi_result
def flatten(orig):
data = {}
for c in orig['tree']['children']:
if 'children' in c:
for c2 in c['children']:
if 'children' in c2:
for c3 in c2['children']:
if 'children' in c3:
for c4 in c3['children']:
if (c4['category'] == 'personality'):
data[c4['id']] = c4['percentage']
if 'children' not in c3:
if (c3['category'] == 'personality'):
data[c3['id']] = c3['percentage']
return data
def compare(dict1, dict2):
compared_data = {}
for keys in dict1:
if dict1[keys] != dict2[keys]:
compared_data[keys]=abs(dict1[keys] - dict2[keys])
return compared_data
user_handle = "#itsguppythegod"
celebrity_handle = "#giselleee_____"
user_result = analyze(user_handle)
celebrity_result = analyze(celebrity_handle)
user = flatten(user_result)
celebrity = flatten(celebrity_result)
compared_results = compare(user, celebrity)
sorted_result = sorted(compared_results.items(), key=operator.itemgetter(1))
for keys, value in sorted_result[:5]:
print(keys, end = " ")
print(user[keys], end = " ")
print ('->', end - " ")
print (celebrity[keys], end = " ")
print ('->', end = " ")
print (compared_results[keys])
You created a str (unicode text) object here:
text = ""
and then proceed to append UTF-8 encoded bytes:
text += status.text.encode('utf-8')
In Python 2, "" created a bytestring and that was all fine (albeit that you are then posting UTF-8 bytes to a service that will interpret it all as Latin-1, see the API documentation.
To fix this, don't encode the status texts until you are done collecting all the tweets. In addition, tell Watson to expect UTF-8 data. Last but not least, you should really build a list of twitter texts first and concatenate them in one step later on with str.join(), as concatenating strings in a loop takes quadratic time:
text = []
for status in statuses:
if (status.lang =='en'): #English tweets
text.append(status.text)
# ...
personality_insights = PersonalityInsights(username=pi_username, password=pi_password)
pi_result = personality_insights.profile(
' '.join(text).encode('utf8'),
content_type='text/plain; charset=utf-8'
)

TypeError: unorderable types: NoneType() > int()

I am new to python3, and am getting the following error trying to read earthquake data from last day!!
Traceback (most recent call last):
File "Z:\Python learning\Up and run\Exercise Files\Ch5\jsondata_finished.py", line 54, in <module>
main()
File "Z:\Python learning\Up and run\Exercise Files\Ch5\jsondata_finished.py", line 49, in main
printResults(data)
File "Z:\Python learning\Up and run\Exercise Files\Ch5\jsondata_finished.py", line 33, in printResults
if (feltReports != None) & (feltReports > 0):
TypeError: unorderable types: NoneType() > int()
I am unable to identify the error. Here is my CODE:
import urllib.request
import json
def printResults(data):
# Use the json module to load the string data into a dictionary
theJSON = json.loads(data.decode())
# now we can access the contents of the JSON like any other Python object
if "title" in theJSON["metadata"]:
print (theJSON["metadata"]["title"])
# output the number of events, plus the magnitude and each event name
count = theJSON["metadata"]["count"];
print (str(count) + " events recorded")
# for each event, print the place where it occurred
for i in theJSON["features"]:
print (i["properties"]["place"])
# print the events that only have a magnitude greater than 4
# for i in theJSON["features"]:
# if i["properties"]["mag"] >= 4.0:
# print "%2.1f" % i["properties"]["mag"], i["properties"]["place"]
# print only the events where at least 1 person reported feeling something
print ("Events that were felt:")
for i in theJSON["features"]:
feltReports = i["properties"]["felt"]
if (feltReports != None) & (feltReports > 0):
print ("%2.1f" % i["properties"]["mag"], i["properties"]["place"], " reported " + str(feltReports) + " times")
def main():
# define a variable to hold the source URL
# In this case we'll use the free data feed from the USGS
# This feed lists all earthquakes for the last day larger than Mag 2.5
urlData = "http://earthquake.usgs.gov/earthquakes/feed/v1.0/summary/2.5_day.geojson"
# Open the URL and read the data
webUrl = urllib.request.urlopen(urlData)
print (webUrl.getcode())
if (webUrl.getcode() == 200):
data = webUrl.read()
# print out our customized results
printResults(data)
else:
print ("Received an error from server, cannot retrieve results " + str(webUrl.getcode()))
if __name__ == "__main__":
main()
Please help! I have tried to do a few things over by looking at the solutions of other users, but still i was getting the same error over and over again.
Use and instead of &. Also, use is not None to check if object is not None instead of != None

python tweet parsing

I'm trying to parse tweets data.
My data shape is as follows:
59593936 3061025991 null null <d>2009-08-01 00:00:37</d> <s><a href="http://help.twitter.com/index.php?pg=kb.page&id=75" rel="nofollow">txt</a></s> <t>honda just recalled 440k accords...traffic around here is gonna be light...win!!</t> ajc8587 15 24 158 -18000 0 0 <n>adrienne conner</n> <ud>2009-07-23 21:27:10</ud> <t>eastern time (us & canada)</t> <l>ga</l>
22020233 3061032620 null null <d>2009-08-01 00:01:03</d> <s><a href="http://alexking.org/projects/wordpress" rel="nofollow">twitter tools</a></s> <t>new blog post: honda recalls 440k cars over airbag risk http://bit.ly/2wsma</t> madcitywi 294 290 9098 -21600 0 0 <n>madcity</n> <ud>2009-02-26 15:25:04</ud> <t>central time (us & canada)</t> <l>madison, wi</l>
I want to get the total numbers of tweets and the numbers of keyword related tweets. I prepared the keywords in text file. In addition, I wanna get the tweet text contents, total number of tweets which contain mention(#), retweet(RT), and URL (I wanna save every URL in other file).
So, I coded like this.
import time
import os
total_tweet_count = 0
related_tweet_count = 0
rt_count = 0
mention_count = 0
URLs = {}
def get_keywords(filepath):
with open(filepath) as f:
for line in f:
yield line.split()
for line in open('/nas/minsu/2009_06.txt'):
tweet = line.strip()
total_tweet_count += 1
with open('./related_tweets.txt', 'a') as save_file_1:
keywords = get_keywords('./related_keywords.txt', 'r')
if keywords in line:
text = line.split('<t>')[1].split('</t>')[0]
if 'http://' in text:
try:
url = text.split('http://')[1].split()[0]
url = 'http://' + url
if url not in URLs:
URLs[url] = []
URLs[url].append('\t' + text)
save_file_3 = open('./URLs_in_related_tweets.txt', 'a')
print >> save_file_3, URLs
except:
pass
if '#' in text:
mention_count +=1
if 'RT' in text:
rt_count += 1
related_tweet_count += 1
print >> save_file_1, text
save_file_2 = open('./info_related_tweets.txt', 'w')
print >> save_file_2, str(total_tweet_count) + '\t' + srt(related_tweet_count) + '\t' + str(mention_count) + '\t' + str(rt_count)
save_file_1.close()
save_file_2.close()
save_file_3.close()
The keyword set likes
Happy
Hello
Together
I think my code has many problem, but the first error is as follws:
Traceback (most recent call last):
File "health_related_tweets.py", line 21, in <module>
keywords = get_keywords('./public_health_related_words.txt', 'r')
TypeError: get_keywords() takes exactly 1 argument (2 given)
Please help me out!
The issue is self explanatory in the error, you have specified two parameters in your call to get_keywords() but your implementation only has one parameter. You should change your get_keywords implementation to something like:
def get_keywords(filepath, mode):
with open(filepath, mode) as f:
for line in f:
yield line.split()
Then you can use the following line without that specific error:
keywords = get_keywords('./related_keywords.txt', 'r')
Now you are getting this error:
Traceback (most recent call last): File "health_related_tweets.py", line 23, in if keywords in line: TypeError: 'in ' requires string as left operand, not generator
The reason is that keywords = get_keywords(...) returns a generator. Logically thinking about it, keywords should be a list of all the keywords. And for each keyword in this list, you want to check if it's in the tweet/line or not.
Sample code:
keywords = get_keywords('./related_keywords.txt', 'r')
has_keyword = False
for keyword in keywords:
if keyword in line:
has_keyword = True
break
if has_keyword:
# Your code here (for the case when the line has at least one keyword)
(The above code would be replacing if keywords in line:)

Categories