Geopy with error handling - python

I have some Python code with error handling in place but for some reason the code still seems to be unable to handle this particular error:
raise GQueryError("No corresponding geographic location could be found for the specified location, possibly because the address is relatively new, or because it may be incorrect.")
geopy.geocoders.google.GQueryError: No corresponding geographic location could be found for the specified location, possibly because the address is relatively new, or because it may be incorrect.
This is the source:
import csv
from geopy import geocoders
import time
g = geocoders.Google()
spamReader = csv.reader(open('locations.csv', 'rb'), delimiter='\t', quotechar='|')
f = open("output.txt",'w')
for row in spamReader:
a = ', '.join(row)
#exactly_one = False
time.sleep(1)
try:
place, (lat, lng) = g.geocode(a)
except ValueError:
#print("Error: geocode failed on input %s with message %s"%(a, error_message))
continue
b = str(place) + "," + str(lat) + "," + str(lng) + "\n"
print b
f.write(b)
Have I not included enough error handling? I was under the impression that "except ValueError" would handle this situation but I must be wrong on that.
Thanks in advance for any help out!
P.S. I pulled this out of the code but I don't know what it really means yet:
def check_status_code(self,status_code):
if status_code == 400:
raise GeocoderResultError("Bad request (Server returned status 400)")
elif status_code == 500:
raise GeocoderResultError("Unkown error (Server returned status 500)")
elif status_code == 601:
raise GQueryError("An empty lookup was performed")
elif status_code == 602:
raise GQueryError("No corresponding geographic location could be found for the specified location, possibly because the address is relatively new, or because it may be incorrect.")
elif status_code == 603:
raise GQueryError("The geocode for the given location could be returned due to legal or contractual reasons")
elif status_code == 610:
raise GBadKeyError("The api_key is either invalid or does not match the domain for which it was given.")
elif status_code == 620:
raise GTooManyQueriesError("The given key has gone over the requests limit in the 24 hour period or has submitted too many requests in too short a period of time.")

Right now the try/except is only catching ValueErrors. To catch GQueryError as well, replace the except ValueError: line with:
except (ValueError, GQueryError):
Or if GQueryError isn't in your namespace, you may need something like this:
except (ValueError, geocoders.google.GQueryError):
Or to catch ValueError and all the errors listed in check_status_code:
except (ValueError, GQueryError, GeocoderResultError,
GBadKeyError, GTooManyQueriesError):
(Again, add geocoders.google. or whatever the error location is to the front of all the geopy errors if they're not in your namespace.)
Or, if you just want to catch all possible exceptions, you could simply do:
except:
but this is generally bad practice, because it'll also catch things like a syntax error in your place, (lat, lng) = g.geocode(a) line, which you don't want caught, so it's better to examine the geopy code to find all the possible exceptions it could be throwing that you want to catch. Hopefully all of those are listed in that bit of code you found.

Related

Python function not failing and returning none

I am writing robot test cases with python functions.
I have a python function like this
def ParseJSONUserData(jsonstring):
if len(jsonstring) == 0:
print("String is empty. No processing is possible.")
json_dict = json.loads(jsonstring)
if len(json_dict) == 0:
print("No data found in file.")
currentdt = json_dict[CURRENTDATETIME]
if len(currentdt) == 0:
print ("The Current datetime property is empty")
print ("Systems found: ", len( json_dict[SYSTEMS]))
if len(json_dict[SYSTEMS]) == 0 :
print("No systems found!")
for system_result in json_dict[SYSTEMS]:
if system_result[SERIALNUM] in systems:
print ("Duplicate systemserialnumber value: " + system_result[SERIALNUM] )
systems.add(system_result[SERIALNUM])
if len(system_result[PUBKEYS][0]["pubkey"]) == 0 :
print("No pubkey found for system: " + system_result[SERIALNUM] )
if len(system_result[PUBKEYS][0]["system"]) == 0 :
print("No pubkey system designator (typically intraop) found for system: " + system_result[SERIALNUM] )
This is my robot framework code.
${response}= GET ${host}/data ${data} headers=${header}
${op}= ParseJSONUserData response.json()
Log to console ${op}
I am trying any one of the validation is failed in python. It should fail here in robot code. But even if I pass wrong data python function getting executed and robot test case also getting success. Any help will be much appreciated.
For a keyword to fail, it must thrown an exception. Instead of printing a string, raise an error. For example:
if len(jsonstring) == 0:
raise Exception("String is empty. No processing is possible.")
For more information see Reporting keyword stats in the robot framework user guide, where it says this:
Reporting keyword status is done simply using exceptions. If an executed method raises an exception, the keyword status is FAIL, and if it returns normally, the status is PASS.
Normal execution failures and errors can be reported using the standard exceptions such as AssertionError, ValueError and RuntimeError. There are, however, some special cases explained in the subsequent sections where special exceptions are needed.

How to define multiple error handling statements?

I would like to read a set of csv files from URL as dataframes. These files contain a date in their name like YYYYMMDD.csv in their names. I need to iterate over a set of predefined dates and read the corresponding file into a Python.
Sometimes the file does not exist and an error as follows is thrown:
raise HTTPError(req.full_url, code, msg, hdrs, fp)
urllib.error.HTTPError: HTTP Error 404: Not Found
What I would do in this situation is to add one day to the date like turning 2020-05-01 to 2020-05-02 and in case of throwing the aforementioned error I would add 2 days to the date or at most 3 days until there is a url available without an error.
I would like to know how I can write it in a program maybe with nested try - except where if adding 1 day to the date leads to a URL without error the subsequent steps are not executed.
As I don't have the data I will use the following URL as an example:
import pandas as pd
import requests
url = 'http://winterolympicsmedals.com/medals.csv'
s = requests.get(url).content
c = pd.read_csv(s)
Here the file being read is medals.csv. If you try madels.csv or modals.csv you will get the error I am talking about. So I need to know how I can control the errors in 3 steps by replacing the file name until I get the desired dataframe like first we try madels.csv resulting in an error, then models.csv also resulting in an error and after that medals.csv which result in the desired output.
My problem is that sometimes the modification I made to the file also fails in except so I need to know how I can accommodate a second modification.
No need to do any nested try-except blocks, all you need is one try-except and a for loop.
First, function that tries to read a file (returns content of the file, or None if the file is not found):
def read_file(fp):
try:
with open(fp, 'r') as f:
text = f.read()
return text
except Exception as e:
print(e)
return None
Then, function that tries to find a file from a predefined date (example input would be '20220514'). The functions tries to read content of the file with the given date, or dates up to 3 days after it:
from datetime import datetime, timedelta
def read_from_predefined_date(date):
date_format = '%Y%m%d'
date = datetime.strptime(date, date_format)
result = None
for i in range(4):
date_to_read = date + timedelta(days=i)
date_as_string = date_to_read.strftime(date_format)
fp = f'data/{date_as_string}.csv'
result = read_file(fp)
if result:
break
return result
To test, e.g. create a data/20220515.csv and run following code:
d = '20220514'
result = read_from_predefined_date(d)
here's a simple function that given that URL will do exactly what you ask. Be aware that a slight change in the input url can lead to a few errors so make sure the date format is exactly what you mentioned. In any case:
Notes: URL Parsing
import pandas as pd
import datetime as dt
def url_next_day(url):
# if the full url is passed I suggest you would use urllib parse but
# urllib parse but here's a workaround
filename = url.rstrip("/").split("/")[-1]
date=dt.datetime.strptime(filename.strip(".csv"),"%Y%m%d").date()
date_plus_one_day= date + dt.timedelta(days=1)
new_file_name= dt.datetime.strftime(date_plus_one_day,"%Y%m%d")+".csv"
url_next_day=url.replace(filename,new_file_name)
return url_next_day
for url in list_of_urls:
try:
s = requests.get(url).content
c = pd.read_csv(s)
except Exception as e:
print(f"Invalid URL: {url}. The error: {e}. Trying the days after...")
for _ in range(3): #because you want at most 3 days after
try:
url=url_next_day(url)
s = requests.get(url).content
c = pd.read_csv(s)
break
except Exception:
pass
else:
print("No file available in the days after. Moving on")
Happy Coding!
OK, I have enough changes I want to recommend on top of #Daniel Gonçalves's initial solution that I'm going to post them as a second answer.
1- The loop trying additional days needs to break when it got a hit, so it doesn't keep going.
2- That loop needs an else: block to handle the complete failure case.
3- It is best practice to catch only the exception you mean to catch and know how to handle. Here a urllib.error.HTTPError means a failure to fetch the page, but an other exception would mean something else is wrong with the program, and it would be best not to catch that, so you would notice it and fix your program when that happens.
The result:
for url in list_of_urls:
try:
s = requests.get(url).content
c = pd.read_csv(s)
except urllib.error.HTTPError as e:
print(f"Invalid URL: {url}. The error: {e}. Trying the days after...")
for _ in range(3): #because you want at most 3 days after
try:
url = url_next_day(url)
s = requests.get(url).content
c = pd.read_csv(s)
break
except urllib.error.HTTPError:
print(f"Also failed to fetch {url}...")
else:
# this block is only executed if the loop never breaks
print("No file available in the days after. Moving on.")
c = None # or an empty data frame, or whatever won't crash the rest of your code

Python Try with some Additional Logic

Hello I am trying to make 'try' only work under one condition:
try:
print "Downloading URL: ", url
contents = urllib2.urlopen(url).read()
except:
message = "No record retrieved."
print message
return None
I do not want the above code to work if the kwarg nodownload is True.
So I have tried the following:
try:
if nodownload:
print "Not downloading file!"
time.sleep(6)
raise
print "Downloading URL: ", url
contents = urllib2.urlopen(url).read()
except:
message = "No record retrieved."
print message
return None
The above always downloads no matter if the --nd argument is passed in the command line. The below always skips the file rather the argument is passed or not.
if not nodownload:
print "Not downloading file!"
time.sleep(6)
raise
print "Downloading URL: ", url
contents = urllib2.urlopen(url).read()
except:
message = "No record retrieved."
print message
return None
No download is input at the command line:
parser.add_argument('--nodownload', dest='nodownload', action='store_true',
help='This doesn't work for some reason')
You can use raise to cause an exception when you need to, thus making the try fail.
As others have mentioned, one can raise an exception.
Apart from using predefined exceptions, you may also use your own:
class BadFriend(Exception):
pass
class VirtualFriend(Exception):
pass
class DeadFriend(Exception):
pass
try:
name = raw_input("Tell me name of your friend: ")
if name in ["Elvis"]:
raise DeadFriend()
if name in ["Drunkie", "Monkey"]:
raise BadFriend()
if name in ["ET"]:
raise VirtualFriend()
print("It is nice, you have such a good friend.")
except BadFriend:
print("Better avoid bad friends.")
except VirtualFriend:
print("Whend did you shake hands with him last time?")
except DeadFriend:
print("I am very sorry to tell you, that...")
You can even pass some data via raised exception, but be careful not to abuse it too far (if
standard structures are working, use simpler ones).

how to properly use try/except/with inside functions and main

I am a relative python newbie and I am getting confused with how to properly handle exceptions. Apologies for the dumb question.
In my main() I iterate through a list of dates and for each date I call a function, which downloads a csv file from a public web server. I want to properly catch exceptions for obvious reasons but especially because I do not know when the files of interest will be available for download. My program will execute as part of a cron job and will attempt to download these files every 3 hours if available.
What I want is to download the first file in the list of dates and if that results in a 404 then the program shouldn't proceed to the next file because the assumption is if the oldest date in the list is not available then none of the others that come after it will be available either.
I have the following python pseudo code. I have try/except blocks inside the function that attempts to download the files but if an exception occurred inside the function how do I properly handle it in the main() so I can make decisions whether to proceed to the next date or not. The reason why I created a function to perform the download is because I want to re-use that code later on in the same main() block for other file types.
def main():
...
...
# datelist is a list of date objects
for date in datelist:
download_file(date)
def download_file(date):
date_string = str(date.year) + str(date.strftime('%m')) + str(date.strftime('%d'))
request = HTTP_WEB_PREFIX+ date_string + FILE_SUFFIX
try:
response = urllib2.urlopen(request)
except urllib2.HTTPError, e:
print "HTTPError = " + str(e)
except urllib2.URLError, e:
print "URLError = " + str(e)
except httplib.HTTPException, e:
print "HTTPException = " + str(e)
except IOError:
print "IOError = " + str(e)
except Exception:
import traceback
print "Generic exception: " + traceback.format_exc()
else:
print "No problem downloading %s - continue..." % (response)
try:
with open(TMP_DOWNLOAD_DIRECTORY + response, 'wb') as f:
except IOError:
print "IOError = " + str(e)
else:
f.write(response.read())
f.close()
The key concept here is, if you can fix the problem, you should trap the exception; if you can't, it's the caller's problem to deal with. In this case, the downloader can't fix things if the file isn't there, so it should bubble up its exceptions to the caller; the caller should know to stop the loop if there's an exception.
So let's move all the exception handling out of the function into the loop, and fix it so it craps out if there's a failure downloading the file, as the spec requires:
for date in datelist:
date_string = str(date.year) +
str(date.strftime('%m')) +
str(date.strftime('%d'))
try:
download_file(date_string)
except:
e = sys.exc_info()[0]
print ( "Error downloading for date %s: %s" % (date_string, e) )
break
download_file should now, unless you want to put in retries or something like that, simply not trap the exceptions at all. Since you've decoded the date as you like in the caller, that code can come out of download_file as well, giving the much simpler
def download_file(date_string):
request = HTTP_WEB_PREFIX + date_string + FILE_SUFFIX
response = urllib2.urlopen(request)
print "No problem downloading %s - continue..." % (response)
with open(TMP_DOWNLOAD_DIRECTORY + response, 'wb') as f:
f.write(response.read())
f.close()
I would suggest that the print statement is superfluous, but that if you really want it, using logger is a more flexible way forward, as that will allow you to turn it on or off as you prefer later by changing a config file instead of the code.
From my understanding of your question... you should just insert code into the except blocks that you want to execute when you encounter the particular exception. You don't have to print out the encountered error, you can do whatever you feel is necessary when it is raised... provide a popup box with information/options or otherwise direct your program to the next step. Your else section should isolate that portion, so it will only execute if none of your exceptions are raised.

Python hide error

Alright, I am fairly new to python and I am making a console witch will allow multiple features, one of those is to grab a page source and either print it on the page, or if they have another arg then name the file of that arg... The first arg would be the website url to grab the source from.
My imports are:
import os, urllib.request
This is my code:
def grab(command, args, argslist):
if args == "":
print("The " + command + " command wan't used correctly type help " + command + " for help...")
if args != "":
print("This may take a second...")
try:
argslistcheck = argslist[0]
if argslistcheck[0:7] != "http://":
argslist[0] = "http://" + argslist[0]
with urllib.request.urlopen(argslist[0]) as url:
source = url.read()
source = str(source, "utf-8")
except IndexError:
print("Couln't connect")
source = ""
try:
filesourcename = argslist[1] + ".txt"
filesourceopen = open(filesourcename, "w")
filesourceopen.write(source)
filesourceopen.close()
print("You can find the file save in " + os.getcwd() + " named " + argslist[1] + ".txt.")
except IndexError:
print(source)
Now while I will be ok with improving my code right now I'm focusing on the main point. Right now it works, I will improve the code later on, the only problem is that if the user inputs a fake website or a website page that doesn't exist then it returns lot's of errors. Yet if I change:
except IndexError:
print("Coulnd't connect")
source = ""
to just:
except:
print("Couldn't connect")
source = ""
Then it always says Couldn't connect...
Any help? I didn't put the rest of my code because I didn't think it would be useful, if you need it I can put it all.
The reason I titled this hide error is because it still works for some reason it just says that it was unable to connect, if the user types a second argument then it will save the source to the file he named.
try:
argslistcheck = argslist[0]
if argslistcheck[0:4] != "http://":
argslist[0] = "http://" + argslist[0]
with urllib.request.urlopen(argslist[0]) as url:
source = url.read()
source = str(source, "utf-8")
except IndexError:
print("Couln't connect")
source = ""
In that code block, the only thing that can raise an IndexError exception is the argslist[0]. This will happen if there is no element within that list. This is very likely not your problem.
Now if an invalid address is entered, urlopen will fail. But it will not raise an IndexError but rather an urllib.error.URLError or the more specialized urllib.error.HTTPError.
If you just write except IndexError you will only catch that error, but not the exception raised by the urlopen. If you want to catch those as well, you have to add another except case:
except IndexError:
print('Argument is missing')
except urllib.error.URLError:
print('Could not connect to the URL.')
The alternative is to just catch any exception by just not specifying any (this is what you did in your last code). Note that this is usually not recommended as it will hide any exceptions that occur which you might not have expected to ever happen; i.e. it will hide bugs. So if you know that there are only a few possible exceptions, just catch those and handle them explicitely.

Categories