I am using the following code to read data from yahoo finance from a list symbols downloaded from nasdaq.
pnls = {i:dreader.DataReader(i,'yahoo','1985-01-01','2017-03-30') for i in symbols}
for df_name in pnls:
try:
pnls.get(df_name).to_csv("/Users/Jiong/Documents/data/{}_data.csv".format(df_name), index=True, header=True)
except:
print("error {}".format(df_name))
else:
print("done {}".format(df_name))
Guess some symbols may not be valid and Yahoo Finance throws RemoteDataError Exception.
The code above is supposed to continue on, but it stopped still at the errors.
Isn't except to catch all exceptions? or this is run time error ?
Anyway to get the code ignores it and continue? Thanks. See errors below running
118 if params is not None and len(params) > 0:
119 url = url + "?" + urlencode(params)
--> 120 raise RemoteDataError('Unable to read URL: {0}'.format(url))
121
122 def _read_lines(self, out):
RemoteDataError: Unable to read URL: http://ichart.finance.yahoo.com/table.csv?c=1985&f=2017&s=MITT%5EA&g=d&ignore=.csv&d=2&e=30&a=0&b=1
You need to handle the raise exception or it will halt at the place of raised. So, if a raise exception is not caught and handled, it will be interrupted.
What you need is something like this :
except RemoteDataError as exp :
print('Unable to read URL: {0}'.format(url))
You can refer to this documentation for more details on errors.
Grabbed this import from another page; it seemed to correct the issue for me. I was getting a remote error while trying to pull stock data from Yahoo. You'll need to do the following import prior to adding the RemoteDataError exception:
from pandas_datareader._utils import RemoteDataError
df_ = pd.DataFrame()
for i in assets:
print(i)
try:
vec = web.DataReader(i, 'yahoo', start='12/10/2006', end='2/1/2007')
vec['asset'] = i
vec['returns_close_raw'] = np.log(vec.Close/vec.Close.shift())
vec['returns_open_raw'] = np.log(vec.Open/vec.Open.shift())
vec['returns_open_raw10'] = np.log(vec.Open/vec.Open.shift(10))
vec['returns_close_raw10'] = np.log(vec.Close/vec.Close.shift(10))
df_ = pd.concat([df_, vec])
except RemoteDataError:
print('remote error')
except KeyError:
print('key error')
Related
I would like to read a set of csv files from URL as dataframes. These files contain a date in their name like YYYYMMDD.csv in their names. I need to iterate over a set of predefined dates and read the corresponding file into a Python.
Sometimes the file does not exist and an error as follows is thrown:
raise HTTPError(req.full_url, code, msg, hdrs, fp)
urllib.error.HTTPError: HTTP Error 404: Not Found
What I would do in this situation is to add one day to the date like turning 2020-05-01 to 2020-05-02 and in case of throwing the aforementioned error I would add 2 days to the date or at most 3 days until there is a url available without an error.
I would like to know how I can write it in a program maybe with nested try - except where if adding 1 day to the date leads to a URL without error the subsequent steps are not executed.
As I don't have the data I will use the following URL as an example:
import pandas as pd
import requests
url = 'http://winterolympicsmedals.com/medals.csv'
s = requests.get(url).content
c = pd.read_csv(s)
Here the file being read is medals.csv. If you try madels.csv or modals.csv you will get the error I am talking about. So I need to know how I can control the errors in 3 steps by replacing the file name until I get the desired dataframe like first we try madels.csv resulting in an error, then models.csv also resulting in an error and after that medals.csv which result in the desired output.
My problem is that sometimes the modification I made to the file also fails in except so I need to know how I can accommodate a second modification.
No need to do any nested try-except blocks, all you need is one try-except and a for loop.
First, function that tries to read a file (returns content of the file, or None if the file is not found):
def read_file(fp):
try:
with open(fp, 'r') as f:
text = f.read()
return text
except Exception as e:
print(e)
return None
Then, function that tries to find a file from a predefined date (example input would be '20220514'). The functions tries to read content of the file with the given date, or dates up to 3 days after it:
from datetime import datetime, timedelta
def read_from_predefined_date(date):
date_format = '%Y%m%d'
date = datetime.strptime(date, date_format)
result = None
for i in range(4):
date_to_read = date + timedelta(days=i)
date_as_string = date_to_read.strftime(date_format)
fp = f'data/{date_as_string}.csv'
result = read_file(fp)
if result:
break
return result
To test, e.g. create a data/20220515.csv and run following code:
d = '20220514'
result = read_from_predefined_date(d)
here's a simple function that given that URL will do exactly what you ask. Be aware that a slight change in the input url can lead to a few errors so make sure the date format is exactly what you mentioned. In any case:
Notes: URL Parsing
import pandas as pd
import datetime as dt
def url_next_day(url):
# if the full url is passed I suggest you would use urllib parse but
# urllib parse but here's a workaround
filename = url.rstrip("/").split("/")[-1]
date=dt.datetime.strptime(filename.strip(".csv"),"%Y%m%d").date()
date_plus_one_day= date + dt.timedelta(days=1)
new_file_name= dt.datetime.strftime(date_plus_one_day,"%Y%m%d")+".csv"
url_next_day=url.replace(filename,new_file_name)
return url_next_day
for url in list_of_urls:
try:
s = requests.get(url).content
c = pd.read_csv(s)
except Exception as e:
print(f"Invalid URL: {url}. The error: {e}. Trying the days after...")
for _ in range(3): #because you want at most 3 days after
try:
url=url_next_day(url)
s = requests.get(url).content
c = pd.read_csv(s)
break
except Exception:
pass
else:
print("No file available in the days after. Moving on")
Happy Coding!
OK, I have enough changes I want to recommend on top of #Daniel Gonçalves's initial solution that I'm going to post them as a second answer.
1- The loop trying additional days needs to break when it got a hit, so it doesn't keep going.
2- That loop needs an else: block to handle the complete failure case.
3- It is best practice to catch only the exception you mean to catch and know how to handle. Here a urllib.error.HTTPError means a failure to fetch the page, but an other exception would mean something else is wrong with the program, and it would be best not to catch that, so you would notice it and fix your program when that happens.
The result:
for url in list_of_urls:
try:
s = requests.get(url).content
c = pd.read_csv(s)
except urllib.error.HTTPError as e:
print(f"Invalid URL: {url}. The error: {e}. Trying the days after...")
for _ in range(3): #because you want at most 3 days after
try:
url = url_next_day(url)
s = requests.get(url).content
c = pd.read_csv(s)
break
except urllib.error.HTTPError:
print(f"Also failed to fetch {url}...")
else:
# this block is only executed if the loop never breaks
print("No file available in the days after. Moving on.")
c = None # or an empty data frame, or whatever won't crash the rest of your code
i'm stuck with my simple code, i'm a newbie for coding.. I'm using python and
In my list i have bad values that made exceptions (httperror : 404) . I want to ignore this exceptions and continue my loop. But with my code, the print("Http error") loop again and again. I don't know how to pass this exception to loop the entire code again.
while i < len(list_siret):
try :
data = api.siret(list_sirets[i]).get()
str_datajs = json.dumps(data, indent= 4)
a_json = json.loads(str_datajs)
i +=1
print("test1", i ,str_datajs)
except urllib.error.URLError :
print("Http error")
pass
Since you have print("Http error") inside the except block, it will be executed every time the exception occurs.
Consider the more idiomatic approach below:
for siret in list_siret:
try:
data = api.siret(siret).get()
except urllib.error.URLError:
continue
str_datajs = json.dumps(data, indent=4)
a_json = json.loads(str_datajs)
print("test1", i ,str_datajs)
We iterate directly over list_siret without needing to index into it and manually manage i, and instead of passing we just move to the next element in the list in case an exception was raised.
I am trying to scrape data using yfinance and have met a road block when trying to retrieve a ticker with no data, the error is - 7086.KL: No data found for this date range, symbol may be delisted.
How do I try catch this error? I've tried try catching it as seen in the code below but it still prints that error.
The code:
tickerdata = yf.Ticker("7086.KL")
try:
history = tickerdata.history(start="2019-06-01", end="2020-05-01")
except ValueError as ve:
print("Error")
Any advise on how to solve this?
I just had a look at the source code. It looks like they are indeed just printing the message. But they are also adding the error to a dictionary in the shared.py file. You can use this to check for errors:
from yfinance import shared
ticker = <ticker as string>
tickerdata = yf.Ticker(ticker)
history = tickerdata.history(start="2019-06-01", end="2020-05-01")
error_message = shared._ERRORS[ticker]
I am trying to catch an error when writing an excel file, for example when the file is already open:
import pandas as pd
import xlsxwriter
test = "2020/04/02|17:50:33|Conversion succeeded (SemanticProtocolConverter): TEST/Neuro/Neuro/Dot Tete VE11/AX T1 mprage post|TE: 3.24 --> 3.02 ms; Echo Spacing: 7.84 --> 7.62 ms; Coil Selection: Manual --> ACS All but spine|22808"
test2 = test.split("|")
df = pd.DataFrame(test2)
df = df.transpose()
outDF = test2
outXLSX = pd.ExcelWriter("test.xlsx", engine='xlsxwriter')
df.to_excel(outXLSX, 'Test', index=False)
try:
outXLSX.save()
except IOError:
print("Cannot open the file")
print("done")
The problem is that it doesn't catch the error. How can I make sure I can write to the file?
Thanks,
Bart
If it is not catching the error, that means the Exception which you defined in your try\except statement is not exception that it is throwing. You can try a more general Exception, it might throw an error that might help you solve the issue.
try:
outXLSX.save()
except Exception as e:
print(e)
finally:
print("done")
This error that you are getting is coming from xlsxwriter. To catch this error you should use this block of code:
try:
outXLSX.save()
except xlsxwriter.exceptions.FileCreateError:
print("Can not save because file is open. Close it and retry.")
except Exception as error:
print(error)
Have a look at xlsxwriter's documentation here .
I have following code in a python script
try:
# send the query request
sf = urllib2.urlopen(search_query)
search_soup = BeautifulSoup.BeautifulStoneSoup(sf.read())
sf.close()
except Exception, err:
print("Couldn't get programme information.")
print(str(err))
return
I'm concerned because if I encounter an error on sf.read(), then sf.clsoe() is not called.
I tried putting sf.close() in a finally block, but if there's an exception on urlopen() then there's no file to close and I encounter an exception in the finally block!
So then I tried
try:
with urllib2.urlopen(search_query) as sf:
search_soup = BeautifulSoup.BeautifulStoneSoup(sf.read())
except Exception, err:
print("Couldn't get programme information.")
print(str(err))
return
but this raised a invalid syntax error on the with... line.
How can I best handle this, I feel stupid!
As commenters have pointed out, I am using Pys60 which is python 2.5.4
I would use contextlib.closing (in combination with from __future__ import with_statement for old Python versions):
from contextlib import closing
with closing(urllib2.urlopen('http://blah')) as sf:
search_soup = BeautifulSoup.BeautifulStoneSoup(sf.read())
Or, if you want to avoid the with statement:
try:
sf = None
sf = urllib2.urlopen('http://blah')
search_soup = BeautifulSoup.BeautifulStoneSoup(sf.read())
finally:
if sf:
sf.close()
Not quite as elegant though.
finally:
if sf: sf.close()
Why not just try closing sf, and passing if it doesn't exist?
import urllib2
try:
search_query = 'http://blah'
sf = urllib2.urlopen(search_query)
search_soup = BeautifulSoup.BeautifulStoneSoup(sf.read())
except urllib2.URLError, err:
print(err.reason)
finally:
try:
sf.close()
except NameError:
pass
Given that you are trying to use 'with', you should be on Python 2.5, and then this applies too: http://docs.python.org/tutorial/errors.html#defining-clean-up-actions
If urlopen() has an exception, catch it and call the exception's close() function, like this:
try:
req = urllib2.urlopen(url)
req.close()
print 'request {0} ok'.format(url)
except urllib2.HTTPError, e:
e.close()
print 'request {0} failed, http code: {1}'.format(url, e.code)
except urllib2.URLError, e:
print 'request {0} error, error reason: {1}'.format(url, e.reason)
the exception is also a full response object, you can see this issue message: http://bugs.jython.org/issue1544
Looks like the problem runs deeper than I thought - this forum thread indicates urllib2 doesn't implement with until after python 2.6, and possibly not until 3.1
You could create your own generic URL opener:
from contextlib import contextmanager
#contextmanager
def urlopener(inURL):
"""Open a URL and yield the fileHandle then close the connection when leaving the 'with' clause."""
fileHandle = urllib2.urlopen(inURL)
try: yield fileHandle
finally: fileHandle.close()
Then you could then use your syntax from your original question:
with urlopener(theURL) as sf:
search_soup = BeautifulSoup.BeautifulSoup(sf.read())
This solution gives you a clean separation of concerns. You get a clean generic urlopener syntax that handles the complexities of properly closing the resource regardless of errors that occur underneath your with clause.
Why not just use multiple try/except blocks?
try:
# send the query request
sf = urllib2.urlopen(search_query)
except urllib2.URLError as url_error:
sys.stderr.write("Error requesting url: %s\n" % (search_query,))
raise
try:
search_soup = BeautifulSoup.BeautifulStoneSoup(sf.read())
except Exception, err: # Maybe catch more specific Exceptions here
sys.stderr.write("Couldn't get programme information from url: %s\n" % (search_query,))
raise # or return as in your original code
finally:
sf.close()