I am trying to scrape data using yfinance and have met a road block when trying to retrieve a ticker with no data, the error is - 7086.KL: No data found for this date range, symbol may be delisted.
How do I try catch this error? I've tried try catching it as seen in the code below but it still prints that error.
The code:
tickerdata = yf.Ticker("7086.KL")
try:
history = tickerdata.history(start="2019-06-01", end="2020-05-01")
except ValueError as ve:
print("Error")
Any advise on how to solve this?
I just had a look at the source code. It looks like they are indeed just printing the message. But they are also adding the error to a dictionary in the shared.py file. You can use this to check for errors:
from yfinance import shared
ticker = <ticker as string>
tickerdata = yf.Ticker(ticker)
history = tickerdata.history(start="2019-06-01", end="2020-05-01")
error_message = shared._ERRORS[ticker]
Related
I would like to read a set of csv files from URL as dataframes. These files contain a date in their name like YYYYMMDD.csv in their names. I need to iterate over a set of predefined dates and read the corresponding file into a Python.
Sometimes the file does not exist and an error as follows is thrown:
raise HTTPError(req.full_url, code, msg, hdrs, fp)
urllib.error.HTTPError: HTTP Error 404: Not Found
What I would do in this situation is to add one day to the date like turning 2020-05-01 to 2020-05-02 and in case of throwing the aforementioned error I would add 2 days to the date or at most 3 days until there is a url available without an error.
I would like to know how I can write it in a program maybe with nested try - except where if adding 1 day to the date leads to a URL without error the subsequent steps are not executed.
As I don't have the data I will use the following URL as an example:
import pandas as pd
import requests
url = 'http://winterolympicsmedals.com/medals.csv'
s = requests.get(url).content
c = pd.read_csv(s)
Here the file being read is medals.csv. If you try madels.csv or modals.csv you will get the error I am talking about. So I need to know how I can control the errors in 3 steps by replacing the file name until I get the desired dataframe like first we try madels.csv resulting in an error, then models.csv also resulting in an error and after that medals.csv which result in the desired output.
My problem is that sometimes the modification I made to the file also fails in except so I need to know how I can accommodate a second modification.
No need to do any nested try-except blocks, all you need is one try-except and a for loop.
First, function that tries to read a file (returns content of the file, or None if the file is not found):
def read_file(fp):
try:
with open(fp, 'r') as f:
text = f.read()
return text
except Exception as e:
print(e)
return None
Then, function that tries to find a file from a predefined date (example input would be '20220514'). The functions tries to read content of the file with the given date, or dates up to 3 days after it:
from datetime import datetime, timedelta
def read_from_predefined_date(date):
date_format = '%Y%m%d'
date = datetime.strptime(date, date_format)
result = None
for i in range(4):
date_to_read = date + timedelta(days=i)
date_as_string = date_to_read.strftime(date_format)
fp = f'data/{date_as_string}.csv'
result = read_file(fp)
if result:
break
return result
To test, e.g. create a data/20220515.csv and run following code:
d = '20220514'
result = read_from_predefined_date(d)
here's a simple function that given that URL will do exactly what you ask. Be aware that a slight change in the input url can lead to a few errors so make sure the date format is exactly what you mentioned. In any case:
Notes: URL Parsing
import pandas as pd
import datetime as dt
def url_next_day(url):
# if the full url is passed I suggest you would use urllib parse but
# urllib parse but here's a workaround
filename = url.rstrip("/").split("/")[-1]
date=dt.datetime.strptime(filename.strip(".csv"),"%Y%m%d").date()
date_plus_one_day= date + dt.timedelta(days=1)
new_file_name= dt.datetime.strftime(date_plus_one_day,"%Y%m%d")+".csv"
url_next_day=url.replace(filename,new_file_name)
return url_next_day
for url in list_of_urls:
try:
s = requests.get(url).content
c = pd.read_csv(s)
except Exception as e:
print(f"Invalid URL: {url}. The error: {e}. Trying the days after...")
for _ in range(3): #because you want at most 3 days after
try:
url=url_next_day(url)
s = requests.get(url).content
c = pd.read_csv(s)
break
except Exception:
pass
else:
print("No file available in the days after. Moving on")
Happy Coding!
OK, I have enough changes I want to recommend on top of #Daniel Gonçalves's initial solution that I'm going to post them as a second answer.
1- The loop trying additional days needs to break when it got a hit, so it doesn't keep going.
2- That loop needs an else: block to handle the complete failure case.
3- It is best practice to catch only the exception you mean to catch and know how to handle. Here a urllib.error.HTTPError means a failure to fetch the page, but an other exception would mean something else is wrong with the program, and it would be best not to catch that, so you would notice it and fix your program when that happens.
The result:
for url in list_of_urls:
try:
s = requests.get(url).content
c = pd.read_csv(s)
except urllib.error.HTTPError as e:
print(f"Invalid URL: {url}. The error: {e}. Trying the days after...")
for _ in range(3): #because you want at most 3 days after
try:
url = url_next_day(url)
s = requests.get(url).content
c = pd.read_csv(s)
break
except urllib.error.HTTPError:
print(f"Also failed to fetch {url}...")
else:
# this block is only executed if the loop never breaks
print("No file available in the days after. Moving on.")
c = None # or an empty data frame, or whatever won't crash the rest of your code
I am downloading historic data from NSE site and running below code. It is give TIMEOUT error for public holidays as data in not available for those days. How can I ignore this error and let the loop continue for next date.
from datetime import date, datetime
from pandas.tseries.offsets import BDay
from jugaad_data.nse import bhavcopy_fo_save
i=0
while i<2610:
Date = (datetime.today() - BDay(i)).date()
bhavcopy_fo_save(Date, r"C:\Users\Mohit\ALGO_TRADING\bhavcopy")
i=i+1
I have pasted error screenshot as below-
enter image description here
enter image description here
Thanks,
Mohit
What you could do is to wrap the code that might throw the exception in a try-except block.
So for instance:
try:
bhavcopy_fo_save(Date, r"C:\Users\Mohit\ALGO_TRADING\bhavcopy")
catch TimeoutError as e:
pass
Instead of pass you could print the error or something else that acknowledges that the error occured, but it will no longer stop the program.
I am scraping data from websites, and searching for a table with a certain id. I have something like:
table = pd.read_html(page,attrs={'id': 'SpecificID'})[0]
The problem is that if the table with that id does not exist, my script stops with the following error message:
ValueError: No tables found
Is there a way I can catch the error code for pd.read_html? Something like:
if pd.read_html(page,attrs={'id': 'SpecificID'})[0]:
# No error
table = pd.read_html(page,attrs={'id': 'SpecificID'})[0]
else:
# Error
print("Error")
Any help would be appreciated. Thanks.
Just use a try statement:
try:
# No error
table = pd.read_html(page,attrs={'id': 'SpecificID'})[0]
except:
# Error
print("Error")
I am using the following code to read data from yahoo finance from a list symbols downloaded from nasdaq.
pnls = {i:dreader.DataReader(i,'yahoo','1985-01-01','2017-03-30') for i in symbols}
for df_name in pnls:
try:
pnls.get(df_name).to_csv("/Users/Jiong/Documents/data/{}_data.csv".format(df_name), index=True, header=True)
except:
print("error {}".format(df_name))
else:
print("done {}".format(df_name))
Guess some symbols may not be valid and Yahoo Finance throws RemoteDataError Exception.
The code above is supposed to continue on, but it stopped still at the errors.
Isn't except to catch all exceptions? or this is run time error ?
Anyway to get the code ignores it and continue? Thanks. See errors below running
118 if params is not None and len(params) > 0:
119 url = url + "?" + urlencode(params)
--> 120 raise RemoteDataError('Unable to read URL: {0}'.format(url))
121
122 def _read_lines(self, out):
RemoteDataError: Unable to read URL: http://ichart.finance.yahoo.com/table.csv?c=1985&f=2017&s=MITT%5EA&g=d&ignore=.csv&d=2&e=30&a=0&b=1
You need to handle the raise exception or it will halt at the place of raised. So, if a raise exception is not caught and handled, it will be interrupted.
What you need is something like this :
except RemoteDataError as exp :
print('Unable to read URL: {0}'.format(url))
You can refer to this documentation for more details on errors.
Grabbed this import from another page; it seemed to correct the issue for me. I was getting a remote error while trying to pull stock data from Yahoo. You'll need to do the following import prior to adding the RemoteDataError exception:
from pandas_datareader._utils import RemoteDataError
df_ = pd.DataFrame()
for i in assets:
print(i)
try:
vec = web.DataReader(i, 'yahoo', start='12/10/2006', end='2/1/2007')
vec['asset'] = i
vec['returns_close_raw'] = np.log(vec.Close/vec.Close.shift())
vec['returns_open_raw'] = np.log(vec.Open/vec.Open.shift())
vec['returns_open_raw10'] = np.log(vec.Open/vec.Open.shift(10))
vec['returns_close_raw10'] = np.log(vec.Close/vec.Close.shift(10))
df_ = pd.concat([df_, vec])
except RemoteDataError:
print('remote error')
except KeyError:
print('key error')
My scrpt has the following line:
libro_dia = xlrd.open_workbook(file_contents = libro_dia)
When libro_dia is not valid, it raises the following error:
XLRDError: Unsupported format, or corrupt file: Expected BOF record; found '<!DOCTYP'
I whant to handle this error, so I write:
try:
libro_dia = xlrd.open_workbook(file_contents = libro_dia)
except XLRDError:
no_termina = False
But it raises the following error:
NameError: name 'XLRDError' is not defined
What's going on?
You don't have XLRDError imported. I'm not familiar with xlrd, but something like:
from xlrd import XLRDError
might work. Alternatively, qualify your Error when handling it:
try:
libro_dia = xlrd.open_workbook(file_contents = libro_dia)
except xlrd.XLRDError: #<-- Qualified error here
no_termina = False
The above is assuming you have the following import:
import xlrd
In response to your comment:
There are several ways to use imports in python. If you import by using import xlrd, then you will have to qualify every object in that module as xlrd.SomeObject. An alternative way is by using the form from xlrd import *, which would allow you to reference the XLRD error without its' module namespace. This is lazy and a bad idea though, as it can lead to namespace clashes. If you would like to reference the error without qualifying it, the correct way to do it would be from xlrd import XLRDError, which would allow you to say except XLRDError. Read more about Python Modules
XLRDError is a custom exception and must be imported to your namespace just like any other object.
Edit: As Burhan Khalid has noted, you could just modify the except block to except xlrd.XLRDError if you already have import xlrd in your module.
Try using
xlrd.XLRDError as e and e.message should contain the error string
Sample:
try:
workbook = xlrd.open_workbook(sys.argv[1])
except xlrd.XLRDError as e:
print e.message
sys.exit(-1)