Pandas pd.read_html error code when table does not exist - python

I am scraping data from websites, and searching for a table with a certain id. I have something like:
table = pd.read_html(page,attrs={'id': 'SpecificID'})[0]
The problem is that if the table with that id does not exist, my script stops with the following error message:
ValueError: No tables found
Is there a way I can catch the error code for pd.read_html? Something like:
if pd.read_html(page,attrs={'id': 'SpecificID'})[0]:
# No error
table = pd.read_html(page,attrs={'id': 'SpecificID'})[0]
else:
# Error
print("Error")
Any help would be appreciated. Thanks.

Just use a try statement:
try:
# No error
table = pd.read_html(page,attrs={'id': 'SpecificID'})[0]
except:
# Error
print("Error")

Related

Retrieving BigQuery validation errors when loading JSONL data via the Python API

How can I retrieve more information relating to the validation errors when loading a JSONL file into BigQuery? (The question is not about solving the issue)
Example code:
from google.cloud.bigquery import (
LoadJobConfig,
QueryJobConfig,
Client,
SourceFormat,
WriteDisposition
)
# variables depending on the environment
filename = '...'
gcp_project_id = '...'
dataset_name = '...'
table_name = '...'
schema = [ ... ]
# loading data
client = Client(project=project_id)
dataset_ref = client.dataset(dataset_name)
table_ref = dataset_ref.table(table_name)
job_config = LoadJobConfig()
job_config.source_format = SourceFormat.NEWLINE_DELIMITED_JSON
job_config.write_disposition = WriteDisposition.WRITE_APPEND
job_config.schema = schema
LOGGER.info('loading from %s', filename)
with open(filename, "rb") as source_file:
job = client.load_table_from_file(
source_file, destination=table_ref, job_config=job_config
)
# Waits for table cloud_data_store to complete
job.result()
Here I am using bigquery-schema-generator to generate a schema (as BigQuery otherwise only looks at the first 100 rows).
Running that might error with the following error message (google.api_core.exceptions.BadRequest):
400 Error while reading data, error message: JSON table encountered too many errors, giving up. Rows: 1; errors: 1. Please look into the errors[] collection for more details.
Looking at the errors property basically doesn't provide any new information:
[{'reason': 'invalid',
'message': 'Error while reading data, error message: JSON table encountered too many errors, giving up. Rows: 1; errors: 1. Please look into the errors[] collection for more details.'}]
I also looked at __dict__ of the exception but that hasn't revealed any further information.
Trying to load the table using the bq command line (in this case without explicit schema) results in a much more helpful message:
BigQuery error in load operation: Error processing job '...': Provided Schema does not match Table <table name>. Field <field name> has changed type from TIMESTAMP to
DATE
My question now is how would I be able to retrieve such helpful message from the Python API?
Solution based on accepted answer
Here is a copy and past workaround that one could add in order to show more information by default. (There may be downsides to it)
import google.cloud.exceptions
import google.cloud.bigquery.job
def get_improved_bad_request_exception(
job: google.cloud.bigquery.job.LoadJob
) -> google.cloud.exceptions.BadRequest:
errors = job.errors
result = google.cloud.exceptions.BadRequest(
'; '.join([error['message'] for error in errors]),
errors=errors
)
result._job = job
return result
def wait_for_load_job(
job: google.cloud.bigquery.job.LoadJob
):
try:
job.result()
except google.cloud.exceptions.BadRequest as exc:
raise get_improved_bad_request_exception(job) from exc
Then calling wait_for_load_job(job) instead of job.result() directly, will result in a more useful exception (the error message and errors property).
To be able to show a more helpful error message, you can import google.api_core.exceptions.BadRequest to catch exceptions and then use LoadJob attribute errors to get verbose error messages from the job.
from google.api_core.exceptions import BadRequest
...
...
try:
load_job.result()# Waits for the job to complete.
except BadRequest:
for error in load_job.errors:
print(error["message"]) # error is of type dictionary
For testing I used the sample code BQ load json data and changed the input file to produce an error. In the file I changed the value for "post_abbr" from string to an array value.
File used:
{"name": "Alabama", "post_abbr": "AL"}
{"name": "Alaska", "post_abbr": "AK"}
{"name": "Arizona", "post_abbr": [65,2]}
See output below when code snippet above is applied. The last error message shows the actual error about "post_abbr" receiving an Array for a non repeated field.
Error while reading data, error message: JSON table encountered too many errors, giving up. Rows: 3; errors: 1. Please look into the errors[] collection for more details.
Error while reading data, error message: JSON processing encountered too many errors, giving up. Rows: 3; errors: 1; max bad: 0; error percent: 0
Error while reading data, error message: JSON parsing error in row starting at position 78: Array specified for non-repeated field: post_abbr.

How is the name of these 2 errors to be invoked in Except? (Selenium and database)

I'm putting errors in Try and Except, but I'm having trouble with 2 types of errors. I'll give you an example of what I need. This except with Valure Error works fine, because I know the error is called ValueError, so:
#ONLY EXAMPLE
except ValueError:
return "FAILED"
EXCEPT ERROR N.1
The question I ask you is: how should I write for the following error? It consists of the wrong name of an html class when scraping. I wrote selenium.common.exceptions.NoSuchElementException, but the error search does not work (the script does not open at all)
except selenium.common.exceptions.NoSuchElementException:
return "FAILED"
EXCEPT ERROR N.2
Same thing for the database insert error. I want to make sure that if 0 records are saved in the database then I have an error. I tried to write like this, but it's not good:
except records_added_Results = records_added_Results = + 0:
return "FAILED"
I wrote this because I have this code for the insertion in the database. It works correctly, I am writing it just to make you understand:
con = sqlite3.connect('/home/mypc/Scrivania/folder/Database.db')
cursor = con.cursor()
records_added_Results = 0
Values = ((SerieA_text,), (SerieB_text,)
sqlite_insert_query = 'INSERT INTO ARCHIVIO_Campionati (Nome_Campionato) VALUES (?);'
count = cursor.executemany(sqlite_insert_query, Values)
con.commit()
print("Record inserted successfully ", cursor.rowcount)
records_added_Results = records_added_Results + 1
cursor.close()
How should I write? Thank you
Import the exception before you try to catch it. Add this line to your imports:
from selenium.common.exceptions import NoSuchElementException
Then you can simply do:
try:
#code that could raise the NoSuchElementException
except NoSuchElementException:
#exception handling
For the other exception, it looks like you're trying to raise an exception. Consider something like this:
if records_added_Results == 0:
raise ValueError("error message")

Getting an error when using prepared statement in python

whenever I try to run a normal query all works perfectly fine. the code executes and I can get the results but whenever I try to use a prepared statement in python I keep getting the following error:
1064 (42000): You have an error in your SQL syntax; check the manual that corresponds to your MariaDB server version for the right syntax to use near '? WHERE name = ?' at line 1
The code I'm trying to run:
cursor = con.db.cursor(prepared=True)
try:
cursor.execute("SELECT * FROM %s WHERE name = %s", ('operations', 'check', ))
except mysql.connector.Error as error:
print(error)
except TypeError as e:
print(e)
I've tried also to change the tuple object to string and removed one of the '%s' just for checking. but I still get an error for the '%s' synax.
another thing I've tried is to use a dict object so I've changed the '%s' to '%(table)s' and '%(name)s' and used a dict of
{'table': 'operations', 'name': 'check'}
example:
cursor.execute("SELECT * FROM %(table)s WHERE name = %(name)s", {'table': 'operations', 'name': 'check'})
but again it didn't worked and I still got the exception
am I missing something?
Thanks in advance!
-------- Edit --------
Thanks to #khelwood, I've fixed the problem.
as #khelwood mentioned in comments the problem was because I tried to use the '%s' as a parameter for table name.
python prepared statements can't handle parameters for things such as table names
so thats what throwed the exception
You can't insert a table name as a query parameter. You can pass the name you're looking for as a parameter, but it should be in a tuple: ("check",)
So
cursor.execute("SELECT * FROM operations WHERE name = %s", ("check", ))

Exception handling with Yahoo-finance

I am trying to scrape data using yfinance and have met a road block when trying to retrieve a ticker with no data, the error is - 7086.KL: No data found for this date range, symbol may be delisted.
How do I try catch this error? I've tried try catching it as seen in the code below but it still prints that error.
The code:
tickerdata = yf.Ticker("7086.KL")
try:
history = tickerdata.history(start="2019-06-01", end="2020-05-01")
except ValueError as ve:
print("Error")
Any advise on how to solve this?
I just had a look at the source code. It looks like they are indeed just printing the message. But they are also adding the error to a dictionary in the shared.py file. You can use this to check for errors:
from yfinance import shared
ticker = <ticker as string>
tickerdata = yf.Ticker(ticker)
history = tickerdata.history(start="2019-06-01", end="2020-05-01")
error_message = shared._ERRORS[ticker]

python cx_oracle delete stmt error [duplicate]

This question already has an answer here:
cx_oracle Error handling issue
(1 answer)
Closed 10 years ago.
I'm trying a delete a particular row from a table in a try except block but I get the following error
self.returnvals['ERROR_CD'] = error.code
AttributeError: 'str' object has no attribute 'code'
Code:
try:
# code deleting from a table
except cx_Oracle.DatabaseError, ex:
error, = ex.args
self.conn.rollback()
self.returnerr['ID'] = 0
self.returnerr['ERROR_CD'] = error.code
self.returnerr['ERROR_MSG'] = error.message
self.returnerr['TABLE_NAME'] = self.debug_val
Your exception handling is broken. You're getting some sort of error oracle database error thrown inside your try block and it would be useful to see it, but the line in your code
self.returnerr['ERROR_CD'] = error.code
is throwing the error you're reporting because the "error" object is just a string and doesn't have a .code attribute.
Also...
delete from a table
Doesn't look like the actual DML you are attempting. Why don't you post the actual DELETE statement and maybe we can see if there's a syntax error. If this is your literal code, you need to read the documentation. I believe it's supposed to look more like:
import cx_Oracle as db
conn = db.connection()
cur = conn.cursor()
cur.execute("DELETE FROM TABLE WHERE somecolumn = someval")
conn.commit()
conn.close

Categories