Python - mysql - logging queries that generate warnings - python

I have python code that is making mysql calls. It logs all mysql errors (and sends me a notifiocation on google chat). However warnings such as this dont get reported which makes sense since they are not warnings. I would however like the mysql statement logged when there is a warning so I can fix the underlying issue. What is the best way to find those warnings and get them to the log(with the bad mysql statment
/usr/local/lib/python3.4/dist-packages/pymysql/cursors.py:170: Warning: (1366, "Incorrect string value: '\\xF3n Cha...' for column 'recipient-name' at row 1")
.
try:
cursor.execute(query_string, field_split)
db.commit()
except pymysql.err.InternalError as e:
logger.warning('Mysql Error : %s', e)
logger.warning('Statement : %s', cursor._last_executed)
string_google= str(e.args[1] + ' - ' + cursor._last_executed)
googlechat(string_google)
return #exit rather then marking report run good

Use catch_warnings from the warnings module. It's a context manager that provides you with a list of the warnings. The code would look something like this:
with warnings.catch_warnings(record=True) as w:
function_that_triggers_warning()
if w:
logging_function(w[-1])

Related

What is the difference between building cypher and executing the prepared statement in the following code

I am unable to understand why there are two queries being executed. First we are executing the prepared statement and we are using the build cypher function. The code can be found here
https://github.com/apache/age/blob/master/drivers/python/age/age.py
def execCypher(conn:ext.connection, graphName:str, cypherStmt:str, cols:list=None, params:tuple=None) -> ext.cursor :
if conn == None or conn.closed:
raise _EXCEPTION_NoConnection
cursor = conn.cursor()
#clean up the string for modification
cypherStmt = cypherStmt.replace("\n", "")
cypherStmt = cypherStmt.replace("\t", "")
cypher = str(cursor.mogrify(cypherStmt, params))
cypher = cypher[2:len(cypher)-1]
preparedStmt = "SELECT * FROM age_prepare_cypher({graphName},{cypherStmt})"
cursor = conn.cursor()
try:
cursor.execute(sql.SQL(preparedStmt).format(graphName=sql.Literal(graphName),cypherStmt=sql.Literal(cypher)))
except SyntaxError as cause:
conn.rollback()
raise cause
except Exception as cause:
conn.rollback()
raise SqlExecutionError("Execution ERR[" + str(cause) +"](" + preparedStmt +")", cause)
stmt = buildCypher(graphName, cypher, cols)
cursor = conn.cursor()
try:
cursor.execute(stmt)
return cursor
except SyntaxError as cause:
conn.rollback()
raise cause
except Exception as cause:
conn.rollback()
raise SqlExecutionError("Execution ERR[" + str(cause) +"](" + stmt +")", cause)
Both statements perform the same operation.
The difference is that preparedStmt and buildCypher function use different form of cypher queries as shown in code. (cypherStmt & cypher) And their code for building the query is a bit different.
I can't tell you why it's done this way but I'll show you why it's different. Also apologies but I'm not used to Python or C.
The preparedStatement is calling a custom postgres function age_prepare_cypher in this file here apache/age/src/backend/utils/adt/age_session_info.c, which calls set_session_info(graph_name_str, cypher_statement_str);.
And the set_session_info in this file here apache/age/src/backend/utils/adt/age_session_info.c just sets it to a global variable session_info_cypher_statement.
So your graph name and query are being set in the session.
There's another function that gets your graph name and query back out of the session, and that is the convert_cypher_to_subquery. It only gets them out if is_session_info_prepared() is true, and only if graph_name and query_str provided to it are NULL.
Seems strange right? But now let's look at this bit of the python buildCypher function code:
stmtArr = []
stmtArr.append("SELECT * from cypher(NULL,NULL) as (")
stmtArr.append(','.join(columnExp))
stmtArr.append(");")
return "".join(stmtArr)
It's taking your query and saying your graph name and query string are NULL.
So we can conclude that the prepare statement is storing those values in session memory, and then when you execute your statement after using buildCypher, it's getting them out of memory and completing the statement again.
I can't explain exactly why or how it does it, but I can see a chunk of test sql in the project that is doing the same sort of thing here:
-- should return true and execute cypher command
SELECT * FROM age_prepare_cypher('analyze', 'MATCH (u) RETURN (u)');
SELECT * FROM cypher(NULL, NULL) AS (result agtype);
So tl;dr, executing the prepareStatement is storing it in session memory, and executing the normal statement after running it through buildCypher is grabbing what was just stored in the session.

Error: Increase MaxLocksPerFile registry entry via Python

I am running a rather complec update MS Access query from Python:
qry = '''
UPDATE H500_ODFlows INNER JOIN H500_UPDATE ON
(H500_ODFlows.Product = H500_UPDATE.Product)
AND (H500_ODFlows.Dest = H500_UPDATE.DestCode)
AND (H500_ODFlows.Orig = H500_UPDATE.OrigCode)
SET H500_ODFlows.Pieces = [H500_UPDATE].[Pieces],
H500_ODFlows.Weight = [H500_UPDATE].[Weight],
H500_ODFlows.Cons = [H500_UPDATE].[Pieces],
H500_ODFlows.DeadWeight = [H500_UPDATE].[DeadWeight],
H500_ODFlows.DoNotRead = [H500_UPDATE].DoNotRead,
H500_ODFlows.[_OrigCountryCode] = [H500_UPDATE].[_OrigCountryCode],
H500_ODFlows.[_DestCountryCode] = [H500_UPDATE].[_DestCountryCode]
'''
try:
crsr.execute(lb.cleanqry(qry))
cnxn.commit()
print('Updating was successful.')
except Exception as err:
print('Updating failed. See the error.' + str(err))
but get the following error:
('HY000', '[HY000] [Microsoft][ODBC Microsoft Access Driver] File
sharing lock count exceeded. Increase MaxLocksPerFile registry entry.
(-1033) (SQLExecDirectW)')
I followed the instructions to increase "MaxLocksPerFile" but it is not helping. Moreover, the query runs in MS Access quite OK but not through Python. Any advice?
Try running the query with autocommit on. That way, the database won't need to keep all those locks open, but can just commit everything as the query runs.
qry = '''
UPDATE H500_ODFlows INNER JOIN H500_UPDATE ON
(H500_ODFlows.Product = H500_UPDATE.Product)
AND (H500_ODFlows.Dest = H500_UPDATE.DestCode)
AND (H500_ODFlows.Orig = H500_UPDATE.OrigCode)
SET H500_ODFlows.Pieces = [H500_UPDATE].[Pieces],
H500_ODFlows.Weight = [H500_UPDATE].[Weight],
H500_ODFlows.Cons = [H500_UPDATE].[Pieces],
H500_ODFlows.DeadWeight = [H500_UPDATE].[DeadWeight],
H500_ODFlows.DoNotRead = [H500_UPDATE].DoNotRead,
H500_ODFlows.[_OrigCountryCode] = [H500_UPDATE].[_OrigCountryCode],
H500_ODFlows.[_DestCountryCode] = [H500_UPDATE].[_DestCountryCode]
'''
try:
cnxn.autocommit = True
crsr.execute(lb.cleanqry(qry))
print('Updating was successful.')
except Exception as err:
print('Updating failed. See the error.' + str(err))
Since you note: The query runs in MS Access quite OK but not through Python. One possible reason for this is Access stored queries are more efficient than application layer called queries since the engine saves and caches best execution plan. In the application layer (Python, VBA, etc.) when processing a string SQL statement, the Jet/ACE engine does not have time to plan the best execution.
Therefore, consider the following:
Add any needed indexes to JOIN variables of respective tables.
Save your UPDATE query as a stored query inside the database. Here, the saving process checks syntax, calculates and optimizes plan, and cache stats.
Run Compact & Repair in database to refresh stats.
Then, run query in Python as a stored proc with CALL command:
# SET AUTOCOMMIT PREFERENCE IN CONNECTION
cnxn = pyodbc.connect(..., autocommit=True)
...
crsr.execute("{CALL myUpdateQuery}")

Python psycopg2 PostgreSQL simple SELECT query gives syntax error

I am trying to go through each query in a SQL file and execute it in my python script using psycopg2. Each query has an id which I replace before executing.
The first query in the sql file is the following:
select * from subscriber where org_id = '1111111111';
I get the old id and replace it with the new id that I am looking for
id_regex = re.compile("\d{10,}")
m = id_regex.search(q)
old_id = m.group(0)
new_q = q.replace(old_id, new_id)
I then execute the queries on the following manner
for index, cmd in enumerate(cmds):
# ... (other stuff here)
elif cmd != '\n':
new_cmd = p_helper.replace_id(org_id, cmd)
logger.debug("Running Command:\n" + new_cmd)
try:
if not test_run:
db_cursor.execute(new_cmd)
except psycopg2.Error as e:
logger.error(e.pgerror)
else:
pass
# DO NOTHING
When I run my program I get the following error:
ERROR: syntax error at or near "select"
LINE 1: select * from subscriber where org_id = '9999999999';
^
Every query after the first doesn't run
ERROR: current transaction is aborted, commands ignored until end of transaction block
I ran the select query manually in psql and it worked perfectly so I don't think the problem is the syntax of the statement. I think it has something to do with the formatting of queries that psycopg2 takes. I'm not sure exactly what to change, I have looked at other SO posts and could not figure out what I needed to change. It'd be great if someone could help me figure this out. Thanks!
Versions
python: 2.7.6
psycopg2: 2.4.5

IntegrityError: distinguish between unique constraint and not null violations

I have this code:
try:
principal = cls.objects.create(
user_id=user.id,
email=user.email,
path='something'
)
except IntegrityError:
principal = cls.objects.get(
user_id=user.id,
email=user.email
)
It tries to create a user with the given id and email, and if there already exists one - tries to get the existing record.
I know this is a bad construction and it will be refactored anyway. But my question is this:
How do i determine what kind of IntegrityError has happened: the one related to unique constraint violation (there is unique key on (user_id, email)) or the one related to not null constraint (path cannot be null)?
psycopg2 provides the SQLSTATE with the exception as the pgcode member, which gives you quite fine-grained error information to match on.
python3
>>> import psycopg2
>>> conn = psycopg2.connect("dbname=regress")
>>> curs = conn.cursor()
>>> try:
... curs.execute("INVALID;")
... except Exception as ex:
... xx = ex
>>> xx.pgcode
'42601'
See Appendix A: Error Codes in the PostgreSQL manual for code meanings. Note that you can match coarsely on the first two chars for broad categories. In this case I can see that SQLSTATE 42601 is syntax_error in the Syntax Error or Access Rule Violation category.
The codes you want are:
23505 unique_violation
23502 not_null_violation
so you could write:
try:
principal = cls.objects.create(
user_id=user.id,
email=user.email,
path='something'
)
except IntegrityError as ex:
if ex.pgcode == '23505':
principal = cls.objects.get(
user_id=user.id,
email=user.email
)
else:
raise
That said, this is a bad way to do an upsert or merge. #pr0gg3d is presumably right in suggesting the right way to do it with Django; I don't do Django so I can't comment on that bit. For general info on upsert/merge see depesz's article on the topic.
Update as of 9-6-2017:
A pretty elegant way to do this is to try/except IntegrityError as exc, and then use some useful attributes on exc.__cause__ and exc.__cause__.diag (a diagnostic class that gives you some other super relevant information on the error at hand - you can explore it yourself with dir(exc.__cause__.diag)).
The first one you can use was described above. To make your code more future proof you can reference the psycopg2 codes directly, and you can even check the constraint that was violated using the diagnostic class I mentioned above:
except IntegrityError as exc:
from psycopg2 import errorcodes as pg_errorcodes
assert exc.__cause__.pgcode == pg_errorcodes.UNIQUE_VIOLATION
assert exc.__cause__.diag.constraint_name == 'tablename_colA_colB_unique_constraint'
edit for clarification: I have to use the __cause__ accessor because I'm using Django, so to get to the psycopg2 IntegrityError class I have to call exc.__cause__
It could be better to use:
try:
obj, created = cls.objects.get_or_create(user_id=user.id, email=user.email)
except IntegrityError:
....
as in https://docs.djangoproject.com/en/dev/ref/models/querysets/#get-or-create
The IntegrityError should be raised only in the case there's a NOT NULL constraint violation.
Furthermore you can use created flag to know if the object already existed.

PySVN - Determine if a repository exists

I'm writing a small script that manages several SVN repositories. Users pass through the ID of the repository they want to change (the root of the repos are of the form https://www.mydomain.com/).
I need to check if the given repo actually exists. I've tried using Client.list to see if I can find any files, like so:
client = pysvn.Client()
client.list("https://.../<username>/")
But if the repo does not exist then the script hangs on the list line. From digging through the tracebacks it looks like pysvn is actually hanging on the login credentials callback (client.callback_get_login - which I have implemented but omitted, it does not fail if the repo exists).
Can you suggest how I can determine if a repo exists or not using pysvn?
Cheers,
Pete
I couldn't reproduce your hanging in credentials callback problem, so it might need an expanded description of the problem. I'm running pysvn 1.7.2 on Ubuntu 10.04, Python 2.6.6.
When I try to list a non-existent remote repository with client.list() it raises an exception. You could also use client.info2() to check for existence of a remote repository:
head_rev = pysvn.Revision(pysvn.opt_revision_kind.head)
bad_repo = 'https://.../xyz_i_dont_exist'
good_repo = 'https://.../real_project'
for url in (bad_repo, good_repo):
try:
info = client.info2(url, revision=head_rev, recurse=False)
print url, 'exists.'
except pysvn._pysvn_2_6.ClientError, ex:
if 'non-existent' in ex.args[0]:
print url, 'does not exist'
else:
print url, 'error:', ex.args[0]
Peter,
My team and I have experienced the same challenge. Samplebias, try providing a callback_get_login function but set your callback_server_ssl_trust_prompt to return (True, trust_dict['failures'], True). IFF subversion has not cached your server certificate trust settings, then you may find the info2() (or Peter's list() command) hangs (it's not actually hanging, it just takes intermittently much longer time to return). Oddly, when you CTRL-C the interpreter in these scenarios, you'll get indication that it hung on the login callback, not the server_cert verification. Play around with your ~/.subversion/auth settings (in particular the svn.simple and svn.ssl.server directories) and you'll see different amounts of 'hang time'. Look at pysvn.Client.callback_cancel if you need to handle situations which truly never return.
Considering: http://pysvn.tigris.org/docs/pysvn_prog_ref.html#pysvn_client_callback_ssl_server_trust_prompt you need to decide what your desired behavior is. Do you want ONLY to allow those connections for which you already have a cached trust answer? Or, do you want to ALWAYS accept regardless of server certificate verification (WARNING: this could (obviously) have negative security implications). Consider the following suggestion:
import pysvn
URL1 = "https://exists.your.org/svn/repos/dev/trunk/current"
URL2 = "https://doesntexit.your.org/svn/repos/dev/trunk/current"
URL3 = "https://exists.your.org/svn/repos/dev/trunk/youDontHavePermissionsBranch"
ALWAYS = "ALWAYS"
NEVER = "NEVER"
DESIRED_BEHAVIOR = ALWAYS
def ssl_server_certificate_trust_prompt(trust_dict):
if DESIRED_BEHAVIOR == NEVER:
return (False, 0, False)
elif DESIRED_BEHAVIOR == ALWAYS:
return (True, trust_dict['failures'], True)
raise Exception, "Unsupported behavior"
def testURL(url):
try:
c.info2(url)
return True
except pysvn.ClientError, ce:
if ('non-existant' in ce.args[0]) or ('Host not found' in ce.args[0]):
return False
else:
raise ce
c = pysvn.Client()
c.callback_ssl_server_trust_prompt = lambda t: (False, t['failures'], True)
c.callback_get_login = lambda x, y, z: (True, "uname", "pw", False)
if not testURL(URL1): print "Test1 failed."
if testURL(URL2): print "Test2 failed."
try:
testURL(URL3)
print "Test3 failed."
except: pass
In actuality, you probably don't want to get as fancy as I have with the return values. I do think it was important to consider a potential 403 returned by the server and the "Host not found" scenario separately.

Categories