Python code causes SQL server transaction log to increase exponentially - python

I have a python script to execute a stored procedure to purge the tables in database. This SP further calls another SP which has delete statements for each table in database. Something like below -
Python calls - Stored procedure Purge_DB
Purge_DB calls - Stored procedure Purge_Table
Purge_Table has definition to delete data from each table.
When I run this python script, the transaction logs increase exponentially and on running this script 2-3 times, I get the transaction log full error.
Please note that the deletion happens in transaction.
BEGIN TRAN
EXEC (#DEL_SQL)
COMMIT TRAN
Earlier I was executing the same SP using VB script and never got any issue related to transaction log.
Is there a different way that Python uses to create transaction log?
Why is the log size much bigger with Python than VB script?

This is resolved now.
Python starts a transaction when execute method is called and that transaction remains open until we explicitly call commit() method. Since, this purge SP was called for more than 100 tables, the transaction log was populated until transaction was closed in the python code and hence, it was getting full because of this job.
I have set the autocommit property of pyodbc to true which will now automatically commit each SQL statement as and when it is executed as part of that connection. Please refer to the documentation here -
https://github.com/mkleehammer/pyodbc/wiki/Database-Transaction-Management

Related

Is it possible to trigger a script or program if any data is updated in a database, like MySQL?

It doesn't have to be exactly a trigger inside the database. I just want to know how I should design this, so that when changes are made inside MySQL or SQL server, some script could be triggered.
One Way would be to keep a counter on the last updated row in the database, and then you need to keep polling(Checking) the database through python for new records in short intervals.
If the value in the counter is increased then you could use the subprocess module to call another Python script.
It's possible to execute an external script from a MySql trigger, but I never used it and I don't know the implications of something like this.
MySql provides a way to implement your own functions, its called User Defined Functions. With this you can define your own functions and call them from MySql events. You need to write your own logic in a C program by following the interface provided by MySql.
Fortunately someone already did a library to call an external program from MySql: LIB_MYSQLUDF_SYS. After installing it, the following trigger should work:
CREATE TRIGGER Test_Trigger
AFTER INSERT ON MyTable
FOR EACH ROW
BEGIN
DECLARE cmd CHAR(255);
DECLARE result int(10);
SET cmd=CONCAT('/YOUR_SCRIPT');
SET result = sys_exec(cmd);
END;

SQL Stored Procedures not finishing when called from Python

I'm trying to call a stored procedure in my MSSQL database from a python script, but it does not run completely when called via python. This procedure consolidates transaction data into hour/daily blocks in a single table which is later grabbed by the python script. If I run the procedure in SQL studio, it completes just fine.
When I run it via my script, it gets cut short about 2/3's of the way through. Currently I found a work around, by making the program sleep for 10 seconds before moving on to the next SQL statement, however this is not time efficient and unreliable as some procedures may not finish in that time. I'm looking for a more elegant way to implement this.
Current Code:
cursor.execute("execute mySP")
time.sleep(10)
cursor.commit()
The most related article I can find to my issue is here:
make python wait for stored procedure to finish executing
I tried the solution using Tornado and I/O generators, but ran into the same issue as listed in the article, that was never resolved. I also tried the accepted solution to set a runningstatus field in the database by my stored procedures. At the beginnning of my SP Status is updated to 1 in RunningStatus, and when the SP finished Status is updated to 0 in RunningStatus. Then I implemented the following python code:
conn=pyodbc_connect(conn_str)
cursor=conn.cursor()
sconn=pyodbc_connect(conn_str)
scursor=sconn.cursor()
cursor.execute("execute mySP")
cursor.commit()
while 1:
q=scursor.execute("SELECT Status FROM RunningStatus").fetchone()
if(q[0]==0):
break
When I implement this, the same problem happens as before with my storedprocedure finishing executing prior to it actually being complete. If I eliminate my cursor.commit(), as follows, I end up with the connection just hanging indefinitely until I kill the python process.
conn=pyodbc_connect(conn_str)
cursor=conn.cursor()
sconn=pyodbc_connect(conn_str)
scursor=sconn.cursor()
cursor.execute("execute mySP")
while 1:
q=scursor.execute("SELECT Status FROM RunningStatus").fetchone()
if(q[0]==0):
break
Any assistance in finding a more efficient and reliable way to implement this, as opposed to time.sleep(10) would be appreciated.
As OP found out, inconsistent or imcomplete processing of stored procedures from application layer like Python may be due to straying from best practices of TSQL scripting.
As #AaronBetrand highlights in this Stored Procedures Best Practices Checklist blog, consider the following among other items:
Explicitly and liberally use BEGIN ... END blocks;
Use SET NOCOUNT ON to avoid messages sent to client for every row affected action, possibly interrupting workflow;
Use semicolons for statement terminators.
Example
CREATE PROCEDURE dbo.myStoredProc
AS
BEGIN
SET NOCOUNT ON;
SELECT * FROM foo;
SELECT * FROM bar;
END
GO

Why a COMMIT is needed before SELECT, for a previously commited UPDATE?

The situation is detailed in my previous question:
MySQLdb is caching SELECT results?
In short:
python 2.7 + MySQLdb
the "issue" happens inside a Python script (but not from the mysql client prompt)
when querying a SELECT inside a loop, the first result is repeated for all subsequent iterations of the loop
this happens even though another program updates the DB (and commits).
I can see the changes from mysql clients, but not from my python loop.
SQL_NO_CACHE didn't fix it
recreating a cursor didn't help!
autocommit(True) worked --> each query reflects DB change.
So why MySQLdb thinks it's inside a transaction, when it's clearly irrelevant?

Python MySQL- Queries are being unexpectedly cached

I have a small issue(for lack of a better word) with MySQL db. I am using Python.
So I have this table in which rows are inserted regularly. As regularly as 1 row /sec.
I run two Python scripts together. One that simulates the insertion at 1 row/sec. I have also turned autocommit off and explicitly commit after some number of rows, say 10.
The other script is a simple "SELECT count(*) ..." query on the table. This query doesn't show me the number of rows the table currently has. It is stubbornly stuck at whatever number of rows the table had initially when the script started running. I have even tried "SELECT SQL_NO_CACHE count(*) ..." to no effect.
Any help would be appreciated.
My guess is you're using INNODB with REPEATABLE READ isolation mode. Try setting the isolation mode to READ COMMITTED:
SET SESSION TRANSACTION ISOLATION LEVEL READ COMMITTED
Another way is starting a new transaction every time you perform a select query. Read more here
If autocommit is turned off in the reader as well, then it will be doing the reads inside a transaction and thus not seeing the writes the other script is doing.
My guess is that either the reader or writer (most likely the writer) is operating inside a transaction which hasn't been committed. Try ensuring that the writer is committing after each write, and try a ROLLBACK from the reader to make sure that it isn't inside a transaction either.

python: how to get notifications for mysql database changes?

In Python, is there a way to get notified that a specific table in a MySQL database has changed?
It's theoretically possible but I wouldn't recommend it:
Essentially you have a trigger on the the table the calls a UDF which communicates with your Python app in some way.
Pitfalls include what happens if there's an error?
What if it blocks? Anything that happens inside a trigger should ideally be near-instant.
What if it's inside a transaction that gets rolled back?
I'm sure there are many other problems that I haven't thought of as well.
A better way if possible is to have your data access layer notify the rest of your app. If you're looking for when a program outside your control modifies the database, then you may be out of luck.
Another way that's less ideal but imo better than calling an another program from within a trigger is to set some kind of "LastModified" table that gets updated by triggers with triggers. Then in your app just check whether that datetime is greater than when you last checked.
If by changed you mean if a row has been updated, deleted or inserted then there is a workaround.
You can create a trigger in MySQL
DELIMITER $$
CREATE TRIGGER ai_tablename_each AFTER INSERT ON tablename FOR EACH ROW
BEGIN
DECLARE exec_result integer;
SET exec_result = sys_exec(CONCAT('my_cmd '
,'insert on table tablename '
,',id=',new.id));
IF exec_result = 0 THEN BEGIN
INSERT INTO table_external_result (id, tablename, result)
VALUES (null, 'tablename', 0)
END; END IF;
END$$
DELIMITER ;
This will call executable script my_cmd on the server. (see sys_exec fro more info) with some parameters.
my_cmd can be a Python program or anything you can execute from the commandline using the user account that MySQL uses.
You'd have to create a trigger for every change (INSERT/UPDATE/DELETE) that you'd want your program to be notified of, and for each table.
Also you'd need to find some way of linking your running Python program to the command-line util that you call via sys_exec().
Not recommended
This sort of behaviour is not recommend because it is likely to:
slow MySQL down;
make it hang/timeout if my_cmd does not return;
if you are using transaction, you will be notified before the transaction ends;
I'm not sure if you'll get notified of a delete if the transaction rolls back;
It's an ugly design
Links
sys_exec: http://www.mysqludf.org/lib_mysqludf_sys/index.php
Yes, may not be SQL standard. But PostgreSQL supports this with LISTEN and NOTIFY since around Version 9.x
http://www.postgresql.org/docs/9.0/static/sql-notify.html
Not possible with standard SQL functionality.
It might not be a bad idea to try using a network monitor instead of a MySQL trigger. Extending a network monitor like this:
http://sourceforge.net/projects/pynetmontool/
And then writing a script that waits for activity on port 3306 (or whatever port your MySQL server listens on), and then checks the database when the network activity meets certain filter conditions.
It's a very high level idea that you'll have to research further, but you don't run into the DB trigger problems and you won't have to write a cron job that runs every second.

Categories