I am using MobaExterm to run my python script.
The script is fetching records from 3 tables. I can see the output of my query in MySQL Workbench but when the same query runs in my script, i get output as "Killed"
What is the reason. My query seems correct.
select tsp.data_ip, tsp.IP, tvp.vm_d_ip, tvp.IP FROM cmdb.t_server tsp,cmdb.t_vm tvp,t_ip ip where tvp.SERIALNUMBER= 'AD123' or tsp.SERIALNUMBER= 'AD123' and (ip.ip=tsp.d_ip or ip.ip=tsp.IP or ip.ip=tvp.dip or ip.ip=tvp.IP);
The reason why this happens in python script is because of too many records.
The record exceeds the time to wait of the script while its running and kills it.
As seen in the select query, it is querying three tables at the same time with where clause mentioning multiple conditions in 'and', 'or'.
Joins should be used instead.
Related
I'm writing a python script which connects with Oracle DB. I'm collecting specific Reference ID into a variable and then executing a Stored Procedure in a For Loop. It's working fine but its taking very long time to complete.
Here's a code:
sql = f"SELECT STATEMENT"
cursor.execute(sql)
result = cursor.fetchall()
for i in result:
cursor.callproc('DeleteStoredProcedure', [i[0]])
print("Deleted:", i[0])
The first SQL SELECT Statement collect around 600 Ref IDs but its taking around 3 mins to execute Stored Procedure which is very long if we have around 10K or more record.
BTW, the Stored Procedure is configured to delete rows from three different tables based on the reference ID. And its running quickly from Oracle Toad.
Is there any way to improve the performance?
I think you could create just one store procedure that execute the SELECT STATEMENT and do what ever DeleteStoredProcedure does.
Or, you can use threads to execute every stored procedure https://docs.python.org/3/library/threading.html
On a SQL Server database, I would like to run an external (Python) script whenever a new row is inserted into a table. The Python script needs to process the data from this row.
Is a DML Trigger AFTER INSERT a save method to use here? I saw several warnings/discouragements in other questions (see, e.g., How to run a script on every insert or Trigger script when SQL Data changes). From what I understand so far, the script may fail when the INSERT is not yet commited because then the script cannot see/load the row? However, as I understand the example in https://www.mssqltips.com/sqlservertip/5909/sql-server-trigger-example/, during the execution of the trigger there exists a virtual table named inserted that holds the data being affected by the trigger execution. So technically, I should be able to pass the row that the Python script needs by retrieving it directly from this inserted table?
I am new to triggers which is why I am asking - so thank you for any clarifaction on best practices here! :)
After some testing, I found that the following trigger seems to successfully pass the row from the inserted table to an external Python (SQL Server Machine-Learning-Services) script:
CREATE TRIGGER index_new_row
ON dbo.triggertesttable
AFTER INSERT
AS
DECLARE #new_row nvarchar(max) = (SELECT * FROM inserted FOR JSON AUTO);
EXEC sp_execute_external_script #language =N'Python',
#script=N'
import pandas as pd
OutputDataSet = pd.read_json(new_row)
',
#params = N'#new_row nvarchar(max)',
#new_row = #new_row
GO
When testing this with an insert on dbo.triggertesttable, this demo Python script works like a select statement on the inserted table, so it returns all rows that were inserted.
I have a python script to execute a stored procedure to purge the tables in database. This SP further calls another SP which has delete statements for each table in database. Something like below -
Python calls - Stored procedure Purge_DB
Purge_DB calls - Stored procedure Purge_Table
Purge_Table has definition to delete data from each table.
When I run this python script, the transaction logs increase exponentially and on running this script 2-3 times, I get the transaction log full error.
Please note that the deletion happens in transaction.
BEGIN TRAN
EXEC (#DEL_SQL)
COMMIT TRAN
Earlier I was executing the same SP using VB script and never got any issue related to transaction log.
Is there a different way that Python uses to create transaction log?
Why is the log size much bigger with Python than VB script?
This is resolved now.
Python starts a transaction when execute method is called and that transaction remains open until we explicitly call commit() method. Since, this purge SP was called for more than 100 tables, the transaction log was populated until transaction was closed in the python code and hence, it was getting full because of this job.
I have set the autocommit property of pyodbc to true which will now automatically commit each SQL statement as and when it is executed as part of that connection. Please refer to the documentation here -
https://github.com/mkleehammer/pyodbc/wiki/Database-Transaction-Management
I'm trying to call a stored procedure in my MSSQL database from a python script, but it does not run completely when called via python. This procedure consolidates transaction data into hour/daily blocks in a single table which is later grabbed by the python script. If I run the procedure in SQL studio, it completes just fine.
When I run it via my script, it gets cut short about 2/3's of the way through. Currently I found a work around, by making the program sleep for 10 seconds before moving on to the next SQL statement, however this is not time efficient and unreliable as some procedures may not finish in that time. I'm looking for a more elegant way to implement this.
Current Code:
cursor.execute("execute mySP")
time.sleep(10)
cursor.commit()
The most related article I can find to my issue is here:
make python wait for stored procedure to finish executing
I tried the solution using Tornado and I/O generators, but ran into the same issue as listed in the article, that was never resolved. I also tried the accepted solution to set a runningstatus field in the database by my stored procedures. At the beginnning of my SP Status is updated to 1 in RunningStatus, and when the SP finished Status is updated to 0 in RunningStatus. Then I implemented the following python code:
conn=pyodbc_connect(conn_str)
cursor=conn.cursor()
sconn=pyodbc_connect(conn_str)
scursor=sconn.cursor()
cursor.execute("execute mySP")
cursor.commit()
while 1:
q=scursor.execute("SELECT Status FROM RunningStatus").fetchone()
if(q[0]==0):
break
When I implement this, the same problem happens as before with my storedprocedure finishing executing prior to it actually being complete. If I eliminate my cursor.commit(), as follows, I end up with the connection just hanging indefinitely until I kill the python process.
conn=pyodbc_connect(conn_str)
cursor=conn.cursor()
sconn=pyodbc_connect(conn_str)
scursor=sconn.cursor()
cursor.execute("execute mySP")
while 1:
q=scursor.execute("SELECT Status FROM RunningStatus").fetchone()
if(q[0]==0):
break
Any assistance in finding a more efficient and reliable way to implement this, as opposed to time.sleep(10) would be appreciated.
As OP found out, inconsistent or imcomplete processing of stored procedures from application layer like Python may be due to straying from best practices of TSQL scripting.
As #AaronBetrand highlights in this Stored Procedures Best Practices Checklist blog, consider the following among other items:
Explicitly and liberally use BEGIN ... END blocks;
Use SET NOCOUNT ON to avoid messages sent to client for every row affected action, possibly interrupting workflow;
Use semicolons for statement terminators.
Example
CREATE PROCEDURE dbo.myStoredProc
AS
BEGIN
SET NOCOUNT ON;
SELECT * FROM foo;
SELECT * FROM bar;
END
GO
I've got a postgresql-query that returns 120 rows {integer, boolean, integer, varchar(255), varchar(255), bigint, text} in about 70ms when done in the database running psql.
Using python/django with django.db.connection.cursor.execute() it takes 10s to run, on the same machine.
I've tried putting all the rows into an array, and a single string (18k characters, but returning only the first 500 takes the same time) so there is only one row returned but with no gain.
Any ideas as to why there is such a dramatic slowdown in running a query from within python and in the db?
EDIT
I had to increase the work_mem to get the function running timely in psql. Other functions/queries don't show the same pattern, the difference between psql and python is only a few milliseconds.
EDIT
Cutting down the work_mem to 1MB shows similar numbers in psql and the django shell. Could it be that django is not going by the memory set in work_mem?
EDIT
Ugh. The problem was that the work_mem set in psql is not valid globally, if I set the memory in the function, the call is timely. I suppose setting this in the configuration file would work globally.
If the timing between "in situ" queries and psql queries differs much then the first and usual suspect is this: If the framework uses prepared statements, then you have to check the timing in psql using prepared statements too. For example:
prepare foo as select * from sometable where intcolumn = $1;
execute foo(42);
If the timing of the execute is in the same ballpark as your in situ query, then you can explain and explain analyse the execute line.
If the timing is not in the same ballpark you have to look for something else.