PyODBC execute stored procedure does not complete - python

I have the following code, and the stored procedure is used to call several stored procedures. I can run the stored procedure and it will complete without issues in SQL 2012. I am using Python 3.3.
cnxn = pyodbc.connect('DRIVER={SQL Server};Server=.\SQLEXPRESS;Database=MyDatabase;Trusted_Connection=yes;')
cursor = cnxn.cursor()
cnxn.timeout = 0
cnxn.autocommit = True
cursor.execute("""exec my_SP""")
The python code is executing, I have determined this from inserting numerous prints.
I did see the other question regarding python waiting for the SP to finish. I tried adding a 'time.sleep()' after the execute, and varying the time (up to 120 seconds) no change.
The stored procedure appears to be partially executing, based on the results. The data suggests that it is even interrupting one of the sub-stored procedures, yet it is fine when the SP is run from query analyzer.
My best guess would be that this is something SQL config related, but I am lost in where to look.
Any thoughts?

Adding SET NOCOUNT OFF to my proc worked for me.

I had the same issue and solved it with a combination of setting a locking variable (see answer from Ben Caine in this thread: make python wait for stored procedure to finish executing) and adding
"SET NOCOUNT ON"
after "CREATE PROCEDURE ... AS"

Just a follow up; I have had limited success using the time features located at the link below, and reducing the level of nesting stored procedures.
At the level that I was calling in the above, there were 4 layers of nested SP's; pyodbc seems to behave a little better when you have 3 layers or less. Doesn't make a lot of sense to me, but it works.
make python wait for stored procedure to finish executing
Any input on the rationale behind this would be greatly appreciated.

Related

Python code doesn't run the SQL stored procedure completely

I am not proficient in Python but I have written a python code that executes a stored procedure (SQL server) which within it contains multiple stored procedures therefore it usually takes 5 mins or so to run on SSMS.
I can see the stored procedure runs halfway through without error when I run the Python code which makes me think that somehow it needs more time to execute when coding in python.
I found other posts where people suggested subprocess but I don't know how to code this. Below is an example of a (not mine) python code to execute the stored procedure.
mydb_lock = pyodbc.connect('Driver={SQL Server Native Client 11.0};'
'Server=localhost;'
'Database=InterelRMS;'
'Trusted_Connection=yes;'
'MARS_Connection=yes;'
'user=sa;'
'password=Passw0rd;')
mycursor_lock = mydb_lock.cursor()
sql_nodes = "Exec IVRP_Nodes"
mycursor_lock.execute(sql_nodes)
mydb_lock.commit()
How can I edit the above code to use the subprocess? Is the subprocess the right choice? Any other method you can suggest?
Many thanks.
Python 2.7 and 3
SQL Server
UPDATE 04/04/2022:
#AlwaysLearning, I tried
NEWcnxn = pyodbc.connect('DRIVER={ODBC Driver 13 for SQL Server};SERVER='+server+';DATABASE='+database+';UID='+username+';PWD='+ password+';Connection Timeout=0')
But there was no change. What I noticed is that to check how much of the code it executes, I inserted the following two lines of code right after each other somewhere in the nested procedure where I thought the SP stopped.
INSERT INTO CheckTable (OrgID,Stage,Created) VALUES(#OrgID,2.5331,getdate())
INSERT INTO CheckTable (OrgID,Stage,Created) VALUES(#OrgID,2.5332,getdate())
Only the first query is completed. I use Azure DB if that helps.
UPDATE 05/04/2022:
I tried what #AlwaysLearning suggested, after my connection, I added, NEWconxn.timeout=4000 and it's working now
I tried what #AlwaysLearning suggested, after my connection, I added, NEWconxn.timeout=4000 and it's working now. Many thanks.

SQL Stored Procedures not finishing when called from Python

I'm trying to call a stored procedure in my MSSQL database from a python script, but it does not run completely when called via python. This procedure consolidates transaction data into hour/daily blocks in a single table which is later grabbed by the python script. If I run the procedure in SQL studio, it completes just fine.
When I run it via my script, it gets cut short about 2/3's of the way through. Currently I found a work around, by making the program sleep for 10 seconds before moving on to the next SQL statement, however this is not time efficient and unreliable as some procedures may not finish in that time. I'm looking for a more elegant way to implement this.
Current Code:
cursor.execute("execute mySP")
time.sleep(10)
cursor.commit()
The most related article I can find to my issue is here:
make python wait for stored procedure to finish executing
I tried the solution using Tornado and I/O generators, but ran into the same issue as listed in the article, that was never resolved. I also tried the accepted solution to set a runningstatus field in the database by my stored procedures. At the beginnning of my SP Status is updated to 1 in RunningStatus, and when the SP finished Status is updated to 0 in RunningStatus. Then I implemented the following python code:
conn=pyodbc_connect(conn_str)
cursor=conn.cursor()
sconn=pyodbc_connect(conn_str)
scursor=sconn.cursor()
cursor.execute("execute mySP")
cursor.commit()
while 1:
q=scursor.execute("SELECT Status FROM RunningStatus").fetchone()
if(q[0]==0):
break
When I implement this, the same problem happens as before with my storedprocedure finishing executing prior to it actually being complete. If I eliminate my cursor.commit(), as follows, I end up with the connection just hanging indefinitely until I kill the python process.
conn=pyodbc_connect(conn_str)
cursor=conn.cursor()
sconn=pyodbc_connect(conn_str)
scursor=sconn.cursor()
cursor.execute("execute mySP")
while 1:
q=scursor.execute("SELECT Status FROM RunningStatus").fetchone()
if(q[0]==0):
break
Any assistance in finding a more efficient and reliable way to implement this, as opposed to time.sleep(10) would be appreciated.
As OP found out, inconsistent or imcomplete processing of stored procedures from application layer like Python may be due to straying from best practices of TSQL scripting.
As #AaronBetrand highlights in this Stored Procedures Best Practices Checklist blog, consider the following among other items:
Explicitly and liberally use BEGIN ... END blocks;
Use SET NOCOUNT ON to avoid messages sent to client for every row affected action, possibly interrupting workflow;
Use semicolons for statement terminators.
Example
CREATE PROCEDURE dbo.myStoredProc
AS
BEGIN
SET NOCOUNT ON;
SELECT * FROM foo;
SELECT * FROM bar;
END
GO

cx_Oracle Statement getting Stuck

While using cx_Oracle(Python), the code goes into waiting when the the following statement is executed:
some_connection.execute(some_sql)
What could be the reason?
Without seeing the actual SQL in question it is hard to know for sure. Some possible answers include:
1) the SQL actually takes a long time to execute (and you just have to be patient)
2) the SQL is blocked by another transaction (and that transaction needs to be committed or rolled back first)
You can find out by examining the contents of dba_locks, specifically looking at the blocking_others column. You can also attempt to issue the same SQL in SQL*Plus and see if it exhibits the same behaviour.

Python/Hive interface slow with fetchone(), hangs with fetchall()

I have a python script that is querying HiveServer2 using pyhs2, like so:
import pyhs2;
conn = pyhs2.connect(host=localhost,
port=10000,
user='user',
password='password',
database='default');
cur = conn.cursor();
cur.execute("SELECT name,data,number,time FROM table WHERE date = '2014-01-01' AND number in (1,5,6,22) ORDER BY name,time ASC");
line = cur.fetchone();
while line is not None:
<do some processing, including writing to stdout>
.
.
.
line = cur.fetchone();
I have also tried using fetchall() instead of fetchone(), but that just seems to hang forever.
My query runs just fine and returns ~270 million rows. For testing, I dumped the output from Hive into a flat, tab-delimited file and wrote the guts of my python script against that, so I didn't have to wait for the query to finish everytime I ran. My script that reads the flat file will finish in ~20 minutes. What confuses me is that I don't see that same performance when I directly query Hive. In fact, it takes about 5 times longer to finish processing. I am pretty new to Hive, and python so maybe I am making some bone-headed error, but examples that I see online show a set up such as this. I just want to iterate through my Hive return, getting one row at a time as quickly as possible, much like I did using my flat file. Any suggestions?
P.S. I have found this question that sounds similar:
Python slow on fetchone, hangs on fetchall
but that ended up being a SQLite issue, and I have no control over my Hive set up.
Have you considered using fetchmany().
That would be the DBAPI answer for pulling data in chunks (bigger one, where the overhead is an issue, and smaller than all rows, where memory is an issue).

Python in Windows: large number of inserts using pyodbc causes memory leak

I am trying to populate a MS SQL 2005 database using python on windows. I am inserting millions of rows, and by 7 million I am using almost a gigabyte of memory. The test below eats up 4 megs of RAM for each 100k rows inserted:
import pyodbc
connection=pyodbc.connect('DRIVER={SQL Server};SERVER=x;DATABASE=x;UID=x;PWD=x')
cursor=connection.cursor()
connection.autocommit=True
while 1:
cursor.execute("insert into x (a,b,c,d, e,f) VALUES (?,?,?,?,?,?)",1,2,3,4,5,6)
mdbconn.close()
Hack solution: I ended up spawning a new process using the multiprocessing module to return memory. Still confused about why inserting rows in this way consumes so much memory. Any ideas?
I had the same issue, and it looks like a pyodbc issue with parameterized inserts: http://code.google.com/p/pyodbc/issues/detail?id=145
Temporarily switching to a static insert with the VALUES clause populated eliminates the leak, until I try a build from the current source.
Even I had faced the same problem.
I had to read more than 50 XML files each about 300 MB and load them into SQL Server 2005.
I tried the following :
Using the same cursor by dereferencing.
Closing /opening the connection
Setting the connection to None.
Finally ended up bootstrapping each XML file load using Process module.
Now I have replaced the process using IronPython - System.Data.SqlClient.
This give a better performance and also better interface.
Maybe close & re-open the connection every million rows or so?
Sure it doesn't solve anything, but if you only have to do this once you could get on with life!
Try creating a separate cursor for each insert. Reuse the cursor variable each time through the loop to implicitly dereference the previous cursor. Add a connection.commit after each insert.
You may only need something as simple as a time.sleep(0) at the bottom of each loop to allow the garbage collector to run.
You could also try forcing a garbage collection every once in a while with gc.collect() after importing the gc module.
Another option might be to use cursor.executemany() and see if that clears up the problem. The nasty thing about executemany(), though, is that it takes a sequence rather than an iterator (so you can't pass it a generator). I'd try the garbage collector first.
EDIT: I just tested the code you posted, and I am not seeing the same issue. Are you using an old version of pyodbc?

Categories