Pyodbc stored procedure with params not updating table - python

I am using python 3.9 with a pyodbc connection to call a SQL Server stored procedure with two parameters.
This is the code I am using:
connectionString = buildConnection() # build connection
cursor = connectionString.cursor() # Create cursor
command = """exec [D7Ignite].[Service].[spInsertImgSearchHitResults] #RequestId = ?, #ImageInfo = ?"""
values = (requestid, data)
cursor.execute(command, (values))
cursor.commit()
cursor.close()
requestid is simply an integer number, but data is defined as follows (list of json):
[{"ImageSignatureId":"27833", "SimilarityPercentage":"1.0"}]
The stored procedure I am trying to run is supposed to insert data into a table, and it works perfectly fine when executed from Management Studio. When running the code above I notice there are no errors but data is not inserted into the table.
To help me debug, I printed the query preview:
exec [D7Ignite].[Service].[spInsertImgSearchHitResults] #RequestId = 1693, #ImageInfo = [{"ImageSignatureId":"27833", "SimilarityPercentage":"1.0"}]
Pasting this exact line into SQL Server runs the stored procedure with no problem, and data is properly inserted.
I have enabled autocommit = True when setting up the connection and other CRUD commands work perfectly fine with pyodbc.
Is there anything I'm overlooking? Or is pyodbc simply not processing my query properly? If so, are there any other ways to run Stored Procedures from Python?

Related

MySQL table definition has changed error when reading from a table that has been written to by PySpark

I am currently working on a data pipeline with pyspark. As part of the pipeline, I write a spark dataframe to mysql using the following function:
def jdbc_insert_overwrite_table(df, mysql_user, mysql_pass, mysql_host, mysql_port, mysql_db, num_executors, table_name,
logger):
mysql_url = "jdbc:mysql://{}:{}/{}?characterEncoding=utf8".format(mysql_host, mysql_port, mysql_db)
logger.warn("JDBC Writing to table " + table_name)
df.write.format('jdbc')\
.options(
url=mysql_url,
driver='com.mysql.cj.jdbc.Driver',
dbtable=table_name,
user=mysql_user,
password=mysql_pass,
truncate=True,
numpartitions=num_executors,
batchsize=100000
).mode('Overwrite').save()
This works with no issue. However, later on in the pipeline (within the same PySpark app/ spark session), this table is a dependency for another transformation, and I try reading from this table using the following function:
def read_mysql_table_in_session_df(spark, mysql_conn, query_str, query_schema):
cursor = mysql_conn.cursor()
cursor.execute(query_str)
records = cursor.fetchall()
df = spark.createDataFrame(records, schema=query_schema)
return df
And I get this MySQL error: Error 1412: Table definition has changed, please retry transaction.
I've been able to resolve this by closing and ping(reconnect=True) to the database, but I don't like this solution as it feels like a band-aid.
Any ideas why I'm getting this error? I've confirmed writing to the table does not change the table definition (schema wise, at least).

Running simple query through python: No results

I am trying to learn how to get Microsoft SQL query results using python and pyodbc module and have run into an issue in returning the same results using the same query that I use in Microsoft SQL Management Studio.
I've looked at the pyodbc documentation and set up my connection correctly... at least I'm not getting any connection errors at execution. The only issue seems to be returning the table data
import pyodbc
import sys
import csv
cnxn = pyodbc.connect('DRIVER={SQL Server};SERVER=<server>;DATABASE=<db>;UID=<uid>;PWD=<PWD>')
cursor = cnxn.cursor()
cursor.execute("""
SELECT request_id
From audit_request request
where request.reception_datetime between '2019-08-18' and '2019-08-19' """)
rows = cursor.fetchall()
for row in cursor:
print(row.request_id)
When I run the above code i get this in the python terminal window:
Process returned 0 (0x0) execution time : 0.331 s
Press any key to continue . . .
I tried this same query in SQL Management Studio and it returns the results I am looking for. There must be something I'm missing as far as displaying the results using python.
You're not actually setting your cursor up to be used. You should have something like this before executing:
cursor = cnxn.cursor()
Learn more here: https://github.com/mkleehammer/pyodbc/wiki/Connection#cursor

REINDEX DATABASE cannot run inside a transaction block

I am using an old version of sqlalchemy (0.8) and I need to execute "REINDEX DATABASE <dbname>" on PostgreSQLql 9.4 by using sqlalchemy api.
Initially I tried with:
conn = pg_db.connect()
conn.execute('REINDEX DATABASE sg2')
conn.close()
but I got error "REINDEX DATABASE cannot run inside a transaction block".
I read in internet and tried other changes:
engine.execute(text("REINDEX DATABASE sg2").execution_options(autocommit=True))
(I tried also with autocommit=False).
and
conn = engine.raw_connection()
cursor = conn.cursor()
cursor.execute('REINDEX DATABASE sg2')
cursor.close()
I always have the same error.
I tried also following:
conn.execution_options(isolation_level="AUTOCOMMIT").execute(query)
but I got error
Invalid value 'AUTOCOMMIT' for isolation_level. Valid isolation levels for postgresql are REPEATABLE READ, READ COMMITTED, READ UNCOMMITTED, SERIALIZABLE
What am I missing here ? Thanks for any help.

python script hangs when calling cursor.fetchall() with large data set

I have a query that returns over 125K rows.
The goal is to write a script the iterates through the rows, and for each, populate a second table with data processed from the result of the query.
To develop the script, I created a duplicate database with a small subset of the data (4126 rows)
On the small database, the following code works:
import os
import sys
import random
import mysql.connector
cnx = mysql.connector.connect(user='dbuser', password='thePassword',
host='127.0.0.1',
database='db')
cnx_out = mysql.connector.connect(user='dbuser', password='thePassword',
host='127.0.0.1',
database='db')
ins_curs = cnx_out.cursor()
curs = cnx.cursor(dictionary=True)
#curs = cnx.cursor(dictionary=True,buffered=True) #fail
with open('sql\\getRawData.sql') as fh:
sql = fh.read()
curs.execute(sql, params=None, multi=False)
result = curs.fetchall() #<=== script stops at this point
print len(result) #<=== this line never executes
print curs.column_names
curs.close()
cnx.close()
cnx_out.close()
sys.exit()
The line curs.execute(sql, params=None, multi=False) succeeds on both the large and small databases.
If I use curs.fetchone() in a loop, I can read all records.
If I alter the line:
curs = cnx.cursor(dictionary=True)
to read:
curs = cnx.cursor(dictionary=True,buffered=True)
The script hangs at curs.execute(sql, params=None, multi=False).
I can find no documentation on any limits to fetchall(), nor can I find any way to increase the buffer size, and no way to tell how large a buffer I even need.
There are no exceptions raised.
How can I resolve this?
I was having this same issue, first on a query that returned ~70k rows and then on one that only returned around 2k rows (and for me RAM was also not the limiting factor). I switched from using mysql.connector (i.e. the mysql-connector-python package) to MySQLdb (i.e. the mysql-python package) and then was able to fetchall() on large queries with no problem. Both packages seem to follow the python DB API, so for me MySQLdb was a drop-in replacement for mysql.connector, with no code changes necessary beyond the line that sets up the connection. YMMV if you're leveraging something specific about mysql.connector.
Pragmatically speaking, if you don't have a specific reason to be using mysql.connector the solution to this is just to switch to a package that works better!

Python with Mysql - pdf file insertion during runtime

I have a script that stores results in pdf format in a particular folder. I want to create a mysql database ( which is successful with the below code ), and populate the pdf results to it. what would be the best way , storing the file as such , or as reference to the location. The file size would be around 2MB. Could someone help in explaining the same with some working examples. I am new to both python and mysql.Thanks in advance.
To clarify more : I tried using LOAD DATA INFILE and the BLOB type for the result file column , but it dosent seem to work .I am using pymysql api module to connect to the database.Below code is to connect to the database and is successful.
import pymsql
conn = pymysql.connect(host='hostname', port=3306, user='root', passwd='abcdef', db='mydb')
cur = conn.cursor()
cur.execute("SELECT * FROM userlogin")
for r in cur.fetchall():
print(r)
cur.close()
conn.close()
Since you seem to be close to getting mysql to store strings for you (user names), your best bet is to just stick with what you did there and store the file path just as you stored the strings in your userlogin table (but in a different table with a foreign key to userlogin). It will probably be the most efficient approach in the long run anyway, especially if you store important metadata along with the file path (like keywords or even complete n-gram sets)... now you're talking about a file indexing system like Google Desktop or Xapian... just so you know what you're up against if you want to do this the "best" way.

Categories