pandas to_sql() gives a SADeprecationWarning - python

The to_sql() function in pandas is now producing a SADeprecationWarning.
df.to_sql(name=tablename, con=c, if_exists='append', index=False )
[..]/lib/python3.8/site-packages/pandas/io/sql.py:1430: SADeprecationWarning:The Connection.run_callable() method is deprecated and will be removed in a future release. Use a context manager instead. (deprecated since: 1.4)
I was getting this even with df.read_sql() command, when running sql select statements. Changing it to the original df.read_sql_query() that it wraps around, got rid of it. I'm suspecting there would be some linkage there.
So, question is, how to do I write a dataframe table to SQL without it getting deprecated in a future release? What does "use a context manager" mean, how can I implement that?
Versions:
pandas: 1.1.5 | SQLAlchemy: 1.4.0 | pyodbc: 4.0.30 | Python: 3.8.0
Working with a mssql database.
OS: Linux Mint Xfce, 18.04. Using a python virtual environment.
If it matters, connection created like so:
conn_str = r'mssql+pyodbc:///?odbc_connect={}'.format(dbString).strip()
sqlEngine = sqlalchemy.create_engine(conn_str,echo=False, pool_recycle=3600)
c = sqlEngine.connect()
And after the db operation,
c.close()
Doing so keeps the main connection sqlEngine "alive" between api calls and lets me use a pooled connection rather than having to connect anew.

Update: according to the pandas team, this will be fixed in Pandas 1.2.4 which as of the time of writing has not been released yet.
Adding this as an answer since Google led here but the accepted answer is not applicable.
Our surrounding code that uses Pandas does use a context manager:
with get_engine(dbname).connect() as conn:
df = pd.read_sql(stmt, conn, **kwargs)
return df
In my case, this error is being thrown from within pandas itself, not in the surrounding code that uses pandas:
/Users/tommycarpenter/Development/python-indexapi/.tox/py37/lib/python3.7/site-packages/pandas/io/sql.py:1430: SADeprecationWarning: The Engine.run_callable() method is deprecated and will be removed in a future release. Use the Engine.connect() context manager instead. (deprecated since: 1.4)
The snippet from pandas itself is:
def has_table(self, name, schema=None):
return self.connectable.run_callable(
self.connectable.dialect.has_table, name, schema or self.meta.schema
)
I raised an issue: https://github.com/pandas-dev/pandas/issues/40825

You could try...
connection_string = r'mssql+pyodbc:///?odbc_connect={}'.format(dbString).strip()
engine = sqlalchemy.create_engine(connection_string, echo=False, pool_recycle=3600)
with engine.connect() as connection:
df.to_sql(name=tablename, con=connection, if_exists='append', index=False)
This approach uses a ContextManager. The ContextManager of the engine returns a connection and automatically invokes connection.close() on it, see. Read more about ContextManager here. Another useful thing to know is, that a connection is a ContextManager as well and handles transactions for you. This means it begins and ends a transaction and in case of an error it automatically invokes a rollback.

Related

pyodbc MERGE INTO error: HY000: The driver did not supply an error

I'm trying to execute many (~1000) MERGE INTO statements into oracledb 11.2.0.4.0(64bit) using python 3.9.2(64bit) and pyodbc 4.0.30(64bit). However, all the statements return an exception:
HY000: The driver did not supply an error
I've tried everything I can think of to solve this problem, but no luck. I tried changing code, encodings/decodings and ODBC driver from oracle home 12.1(64bit) to oracle home 19.1(64bit). I also tried using pyodbc 4.0.22 in which case the error just changed into:
<class 'pyodbc.ProgrammingError'> returned a result with an error set
Which is not any more helpful error than the first one. The issue I assume cannot be the MERGE INTO statement itself, because when I try running them directly in the database shell, it completes without issue.
Below is my code. I guess I should also mention the commands and parameters are read from stdin before being executed, and oracledb is using utf8 characterset.
cmds = sys.stdin.readlines()
comms = json.loads(cmds[0])
conn = pyodbc.connect(connstring)
conn.setencoding(encoding="utf-8")
cursor = conn.cursor()
cursor.execute("""ALTER SESSION SET NLS_DATE_FORMAT='YYYY-MM-DD"T"HH24:MI:SS.....'""")
for comm in comms:
params = [(None) if str(x)=='None' or str(x)=='NULL' else (x) for x in comm["params"]]
try:
cursor.execute(comm["sql"],params)
except Exception as e:
print(e)
conn.commit()
conn.close()
Edit: Another things worth mentioning for sure - this issue began after python2.7 to 3.9.2 update. The code itself didn't require any changes at all in this particular location, though.
I've had my share of HY000 errors in the past. It almost always came down to a syntax error in the SQL query. Double check all your double and single quotes, and makes sure the query works when run independently in an SQL session to your database.

MySQL stored procedure sometimes returns 0 rows

EDIT: I've now tried pyodbc as well as pymysql, and have the same result (zero rows returned when calling a stored procedure). Forgot to mention before that this is on Ubuntu 16.04.2 LTS using the MySQL ODBC 5.3 Driver (libmyodbc5w.so).
I'm using pymysql (0.7.11) on Python 3.5.2, executing various stored procedures against a MySQL 5.6.10 database. I'm running into a strange and inconsistent issue where I'm occasionally getting zero results returned, though I can immediately re-run the exact same code and get the number of rows I expect.
The code is pretty straightforward...
from collections import OrderedDict
import pymysql
from pymysql.cursors import DictCursorMixin, Cursor
class OrderedDictCursor(DictCursorMixin, Cursor):
dict_type = OrderedDict
try:
connection = pymysql.connect(
host=my_server,
user=my_user,
password=my_password,
db=my_database,
connect_timeout=60,
cursorclass=pymysql.cursors.DictCursor
)
param1 = '2017-08-23 00:00:00'
param2 = '2017-08-24 00:00:00'
proc_args = tuple([param1, param2])
proc = 'my_proc_name'
cursor = connection.cursor(OrderedDictCursor)
cursor.callproc(proc, proc_args)
result = cursor.fetchall()
except Exception as e:
print('Error: ', e)
finally:
if not isinstance(connection, str):
connection.close()
More often than not, it works just fine. But every once in awhile, it completes almost instantly but with zero rows in the result set. No error that I can see or anything, just nothing... Run it again, and no problem.
Turns out that the problem had nothing to do with pymysql, odbc, etc., but rather was a problem with the order in which the parameters were passed to the stored procedure.
On my desktop, I was using Python 3.6 and things worked just fine. I didn't realize, tho, that one of the changes between 3.5.2 and 3.6 affected how items added to a dictionary object via json.loads were ordered.
The parameters being passed were coming from a dict object originally populated via json.loads... since they were unordered pre-3.6, running the code would occasionally mean that my starttime and endtime parameters were passed to the MySQL stored procedure backwards. Hence, zero rows returned.
Once I realized that was the issue, fixing it was just a matter of adding object_pairs_hook=OrderedDict to the json.loads part.

latency with group in pymongo in tests

Good Day.
I have faced following issue using pymongo==2.1.1 in python2.7 with mongo 2.4.8
I have tried to find solution using google and stack overflow but failed.
What's the issue?
I have following function
from bson.code import Code
def read(groupped_by=None):
reducer = Code("""
function(obj, prev){
prev.count++;
}
""")
client = Connection('localhost', 27017)
db = client.urlstats_database
results = db.http_requests.group(key={k:1 for k in groupped_by},
condition={},
initial={"count": 0},
reduce=reducer)
groupped_by = list(groupped_by) + ['count']
result = [tuple(res[col] for col in groupped_by) for res in results]
return sorted(result)
Then I am trying to write test for this function
class UrlstatsViewsTestCase(TestCase):
test_data = {'data%s' % i : 'data%s' % i for i in range(6)}
def test_one_criterium(self):
client = Connection('localhost', 27017)
db = client.urlstats_database
for column in self.test_data:
db.http_requests.remove()
db.http_requests.insert(self.test_data)
response = read([column])
self.assertEqual(response, [(self.test_data[column], 1)])
this test sometimes fails as I understand because of latency. As I can see response has not cleaned data in it
If I add delay after remove test pass all the time.
Is there any proper way to test such functionality?
Thanks in Advance.
A few questions regarding your environment / code:
What version of pymongo are you using?
If you are using any of the newer versions that have MongoClient, is there any specific reason you are using Connection instead of MongoClient?
The reason I ask second question is because Connection provides fire-and-forget kind of functionality for the operations that you are doing while MongoClient works by default in safe mode and is also preferred approach of use since mongodb 2.2+.
The behviour that you see is very conclusive for Connection usage instead of MongoClient. While using Connection your remove is sent to server, and the moment it is sent from client side, your program execution moves to next step which is to add new entries. Based on latency / remove operation completion time, these are going to be conflicting as you have already noticed in your test case.
Can you change to use MongoClient and see if that helps you with your test code?
Additional Ref: pymongo: MongoClient or Connection
Thanks All.
There is no MongoClient class in version of pymongo I use. So I was forced to find out what exactly differs.
As soon as I upgrade to 2.2+ I will test whether everything is ok with MongoClient. But as for connection class one can use write concern to control this latency.
I older version One should create connection with corresponding arguments.
I have tried these twojournal=True, safe=True (journal write concern can't be used in non-safe mode)
j or journal: Block until write operations have been commited to the journal. Ignored if the server is running without journaling. Implies safe=True.
I think this make performance worse but for automatic tests this should be ok.

What's the difference between pyodbc and MySQLdb?

I have some code written with pyodbc on win x64 using python 2.6 and I get no problem.
Using the same code switching to MySQLdb I get errors.
Example. Long object not iterable....
whats the difference between pyodbc and MySQLdb?
EDIT
import csv, pyodbc, os
import numpy as np
cxn = pyodbc.connect('DSN=MySQL;PWD=me')
import MySQLdb
cxn = MySQLdb.connect (host = "localhost",user="root",passwd ="me")
csr = cxn.cursor()
try:
csr.execute('Call spex.updtop')
cxn. commit
except: pass
csr.close()
cxn.close()
del csr, cxn
Without seeing code, it's not obvious why you're getting errors. You can connect to MySQL databases using either one, and they both implement version 2.x of the Python DB API, though their underlying workings are totally different, as Ignacio Vazquez-Abrams commented.
Some things to consider:
Are you using extensions to the Python DB API that might not be implemented in both?
Are the two libraries translating MySQL datatypes to Python datatypes the same way?
Is there example code you could post?

Sqlite / SQLAlchemy: how to enforce Foreign Keys?

The new version of SQLite has the ability to enforce Foreign Key constraints, but for the sake of backwards-compatibility, you have to turn it on for each database connection separately!
sqlite> PRAGMA foreign_keys = ON;
I am using SQLAlchemy -- how can I make sure this always gets turned on?
What I have tried is this:
engine = sqlalchemy.create_engine('sqlite:///:memory:', echo=True)
engine.execute('pragma foreign_keys=on')
...but it is not working!...What am I missing?
EDIT:
I think my real problem is that I have more than one version of SQLite installed, and Python is not using the latest one!
>>> import sqlite3
>>> print sqlite3.sqlite_version
3.3.4
But I just downloaded 3.6.23 and put the exe in my project directory!
How can I figure out which .exe it's using, and change it?
For recent versions (SQLAlchemy ~0.7) the SQLAlchemy homepage says:
PoolListener is deprecated. Please refer to PoolEvents.
Then the example by CarlS becomes:
engine = create_engine(database_url)
def _fk_pragma_on_connect(dbapi_con, con_record):
dbapi_con.execute('pragma foreign_keys=ON')
from sqlalchemy import event
event.listen(engine, 'connect', _fk_pragma_on_connect)
Building on the answers from conny and shadowmatter, here's code that will check if you are using SQLite3 before emitting the PRAGMA statement:
from sqlalchemy import event
from sqlalchemy.engine import Engine
from sqlite3 import Connection as SQLite3Connection
#event.listens_for(Engine, "connect")
def _set_sqlite_pragma(dbapi_connection, connection_record):
if isinstance(dbapi_connection, SQLite3Connection):
cursor = dbapi_connection.cursor()
cursor.execute("PRAGMA foreign_keys=ON;")
cursor.close()
I now have this working:
Download the latest sqlite and pysqlite2 builds as described above: make sure correct versions are being used at runtime by python.
import sqlite3
import pysqlite2
print sqlite3.sqlite_version # should be 3.6.23.1
print pysqlite2.__path__ # eg C:\\Python26\\lib\\site-packages\\pysqlite2
Next add a PoolListener:
from sqlalchemy.interfaces import PoolListener
class ForeignKeysListener(PoolListener):
def connect(self, dbapi_con, con_record):
db_cursor = dbapi_con.execute('pragma foreign_keys=ON')
engine = create_engine(database_url, listeners=[ForeignKeysListener()])
Then be careful how you test if foreign keys are working: I had some confusion here. When using sqlalchemy ORM to add() things my import code was implicitly handling the relation hookups so could never fail. Adding nullable=False to some ForeignKey() statements helped me here.
The way I test sqlalchemy sqlite foreign key support is enabled is to do a manual insert from a declarative ORM class:
# example
ins = Coverage.__table__.insert().values(id = 99,
description = 'Wrong',
area = 42.0,
wall_id = 99, # invalid fkey id
type_id = 99) # invalid fkey_id
session.execute(ins)
Here wall_id and type_id are both ForeignKey()'s and sqlite throws an exception correctly now if trying to hookup invalid fkeys. So it works! If you remove the listener then sqlalchemy will happily add invalid entries.
I believe the main problem may be multiple sqlite3.dll's (or .so) lying around.
As a simpler approach if your session creation is centralised behind a Python helper function (rather than exposing the SQLA engine directly), you can just issue session.execute('pragma foreign_keys=on') before returning the freshly created session.
You only need the pool listener approach if arbitrary parts of your application may create SQLA sessions against the database.
From the SQLite dialect page:
SQLite supports FOREIGN KEY syntax when emitting CREATE statements for tables, however by default these constraints have no effect on the operation of the table.
Constraint checking on SQLite has three prerequisites:
At least version 3.6.19 of SQLite must be in use
The SQLite libary must be compiled without the SQLITE_OMIT_FOREIGN_KEY or SQLITE_OMIT_TRIGGER symbols enabled.
The PRAGMA foreign_keys = ON statement must be emitted on all connections before use.
SQLAlchemy allows for the PRAGMA statement to be emitted automatically for new connections through the usage of events:
from sqlalchemy.engine import Engine
from sqlalchemy import event
#event.listens_for(Engine, "connect")
def set_sqlite_pragma(dbapi_connection, connection_record):
cursor = dbapi_connection.cursor()
cursor.execute("PRAGMA foreign_keys=ON")
cursor.close()
One-liner version of conny's answer:
from sqlalchemy import event
event.listen(engine, 'connect', lambda c, _: c.execute('pragma foreign_keys=on'))
I had the same problem before (scripts with foreign keys constraints were going through but actuall constraints were not enforced by the sqlite engine); got it solved by:
downloading, building and installing the latest version of sqlite from here: sqlite-sqlite-amalgamation; before this I had sqlite 3.6.16 on my ubuntu machine; which didn't support foreign keys yet; it should be 3.6.19 or higher to have them working.
installing the latest version of pysqlite from here: pysqlite-2.6.0
after that I started getting exceptions whenever foreign key constraint failed
hope this helps, regards
If you need to execute something for setup on every connection, use a PoolListener.
Enforce Foreign Key constraints for sqlite when using Flask + SQLAlchemy.
from flask import Flask
from flask_sqlalchemy import SQLAlchemy
def create_app(config: str=None):
app = Flask(__name__, instance_relative_config=True)
if config is None:
app.config.from_pyfile('dev.py')
else:
logger.debug('Using %s as configuration', config)
app.config.from_pyfile(config)
db.init_app(app)
# Ensure FOREIGN KEY for sqlite3
if 'sqlite' in app.config['SQLALCHEMY_DATABASE_URI']:
def _fk_pragma_on_connect(dbapi_con, con_record): # noqa
dbapi_con.execute('pragma foreign_keys=ON')
with app.app_context():
from sqlalchemy import event
event.listen(db.engine, 'connect', _fk_pragma_on_connect)
Source:
https://gist.github.com/asyd/a7aadcf07a66035ac15d284aef10d458

Categories