Execute a prepared statement in sqlalchemy - python

I have to run 40K requests against a username:
SELECT * from user WHERE login = :login
It's slow, so I figured I would just use a prepared statement.
So I do
e = sqlalchemy.create_engine(...)
c = e.connect()
c.execute("PREPARE userinfo(text) AS SELECT * from user WHERE login = $1")
r = c.execute("EXECUTE userinfo('bob')")
for x in r:
do_foo()
But I have a:
InterfaceError: (InterfaceError) cursor already closed None None
I don't understand why I get an exception

Not sure how to solve your cursor related error message, but I dont think a prepared staement will solve your performance issue - as long as your using SQL server 2005 or later the execution plan for SELECT * from user WHERE login = $login will already be re-used and there will be no performance gain from the prepared statement. I dont know about MySql or other SQL database servers, but I suspect they too have similar optimisations for Ad-Hoc queries that make the prepared statement redundant.
It sounds like the cause of the performance hit is more down to the fact that you are making 40,000 round trips to the database - you should try and rewrite the query so that you are only executing one SQL statement with a list of the login names. Am I right in thinking that MySql supports an aray data type? If it doesnt (or you are using Microsoft SQL) you should look into passing in some sort of delimited list of usernames.

From this discussion, it might be a good idea to check your paster debug logs in case there is a better error message there.

Related

PYODBC using where in the sql cursor execute

A student of mine is partaking on a piece of coursework where they create a small program / artefact and they have chosen to link Python with a database using pyodbc.
So far he can successfully connect and if he uses a select * from statement and then fetchall he can print out the whole database. But naturally to extend this work he wants to be able to filter results using where but it doesn't seem to work as intended and my experience in this is very limited.
For example the code:
cursor.execute("select * from Films where BBFC = '12'")
Gives this error
pyodbc.Error: ('07002', '[07002] [Microsoft][ODBC Microsoft Access
Driver] Too few parameters. Expected 1. (-3010) (SQLExecDirectW)')”
It is a database of films and wants to filter it by age rating (the bbfc column). I have taken a look myself and cant seem to fix the issue so any help or guidance would be massively appreciated.
The problem here might be some spelling mistakes or maybe a case senstive field name or table name. Would you be able to make sure that 'Films' and 'BBFC' are spelt correctly and match the DB?

Effective insert-only permissions for peewee tables

I'm wondering what the best strategy is for using insert-only permissions to a postgres db with Peewee. I'd like this in order to be certain that a specific user can't read any data back out of the database.
I granted INSERT permissions to my table, 'test', in postgres. But I've run into the problem that when I try to save new rows with something like:
thing = Test(value=1)
thing.save()
The sql actually contains a RETURNING clause that needs more permissions (namely, SELECT) than just insert:
INSERT INTO "test" ("value") VALUES (1) RETURNING "test"."id"
Seems like the same sql is generated when I try to use query = test.insert(value=1)' query.execute() as well.
From looking around, it seems like you need either grant SELECT privileges, or use a more exotic feature like "row level security" in Postgres. Is there any way to go about this with peewee out of the box? Or another suggestion of how to add new rows with truly write-only permissions?
You can omit the returning clause by explicitly writing your INSERT query and supplying a blank RETURNING. Peewee uses RETURNING whenever possible so that the auto-generated PK can be recovered in a single operation, but it is possible to disable it:
# Empty call to returning will disable the RETURNING clause:
iq = Test.insert(value=1).returning()
iq.execute()
You can also override this for all INSERT operations by setting the returning_clause attribute on the DB to False:
db = PostgresqlDatabase(...)
db.returning_clause = False
This is not an officially supported approach, though, and may have unintended side-effects or weird behavior - caveat emptor.

Postgres fails fetching data in Python

I am using Python with psycopg2 module to get data from Postgres database.
The database is quite large (tens of GB).
Everything appears to be working, I am creating objects from the fetched data.
However, after ~160000 of created objects I get the following error:
I suppose the reason is the amount of data, but I could not get anywhere searching for a solution online. I am not aware of using any proxy and have never used any on this machine before, the database is on localhost.
It's interesting how often the "It's a local server so I'm not open to SQL injection" stance leads to people thinking that string interpolation is somehow easier than a parameterized query. In your case it's ended up with:
'... cookie_id = \'{}\''.format(cookie)
So you've ended up with something that's less legible and also fails (though from the specific error I don't know exactly how). Use parameterization:
cursor.execute("SELECT user_id, created_at FROM cookies WHERE cookie_id = %s ORDER BY created_at DESC;", (cookie,))
Bottom line, do it the correct way all the time. Note, there are cases where you must use string interpolation, e.g. for table names:
cursor.execute("SELECT * FROM %s", (table_name,)) # Not valid
cursor.execute("SELECT * FROM {}".format(table_name)) # Valid
And in those cases, you need to take other precautions if someone else can interact with the code.

Is SQL injection protection built into SQLAlchemy's ORM or Core?

I'm developing an aiohttp server application, and I just saw that apparently it isn't able to use SQLAlchemy's ORM layer. So, I was wondering: if my application will only be able to use SQLAlchemy's core, is it still protected against SQL injection attacks?
My code is the following:
async def add_sensor(db_engine, name):
async with db_engine.acquire() as connection:
query = model.Sensor.__table__.insert().values(name=name)
await connection.execute(query)
A comment on the accepted answer in this related question makes me doubt:
you can still use execute() or other literal data that will NOT be
escaped by SQLAlchemy.
So, with the execute() used in my code, does the above quote mean that my code is unsafe? And in general: is protection against SQL Injection only possible with the SQLAlchemy ORM layer, as with the Core layer you'll end up launching execute()?
in your example above i dont see any variable beeing supplied to the database query. Since there is no user supplied input there is also no Sql Injection possible.
Even if there would be a user supplied value as long as you dont use handwritten sql statements with sqlalchemy and instead use the orm model approach (model.Sensor.__table__.select()) as can be seen in your example you are secure against Sql Injection.
In the end its all about telling sqlalchemy explicitely what columns and tables should be used to select and insert data from/to and keeping that separate from the data that is beeing inserted or selected. Never combine the data string with the query string and always use sqlalchemy orm model objects to describe your query.
Bad way (Sql Injectable):
Session.execute("select * form users where name = %s" % request.GET['name'])
Good way (Not Sql Injectable):
Session.execute(model.users.__table__.select().where(model.users.name == request.GET['name']))

Is it possible to use pyodbc to read Paradox tables that are open in the Paradox gui?

I'm working in a environment with a very poorly managed legacy Paradox database system. (I'm not the administrator.) I've been messing around with using pyodbc to interact with our tables, and the basic functionality seems to work. Here's some (working) test code:
import pyodbc
LOCATION = "C:\test"
cnxn = pyodbc.connect(r"Driver={{Microsoft Paradox Driver (*.db )\}};DriverID=538;Fil=Paradox 5.X;DefaultDir={0};Dbq={0};CollatingSequence=ASCII;".format(LOCATION), autocommit=True, readonly=True)
cursor = cnxn.cursor()
cursor.execute("select last, first from test")
row = cursor.fetchone()
print row
The problem is that most of our important tables are going to be open in someone's Paradox GUI at pretty much all times. I get this error whenever I try to do a select from one of those tables:
pyodbc.Error: ('HY000', "[HY000] [Microsoft][ODBC Paradox Driver] Could not lock
table 'test'; currently in use by user '(unknown)' on machine '(unknown)'. (-1304)
(SQLExecDirectW)")
This is, obviously, because pyodbc tries to lock the table when cursor.execute() is called on it. This behavior makes perfect sense, since cursor.execute() runs arbitary SQL code and could change the table.
However, Paradox itself (through its gui) seems to handle multiple users fine. It only gives you similar errors if you try to restructure the table while people are using it.
Is there any way I can get pyodbc to use some sort of read-only mode, such that it doesn't have to lock the table when I'm just doing select and such? Or is locking a fundamental part of how it works that I'm not going to be able to get around?
Solutions that would use other modules are also totally fine.
Ok, I finally figured it out.
Apparently, odbc dislikes Paradox tables which have no primary key. You cannot update tables with no primary key under any circumstances, and you cannot read from tables with no primary key unless you are the only user trying to access that table.
Unrelatedly, you get essentially the same error messages from password-protected tables if you don't supply a password.
So I was testing my script on two different tables, one of which has both a password and a primary key, and one of which had neither. I assumed the error messages had the same root cause, but it was actually two different problems, with different solutions.
There still seems to be no way to get access to tables without primary keys if they are open in someone's GUI, but that's a smaller issue.
Make sure that you have the latest version of pyobdc (3.0.6)here, according to them, they
Added Cursor.commit() and Cursor.rollback(). It is now possible to use
only a cursor in your code instead of keeping track of a connection
and a cursor.
Added readonly keyword to connect. If set to True, SQLSetConnectAttr
SQL_ATTR_ACCESS_MODE is set to SQL_MODE_READ_ONLY. This may provide
better locking semantics or speed for some drivers.
Fixed an error reading SQL Server XML data types longer than 4K.
Also, i have tested this on a paradox server using readonly and it does works.
Hope this helps!
I just published a Python library for reading Paradox database files via the pxlib C-library: https://github.com/mherrmann/pypxlib. This operates on the file-level so should also let you read the database independently of who else is currently accessing it. Since it does not synchronize read/write accesses, you do have to be careful though!

Categories