I was looking at the question and decided to try using the bind variables. I use
sql = 'insert into abc2 (interfield,textfield) values (%s,%s)'
a = time.time()
for i in range(10000):
#just a wrapper around cursor.execute
db.executeUpdateCommand(sql,(i,'test'))
db.commit()
and
sql = 'insert into abc2 (intfield,textfield) values (%(x)s,%(y)s)'
for i in range(10000):
db.executeUpdateCommand(sql,{'x':i,'y':'test'})
db.commit()
Looking at the time taken for the two sets, above it seems like there isn't much time difference. In fact, the second one takes longer. Can someone correct me if I've made a mistake somewhere? using psycopg2 here.
The queries are equivalent in Postgresql.
Bind is oracle lingo. When you use it will save the query plan so the next execution will be a little faster. prepare does the same thing in Postgres.
http://www.postgresql.org/docs/current/static/sql-prepare.html
psycopg2 supports an internal 'bind', not prepare with cursor.executemany() and cursor.execute()
(But don't call it bind to pg people. Call it prepare or they may not know what you mean:)
IMPORTANT UPDATE :
I've seen into source of all python libraries to connect to PostgreSQL in FreeBSD ports and can say, that only py-postgresql does real prepared statements! But it is Python 3+ only.
also py-pg_queue is funny lib implementing official DB protocol (python 2.4+)
You've missed answer for that question about prepared statements to use as many as possible. "Binded variables" are better form of this, let's see:
sql_q = 'insert into abc (intfield, textfield) values (?, ?)' # common form
sql_b = 'insert into abc2 (intfield, textfield) values (:x , :y)' # should have driver and db support
so your test should be this:
sql = 'insert into abc2 (intfield, textfield) values (:x , :y)'
for i in range (10000):
cur.execute(sql, x=i, y='test')
or this:
def _data(n):
for i in range (n):
yield (i, 'test')
sql = 'insert into abc2 (intfield, textfield) values (? , ?)'
cur.executemany(sql, _data(10000))
and so on.
UPDATE:
I've just found interest reciple how to transparently replace SQL queries with prepared and with usage of %(name)s
As far as I know, psycopg2 has never supported server-side parameter binding ("bind variables" in Oracle parlance). Current versions of PostgreSQL do support it at the protocol level using prepared statements, but only a few connector libraries make use of it. The Postgres wiki notes this here. Here are some connectors that you might want to try: (I haven't used these myself.)
pg8000
python-pgsql
py-postgresql
As long as you're using DB-API calls, you probably ought to consider cursor.executemany() instead of repeatedly calling cursor.execute().
Also, binding parameters to their query in the server (instead of in the connector) is not always going to be faster in PostgreSQL. Note this FAQ entry.
Related
This question already has answers here:
How to use variables in SQL statement in Python?
(5 answers)
Closed 2 months ago.
def update_inv_quant():
new_quant = int(input("Enter the updated quantity in stock: "))
Hello! I'm wondering how to insert a user variable into an sql statement so that a record is updated to said variable. Also, it'd be really helpful if you could also help me figure out how to print records of the database into the actual python console. Thank you!
I tried doing soemthing like ("INSERT INTO Inv(ItemName) Value {user_iname)") but i'm not surprised it didnt work
It would have been more helpful if you specified an actual database.
First method (Bad)
The usual way (which is highly discouraged as Graybeard said in the comments) is using python's f-string. You can google what it is and how to use it more in-depth.
but basically, say you have two variables user_id = 1 and user_name = 'fish', f-string turns something like f"INSERT INTO mytable(id, name) values({user_id},'{user_name}')" into the string INSERT INTO mytable(id,name) values(1,'fish').
As we mentioned before, this causes something called SQL injection. There are many good youtube videos that demonstrate what that is and why it's dangerous.
Second method
The second method is dependent on what database you are using. For example, in Psycopg2 (Driver for PostgreSQL database), the cursor.execute method uses the following syntax to pass variables cur.execute('SELECT id FROM users WHERE cookie_id = %s',(cookieid,)), notice that the variables are passed in a tuple as a second argument.
All databases use similar methods, with minor differences. For example, I believe SQLite3 uses ? instead of psycopg2's %s. That's why I said that specifying the actual database would have been more helpful.
Fetching records
I am most familiar with PostgreSQL and psycopg2, so you will have to read the docs of your database of choice.
To fetch records, you send the query with cursor.execute() like we said before, and then call cursor.fetchone() which returns a single row, or cursor.fetchall() which returns all rows in an iterable that you can directly print.
Execute didn't update the database?
Statements executing from drivers are transactional, which is a whole topic by itself that I am sure will find people on the internet who can explain it better than I can. To keep things short, for the statement to physically change the database, you call connection.commit() after cursor.execute()
So finally to answer both of your questions, read the documentation of the database's driver and look for the execute method.
This is what I do (which is for sqlite3 and would be similar for other SQL type databases):
Assuming that you have connected to the database and the table exists (otherwise you need to create the table). For the purpose of the example, i have used a table called trades.
new_quant = 1000
# insert one record (row)
command = f"""INSERT INTO trades VALUES (
'some_ticker', {new_quant}, other_values, ...
) """
cur.execute(command)
con.commit()
print('trade inserted !!')
You can then wrap the above into your function accordingly.
can somebody please recomend me some python DBAL library that will best suit my requirements. I would like to write my sql statements directly, most of the logics will be in db stored procedures (postgresql), so I only need to invoke db procedures, pass arguments to them and fetch the results. The library should help me with quoting (preventing sql inject).
I played with sqlalchemy, but i think that there is no quoting helper when writing sql statement directly to engine.execute method.
Thank you
You should have given sqlalchemy a deeper look; It does a fine job of quoting placeholders:
>>> engine = sqlalchemy.create_engine("sqlite:///:memory:")
>>> engine.execute("select ?", 5).fetchall()
[(5,)]
>>> engine.execute("select ?", "; drop table users; --").fetchall()
[(u'; drop table users; --',)]
psycopg2 (via DB-API) will automatically quote to prevent SQL injection, IF you use it properly. (The python way is wrong; you have to pass the parameters as arguments to the query command itself.)
WRONG:
cur.execute('select * from table where last="%s" and first="%s"'
% (last, first))
RIGHT:
cur.execute('select * from table where last=%s and first=%s',
(last, first))
Note: you don't use %, and you don't put quotes around your values.
The syntax is slightly different for MySQLdb and sqlite3. (For example, sqlite uses ? instead of %s.)
Also, for psycopg2, always use %s even if you're dealing with numbers or some other type.
i am inserting records to sql server from python using pymssql. The database takes 2 milliseconds to execute a query, yet it insert 6 rows per second. The only problem is at code side. how to optimize following code or what is the fastest method to insert records.
def save(self):
conn = pymssql.connect(host=dbHost, user=dbUser,
password=dbPassword, database=dbName, as_dict=True)
cur = conn.cursor()
self.pageURL = self.pageURL.replace("'","''")
query = "my query is there"
cur.execute(query)
conn.commit()
conn.close()
It looks like you're creating a new connection per insert there. That's probably the major reason for the slowdown: building new connections is typically quite slow. Create the connection outside the method and you should see a large improvement. You can also create a cursor outside function and re-use it, which will be another speedup.
Depending on your situation, you may also want to use the same transaction for more than a single insertion. This changes the behaviour a little -- since a transaction is supposed to be atomic and either completely succeeds or completely fails -- but committing a transaction is typically a slow operation, because it has to be certain the whole operation succeeded.
In addition to Thomas' great advice,
I'd suggest you look into executemany()*, e.g.:
cur.executemany("INSERT INTO persons VALUES(%d, %s)",
[ (1, 'John Doe'), (2, 'Jane Doe') ])
...where the second argument of executemany() should be a sequence of rows to insert.
This brings up another point:
You probably want to send your query and query parameters as separate arguments to either execute() or executemany(). This will allow the PyMSSQL module to handle any quoting issues for you.
*executemany() as described in the Python DB-API:
.executemany(operation,seq_of_parameters)
Prepare a database operation (query or
command) and then execute it against
all parameter sequences or mappings
found in the sequence
seq_of_parameters.
I've been trying to find a postgres interface for python 2.x that supports real prepared statements, but can't seem to find anything. I don't want one that just escapes quotes in the params you pass in and then interpolates them into the query before executing it. Anyone have any suggestions?
Either py-postgresql for Python3 or pg_proboscis for Python2 will do this.
Python-pgsql will also do this but is not threadsafe. Notably, SQLAlchemy does not make use of prepared statements.
have a look at web.py's db module
examples can be found at
http://webpy.org/cookbook/select
http://webpy.org/cookbook/update
http://webpy.org/cookbook/delete
http://webpy.org/Insert
These links hint at the answer when using psycopg2. You don't need special API extensions.
Re: psycopg2 and prepared statements
Prepared Statements in Postgresql
Transparently execute SQL queries as prepared statements with
Postgresql (Python recipe)
Here's an example that I played with. A word of caution though, it didn't give me the expected performance increase I had hoped for. In fact, it was even slower (just slightly) in a contrived case where I tried to read the whole table of one million rows, one row at a time.
cur.execute('''
PREPARE prepared_select(text, int) AS
SELECT * FROM test
WHERE (name = $1 and rowid > $2) or name > $1
ORDER BY name, rowid
LIMIT 1
''')
name = ''
rowid = 0
cur.execute('EXECUTE prepared_select(%s, %s)', (name, rowid))
What's the best way to make psycopg2 pass parameterized queries to PostgreSQL? I don't want to write my own escpaing mechanisms or adapters and the psycopg2 source code and examples are difficult to read in a web browser.
If I need to switch to something like PyGreSQL or another python pg adapter, that's fine with me. I just want simple parameterization.
psycopg2 follows the rules for DB-API 2.0 (set down in PEP-249). That means you can call execute method from your cursor object and use the pyformat binding style, and it will do the escaping for you. For example, the following should be safe (and work):
cursor.execute("SELECT * FROM student WHERE last_name = %(lname)s",
{"lname": "Robert'); DROP TABLE students;--"})
From the psycopg documentation
(http://initd.org/psycopg/docs/usage.html)
Warning Never, never, NEVER use Python string concatenation (+) or string parameters interpolation (%) to pass variables to a SQL query string. Not even at gunpoint.
The correct way to pass variables in a SQL command is using the second argument of the execute() method:
SQL = "INSERT INTO authors (name) VALUES (%s);" # Note: no quotes
data = ("O'Reilly", )
cur.execute(SQL, data) # Note: no % operator
Here are a few examples you might find helpful
cursor.execute('SELECT * from table where id = %(some_id)d', {'some_id': 1234})
Or you can dynamically build your query based on a dict of field name, value:
query = 'INSERT INTO some_table (%s) VALUES (%s)'
cursor.execute(query, (my_dict.keys(), my_dict.values()))
Note: the fields must be defined in your code, not user input, otherwise you will be susceptible to SQL injection.
I love the official docs about this:
https://www.psycopg.org/psycopg3/docs/basic/params.html