How can I access the number of rows affected by:
cursor.execute("SELECT COUNT(*) from result where server_state='2' AND name LIKE '"+digest+"_"+charset+"_%'")
Try using fetchone:
cursor.execute("SELECT COUNT(*) from result where server_state='2' AND name LIKE '"+digest+"_"+charset+"_%'")
result=cursor.fetchone()
result will hold a tuple with one element, the value of COUNT(*).
So to find the number of rows:
number_of_rows=result[0]
Or, if you'd rather do it in one fell swoop:
cursor.execute("SELECT COUNT(*) from result where server_state='2' AND name LIKE '"+digest+"_"+charset+"_%'")
(number_of_rows,)=cursor.fetchone()
PS. It's also good practice to use parametrized arguments whenever possible, because it can automatically quote arguments for you when needed, and protect against sql injection.
The correct syntax for parametrized arguments depends on your python/database adapter (e.g. mysqldb, psycopg2 or sqlite3). It would look something like
cursor.execute("SELECT COUNT(*) from result where server_state= %s AND name LIKE %s",[2,digest+"_"+charset+"_%"])
(number_of_rows,)=cursor.fetchone()
From PEP 249, which is usually implemented by Python database APIs:
Cursor Objects should respond to the following methods and attributes:
[…]
.rowcount
This read-only attribute specifies the number of rows that the last .execute*() produced (for DQL statements like 'select') or affected (for DML statements like 'update' or 'insert').
But be careful—it goes on to say:
The attribute is -1 in case no .execute*() has been performed on the cursor or the rowcount of the last operation is cannot be determined by the interface. [7]
Note:
Future versions of the DB API specification could redefine the latter case to have the object return None instead of -1.
So if you've executed your statement, and it works, and you're certain your code will always be run against the same version of the same DBMS, this is a reasonable solution.
The number of rows effected is returned from execute:
rows_affected=cursor.execute("SELECT ... ")
of course, as AndiDog already mentioned, you can get the row count by accessing the rowcount property of the cursor at any time to get the count for the last execute:
cursor.execute("SELECT ... ")
rows_affected=cursor.rowcount
From the inline documentation of python MySQLdb:
def execute(self, query, args=None):
"""Execute a query.
query -- string, query to execute on server
args -- optional sequence or mapping, parameters to use with query.
Note: If args is a sequence, then %s must be used as the
parameter placeholder in the query. If a mapping is used,
%(key)s must be used as the placeholder.
Returns long integer rows affected, if any
"""
In my opinion, the simplest way to get the amount of selected rows is the following:
The cursor object returns a list with the results when using the fetch commands (fetchall(), fetchone(), fetchmany()). To get the selected rows just print the length of this list. But it just makes sense for fetchall(). ;-)
print len(cursor.fetchall)
# python3
print(len(cur.fetchall()))
To get the number of selected rows I usually use the following:
cursor.execute(sql)
count = len(cursor.fetchall())
when using count(*) the result is {'count(*)': 9}
-- where 9 represents the number of rows in the table, for the instance.
So, in order to fetch the just the number, this worked in my case, using mysql 8.
cursor.fetchone()['count(*)']
Related
This question already has answers here:
How to use variables in SQL statement in Python?
(5 answers)
Closed 2 months ago.
def update_inv_quant():
new_quant = int(input("Enter the updated quantity in stock: "))
Hello! I'm wondering how to insert a user variable into an sql statement so that a record is updated to said variable. Also, it'd be really helpful if you could also help me figure out how to print records of the database into the actual python console. Thank you!
I tried doing soemthing like ("INSERT INTO Inv(ItemName) Value {user_iname)") but i'm not surprised it didnt work
It would have been more helpful if you specified an actual database.
First method (Bad)
The usual way (which is highly discouraged as Graybeard said in the comments) is using python's f-string. You can google what it is and how to use it more in-depth.
but basically, say you have two variables user_id = 1 and user_name = 'fish', f-string turns something like f"INSERT INTO mytable(id, name) values({user_id},'{user_name}')" into the string INSERT INTO mytable(id,name) values(1,'fish').
As we mentioned before, this causes something called SQL injection. There are many good youtube videos that demonstrate what that is and why it's dangerous.
Second method
The second method is dependent on what database you are using. For example, in Psycopg2 (Driver for PostgreSQL database), the cursor.execute method uses the following syntax to pass variables cur.execute('SELECT id FROM users WHERE cookie_id = %s',(cookieid,)), notice that the variables are passed in a tuple as a second argument.
All databases use similar methods, with minor differences. For example, I believe SQLite3 uses ? instead of psycopg2's %s. That's why I said that specifying the actual database would have been more helpful.
Fetching records
I am most familiar with PostgreSQL and psycopg2, so you will have to read the docs of your database of choice.
To fetch records, you send the query with cursor.execute() like we said before, and then call cursor.fetchone() which returns a single row, or cursor.fetchall() which returns all rows in an iterable that you can directly print.
Execute didn't update the database?
Statements executing from drivers are transactional, which is a whole topic by itself that I am sure will find people on the internet who can explain it better than I can. To keep things short, for the statement to physically change the database, you call connection.commit() after cursor.execute()
So finally to answer both of your questions, read the documentation of the database's driver and look for the execute method.
This is what I do (which is for sqlite3 and would be similar for other SQL type databases):
Assuming that you have connected to the database and the table exists (otherwise you need to create the table). For the purpose of the example, i have used a table called trades.
new_quant = 1000
# insert one record (row)
command = f"""INSERT INTO trades VALUES (
'some_ticker', {new_quant}, other_values, ...
) """
cur.execute(command)
con.commit()
print('trade inserted !!')
You can then wrap the above into your function accordingly.
I am aware that queries in Python can be parameterized using either ? or %s in execute query here or here
However I have some long query that would use some constant variable defined at the beginning of the query
Set #my_const = 'xyz';
select #my_const;
-- Query that use #my_const 40 times
select ... coalesce(field1, #my_const), case(.. then #my_const)...
I would like to do the least modif possible to the query from Mysql. So that instead of modifying the query to
pd.read_sql(select ... coalesce(field1, %s), case(.. then %s)... , [my_const, my_const, my_const, ..]
,I could write something along the line of the initial query. Upon trying the following, however, I am getting a TypeError: 'NoneType' object is not iterable
query_str = "Set #null_val = \'\'; "\
" select #null_val"
erpur_df = pd.read_sql(query_str, con = db)
Any idea how to use the original variable defined in Mysql query ?
The reason
query_str = "Set #null_val = \'\'; "\
" select #null_val"
erpur_df = pd.read_sql(query_str, con = db)
throws that exception is because all you are doing is setting null_value to '' and then selecting that '' - what exactly would you have expected that to give you? EDIT read_sql only seems to execute one query at a time, and as the first query returns no rows it results in that exception.
If you split them in to two calls to read_sql then it will in fact return you the value of your #null value in the second call. Due to this behaviour read_sql is clearly not a good way to do this. I strongly suggest you use one of my suggestions below.
Why are you wanting to set the variable in the SQL using '#' anyway?
You could try using the .format style of string formatting.
Like so:
query_str = "select ... coalesce(field1, {c}), case(.. then {c})...".format(c=my_const)
pd.read_sql(query_str)
Just remember that if you do it this way and your my_const is a user input then you will need to sanitize it manually to prevent SQL injection.
Another possibility is using a dict of params like so:
query_str = "select ... coalesce(field1, %(my_const)s, case(.. then %(my_const)s)..."
pd.read_sql(query_str, params={'my_const': const_value})
However this is dependent on which database driver you use.
From the pandas.read_sql docs:
Check your database driver documentation for which of the five syntax
styles, described in PEP 249’s paramstyle, is supported. Eg. for
psycopg2, uses %(name)s so use params={‘name’ : ‘value’}
How does rowcount work. I am using pyodbc and it's always returning -1.
return_query = conn.query_db_param(query, q_params)
print(return_query.rowcount)
def query_db_param(self, query, params):
self.cursor.execute(query,params)
print(self.cursor.rowcount)
rowcount refers to the number of rows affected by the last operation. So, if you do an insert and insert only one row, then it will return 1. If you update 200 rows, then it will return 200. On the other hand, if you SELECT, the last operation doesn't really affect rows, it is a result set. In that case, 0 would be syntactically incorrect, so the interface returns -1 instead.
It will also return -1 for operations where you do things like set variables or use create/alter commands.
You are connecting to a database that can't give you that number for your query. Many database engines produce rows as you fetch results, scanning their internal table and index data structures for the next matching result as you do so. The engine can't know the final count until you fetched all rows.
When the rowcount is not known, the Python DB-API 2.0 specification for Cursor.rowcount states the number must be set to -1 in that case:
The attribute is -1 in case [...] the rowcount of the last operation is cannot be determined by the interface.
The pyodbc Cursor.rowcount documentation conforms to this requirement:
The number of rows modified by the last SQL statement.
This is -1 if no SQL has been executed or if the number of rows is unknown. Note that it is not uncommon for databases to report -1 immediately after a SQL select statement for performance reasons. (The exact number may not be known before the first records are returned to the application.)
pyodbc is not alone in this, another easy-to-link-to example is the Python standard library sqlite3 module; it's Cursor.rowcount documentation states:
As required by the Python DB API Spec, the rowcount attribute “is -1 in case no executeXX() has been performed on the cursor or the rowcount of the last operation is not determinable by the interface”. This includes SELECT statements because we cannot determine the number of rows a query produced until all rows were fetched.
Note that for subset of database implementations, the rowcount value can be updated after fetching some of the rows. You'll have to check your specific database documentation you are connecting to to see if that implementations can do this, or if the rowcount must remain at -1. You could always experiment, of course.
You could execute a COUNT() select first, or, if the result set is not expected to be too large, use cursor.fetchall() and use len() on the resulting list.
If you are using microsoft sql server, and you want to get the number of rows returned in the prior select statement, you can just execute select ##rowcount.
E.g.:
cursor.execute("select ##rowcount")
rowcount = cursor.fetchall()[0][0]
I am trying to do a simple filter operation on a query in sqlalchemy, like this:
q = session.query(Genotypes).filter(Genotypes.rsid.in_(inall))
where
inall is a list of strings
Genotypes is mapped to a table:
class Genotypes(object):
pass
Genotypes.mapper = mapper(Genotypes, kg_table, properties={'rsid': getattr(kg_table.c, 'rs#')})
This seems pretty straightforward to me, but I get the following error when I execute the above query by doing q.first():
"sqlalchemy.exc.OperationalError: (OperationalError) too many SQL
variables u'SELECT" followed by a list of the 1M items in the inall
list. But they aren't supposed to be SQL variables, just a list whose
membership is the filtering criteria.
Am I doing the filtering incorrectly?
(the db is sqlite)
If the table where you are getting your rsids from is available in the same database I'd use a subquery to pass them into your Genotypes query rather than passing the one million entries around in your Python code.
sq = session.query(RSID_Source).subquery()
q = session.query(Genotypes).filter(Genotypes.rsid.in_(sq))
The issue is that in order to pass that list to SQLite (or any database, really), SQLAlchemy has to pass over each entry for your in clause as a variable. The SQL translates roughly to:
-- Not valid SQLite SQL
DECLARE #Param1 TEXT;
SET #Param1 = ?;
DECLARE #Param2 TEXT;
SET #Param2 = ?;
-- snip 999,998 more
SELECT field1, field2, -- etc.
FROM Genotypes G
WHERE G.rsid IN (#Param1, #Param2, /* snip */)
The below workaround worked for me:
q = session.query(Genotypes).filter(Genotypes.rsid.in_(inall))
query_as_string = str(q.statement.compile(compile_kwargs={"literal_binds": True}))
session.execute(query_as_string).first()
This basically forces the query to compile as a string before execution, which bypasses the whole variables issue. Some details on this are available in SQLAlchemy's docs here.
BTW, if you're not using SQLite you can make use of the ANY operator to pass the list object as a single parameter (see my answer to this question here).
I'm trying to create a python script that constructs valid sqlite queries. I want to avoid SQL Injection, so I cannot use '%s'. I've found how to execute queries, cursor.execute('sql ?', (param)), but I want how to get the parsed sql param. It's not a problem if I have to execute the query first in order to obtain the last query executed.
If you're trying to transmit changes to the database to another computer, why do they have to be expressed as SQL strings? Why not pickle the query string and the parameters as a tuple, and have the other machine also use SQLite parameterization to query its database?
If you're not after just parameter substitution, but full construction of the SQL, you have to do that using string operations on your end. The ? replacement always just stands for a value. Internally, the SQL string is compiled to SQLite's own bytecode (you can find out what it generates with EXPLAIN thesql) and ? replacements are done by just storing the value at the correct place in the value stack; varying the query structurally would require different bytecode, so just replacing a value wouldn't be enough.
Yes, this does mean you have to be ultra-careful. If you don't want to allow updates, try opening the DB connection in read-only mode.
Use the DB-API’s parameter substitution. Put ? as a placeholder wherever you want to use a value, and then provide a tuple of values as the second argument to the cursor’s execute() method.
# Never do this -- insecure!
symbol = 'hello'
c.execute("SELECT * FROM stocks WHERE symbol = '%s'" % symbol)
# Do this instead
t = (symbol,)
c.execute('SELECT * FROM stocks WHERE symbol=?', t)
print c.fetchone()
More reference is in the manual.
I want how to get the parsed 'sql param'.
It's all open source so you have full access to the code doing the parsing / sanitization. Why not just reading this code and find out how it works and if there's some (possibly undocumented) implementation that you can reuse ?