How to get number of affected rows in sqlalchemy? - python

I have one question concerning Python and the sqlalchemy module. What is the equivalent for cursor.rowcount in the sqlalchemy Python?

ResultProxy objects have a rowcount property as well.

Actually there is no way to know this precisely for postgres.
The closes thing is rowcount. But
rowcount is not the number of affected rows. Its the number of matched rows. See what doc says
This attribute returns the number of rows matched, which is not necessarily the same as the number of rows that were actually modified - an UPDATE statement, for example, may have no net change on a given row if the SET values given are the same as those present in the row already. Such a row would be matched but not modified. On backends that feature both styles, such as MySQL, rowcount is configured by default to return the match count in all cases
So for both of the following scenarios rowcount will report 1. Because of Rows matched: 1
one row changed with update statement.
Query OK, 1 row affected (0.00 sec)
Rows matched: 1 Changed: 1 Warnings: 0
same update statement is executed.
Query OK, 0 row affected (0.00 sec)
Rows matched: 1 Changed: 0 Warnings: 0

What Shiplu says is correct of course, however, to answer the question, in many cases you can easily make the matched columns equal the changed columns by including the condition that the value is different in the WHERE clause, i.e.:
UPDATE table
SET column_to_be_changed = "something different"
WHERE column_to_be_changed != "something different" AND [...your other conditions...]
rowcount will then return the number of affected rows, because it equals the number of matched rows.

Shiplu's analysis is 100% correct.
Pushing the discussion a bit further, here is how to display the updated rowcount and not the matched rowcount using sqlalchemy for MySQL: the engine needs to be created with the flag client_flag=0.
from sqlalchemy import create_engine
engine = create_engine(
'mysql+pymysql://user:password#host:port/db',
connect_args={'client_flag':0}
)
To give a bit more details, the rowcount returned by MySQL depends on the CLIENT_FOUND_ROWS flag provided to the C-API function mysql_real_connect() as stated in MySQL documentation:
For UPDATE statements, the affected-rows value by default is the number of rows actually changed. If you specify the CLIENT_FOUND_ROWS flag to mysql_real_connect() when connecting to mysqld, the affected-rows value is the number of rows “found”; that is, matched by the WHERE clause.
The flag value is 2 (MySQL constants), and is added automatically by sqlalchemy when creating the engine as visible here.
The value of client_flag in connect_args allows to override this value.
Note: this might break sth from the sane_rowcount (only used in the ORM apparently) - in my case, I only use the Core of sqlalchemy. In last link:
# FOUND_ROWS must be set in CLIENT_FLAGS to enable
# supports_sane_rowcount.

You can use .returning to give you the rows which have been updated, and then use result.rowcount to count this
eg
insertstmt = insert(mytable).values(myvalues).returning(mytable.c.mytableid)
with get_engine().begin() as conn:
result = conn.execute(insertstmt)
print(result.rowcount)

Related

pyodbc rowcount only returns -1

How does rowcount work. I am using pyodbc and it's always returning -1.
return_query = conn.query_db_param(query, q_params)
print(return_query.rowcount)
def query_db_param(self, query, params):
self.cursor.execute(query,params)
print(self.cursor.rowcount)
rowcount refers to the number of rows affected by the last operation. So, if you do an insert and insert only one row, then it will return 1. If you update 200 rows, then it will return 200. On the other hand, if you SELECT, the last operation doesn't really affect rows, it is a result set. In that case, 0 would be syntactically incorrect, so the interface returns -1 instead.
It will also return -1 for operations where you do things like set variables or use create/alter commands.
You are connecting to a database that can't give you that number for your query. Many database engines produce rows as you fetch results, scanning their internal table and index data structures for the next matching result as you do so. The engine can't know the final count until you fetched all rows.
When the rowcount is not known, the Python DB-API 2.0 specification for Cursor.rowcount states the number must be set to -1 in that case:
The attribute is -1 in case [...] the rowcount of the last operation is cannot be determined by the interface.
The pyodbc Cursor.rowcount documentation conforms to this requirement:
The number of rows modified by the last SQL statement.
This is -1 if no SQL has been executed or if the number of rows is unknown. Note that it is not uncommon for databases to report -1 immediately after a SQL select statement for performance reasons. (The exact number may not be known before the first records are returned to the application.)
pyodbc is not alone in this, another easy-to-link-to example is the Python standard library sqlite3 module; it's Cursor.rowcount documentation states:
As required by the Python DB API Spec, the rowcount attribute “is -1 in case no executeXX() has been performed on the cursor or the rowcount of the last operation is not determinable by the interface”. This includes SELECT statements because we cannot determine the number of rows a query produced until all rows were fetched.
Note that for subset of database implementations, the rowcount value can be updated after fetching some of the rows. You'll have to check your specific database documentation you are connecting to to see if that implementations can do this, or if the rowcount must remain at -1. You could always experiment, of course.
You could execute a COUNT() select first, or, if the result set is not expected to be too large, use cursor.fetchall() and use len() on the resulting list.
If you are using microsoft sql server, and you want to get the number of rows returned in the prior select statement, you can just execute select ##rowcount.
E.g.:
cursor.execute("select ##rowcount")
rowcount = cursor.fetchall()[0][0]

I am delete object of model with pk=1, but new object have pk=2 [duplicate]

I have got a table with auto increment primary key. This table is meant to store millions of records and I don't need to delete anything for now. The problem is, when new rows are getting inserted, because of some error, the auto increment key is leaving some gaps in the auto increment ids.. For example, after 5, the next id is 8, leaving the gap of 6 and 7. Result of this is when I count the rows, it results 28000, but the max id is 58000. What can be the reason? I am not deleting anything. And how can I fix this issue.
P.S. I am using insert ignore while inserting records so that it doesn't give error when I try to insert duplicate entry in unique column.
This is by design and will always happen.
Why?
Let's take 2 overlapping transaction that are doing INSERTs
Transaction 1 does an INSERT, gets the value (let's say 42), does more work
Transaction 2 does an INSERT, gets the value 43, does more work
Then
Transaction 1 fails. Rolls back. 42 stays unused
Transaction 2 completes with 43
If consecutive values were guaranteed, every transaction would have to happen one after the other. Not very scalable.
Also see Do Inserted Records Always Receive Contiguous Identity Values (SQL Server but same principle applies)
You can create a trigger to handle the auto increment as:
CREATE DEFINER=`root`#`localhost` TRIGGER `mytable_before_insert` BEFORE INSERT ON `mytable` FOR EACH ROW
BEGIN
SET NEW.id = (SELECT IFNULL(MAX(id), 0) + 1 FROM mytable);;
END
This is a problem in the InnoDB, the storage engine of MySQL.
It really isn't a problem as when you check the docs on “AUTO_INCREMENT Handling in InnoDB” it basically says InnoDB uses a special table to do the auto increments at startup
And the query it uses is something like
SELECT MAX(ai_col) FROM t FOR UPDATE;
This improves concurrency without really having an affect on your data.
To not have this use MyISAM instead of InnoDB as storage engine
Perhaps (I haven't tested this) a solution is to set innodb_autoinc_lock_mode to 0.
According to http://dev.mysql.com/doc/refman/5.7/en/innodb-auto-increment-handling.html this might make things a bit slower (if you perform inserts of multiple rows in a single query) but should remove gaps.
You can try insert like :
insert ignore into table select (select max(id)+1 from table), "value1", "value2" ;
This will try
insert new data with last unused id (not autoincrement)
if in unique fields duplicate entry found ignore it
else insert new data normally
( but this method not support to update fields if duplicate entry found )

How do I get the number of rows affected with SQL Alchemy?

How do I get the number of rows affected for an update statement with sqlalchemy? (I am using mysql and python/pyramid):
from sqlalchemy.engine.base import ResultProxy
#classmethod
def myupdate(cls, id, myvalue):
DBSession.query(cls).filter(cls.id == id).update({'mycolumn': myvalue})
if ResultProxy.rowcount == 1:
return True
else:
return False
Note: I saw this post but according to the docs: "The ‘rowcount’ reports the number of rows matched by the WHERE criterion of an UPDATE or DELETE statement."....in other words, it doesn't return the number of rows affected by the update or delete statement.
You can override this behaviour by specifying the right option to the DBAPI, according to the doc.
I don't have a mysql ready to test, but I think adding the right option (either client_flag or found_rows=False depending on the api used) to the configuration url should do the trick. Check the corresponding source for mysqlconnector and oursql for more info.
I hope this will be enough to help you.

How can I "fetch two" with python-mysql?

I have a table, and I want to execute a query that will return the values of two rows:
cursor.execute("""SELECT `egg_id`
FROM `groups`
WHERE `id` = %s;""", (req_id))
req_egg = str(cursor.fetchone())
print req_egg
The column egg_id has two rows it can return for that query, however the above code will only print the first result -- I want it to also show the second, how would I get both values?
Edit: Would there be any way to store each one in a separate variable, with fetchmany?
in this case you can use fetchmany to fetch a specified number of rows:
req_egg = cursor.fetchmany(2)
edit:
but be aware: if you have a table with many rows but only need two, then you should also use a LIMIT in your sql query, otherwise all rows are returned from the database, but only two are used by your program.
Call .fetchone() a second time, and it would return the next result.
Otherwise if you are 100% positively sure that your query would always return exactly two results, even if you've had a bug or inconsistent data in the database, then just do a .fetchall() and capture both results.
Try this:
Cursor.fetchmany(size=2)
Documentation for sqlite3 (which also implements dbapi): http://docs.python.org/library/sqlite3.html#sqlite3.Cursor.fetchmany

how to count value returned from select query in python

I want to count the total number of rows returned by query. I am able to retrieved rows returned by query but what if i need to work in case when no data exits. that is when query returns no va from database.
the code i used to solve this problem is :
try:
cur.execute(query)
id = cur.fetchone()[0]
if(id is None):
return '-1'
else:
return id
But this doenst help when no values is returned selected from query.(when condition doesnt meet criteria defined in select statement)
cur.fetchall() will give you a sequence of all the rows. You can look at the length of that sequence to see if any rows were returned. This works for small result sets, but may not be ideal for queries that return large amounts of data.
Alternatively, you can look at cur.rowcount. Rowcount will return the number of rows in the query, or -1 if the number cannot be determined. It is up to the implementation to set rowcount; several popular python database modules (most notably sqlite3), do not set rowcount for all queries. For modules that do not set rowcount, the only way to count the number of result rows is to load the full result set into memory.

Categories