Locking mysql table in a concurrent function causes an unexpected infinite loop - python

I've a function in my code which is as follows:
async def register():
db = connector.connect(host='localhost',user='root',password='root',database='testing')
cursor = db.cursor()
cursor.execute('LOCK TABLES Data WRITE;')
cursor.execute('SELECT Total_Reg FROM Data;')
data = cursor.fetchall()
reg = data[0][0]
print(reg)
if reg >= 30:
print("CLOSED!")
return
await asyncio.sleep(1)
cursor.execute('UPDATE Data SET Total_Reg = Total_Reg + 1 WHERE Id = 1')
cursor.execute('COMMIT;')
print("REGISTERED!")
db.close()
In case of multiple instances of this register function running at the same time, there is an unexpected infinite loop occurs blocking my entire code. Why is that so? Also, if it's a deadlock [I assume] then why my program is not raising any error? Please tell me why is this happening? And what can be done to prevent this issue?

A much simpler construct:
db = connector.connect(...
cursor = db.cursor()
if cursor.execute('UPDATE Data SET Total_Reg = Total_Reg + 1 WHERE Id = 1 AND Total_Reg < 30'):
print("REGISTERED!")
else:
print("CLOSED!")
db.close()
So:
Don't use LOCK TABLES, not until you understand transactions, and then rarely
Use the SQL to enforce the constraints you want
Use the return value to see if any rows where changed
Don't use sleep statements

Related

python-mysql-connector: I need to speed up the time it takes to update multiple items in mySQL table

I currently have a list of id's approx. of size 10,000. I need to update all rows in the mySQL table which have an id in the inactive_ids list that you see below. I need to change their active status to 'No' which is a column in the mySQL table.
I am using mysql.connector python library.
When I run the code below, it is taking about 0.7 seconds to execute each iteration in the for loop. Thats about a 2 hour run time for all 10,000 id's to be changed. Is there a more optimal/quicker way to do this?
# inactive_ids are unique strings something like shown below
# inactive_ids = ['a9okeoko', 'sdfhreaa', 'xsdfasy', ..., 'asdfad']
# initialize connection
mydb = mysql.connector.connect(
user="REMOVED",
password="REMOVED",
host="REMOVED",
database="REMOVED"
)
# initialize cursor
mycursor = mydb.cursor(buffered=True)
# Function to execute multiple lines
def alter(state, msg, count):
result = mycursor.execute(state, multi=True)
result.send(None)
print(str(count), ': ', msg, result)
count += 1
return count
# Try to execute, throw exception if fails
try:
count = 0
for Id in inactive_ids:
# SAVE THE QUERY AS STRING
sql_update = "UPDATE test_table SET Active = 'No' WHERE NoticeId = '" + Id + "'"
# ALTER
count = alter(sql_update, "done", count)
# commits all changes to the database
mydb.commit()
except Exception as e:
mydb.rollback()
raise e
Do it with a single query that uses IN (...) instead of multiple queries.
placeholders = ','.join(['%s'] * len(inactive_ids))
sql_update = f"""
UPDATE test_table
SET Active = 'No'
WHERE NoticeId IN ({placeholders})
"""
mycursor.execute(sql_update, inactive_ids)

Populate a database with python and Django efficiently

We are trying to populate a database with Python and Django with random numbers, but we have a lot of rows to go through, and it takes like 20 minutes to carry out that task.
This is our code. We have 210000 rows to go through
def populate(request):
all_accounts = Account.objects.all()
count = 0
for account in all_accounts:
account.avg_deal_size = round(random.randint(10, 200000), 2)
account.save()
print(f"Counter of accounts: {count}")
count += 1
Thank you!
Assuming you don't need any logic in .save(), or signals to be executed or such, just use SQL to have your RDBMS do the heavy lifting. This should execute in seconds if that.
from django.db import connection
def populate(request):
with connection.cursor() as cur:
cur.execute("UPDATE myapp_account SET avg_deal_size = round(10 + random() * 190000, 2)")
(res,) = cur.fetchone()
print(f"{res} rows affected.")

Accessing database speed

I have a simple application(Telegram bot) that about 2000-3000 people are using right now.
So I want to increase the balance of users every "N" seconds. It depends on their current status. The code works fine, but the function might not work after some amount of time. I started with 30 seconds, but after the first issue, I thought that 30 seconds is not enough to go through all rows and execute it. So right now I'm running it with 1200 seconds but anyway it stops growing after a while.
So is it just because of it or I'm doing something wrong in the code itself?
P.S. I'm using Python3 and SQLite and the bot is running constantly on a cheap, weak VPS server.
def balance_growth():
try:
cursor = connMembers.cursor()
sql = "SELECT * FROM members"
cursor.execute(sql)
data = cursor.fetchall()
for single_data in data:
if single_data[5] == "Basic":
sql = "UPDATE members SET balance = {B} + 1 WHERE chat_id = {I}".format(B=single_data[1], I=single_data[0])
cursor.execute(sql)
elif single_data[5] == "Bronze":
sql = "UPDATE members SET balance = {B} + 2 WHERE chat_id = {I}".format(B=single_data[1], I=single_data[0])
cursor.execute(sql)
elif single_data[5] == "Silver":
sql = "UPDATE members SET balance = {B} + 12 WHERE chat_id = {I}".format(B=single_data[1], I=single_data[0])
cursor.execute(sql)
elif single_data[5] == "Gold":
sql = "UPDATE members SET balance = {B} + 121 WHERE chat_id = {I}".format(B=single_data[1], I=single_data[0])
cursor.execute(sql)
elif single_data[5] == "Platinum":
sql = "UPDATE members SET balance = {B} + 1501 WHERE chat_id = {I}".format(B=single_data[1], I=single_data[0])
cursor.execute(sql)
cursor.execute(sql)
connMembers.commit()
cursor.close()
t = threading.Timer(120, balance_growth).start()
except Exception as err:
print(err)
Why not just do it all in a single update statement instead of one per row? Something like
UPDATE members SET balance = balance + (CASE whatever_column
WHEN "Platinum" THEN 1501
WHEN "Gold" THEN 121
WHEN "Silver" THEN 12
WHEN "Bronze" THEN 2
ELSE 1 END)
Edit:
Other suggestions:
Use integers instead of strings for the different levels, which will both be faster to compare and take up less space in the database.
Redesign your logic to not need an update every single tick. Maybe something like keeping track of the last time a row's balance was updated, and updating it according to the difference in time between then and now whenever you need to check the balance.
The problem is that you're calling commit() after every single UPDATE statement, which forces the database to write back all changes from its cache.
Do a single commit after you have finished everything.

getting only updated data from database

I have to get the recently updated data from database. For the purpose of solving it, I have saved the last read row number into shelve of python. The following code works for a simple query like select * from rows. My code is:
from pyodbc import connect
from peewee import *
import random
import shelve
import connection
d = shelve.open("data.shelve")
db = SqliteDatabase("data.db")
class Rows(Model):
valueone = IntegerField()
valuetwo = IntegerField()
class Meta:
database = db
def CreateAndPopulate():
db.connect()
db.create_tables([Rows],safe=True)
with db.atomic():
for i in range(100):
row = Rows(valueone=random.randrange(0,100),valuetwo=random.randrange(0,100))
row.save()
db.close()
def get_last_primay_key():
return d.get('max_row',0)
def doWork():
query = "select * from rows" #could be anything
conn = connection.Connection("localhost","","SQLite3 ODBC Driver","data.db","","")
max_key_query = "SELECT MAX(%s) from %s" % ("id", "rows")
max_primary_key = conn.fetch_one(max_key_query)[0]
print "max_primary_key " + str(max_primary_key)
last_primary_key = get_last_primay_key()
print "last_primary_key " + str(last_primary_key)
if max_primary_key == last_primary_key:
print "no new records"
elif max_primary_key > last_primary_key:
print "There are some datas"
optimizedQuery = query + " where id>" + str(last_primary_key)
print query
for data in conn.fetch_all(optimizedQuery):
print data
d['max_row'] = max_primary_key
# print d['max_row']
# CreateAndPopulate() # to populate data
doWork()
While the code will work for a simple query without where clause, but the query can be anything from simple to complex, having joins and multiple where clauses. If so, then the portion where I'm adding where will fail. How can I get only last updated data from database whatever be the query?
PS: I cannot modify database. I just have to fetch from it.
Use an OFFSET clause. For example:
SELECT * FROM [....] WHERE [....] LIMIT -1 OFFSET 1000
In your query, replace 1000 with a parameter bound to your shelve variable. That will skip the top "shelve" number of rows and only grab newer ones. You may want to consider a more robust refactor eventually, but good luck.

Index out of range while executing results from db

I'm having a problem while trying to simply execute data from rows from db (sqlite3). The DB input has 4 fields, therefore once entered they're being saved. But here's my problem, where I execute all of the 4 rows, if one of the fields was not filled I get an error.
That's the database execute code:
def ids(self):
con = lite.connect('foo.db')
with con:
cur = con.cursor()
cur.execute("SELECT Id FROM foo")
while True:
ids = cur.fetchall()
if ids == None:
continue
return ids
And since there are 4 rows, my output code:
print ''.join(ids[0]) + ',' + ''.join(ids[1]) + ',' + ''.join(ids[2])
+ ',' + ''.join(ids[3])
so my question is how to make an exception when there's no existing row to not show anything and just leave the ones that actually exist? I tried doing if ids[0] is not None: #do something but that would make my code really slow and it's non-pythonic way I guess. Is there any better way to make that work? Any help will be appreciated.
You don't seem to have 4 rows. Make it generic and just join an arbitrary number of rows:
ids = someobject.ids()
print ','.join(''.join(row) for row in ids)
You can simplify your database query, there is no need to 'poll' the query:
def ids(self):
with lite.connect('foo.db') as con:
cur = con.cursor()
cur.execute("SELECT Id FROM foo")
return cur.fetchall()
You could also just loop directly over the cursor, the database will handle buffering as you fetch:
def ids(self):
with lite.connect('foo.db') as con:
cur = con.cursor()
cur.execute("SELECT Id FROM foo")
return cur # just the cursor, no fetching
ids = someobject.ids()
# this'll loop over the cursor, which yields rows as required
print ','.join(''.join(row) for row in ids)

Categories