Problems implementing a python db listener - python

I'm writing a module for a program that needs to listen for new entries in a db, and execute a function on the event of new rows being posted to this table... aka a trigger.
I have written some code, but it does not work. Here's my logic:
connect to db, query for the newest row, compare that row with variable, if not equal, run function, store newest row to variable, else close. Run every 2 seconds to compare newest row with whatever is stored in the variable/object.
Everything runs fine and pulls the expected results from the db, however I'm getting a 'local variable 'last_sent' referenced before assignment.
This confuses me for 2 reasons.
I thought I set last_sent to 'nothing' as a global variable/object before the functions are called.
In order for my comparison logic to work, I can't set last_sent within the sendListener() function before the if/else
Here's the code.
from Logger import Logger
from sendSMS import sendSMS
from Needles import dbUser, dbHost, dbPassword, pull_stmt
import pyodbc
import time
#set last_sent to something
last_sent = ''
def sendListener():
#connect to db
cnxn = pyodbc.connect('UID='+dbUser+';PWD='+dbPassword+';DSN='+dbHost)
cursor = cnxn.cursor()
#run query to pull newest row
cursor.execute(pull_stmt)
results = cursor.fetchone()
#if query results different from results stored in last_sent, run function.
#then set last_sent object to the query results for next comparison.
if results != last_sent:
sendSMS()
last_sent = results
else:
cnxn.close()
# a loop to run the check every 2 seconds- as to lessen cpu usage
def sleepLoop():
while 0 == 0:
sendListener()
time.sleep(2.0)
sleepLoop()
I'm sure there is a better way to implement this.

Here:
if results != last_sent:
sendSMS()
last_sent = results
else:
cnxn.close()
Python sees that you're assigning to last_sent, but it's not marked as global in this function, so it must be local. Yet you're reading it in results != last_sent before its definition, so you get the error.
To solve this, mark it as global at the beginning of the function:
def sendListener():
global last_sent
...

Related

Why is a subsequent query not able to find newly-inserted rows?

I'm using AWS RDS, which I'm accessing with pymysql. I have a python lambda function that inserts a row into one of my tables. I then call cursor.commit() on the pymysql cursor object. Later, my lambda invokes a second lambda; this second lambda (using a different db connection) executes a SELECT to look for the newly-added row. Unfortunately, the row is not found immediately. As a debugging step, I added code like this:
lambda_handler.py
...
uuid_values = [uuid_value] # A single-item list
things = queries.get_things(uuid_values)
# Added for debugging
if not things:
print('For debugging: things not found.')
time.sleep(5)
things = queries.get_things()
print(f'for debugging: {str(things)}')
return things
queries.py
def get_things(uuid_values):
# Creates a string of the form 'UUID_TO_BIN(%s), UUID_TO_BIN(%s)' for use in the query below
format_string = ','.join(['UUID_TO_BIN(%s)'] * len(uuid_values))
tuple_of_keys = tuple([str(key) for key in uuid_values])
with db_conn.get_cursor() as cursor:
# Lightly simplified query
cursor.execute( '''
SELECT ...
FROM table1 t1
JOIN table2 t2 ON t1.id = t2.t1_id
WHERE
t1.uuid_value IN ({format_string})
AND t2.status_id = 1
''' % format_string,
tuple_of_keys)
results = cursor.fetchall()
db_conn.conn.commit()
return results
This outputs
'For debugging: things not found.'
'\<thing list\>'
meaning the row is not found immediately, but is after a brief delay. I'd rather not leave this delay in when I ship to production. I'm not doing anything with transactions or isolation level. So it's very strange to me that this second query would not find the newly-inserted row. Any idea what might be causing this?

Whats is correct way to work with PostgreSQL from Python threads?

I need to increase speed of parsing heap of XML files. I decided to try use Python threads, but I do not know how to correct work from them with DB.
My DB store only links to files. I decided to add isProcessing column to my DB to prevent acquire of same rows from multiple threads
So result table look like:
|xml_path|isProcessing|
Every thread set this flag before starting processing and other threads select for procossings rows where this flags is not set.
But I am not sure that is correct way, because I am not sure that acquire is atomic and two threads my to process same row two times.
def select_single_file_for_processing():
#...
sql = """UPDATE processing_files SET "isProcessing" = 'TRUE' WHERE "xml_name"='{0}'""".format(xml_name)
cursor.execute(sql)
conn.commit()
def worker():
result = select_single_file_for_processing() #
# ...
# processing()
def main():
# ....
while unprocessed_xml_count != 0: # now unprocessed_xml_count is global! I know that it's wrong, but how to fix it?
checker_thread = threading.Thread(target=select_total_unpocessed_xml_count)
checker_thread.start() # if we have files for processing
for i in range(10): # run processed
t = Process(target=worker)
t.start()
The second question - what is the best practice to work with DB from multiprocessing module?
As written, your isProcessing flag could have problems with multiple threads. You should include a predicate for isProcessing = FALSE and check how many rows are updated. One thread will report 1 row and any other threads will report 0 rows.
As to best practices? This is a reasonable solution. The key is to be specific. A simple update will set the values as specified. The operation you are trying to perform though is to change the value from a to b, hence including the predicate for a in the statement.
UPDATE processing_files
SET isProcessing = 'TRUE'
WHERE xmlName = '...'
AND isProcessing = 'FALSE';

python function error with sqlite code

def readswitch(x,y,connn,read):
x='create vlan'
y='global'
conn = sqlite3.connect('server.db')
if conn:
cur = conn.cursor()
run= cur.execute("SELECT command FROM switch WHERE function =? or type = ? ORDER BY key ASC",(x,y))
read = cur.fetchall()
return run;
for row in read:
print (readswitch())
I am going to search x and y in my database and I want it to return my sql statement for the command
but it seems cant run this function like
for row in read:
NameError: name 'read' is not defined
can anyone fix this error?
Your code has several problems, including argument passing and variable scope. I'm not sure what it's really trying to do. I suggest rewriting it with no function, just straight sequential execution. Once you get that working, try to pull out the function call.

Looping through list of tuples to post each one within a function

So, my first two functions, sqlPull() and dupCatch() work perfectly, but when I try to pass new_data (the unique MySQL tuple rows) to the post() function, nothing happens. I am not getting errors, and it continues to run. Normally if I were to execute a static post request I would see it instantaneously in Google Analytics, but nothing is appearing, so I know something is wrong with the function. I assume the error lies in the for loop within the post() function, but I am not sure what about it. Maybe I can't unpack the variables like I am currently doing because of what I did to them in the previous function?
import mysql.connector
import datetime
import requests
import time
def sqlPull():
connection = mysql.connector.connect(user='xxxxx', password='xxxxx', host='xxxxx', database='MeshliumDB')
cursor = connection.cursor()
cursor.execute("SELECT TimeStamp, MAC, RSSI FROM wifiscan ORDER BY TimeStamp DESC LIMIT 20;")
data = cursor.fetchall()
connection.close()
time.sleep(5)
return data
seen = set()
def dupCatch():
data = sqlPull()
new_data = []
for (TimeStamp, MAC, RSSI) in data:
if (TimeStamp, MAC, RSSI) not in seen:
seen.add((TimeStamp, MAC, RSSI))
new_data.append((TimeStamp, MAC, RSSI))
return new_data
def post():
new_data = dupCatch()
for (TimeStamp, MAC, RSSI) in new_data:
requests.post("http://www.google-analytics.com/collect",
data="v=1&tid=UA-22560594-2&cid={}&t=event&ec={}&ea=InStore&el=RSSI&ev={}&pv=SipNSiz_Store".format(
MAC,
RSSI,
TimeStamp)
)
while run is True:
sqlPull()
dupCatch()
post()
Your post function calls dupCatch(). But you also call dupCatch in your main run loop, right before calling dupCatch.
Similarly, your dupCatch function calls sqlPull(), but you also call sqlPull in your main run loop.
Those extra calls mean you end up throwing away 2 batches of data for each batch you process.
You could restructure your code so your functions take their values as arguments, like this:
while run is True:
data = sqlPull()
newdata = dupCatch(data)
post(newdata)
… and then change dupCatch and post so they use those arguments, instead of calling the functions themselves.
Alternatively, you could just remove the extra calls in the main run loop.

Python's MySqlDB not getting updated row

I have a script that waits until some row in a db is updated:
con = MySQLdb.connect(server, user, pwd, db)
When the script starts the row's value is "running", and it waits for the value to become "finished"
while(True):
sql = '''select value from table where some_condition'''
cur = self.getCursor()
cur.execute(sql)
r = cur.fetchone()
cur.close()
res = r['value']
if res == 'finished':
break
print res
time.sleep(5)
When I run this script it hangs forever. Even though I see the value of the row has changed to "finished" when I query the table, the printout of the script is still "running".
Is there some setting I didn't set?
EDIT: The python script only queries the table. The update to the table is carried out by a tomcat webapp, using JDBC, that is set on autocommit.
This is an InnoDB table, right? InnoDB is transactional storage engine. Setting autocommit to true will probably fix this behavior for you.
conn.autocommit(True)
Alternatively, you could change the transaction isolation level. You can read more about this here:
http://dev.mysql.com/doc/refman/5.0/en/set-transaction.html
The reason for this behavior is that inside a single transaction the reads need to be consistent. All consistent reads within the same transaction read the snapshot established by the first read. Even if you script only reads the table this is considered a transaction too. This is the default behavior in InnoDB and you need to change that or run conn.commit() after each read.
This page explains this in more details: http://dev.mysql.com/doc/refman/5.0/en/innodb-consistent-read.html
I worked around this by running
c.execute("""set session transaction isolation level READ COMMITTED""")
early on in my reading session. Updates from other threads do come through now.
In my instance I was keeping connections open for a long time (inside mod_python) and so updates by other processes weren't being seen at all.

Categories