Background
In my code below I have a function called process that does some stuff and when it is running i want to make sure it is not run concurrently. I have a table called creation_status where i set a time stamp anytime i start the process. The reason i use a time stamp is because it allows me to know what time i started this process in case i need to.
I always check if there is already a time stamp and if there is raise an exception to make sure i am not running this script concurrently.
Code
def is_in_process() -> bool:
status = db.run_query(sql="SELECT is_in_process FROM creation_status")
return False if status[0].is_in_process is None else True
def set_status() -> None:
db.execute(sql="UPDATE creation_status SET is_in_process = NOW()")
def delete_status() -> None:
db.execute(sql="UPDATE creation_status SET is_in_process = NULL")
def process():
if is_in_process():
raise Exception("Must not run concurrent process creations." )
set_status()
# stuff happens
delete_status()
Issue
I want to make sure my query is atomic to eliminate race conditions. It is possible that the by the time i check the function is_in_process and call the function set_status another script could get kicked off. How do i ensure both those things happen in one go so i avoid race conditions.
Please let me know if i can explain something more clear and i am open to all suggestions.
Don't use multiple steps when you don't need to.
UPDATE creation_status SET is_in_process = NOW() where is_in_process is null returning is_in_process
Then check to see if a row is returned.
Of course this should probably be done in the same transaction as the rest of the stuff, which we can't tell from your code, but then things will just block until the previous is done, rather than aborting.
Related
I found this post which asks a similar question. However, the answers were not what I expected to find so I'm going to try asking it a little differently.
Let's assume a function searching_for_connection will run indefinitely in a while True loop. It that function, we'll loop and preform a check to see if a new connection has been made with /dev/ttyAMA0. If that connection exists, exit the loop, finish searching_for_connection, and begin some other processes. Is this possible to do and how would I go about doing that?
My current approach is sending a carriage return and checking for a response. My problem is that this method has been pretty spotty and hasn't yielded consistent results for me. Sometimes this method works and sometimes it will just stop working
def serial_device_connected(serial_device: "serial.Serial") -> bool:
try:
serial_device.write(b"\r")
return bool(serial_device.readlines())
except serial.SerialException
return False
I suggest having a delay to allow time for the device to respond.
import time
def serial_device_connected(serial_device: "serial.Serial") -> bool:
try:
serial_device.write(b"\r")
time.sleep(0.01)
return bool(serial_device.readlines())
except serial.SerialException
return False
I have a Django app saving objects to the database and a celery task that periodically does some processing on some of those objects. The problem is that the user can delete an object after it has been selected by the celery task for processing, but before the celery task has actually finished processing and saving it. So when the celery task does call .save(), the object re-appears in the database even though the user deleted it. Which is really spooky for users, of course.
So here's some code showing the problem:
def my_delete_view(request, pk):
thing = Thing.objects.get(pk=pk)
thing.delete()
return HttpResponseRedirect('yay')
#app.task
def my_periodic_task():
things = get_things_for_processing()
# if the delete happens anywhere between here and the .save(), we're hosed
for thing in things:
process_thing(thing) # could take a LONG time
thing.save()
I thought about trying to fix it by adding an atomic block and a transaction to test if the object actually exists before saving it:
#app.task
def my_periodic_task():
things = Thing.objects.filter(...some criteria...)
for thing in things:
process_thing(thing) # could take a LONG time
try:
with transaction.atomic():
# just see if it still exists:
unused = Thing.objects.select_for_update().get(pk=thing.pk)
# no exception means it exists. go ahead and save the
# processed version that has all of our updates.
thing.save()
except Thing.DoesNotExist:
logger.warning("Processed thing vanished")
Is this the correct pattern to do this sort of thing? I mean, I'll find out if it works within a few days of running it in production, but it would be nice to know if there are any other well-accepted patterns for accomplishing this sort of thing.
What I really want is to be able to update an object if it still exists in the database. I'm ok with the race between user edits and edits from the process_thing, and I can always throw in a refresh_from_db just before the process_thing to minimize the time during which user edits would be lost. But I definitely can't have objects re-appearing after the user has deleted them.
if you open a transaction for the time of processing of celery task, you should avoid such a problems:
#app.task
#transaction.atomic
def my_periodic_task():
things = get_things_for_processing()
# if the delete happens anywhere between here and the .save(), we're hosed
for thing in things:
process_thing(thing) # could take a LONG time
thing.save()
sometimes, you would like to report to the frontend, that you are working on the data, so you can add select_for_update() to your queryset (most probably in get_things_for_processing), then in the code responsible for deletion you need to handle errors when db will report that specific record is locked.
For now, it seems like the pattern of "select again atomically, then save" is sufficient:
#app.task
def my_periodic_task():
things = Thing.objects.filter(...some criteria...)
for thing in things:
process_thing(thing) # could take a LONG time
try:
with transaction.atomic():
# just see if it still exists:
unused = Thing.objects.select_for_update().get(pk=thing.pk)
# no exception means it exists. go ahead and save the
# processed version that has all of our updates.
thing.save()
except Thing.DoesNotExist:
logger.warning("Processed thing vanished")
(this is the same code as in my original question).
I'm running a python script on a raspberry pi that constantly checks on a Yocto button and when it gets pressed it puts data from a different sensor in a database.
a code snippet of what constantly runs is:
#when all set and done run the program
Active = True
while Active:
if ResponseType == "b":
while Active:
try:
if GetButtonPressed(ResponseValue):
DoAllSensors()
time.sleep(5)
else:
time.sleep(0.5)
except KeyboardInterrupt:
Active = False
except Exception, e:
print str(e)
print "exeption raised continueing after 10seconds"
time.sleep(10)
the GetButtonPressed(ResponseValue) looks like the following:
def GetButtonPressed(number):
global buttons
if ModuleCheck():
if buttons[number - 1].get_calibratedValue() < 300:
return True
else:
print "module not online"
return False
def ModuleCheck():
global moduleb
return moduleb.isOnline()
I'm not quite sure about what might be going wrong. But it takes about an hour before the RPI runs out of memory.
The memory increases in size constantly and the button is only pressed once every 15 minutes or so.
That already tells me that the problem must be in the code displayed above.
The problem is that the yocto_api.YAPI object will continue to accumulate _Event objects in its _DataEvents dict (a class-wide attribute) until you call YAPI.YHandleEvents. If you're not using the API's callbacks, it's easy to think (I did, for hours) that you don't need to ever call this. The API docs aren't at all clear on the point:
If your program includes significant loops, you may want to include a call to this function to make sure that the library takes care of the information pushed by the modules on the communication channels. This is not strictly necessary, but it may improve the reactivity of the library for the following commands.
I did some playing around with API-level callbacks before I decided to periodically poll the sensors in my own code, and it's possible that some setting got left enabled in them that is causing these events to accumulate. If that's not the case, I can't imagine why they would say calling YHandleEvents is "not strictly necessary," unless they make ARM devices with unlimited RAM in Switzerland.
Here's the magic static method that thou shalt call periodically, no matter what. I'm doing so once every five seconds and that is taking care of the problem without loading down the system at all. API code that would accumulate unwanted events still smells to me, but it's time to move on.
#noinspection PyUnresolvedReferences
#staticmethod
def HandleEvents(errmsgRef=None):
"""
Maintains the device-to-library communication channel.
If your program includes significant loops, you may want to include
a call to this function to make sure that the library takes care of
the information pushed by the modules on the communication channels.
This is not strictly necessary, but it may improve the reactivity
of the library for the following commands.
This function may signal an error in case there is a communication problem
while contacting a module.
#param errmsg : a string passed by reference to receive any error message.
#return YAPI.SUCCESS when the call succeeds.
On failure, throws an exception or returns a negative error code.
"""
errBuffer = ctypes.create_string_buffer(YAPI.YOCTO_ERRMSG_LEN)
#noinspection PyUnresolvedReferences
res = YAPI._yapiHandleEvents(errBuffer)
if YAPI.YISERR(res):
if errmsgRef is not None:
#noinspection PyAttributeOutsideInit
errmsgRef.value = YByte2String(errBuffer.value)
return res
while len(YAPI._DataEvents) > 0:
YAPI.yapiLockFunctionCallBack(errmsgRef)
if not (len(YAPI._DataEvents)):
YAPI.yapiUnlockFunctionCallBack(errmsgRef)
break
ev = YAPI._DataEvents.pop(0)
YAPI.yapiUnlockFunctionCallBack(errmsgRef)
ev.invokeData()
return YAPI.SUCCESS
I'm adding automatic validation to one of my models in a pyramid application with before_insert.
So far I've got this:
def Property_before_insert_listener(mapper, connection, target):
formvalidator = PropertySchema()
try:
return formvalidator.to_python(target.__table__.columns)
except formencode.Invalid as error:
print ("***************************************ERROR" + str(error))
event.listen(
Property, 'before_insert', Property_before_insert_listener)
Everything seems to be working fine, and I get the proper error printed out in the console. However, after handling the error, it continues with the insert. How do I stop the insert from happening?
In the sqlAlquemy documentation for Mapper Events you have something that might help you:
retval=False
when True, the user-defined event function must have a return value, the purpose of which is either to control subsequent event propagation, or to otherwise alter the operation in progress by the mapper. Possible return > values are:
sqlalchemy.orm.interfaces.EXT_CONTINUE - continue event processing normally.
sqlalchemy.orm.interfaces.EXT_STOP - cancel all subsequent event handlers in the chain.
other values - the return value specified by specific listeners.
I'm writing a program that adds normal UNIX accounts (i.e. modifying /etc/passwd, /etc/group, and /etc/shadow) according to our corp's policy. It also does some slightly fancy stuff like sending an email to the user.
I've got all the code working, but there are three pieces of code that are very critical, which update the three files above. The code is already fairly robust because it locks those files (ex. /etc/passwd.lock), writes to to a temporary files (ex. /etc/passwd.tmp), and then, overwrites the original file with the temporary. I'm fairly pleased that it won't interefere with other running versions of my program or the system useradd, usermod, passwd, etc. programs.
The thing that I'm most worried about is a stray ctrl+c, ctrl+d, or kill command in the middle of these sections. This has led me to the signal module, which seems to do precisely what I want: ignore certain signals during the "critical" region.
I'm using an older version of Python, which doesn't have signal.SIG_IGN, so I have an awesome "pass" function:
def passer(*a):
pass
The problem that I'm seeing is that signal handlers don't work the way that I expect.
Given the following test code:
def passer(a=None, b=None):
pass
def signalhander(enable):
signallist = (signal.SIGINT, signal.SIGQUIT, signal.SIGABRT, signal.SIGPIPE, signal.SIGALRM, signal.SIGTERM, signal.SIGKILL)
if enable:
for i in signallist:
signal.signal(i, passer)
else:
for i in signallist:
signal.signal(i, abort)
return
def abort(a=None, b=None):
sys.exit('\nAccount was not created.\n')
return
signalhander(True)
print('Enabled')
time.sleep(10) # ^C during this sleep
The problem with this code is that a ^C (SIGINT) during the time.sleep(10) call causes that function to stop, and then, my signal handler takes over as desired. However, that doesn't solve my "critical" region problem above because I can't tolerate whatever statement encounters the signal to fail.
I need some sort of signal handler that will just completely ignore SIGINT and SIGQUIT.
The Fedora/RH command "yum" is written is Python and does basically exactly what I want. If you do a ^C while it's installing anything, it will print a message like "Press ^C within two seconds to force kill." Otherwise, the ^C is ignored. I don't really care about the two second warning since my program completes in a fraction of a second.
Could someone help me implement a signal handler for CPython 2.3 that doesn't cause the current statement/function to cancel before the signal is ignored?
As always, thanks in advance.
Edit: After S.Lott's answer, I've decided to abandon the signal module.
I'm just going to go back to try: except: blocks. Looking at my code there are two things that happen for each critical region that cannot be aborted: overwriting file with file.tmp and removing the lock once finished (or other tools will be unable to modify the file, until it is manually removed). I've put each of those in their own function inside a try: block, and the except: simply calls the function again. That way the function will just re-call itself in the event of KeyBoardInterrupt or EOFError, until the critical code is completed.
I don't think that I can get into too much trouble since I'm only catching user provided exit commands, and even then, only for two to three lines of code. Theoretically, if those exceptions could be raised fast enough, I suppose I could get the "maximum reccurrsion depth exceded" error, but that would seem far out.
Any other concerns?
Pesudo-code:
def criticalRemoveLock(file):
try:
if os.path.isFile(file):
os.remove(file)
else:
return True
except (KeyboardInterrupt, EOFError):
return criticalRemoveLock(file)
def criticalOverwrite(tmp, file):
try:
if os.path.isFile(tmp):
shutil.copy2(tmp, file)
os.remove(tmp)
else:
return True
except (KeyboardInterrupt, EOFError):
return criticalOverwrite(tmp, file)
There is no real way to make your script really save. Of course you can ignore signals and catch a keyboard interrupt using try: except: but it is up to your application to be idempotent against such interrupts and it must be able to resume operations after dealing with an interrupt at some kind of savepoint.
The only thing that you can really to is to work on temporary files (and not original files) and move them after doing the work into the final destination. I think such file operations are supposed to be "atomic" from the filesystem prospective. Otherwise in case of an interrupt: restart your processing from start with clean data.