Loop through changing dataset with inlineCallbacks/yield (python-twisted) - python

I have a defer.inlineCallback function for incrementally updating a large (>1k) list one piece at a time. This list may change at any time, and I'm getting bugs because of that behavior.
The simplest representation of what I'm doing is:-
#defer.inlineCallbacks
def _get_details(self, dt=None):
data = self.data
for e in data:
if needs_update(e):
more_detail = yield get_more_detail(e)
do_the_update(e, more_detail)
schedule_future(self._get_details)
self.data is a list of dictionaries which is initially populated with basic information (e.g. a name and ID) at application start. _get_details will run whenever allowed to by the reactor to get more detailed information for each item in data, updating the item as it goes along.
This works well when self.data does not change, but once it is changed (can be at any point) the loop obviously refers to the wrong information. In fact in that situation it would be better to just stop the loop entirely.
I'm able to set a flag in my class (which the inlineCallback can then check) when the data is changed.
Where should this check be conducted?
How does the inlineCallback code execute compared to a normal deferred (and indeed to a normal python generator).
Does code execution stop everytime it encounters yield (i.e. can I rely on this code between one yield and the next to be atomic)?
In the case of unreliable large lists, should I even be looping through the data (for e in data), or is there a better way?

the Twisted reactor never preempts your code while it is executing -- you have to voluntarily yield to the reactor by returning a value. This is why it is such a terrible thing to write Twisted code that blocks on I/O, because the reactor is not able to schedule any tasks while you are waiting for your disk.
So the short answer is that yes, execution is atomic between yields.
Without #inlineCallbacks, the _get_details function returns a generator. The #inlineCallbacks annotation simply wraps the generator in a Deferred that traverses the generator until it reaches a StopIteration exception or a defer.returnValue exception. When either of those conditions is reached, inlineCallbacks fires its Deferred. It's quite clever, really.
I don't know enough about your use case to help with your concurrency problem. Maybe make a copy of the list with tuple() and update that. But it seems like you really want an event-driven solution and not a state-driven one.

You need to protect access to shared resource (self.data).
You can do this with: twisted.internet.defer.DeferredLock.
http://twistedmatrix.com/documents/current/api/twisted.internet.defer.DeferredLock.html
Method acquire
Attempt to acquire the lock. Returns a Deferred that fires on lock
acquisition with the DeferredLock as the value. If the lock is locked,
then the Deferred is placed at the end of a waiting list.
Method release
Release the lock. If there is a waiting list, then the first Deferred in that waiting list will be called back.

#defer.inlineCallback
def _get_details(self, dt=None):
data = self.data
i = 0
while i < len(data):
e = data[i]
if needs_update(e):
more_detail = yield get_more_detail(e)
if i < len(data) or data[i] != e:
break
do_the_update(e, more_detail)
i += 1
schedule_future(self._get_details)
Based on more testing, the following are my observations.
for e in data iterates through elements, with the element still existing even if data itself does not, both before and after the yield statement.
As far as I can tell, execution is atomic between one yield and the next.
Looping through the data is more transparently done by using a counter. This also allows for checking whether the data has changed. The check can be done anytime after yield because any changes must have occurred before yield returned. This results in the code shown above.

self.data is a list of dictionaries...once it is changed (can be at any point) the loop obviously refers to the wrong information
If you're modifying a list while you iterate it, as Raymond Hettinger would say "You're living in the land of sin and you deserve everything that happens to you." :) Scenarios like this should be avoided or the list should be immutable. To circumvent this problem, you can use self.data.pop() or DeferredQueue object to store data. This way you can add and remove elements at anytime without causing adverse effects. Example with a list:
#defer.inlineCallbacks
def _get_details(self, dt=None):
try:
data = yield self.data.pop()
except IndexError:
schedule_future(self._get_details)
defer.returnValue(None) # exit function
if needs_update(e):
more_detail = yield get_more_detail(data)
do_the_update(data, more_detail)
schedule_future(self._get_details)
Take a look at DeferredQueue because a Deferred is returned when the get() function is called, which you can chain callbacks to handle each element you pop from the queue.

Related

what's the difference between yield from and yield in python 3.3.2+

After python 3.3.2+ python support a new syntax for create generator function
yield from <expression>
I have made a quick try for this by
>>> def g():
... yield from [1,2,3,4]
...
>>> for i in g():
... print(i)
...
1
2
3
4
>>>
It seems simple to use but the PEP document is complex. My question is that is there any other difference compare to the previous yield statement? Thanks.
For most applications, yield from just yields everything from the left iterable in order:
def iterable1():
yield 1
yield 2
def iterable2():
yield from iterable1()
yield 3
assert list(iterable2) == [1, 2, 3]
For 90% of users who see this post, I'm guessing that this will be explanation enough for them. yield from simply delegates to the iterable on the right hand side.
Coroutines
However, there are some more esoteric generator circumstances that also have importance here. A less known fact about Generators is that they can be used as co-routines. This isn't super common, but you can send data to a generator if you want:
def coroutine():
x = yield None
yield 'You sent: %s' % x
c = coroutine()
next(c)
print(c.send('Hello world'))
Aside: You might be wondering what the use-case is for this (and you're not alone). One example is the contextlib.contextmanager decorator. Co-routines can also be used to parallelize certain tasks. I don't know too many places where this is taken advantage of, but google app-engine's ndb datastore API uses it for asynchronous operations in a pretty nifty way.
Now, lets assume you send data to a generator that is yielding data from another generator ... How does the original generator get notified? The answer is that it doesn't in python2.x where you need to wrap the generator yourself:
def python2_generator_wapper():
for item in some_wrapped_generator():
yield item
At least not without a whole lot of pain:
def python2_coroutine_wrapper():
"""This doesn't work. Somebody smarter than me needs to fix it. . .
Pain. Misery. Death lurks here :-("""
# See https://www.python.org/dev/peps/pep-0380/#formal-semantics for actual working implementation :-)
g = some_wrapped_generator()
for item in g:
try:
val = yield item
except Exception as forward_exception: # What exceptions should I not catch again?
g.throw(forward_exception)
else:
if val is not None:
g.send(val) # Oops, we just consumed another cycle of g ... How do we handle that properly ...
This all becomes trivial with yield from:
def coroutine_wrapper():
yield from coroutine()
Because yield from truly delegates (everything!) to the underlying generator.
Return semantics
Note that the PEP in question also changes the return semantics. While not directly in OP's question, it's worth a quick digression if you are up for it. In python2.x, you can't do the following:
def iterable():
yield 'foo'
return 'done'
It's a SyntaxError. With the update to yield, the above function is not legal. Again, the primary use-case is with coroutines (see above). You can send data to the generator and it can do it's work magically (maybe using threads?) while the rest of the program does other things. When flow control passes back to the generator, StopIteration will be raised (as is normal for the end of a generator), but now the StopIteration will have a data payload. It is the same thing as if a programmer instead wrote:
raise StopIteration('done')
Now the caller can catch that exception and do something with the data payload to benefit the rest of humanity.
At first sight, yield from is an algorithmic shortcut for:
def generator1():
for item in generator2():
yield item
# do more things in this generator
Which is then mostly equivalent to just:
def generator1():
yield from generator2()
# more things on this generator
In English: when used inside an iterable, yield from issues each element in another iterable, as if that item were coming from the first generator, from the point of view of the code calling the first generator.
The main reasoning for its creation is to allow easy refactoring of code relying heavily on iterators - code which use ordinary functions always could, at very little extra cost, have blocks of one function refactored to other functions, which are then called - that divides tasks, simplifies reading and maintaining the code, and allows for more reusability of small code snippets -
So, large functions like this:
def func1():
# some calculation
for i in somesequence:
# complex calculation using i
# ...
# ...
# ...
# some more code to wrap up results
# finalizing
# ...
Can become code like this, without drawbacks:
def func2(i):
# complex calculation using i
# ...
# ...
# ...
return calculated_value
def func1():
# some calculation
for i in somesequence:
func2(i)
# some more code to wrap up results
# finalizing
# ...
When getting to iterators however, the form
def generator1():
for item in generator2():
yield item
# do more things in this generator
for item in generator1():
# do things
requires that for each item consumed from generator2, the running context be first switched to generator1, nothing is done in that context, and the cotnext have to be switched to generator2 - and when that one yields a value, there is another intermediate context switch to generator1, before getting the value to the actual code consuming those values.
With yield from these intermediate context switches are avoided, which can save quite some resources if there are a lot of iterators chained: the context switches straight from the context consuming the outermost generator to the innermost generator, skipping the context of the intermediate generators altogether, until the inner ones are exhausted.
Later on, the language took advantage of this "tunelling" through intermediate contexts to use these generators as co-routines: functions that can make asynchronous calls. With the proper framework in place, as descibed in https://www.python.org/dev/peps/pep-3156/ , these co-routines are written in a way that when they will call a function that would take a long time to resolve (due to a network operation, or a CPU intensive operation that can be offloaded to another thread) - that call is made with a yield from statement - the framework main loop then arranges so that the called expensive function is properly scheduled, and retakes execution (the framework mainloop is always the code calling the co-routines themselves). When the expensive result is ready, the framework makes the called co-routine behave like an exhausted generator, and execution of the first co-routine resumes.
From the programmer's point of view it is as if the code was running straight forward, with no interruptions. From the process point of view, the co-routine was paused at the point of the expensive call, and other (possibly parallel calls to the same co-routine) continued running.
So, one might write as part of a web crawler some code along:
#asyncio.coroutine
def crawler(url):
page_content = yield from async_http_fetch(url)
urls = parse(page_content)
...
Which could fetch tens of html pages concurrently when called from the asyncio loop.
Python 3.4 added the asyncio module to the stdlib as the default provider for this kind of functionality. It worked so well, that in Python 3.5 several new keywords were added to the language to distinguish co-routines and asynchronous calls from the generator usage, described above. These are described in https://www.python.org/dev/peps/pep-0492/
Here is an example that illustrates it:
>>> def g():
... yield from range(5)
...
>>> list(g())
[0, 1, 2, 3, 4]
>>> def g():
... yield range(5)
...
>>> list(g())
[range(0, 5)]
>>>
yield from yields each item of the iterable, but yield yields the iterable itself.
The difference is simple:
yield:
[extra info, if you know the working of generator you can skip that]
yield is used to produce a single value from the generator function. When the generator function is called, it starts executing, and when a yield statement is encountered, it temporarily suspends the execution of the function, returns the value to the caller, and saves its current state. The next time the function is called, it resumes execution from where it left off, and continues until it hits the next yield statement.
In example below, generator1 and generator2 returning a value wrapped in a generator object but combined_generator is also returning a generator object but that object has another generator object, Now, to get the value of these nested generator we were using yield from
class Gen:
def generator1(self):
yield 1
yield 2
yield 3
def generator2(self):
yield 'a'
yield 'b'
yield 'c'
def combined_generator(self):
"""
This function yielding a generator, which inturn yielding a generator
so we need to use `yield from` so that our end function can directly consume the values instead.
"""
yield from self.generator1()
yield from self.generator2()
def run(self):
print("Gen running ...")
for item in self.combined_generator():
print(item)
g = Gen()
g.run()
The output of above is:
Gen calling ...
1
2
3
a
b
c

Twisted wait for event in loop

I want to read and process some data from an external service. I ask the service if there is any data, if something was returned I process it and ask again (so data can be processed immediately when it's available) and otherwise I wait for a notification that data is available. This can be written as an infinite loop:
def loop(self):
while True:
data = yield self.get_data_nonblocking()
if data is not None:
yield self.process_data(data)
else:
yield self.data_available
def on_data_available(self):
self.data_available.fire()
How can data_available be implemented here? It could be a Deferred but a Deferred cannot be reset, only recreated. Are there better options?
Can this loop be integrated into the Twisted event loop? I can read and process data right in on_data_available and write some code instead of the loop checking get_data_nonblocking but I feel like then I'll need some locks to make sure data is processed in the same order it arrives (the code above enforces it because it's the only place where it's processed). Is this a good idea at all?
Consider the case of a TCP connection. The receiver buffer for a TCP connection can either have data in it or not. You can get that data, or get nothing, without blocking by using the non-blocking socket API:
data = socket.recv(1024)
if data:
self.process_data(data)
You can wait for data to be available using select() (or any of the basically equivalent APIs):
socket.setblocking(False)
while True:
data = socket.recv(1024)
if data:
self.process_data(data)
else:
select([socket], [], [])
Of these, only select() is particularly Twisted-unfriendly (though the Twisted idiom is certainly not to make your own socket.recv calls). You could replace the select call with a Twisted-friendly version though (implement a Protocol with a dataReceived method that fires a Deferred - sort of like your on_data_available method - toss in some yields and make this whole thing an inlineCallbacks generator).
But though that's one way you can get data from a TCP connection, that's not the API that Twisted encourages you to use to do so. Instead, the API is:
class SomeProtocol(Protocol):
def dataReceived(self, data):
# Your logic here
I don't see how your case is substantially different. What if, instead of the loop you wrote, you did something like this:
class YourDataProcessor(object):
def process_data(self, data):
# Your logic here
class SomeDataGetter(object):
def __init__(self, processor):
self.processor = processor
def on_available_data(self):
data = self.get_data_nonblocking()
if data is not None:
self.processor.process_data(data)
Now there are no Deferreds at all (except perhaps in whatever implements on_available_data or get_data_nonblocking but I can't see that code).
If you leave this roughly as-is, you are guaranteed of in-ordered execution because Twisted is single-threaded (except in a couple places that are very clearly marked) and in a single-threaded program, an earlier call to process_data must complete before any later call to process_data could be made (excepting, of course, the case where process_data reentrantly invokes itself - but that's another story).
If you switch this back to using inlineCallbacks (or any equivalent "coroutine" flavored drink mix) then you are probably introducing the possibility of out-of-order execution.
For example, if get_data_nonblocking returns a Deferred and you write something like this:
#inlineCallbacks
def on_available_data(self):
data = yield self.get_data_nonblocking()
if data is not None:
self.processor.process_data(data)
Then you have changed on_available_data to say that a context switch is allowed when calling get_data_nonblocking. In this case, depending on your implementation of get_data_nonblocking and on_available_data, it's entirely possible that:
on_available_data is called
get_data_nonblocking is called and returns a Deferred
on_available_data tells execution to switch to another context (via yield / inlineCallbacks)
on_available_data is called again
get_data_nonblocking is called again and returns a Deferred (perhaps the same one! perhaps a new one! depends on how it's implement)
The second invocation of on_available_data tells execution to switch to another context (same reason)
The reactor spins around for a while and eventually an event arrives that causes the Deferred returned by the second invocation of get_data_nonblocking to fire.
Execution switches back to the second on_available_data frame
process_data is called with whatever data the second get_data_nonblocking call returned
Eventually the same things happen to the first set of objects and process_data is called again with whatever data the first get_data_nonblocking call returned
Now perhaps you've processed data out of order - again, this depends on more details of other parts of your system.
If so, you can always re-impose order. There are a lot of different possible approaches to this. Twisted itself doesn't come with any APIs that are explicitly in support of this operation so the solution involves writing some new code. Here's one idea (untested) for an approach - a queue-like class that knows about object sequence numbers:
class SequencedQueue(object):
"""
A queue-like type which guarantees objects come out of the queue in the order
defined by a sequence number associated with the objects when they are put into
the queue.
Application code manages sequence number assignment so that sequence numbers don't
have to have the same order as `put` calls on this type.
"""
def __init__(self):
# The sequence number of the object that should be given out
# by the next call to `get`
self._next_sequence = 0
# The sequence number of the next result that needs to be provided.
self._next_result = 0
# A holding area for objects past _next_sequence
self._queue = {}
# A holding area
self._waiting =
def put(self, sequence, object):
"""
Put an object into the queue at a particular point in the sequence.
"""
if sequence < self._next_sequence:
# Programming error. The sequence number
# of the object being put has already been used.
raise ...
self._queue[sequence] = object
self._check_waiters()
def get(self):
"""
Get an object from the queue which has the next sequence number
following whatever was previously gotten.
"""
result = self._waiters[self._next_sequence] = Deferred()
self._next_sequence += 1
self._check_waiters()
return result
def _check_waiters(self):
"""
Find any Deferreds previously given out by get calls which can now be given
their results and give them to them.
"""
while True:
seq = self._next_result
if seq in self._queue and seq in self._waiting:
self._next_result += 1
# XXX Probably a re-entrancy bug here. If a callback calls back in to
# put then this loop might run recursively
self._waiting.pop(seq).callback(self._queue.pop(seq))
else:
break
The expected behavior (modulo any bugs I accidentally added) is something like:
q = SequencedQueue()
d1 = q.get()
d2 = q.get()
# Nothing in particular happens
q.put(1, "second result")
# d1 fires with "first result" and afterwards d2 fires with "second result"
q.put(0, "first result")
Using this, just make sure you assign sequence numbers in the order you want data dispatched rather than the order it actually shows up somewhere. For example:
#inlineCallbacks
def on_available_data(self):
sequence = self._process_order
data = yield self.get_data_nonblocking()
if data is not None:
self._process_order += 1
self.sequenced_queue.put(sequence, data)
Elsewhere, some code can consume the queue sort of like:
#inlineCallbacks
def queue_consumer(self):
while True:
yield self.process_data(yield self.sequenced_queue.get())

How to add an item to a memcached list atomically (in Python)

Behold my simple Python memcached code below:
import memcache
memcache_client = memcache.Client(['127.0.0.1:11211'], debug=True)
key = "myList"
obj = ["A", "B", "C"]
memcache_client.set(key, obj)
Now, suppose I want to append an element "D" to the list cached as myList, how can I do it atomically?
I know this is wrong because it is not atomic:
memcache_client.set(key, memcache_client.get(key) + ["D"])
The above statement contains a race condition. If another thread executes this same instruction at the exact right moment, one of the updates will get clobbered.
How can I solve this race condition? How can I update a list or dictionary stored in memcached atomically?
Here's the corresponding function of the python client API
https://cloud.google.com/appengine/docs/python/memcache/clientclass#Client_cas
Also here's a nice tutorial by Guido van Rossum. Hope he'd better explain python stuff than I ;)
Here's how the code should look like in your case:
memcache_client = memcache.Client(['127.0.0.1:11211'], debug=True)
key = "myList"
while True: # Retry loop, probably it should be limited to some reasonable retries
obj = memcache_client.gets(key)
assert obj is not None, 'Uninitialized object'
if memcache_client.cas(key, obj + ["D"]):
break
The whole workflow remains the same: first you fetch a value (w/ some internal information bound to a key), then modify the fetched value, then attempt to update it in the memcache. The only difference that the value (actually, key/value pair) is checked that it hasn't been changed simultaneosly from a parallel process. In the latter case the call fails and you should retry the workflow from the beginning. Also, if you have a multi-threaded application, then each memcache_client instance likely should be thread-local.
Also don't forget that there're incr() and decr() methods for simple integer counters which are "atomic" by their nature.
If you don't want receive a race condition then you must use Lock primitive from threading module. For example
lock = threading.Lock()
def thread_func():
obj = get_obj()
lock.acquire()
memcache_client.set(key, obj)
lock.release()

Can Python Generators be used in Django Views?

Question:
Essentially I want to return a unique result from the database everytime a view is called (until I run out of unique objects and have to start over). I was thinking that a simple and elegant solution would be to use a generator to handle this. Is this possible and if so how can this be approached with regards to pulling values from with ORM?
Note:
I think sessions or utilizing a design pattern like Memento may be a solution here, but I'm really curious to see if and how Python generators could be used in this context.
As Django is synchronous wsgi, you have to process each request as stand alone, your python environment can be killed or switched to an other at any time.
Still if you have no fear and a single process, you can make a file scope dictionary with session ids and iterators that you'll consume each time
from django.shortcuts import render
from collections import defaultdict
import uuid
def iterator():
for item in DatabaseTable.objects.all():
yield item
sessions_current_iterators = defaultdict(iterator)
def my_view(request):
id = request.session.get("iterator_id", None)
if id is None:
request.session["iterator_id"] = str(uuid.uuid4())
try:
return render(request, "item_template.html", {"item": next(sessions_current_iterators)}
except StopIteration:
request.session.pop("iterator_id")
return render(request, "end_template.html", {})
but: NEVER USE THIS ON A PRODUCTION ENVIRONMENT!
generators are great to reduce memory consumption while computing the request or can be good for tornado web service, but clearly, django should not share data between request in local variables.
You can always use yield where you can use return (since these are python stuff not Django stuff). The only caveat here is that the same function is called for every request; so the continuation after the yield may serve another client instead of the one you intend. However you can beat this problem by using a higher level function (generator here). Basically the function will have a dictionary of generators indexed by unique keys derived from the requests. Every time the function is called, check whether an entry already exists for the request in the dictionary. If not add a new function for that request. Then invoke the generator for the given request making sure to store whatever is yielded or returned by the generator. To keep the dictionary in memory let the main function now yield the stored value. Finally, so that the dictionary is not cleared every time the main function is called, start the function body by initializing the dictionary to an empty dictionary; then wrap everything else in an infinite while loop. This will ensure that the main function, also a generator, never really exits. When called the first time, the dictionary is initialized; then the while starts. In the while, the function creates and stores a generator in the dictionary if no entry already exists for the given request. Then the function invokes the generator for the request and yields whatever the generator returns or yields at the bottom of the while. When called again; the main function resumes at the top of the while. The code is like so:
def main_func(request, *args) :
funcs = {}
while True:
request_key = make_key(request)
If request_key not in funcs.keys():
def generator_func():
# your generator code here...
# remember to delete the func item in funcs before returning...
funcs[request_key] = generator_func
yield funcs[request_key] ()
def make_key(request):
# quick and dirty impl
return str(request.session)

Is this Python producer-consumer lockless approach thread-safe?

I recently wrote a program that used a simple producer/consumer pattern. It initially had a bug related to improper use of threading.Lock that I eventually fixed. But it made me think whether it's possible to implement producer/consumer pattern in a lockless manner.
Requirements in my case were simple:
One producer thread.
One consumer thread.
Queue has place for only one item.
Producer can produce next item before the current one is consumed. The current item is therefore lost, but that's OK for me.
Consumer can consume current item before the next one is produced. The current item is therefore consumed twice (or more), but that's OK for me.
So I wrote this:
QUEUE_ITEM = None
# this is executed in one threading.Thread object
def producer():
global QUEUE_ITEM
while True:
i = produce_item()
QUEUE_ITEM = i
# this is executed in another threading.Thread object
def consumer():
global QUEUE_ITEM
while True:
i = QUEUE_ITEM
consume_item(i)
My question is: Is this code thread-safe?
Immediate comment: this code isn't really lockless - I use CPython and it has GIL.
I tested the code a little and it seems to work. It translates to some LOAD and STORE ops which are atomic because of GIL. But I also know that del x operation isn't atomic when x implements __del__ method. So if my item has a __del__ method and some nasty scheduling happens, things may break. Or not?
Another question is: What kind of restrictions (for example on produced items' type) do I have to impose to make the above code work fine?
My questions are only about theoretical possibility to exploit CPython's and GIL's quirks in order to come up with lockless (i.e. no locks like threading.Lock explicitly in code) solution.
Trickery will bite you. Just use Queue to communicate between threads.
Yes this will work in the way that you described:
That the producer may produce a skippable element.
That the consumer may consume the same element.
But I also know that del x operation isn't atomic when x implements del method. So if my item has a del method and some nasty scheduling happens, things may break.
I don't see a "del" here. If a del happens in consume_item then the del may occur in the producer thread. I don't think this would be a "problem".
Don't bother using this though. You will end up using up CPU on pointless polling cycles, and it is not any faster than using a queue with locks since Python already has a global lock.
This is not really thread safe because producer could overwrite QUEUE_ITEM before consumer has consumed it and consumer could consume QUEUE_ITEM twice. As you mentioned, you're OK with that but most people aren't.
Someone with more knowledge of cpython internals will have to answer you more theoretical questions.
I think it's possible that a thread is interrupted while producing/consuming, especially if the items are big objects.
Edit: this is just a wild guess. I'm no expert.
Also the threads may produce/consume any number of items before the other one starts running.
You can use a list as the queue as long as you stick to append/pop since both are atomic.
QUEUE = []
# this is executed in one threading.Thread object
def producer():
global QUEUE
while True:
i = produce_item()
QUEUE.append(i)
# this is executed in another threading.Thread object
def consumer():
global QUEUE
while True:
try:
i = QUEUE.pop(0)
except IndexError:
# queue is empty
continue
consume_item(i)
In a class scope like below, you can even clear the queue.
class Atomic(object):
def __init__(self):
self.queue = []
# this is executed in one threading.Thread object
def producer(self):
while True:
i = produce_item()
self.queue.append(i)
# this is executed in another threading.Thread object
def consumer(self):
while True:
try:
i = self.queue.pop(0)
except IndexError:
# queue is empty
continue
consume_item(i)
# There's the possibility producer is still working on it's current item.
def clear_queue(self):
self.queue = []
You'll have to find out which list operations are atomic by looking at the bytecode generated.
The __del__ could be a problem as You said. It could be avoided, if only there was a way to prevent the garbage collector from invoking the __del__ method on the old object before We finish assigning the new one to the QUEUE_ITEM. We would need something like:
increase the reference counter on the old object
assign a new one to `QUEUE_ITEM`
decrease the reference counter on the old object
I'm afraid, I don't know if it is possible, though.

Categories