getting last n items from queue - python

everything I see is about lists but this is about events = queue.queue()
which is a queue with objects that I want to extract, but how would I go about getting the last N elements from that queue?

By definition, you can't.
What you can do is use a loop or a comprehension to get the first (you can't get from the end of a queue) N elements:
N = 2
first_N_elements = [my_queue.get() for _ in range(N)]

If you're multi-threading, "the last N elements from that queue" is undefined and the question doesn't make sense.
If there is no multi-threading, it depends on whether you care about the other elements (not the last N).
If you don't:
for i in range(events.qsize() - N):
events.get()
after that, get N items
If you don't want to throw away the other items, you'll just have to move everything to a different data structure (like a list). The whole point of a queue is to get things in a certain order.

Though this was years ago, it bears mentioning that there is a method, which depends on the queue type you need. If the latest entries to the queue are your concern then use a Last-In-First-Out Queue. Opposite of the standard queue, which is First-In-First-Out. Please give larger examples. With the little info given, here's how you get the last N elements from a LIFO queue
from queue import LifoQueue, Empty
events = LifoQueue()
# Wait for queue to be loaded with N events
N = 5
while N:
try:
event = events.get_nowait()
except Empty:
break
# process event
N-=1
You should handle the case where the queue is Empty in this case exit the loop and do not attempt to process.

Related

Passing updated args to multiple threads periodically in python

I have three base stations, they have to work in parallel, and they will receive a list every 10 seconds which contain information about their cluster, and I want to run this code for about 10 minutes. So, every 10 seconds my three threads have to call the target method with new arguments, and this process should last long for 10 minutes. I don't know how to do this, but I came up with the below idea which seems to be not quite a good one! Thus I appreciate any help.
I have a list named base_centroid_assign that I want to pass each item of it to a distinct thread. The list content will be updated frequently (supposed for instance 10 seconds), I so wish to recall my previous threads and give the update items to them.
In the below code, the list contains three items which have multiple items in them (it's nested). I want to have three threads stop after executing the quite simple target function, and then recall the threads with update item; however, when I run the below code, I ended up with 30 threads! (the run_time variable is 10 and list's length is 3).
How can I implement idea as mentioned above?
run_time = 10
def cluster_status_broadcasting(info_base_cent_avr):
print(threading.current_thread().name)
info_base_cent_avr.sort(key=lambda item: item[2], reverse=True)
start = time.time()
while(run_time > 0):
for item in base_centroid_assign:
t = threading.Thread(target=cluster_status_broadcasting, args=(item,))
t.daemon = True
t.start()
print('Entire job took:', time.time() - start)
run_time -= 1
Welcome to Stackoverflow.
Problems with thread synchronisation can be so tricky to handle that Python already has some very useful libraries specifically to handle such tasks. The primary such library is queue.Queue in Python 3. The idea is to have a queue for each "worker" thread. The main thread collect and put new data onto a queue, and have the subsidiary threads get the data from that queue.
When you call a Queue's get method its normal action is to block the thread until something is available, but presumably you want the threads to continue working on the current inputs until new ones are available, in which case it would make more sense to poll the queue and continue with the current data if there is nothing from the main thread.
I outline such an approach in my answer to this question, though in that case the worker threads are actually sending return values back on another queue.
The structure of your worker threads' run method would then need to be something like the following pseudo-code:
def run(self):
request_data = self.inq.get() # Wait for first item
while True:
process_with(request_data)
try:
request_data = self.inq.get(block=False)
except queue.Empty:
continue
You might like to add logic to terminate the thread cleanly when a sentinel value such as None is received.

Choose Items from a List in multithreaded python

I am a beginner in python and cant figure out how to do this:
I am running a python script that puts a new value every 5-10 seconds into a list. I want to choose these elements from the list in another multithreaded python script however per thread one value, so one value shouldnt be reused, if theres no next value, then wait until next value is present. I have some code where I tried to do it but with no success:
Script that creates values:
values = ['a','b','c','d','e','f']
cap = []
while True:
cap.append(random.choice(values))
print cap
time.sleep(5)
Script that needs these values:
def adding(self):
p = cap.pop()
print (p)
However in a multithreaded environment, each thread gives me the same value, even thought I want value for each thread to be different (e.g remove value already used by thread) What are my options here?
If I understood correctly, you want to use one thread (a producer) to fill a list with values, and then a few different threads (consumers) to remove from that same list. Thus resulting with a series of consumers which have mutually exclusive subsets of the values added by the producer.
A possible outcome might be:
Producer
cap.append('a')
cap.append('c')
cap.append('b')
cap.append('f')
Consumer 1
cap.pop() # a
cap.pop() # f
Consumer 2
cap.pop() # c
cap.pop() # b
If this is the behavior you want I recommend using a thread-safe object like a Queue (python 2.*) or queue (python 3.*)
Here is one possible implementation
Producer
import Queue
values = ['a','b','c','d','e','f']
q = Queue.Queue()
while True:
q.put(random.choice(values))
print q
time.sleep(5)
Consumer
val = q.get() # this call will block (aka wait) for something to be available
print(val)
Its also very important that both the producer and the consumer have access to the same instance of the of q.

How to add threads depending on a number

In a part of my software code written with python, I have a list of items where it size can vary greatly from 12 to only one item . For each item in this list I'm doing some processing (sending an HTTP request related to the given item, parse results and many other operations . I'd like to speed up my code using threading, I'd like to create 2 threads where each one take a number of items and do the processing async.
Example 1 : Let's say that in my list I have 12 items, each thread would take in this case 6 items and call the processing functions on each item .
Example 2 : Now let's say that my list have 9 items, one thread would take 5 items and the other thread would take the other 4 left items .
Currently I'm not applying any threading and my code base is very large, so here some code that do almost the same thing as my case :
#This procedure need to be used with threading .
itemList = getItems() #This function return an unknown number of items between 1 and 12
if len(itemList) > 0: # Make sure that the list is empty in this case .
for item in itemList:
processItem(item) #This is an imaginary function that do the processing on each item
Below is a basic lite code that explain what I'm doing, I can't figure out how can I make my threads flexible, so each one take a number of items and the other take the rest (as explained in example 1 & 2) .
Thank's for your time
You might rather implement it using shared queues
https://docs.python.org/3/library/queue.html#queue-objects
import queue
import threading
def worker():
while True:
item = q.get()
if item is None:
break
do_work(item)
q.task_done()
q = queue.Queue()
threads = []
for i in range(num_worker_threads):
t = threading.Thread(target=worker)
t.start()
threads.append(t)
for item in source():
q.put(item)
# block until all tasks are done
q.join()
# stop workers
for i in range(num_worker_threads):
q.put(None)
for t in threads:
t.join()
Quoting from
https://docs.python.org/3/library/queue.html#module-queue:
The queue module implements multi-producer, multi-consumer queues. It
is especially useful in threaded programming when information must be
exchanged safely between multiple threads.
The idea is that you have a shared storage and each thread attempts reading items from it one-by-one.
This is much more flexible than distributing the load in advance as you don't know how threads execution will be scheduled by your OS, how much time each iteration would take etc.
Furthermore, you might add items for further processing to this queue dynamically — for example, having a producer thread running in parallel.
Some helpful links:
A brief introduction into concurrent programming in python:
http://www.slideshare.net/dabeaz/an-introduction-to-python-concurrency
More details on producer-consumer pattern with line-by-line explanation:
http://www.informit.com/articles/article.aspx?p=1850445&seqNum=8
You can use the ThreadPoolExecutor class from the concurrent.futures module in Python 3. The module is not present in Python 2, but there are some workarounds (which I will not discuss).
A thread pool executor does basically what #ffeast proposed, but with fewer lines of code for you to write. It manages a pool of threads which will execute all the tasks that you submit to it, presumably in the most efficient manner possible. The results will be returned through Future objects, which represent a "pending" result.
Since you seem to know the list of tasks up front, this is especially convenient for you. While you can not guarantee how the tasks will be split between the threads, the result will probably be at least as good as anything you coded by hand.
from concurrent.futures import ThreadPoolExecutor
with ThreadPoolExecutor(max_workers=2) as executor:
for item in getItems():
executor.submit(processItem, item)
If you need more information with the output, like some way of identifying the futures that have completed or getting results out of them, see the example in the Python documentation (on which the code above is heavily based).

Python 3: How to properly add new Futures to a list while already waiting upon it?

I have a concurrent.futures.ThreadPoolExecutor and a list. And with the following code I add futures to the ThreadPoolExecutor:
for id in id_list:
future = self._thread_pool.submit(self.myfunc, id)
self._futures.append(future)
And then I wait upon the list:
concurrent.futures.wait(self._futures)
However, self.myfunc does some network I/O and thus there will be some network exceptions. When errors occur, self.myfunc submits a new self.myfunc with the same id to the same thread pool and add a new future to the same list, just as the above:
try:
do_stuff(id)
except:
future = self._thread_pool.submit(self.myfunc, id)
self._futures.append(future)
return None
Here comes the problem: I got an error on the line of concurrent.futures.wait(self._futures):
File "/usr/lib/python3.4/concurrent/futures/_base.py", line 277, in wait
f._waiters.remove(waiter)
ValueError: list.remove(x): x not in list
How should I properly add new Futures to a list while already waiting upon it?
Looking at the implementation of wait(), it certainly doesn't expect that anything outside concurrent.futures will ever mutate the list passed to it. So I don't think you'll ever get that "to work". It's not just that it doesn't expect the list to mutate, it's also that significant processing is done on list entries, and the implementation has no way to know that you've added more entries.
Untested, I'd suggest trying this instead: skip all that, and just keep a running count of threads still active. A straightforward way is to use a Condition guarding a count.
Initialization:
self._count_cond = threading.Condition()
self._thread_count = 0
When my_func is entered (i.e., when a new thread starts):
with self._count_cond:
self._thread_count += 1
When my_func is done (i.e., when a thread ends), for whatever reason (exceptional or not):
with self._count_cond:
self._thread_count -= 1
self._count_cond.notify() # wake up the waiting logic
And finally the main waiting logic:
with self._count_cond:
while self._thread_count:
self._count_cond.wait()
POSSIBLE RACE
It seems possible that the thread count could reach 0 while work for a new thread has been submitted, but before its my_func invocation starts running (and so before _thread_count is incremented to account for the new thread).
So the:
with self._count_cond:
self._thread_count += 1
part should really be done instead right before each occurrence of
self._thread_pool.submit(self.myfunc, id)
Or write a new method to encapsulate that pattern; e.g., like so:
def start_new_thread(self, id):
with self._count_cond:
self._thread_count += 1
self._thread_pool.submit(self.myfunc, id)
A DIFFERENT APPROACH
Offhand, I expect this could work too (but, again, haven't tested it): keep all your code the same except change how you're waiting:
while self._futures:
self._futures.pop().result()
So this simply waits for one thread at a time, until none remain.
Note that .pop() and .append() on lists are atomic in CPython, so no need for your own lock. And because your my_func() code appends before the thread it's running in ends, the list won't become empty before all threads really are done.
AND YET ANOTHER APPROACH
Keep the original waiting code, but rework the rest not to create new threads in case of exception. Like rewrite my_func to return True if it quits due to an exception, return False otherwise, and start threads running a wrapper instead:
def my_func_wrapper(self, id):
keep_going = True
while keep_going:
keep_going = self.my_func(id)
This may be especially attractive if you someday decide to use multiple processes instead of multiple threads (creating new processes can be a lot more expensive on some platforms).
AND A WAY USING cf.wait()
Another way is to change just the waiting code:
while self._futures:
fs = self._futures[:]
for f in fs:
self._futures.remove(f)
concurrent.futures.wait(fs)
Clear? This makes a copy of the list to pass to .wait(), and the copy is never mutated. New threads show up in the original list, and the whole process is repeated until no new threads show up.
Which of these ways makes most sense seems to me to depend mostly on pragmatics, but there's not enough info about all you're doing for me to make a guess about that.

Is there a "single slot" queue?

I need to use a queue which holds only one element, any new element discarding the existing one. Is there a built-in solution?
The solution I coded works but I strive not to reinvent the wheel :)
import Queue
def myput(q, what):
# empty the queue
while not q.empty():
q.get()
q.put(what)
q = Queue.Queue()
print("queue size: {}".format(q.qsize()))
myput(q, "hello")
myput(q, "hello")
myput(q, "hello")
print("queue size: {}".format(q.qsize()))
EDIT: following some comments & answers -- I know that a variable is just for that :) In my program, though, queues will be used to communicate between processes.
As you specify you are using queues to communicate between processes, you should use the multiprocesssing.Queue.
In order to ensure there is only one item in the queue at once, you can have the producers sharing a lock and, whilst locked, first get_nowait from the queue before put. This is similar to the loop you have in your code, but without the race condition of two producers both emptying the queue before putting their new item, and therefore ending up with two items in the queue.
Although the OP is regarding inter-process-communication, I came across a situation where I needed a queue with a single element (such that old elements are discarded when a new element is appended) set up between two threads (producer/consumer).
The following code illustrates the solution I came up with using a collections.deque as was mentioned in the comments:
import collections
import _thread
import time
def main():
def producer(q):
i = 0
while True:
q.append(i)
i+=1
time.sleep(0.75)
def consumer(q):
while True:
try:
v = q.popleft()
print(v)
except IndexError:
print("nothing to pop...queue is empty")
sleep(1)
deq = collections.deque(maxlen=1)
print("starting")
_thread.start_new_thread(producer, (deq,))
_thread.start_new_thread(consumer, (deq,))
if __name__ == "__main__":
main()
In the code above, since the producer is faster than the consumer (sleeps less), some of the elements will not be processed.
Notes (from the documentation):
Deques support thread-safe, memory efficient appends and pops from
either side of the deque with approximately the same O(1) performance
in either direction.
Once a bounded length deque is full, when new items are added, a
corresponding number of items are discarded from the opposite end.
Warning: The code never stops :)

Categories