python Priority Queue implementation - python

i'm having trouble creating an insert function with the following parameters. The insert function should take in a priority queue, and an element and inserts it using the priority rules -
The priority queue will take a series of tasks and order them
based on their importance. Each task has an integer priority from 10 (highest priority) to 1
(lowest priority). If two tasks have the same priority, the order should be based on the order
they were inserted into the priority queue (earlier first).
So, as of right now i've created the following code to initialize some of the things needed...
class Tasks():
__slots__ = ('name', 'priority')
def __init__(bval):
bval.name = myName
bval.priority = myPriority
return bval
class PriorityQueue():
__slots__ = ('queue', 'element')
def __init__(aval):
aval.queue = queue
aval.element = element
return aval
The code i'm trying to write is insert(element, queue): which should insert the elements using the priority queue. Similarly, myPriorty is an integer from 1 to 10.
Similarly can I do the following to insure that I create a priority from 1 to 10...
def __init__(bval , myPriority = 10):
bval.priority = myPriority
bval.pq = [[] for priority in range(bval.myPriority)]
so that I can replace myPriority in the insert task with bval.pq

Why are you trying to re-invent the wheel?
from Queue import PriorityQueue
http://docs.python.org/2/library/queue.html?highlight=priorityqueue#Queue.PriorityQueue
The lowest valued entries are retrieved first (the lowest valued entry is the one returned by sorted(list(entries))[0]). A typical pattern for entries is a tuple in the form:
(priority_number, data).
I use such a module to communicate between the UI and a background polling thread.
READ_LOOP = 5
LOW_PRI = 3
MED_PRI = 2
HI_PRI = 1
X_HI_PRI = 0
and then something like this:
CoreGUI.TX_queue.put((X_HI_PRI,'STOP',[]))

Note that there is a Queue. If you are okay with it being synchronized, I would use that.
Otherwise, you should use a heap to maintain your queue. See Python documentation with an example of that.

From a great book "Modern Python Standard Library Cookbook" by Alessandro Molina
Heaps are a perfect match for everything that has priorities, such as
a priority queue:
import time
import heapq
class PriorityQueue:
def __init__(self):
self._q = []
def add(self, value, priority=0):
heapq.heappush(self._q, (priority, time.time(), value))
def pop(self):
return heapq.heappop(self._q)[-1]
Example:
>>> def f1(): print('hello')
>>> def f2(): print('world')
>>>
>>> pq = PriorityQueue()
>>> pq.add(f2, priority=1)
>>> pq.add(f1, priority=0)
>>> pq.pop()()
hello
>>> pq.pop()()
world

A deque (from collections import deque) is the python implementation of a single queue. You can add items to one end and remove them from the other. If you have a deque for each priority level, you can add to the priority level you want.
Together, it looks a bit like this:
from collections import deque
class PriorityQueue:
def __init__(self, priorities=10):
self.subqueues = [deque() for _ in range(levels)]
def enqueue(self, priorty, value):
self.subqueues[priority].append(value)
def dequeue(self):
for queue in self.subqueues:
try:
return queue.popleft()
except IndexError:
continue

Related

Is list comprehension atomic?

This concerns the _Status.getOps method. I'm very new to multithreading and I'm not sure if I need to lock the thread for this method. Is list comprehension like this atomic or not? I can't imagine it's a single bytecode, but it seems clumsy.
class _Status:
'''A container object for formatting what we are doing in a
nice way. This way, the information about what we're doing
is retained even after we're long-done doing it. '''
def __init__(self):
self.nextid = 0
self.items = {}
self.lock = threading.Rlock()
def add(self, opStr)
''' Pass this method a string describing an operation and
we'll put it into the data struture, then return to you an
id you can use to do things to this operation later. '''
with self.lock:
id = self._getid()
# Store the operation
self.items[id] = {
'name': opStr
'status': 'Initalizing',
'errors': []
}
return id
def getOps(self):
''' This method returns a list of strings describing what's
going on right now. Also comes with some suggested colors
for displaying them. '''
with self.lock:
return ['{name}: {status}'.format(**op) for op in
[self.items[id] for id in sorted(self.items.keys()]]
def clear(self):
''' This removes all stored operations--use with caution. '''
self.items = {}
def _updateOp(self, opId, opStatus):
''' This method needs an id to work with and a string which
descibes the status of that id's operation. I'll update it
if it can. This is a private method, we'll give aliases for
it, however. '''
with threading.Lock():
# Do nothing if we can't find the item
if id in self.items:
return
self.items[id]['status'] = opStatus
def _getid(self):
''' Internal method to increment the next id we hand out. '''
id = hex(self.nextid)
self.nextid += 1
return id

Implementing Priority Queue in python

I'm having a bit of trouble implementing a priority queue in python. Essentially I copied all the code from the heapq implementation documentation, however in my application to recreate the data structure for Dijkstra's algorithm, I would like to update the priorities for each item.
class PriorityQueue():
REMOVED = '<removed-task>' # placeholder for a removed task
def __init__(self, arr=None):
if arr is None:
self.data = []
else:
self.data = []
self.entry_finder = {} # mapping of tasks to entries
self.counter = itertools.count() # unique sequence count
def add_task(self, task, priority=0):
'Add a new task or update the priority of an existing task'
if task in self.entry_finder:
self.remove_task(task)
count = next(self.counter) # Increments count each time item is added ( Saves index ).
entry = [priority, count, task]
self.entry_finder[task] = entry
heapq.heappush(self.data, entry)
def remove_task(self, task):
'Mark an existing task as REMOVED. Raise KeyError if not found.'
entry = self.entry_finder.pop(task)
entry[-1] = self.REMOVED
def pop_task(self):
'Remove and return the lowest priority task. Raise KeyError if empty.'
while self.data:
priority, count, task = heapq.heappop(self.data)
if task is not self.REMOVED:
del self.entry_finder[task]
return task
raise KeyError('pop from an empty priority queue')
list = PriorityQueue()
list.add_task("A", 12)
list.add_task("B", 6)
list.add_task("C", 8)
list.add_task("D", 2)
list.add_task("E", 1)
list.add_task("A", 5)
The following code works fine, except it will add a new task "A", and then the old task "A" will be renamed to 'removed-task' and keep it's index in the heap. I would prefer to outright delete the original task instead. I was confused as to how calling '''add_task"" which calls remove_task to change the value of '''entry'' was implicitly changing the value of items in self.data. I then realized that in add_task:
entry = [priority, count, task]
self.entry_finder[task] = entry
we have these two names interning and referencing the same object. If this is the case, how may I be able to delete an object by it's ID? I think this would solve my problem, unless anyone has another solution to find the item in self.data and remove it in O(1). Thanks!

Using multiprocessing module in class

I have the following program and I want to use multiprocessing module. It uses external files, in which I call the PSO class from another file. costfunc is a function from another file and the other args are just variables.
Swarm is a list containing as much objects as the value of ps, and each object has multiple attributes which need to update at every iteration.
Following Hannu implemented multiprocessing.pool and it is working, however it is taking much more time than running sequentially.
Would appreciate if you could tell me what are the reasons for it happening and how can I make it run faster?
# IMPORT PACKAGES -----------------------------------------------------------+
import random
import numpy as np
# IMPORT FILES --------------------------------------------------------------+
from Reducer import initial
# Particle Class ------------------------------------------------------------+
class Particle:
def __init__(self,D,bounds_x,bounds_v):
self.Position_i = [] # particle position
self.Velocity_i = [] # particle velocity
self.Cost_i = -1 # cost individual
self.Position_Best_i = [] # best position individual
self.Cost_Best_i = -1 # best cost individual
self.Constraint_Best_i = [] # best cost individual contraints
self.Constraint_i = [] # constraints individual
self.Penalty_i = -1 # constraints individual
x0,v0 = initial(D,bounds_x,bounds_v)
for i in range(0,D):
self.Velocity_i.append(v0[i])
self.Position_i.append(x0[i])
# evaluate current fitness
def evaluate(self,costFunc,i):
self.Cost_i, self.Constraint_i,self.Penalty_i = costFunc(self.Position_i,i)
# check to see if the current position is an individual best
if self.Cost_i < self.Cost_Best_i or self.Cost_Best_i == -1:
self.Position_Best_i = self.Position_i
self.Cost_Best_i = self.Cost_i
self.Constraint_Best_i = self.Constraint_i
self.Penalty_Best_i = self.Penalty_i
return self
def proxy(gg, costf, i):
print(gg.evaluate(costf, i))
# Swarm Class ---------------------------------------------------------------+
class PSO():
def __init__(self,costFunc,bounds_x,bounds_v,ps,D,maxiter):
self.Cost_Best_g = -1 # Best Cost for Group
self.Position_Best_g = [] # Best Position for Group
self.Constraint_Best_g = []
self.Penalty_Best_g = -1
# Establish Swarm
Swarm = []
for i in range(0,ps):
Swarm.append(Particle(D,bounds_x,bounds_v))
# Begin optimization Loop
i = 1
self.Evol = []
while i <= maxiter:
pool = multiprocessing.Pool(processes = 4)
results = pool.map_async(partial(proxy, costf = costFunc, i=i), Swarm)
pool.close()
pool.join()
Swarm = results.get()
if Swarm[j].Cost_i< self.Cost_Best_g or self.Cost_Best_g == -1:
self.Position_Best_g = list(Swarm[j].Position_i)
self.Cost_Best_g = float(Swarm[j].Cost_i)
self.Constraint_Best_g = list(Swarm[j].Constraint_i)
self.Penalty_Best_g = float(Swarm[j].Penalty_i)
self.Evol.append(self.Cost_Best_g)
i += 1
You need a proxy function to do the function call, and as you need to deliver arguments to the function, you will need partial as well. Consider this:
from time import sleep
from multiprocessing import Pool
from functools import partial
class Foo:
def __init__(self, a):
self.a = a
self.b = None
def evaluate(self, CostFunction, i):
xyzzy = CostFunction(i)
sleep(0.01)
self.b = self.a*xyzzy
return self
def CostFunc(i):
return i*i
def proxy(gg, costf, i):
return gg.evaluate(costf, i)
def main():
Swarm = []
for i in range(0,10):
nc = Foo(i)
Swarm.append(nc)
p = Pool()
for i in range(100,102):
results = p.map_async(partial(proxy, costf=CostFunc, i=i), Swarm)
p.close()
p.join()
Swarm = []
for a in results.get():
Swarm.append(a)
for s in Swarm:
print (s.b)
main()
This creates a Swarm list of objects, and within each of these objects is evaluate that is the function you need to call. Then we have parameters (CostFunc and an integer as in your code).
We will now use Pool.map_async to map your Swarm list to your pool. This gives each worker one instance of Foo from your Swarm list, and we have a proxy function that actually calls then evaluate().
However, as apply_async only sends an object from the iterable to the function, instead of using proxy as the target function to pool, we use partial to create the target function to pass the "fixed" arguments.
And as you apparently want to get the modified objects back, this requires another trick. If you modify the target object in Pool process, it just modifies the local copy and throws it away as soon as the processing completes. There would be no way for the subprocess to modify main process memory anyway (or vice versa), this would cause a segmentation fault.
Instead, after modifying the object, we return self. When your pool has completed its work, we discard the old Swarm and reassemble it from the result objects.

How to use filter in Python with a function which belongs to an object which is an element on the list being filtered?

To be specific in my case, the class Job has a number of Task objects on which it operates.
import tasker
class Job(object):
_name = None
_tasks = []
_result = None
def __init__(self, Name):
self._name = Name
def ReadTasks(self):
# read from a Json file and create a list of task objects.
def GetNumTasks(self):
return len(self._tasks)
def GetNumFailedTasks(self):
failTaskCnt = 0
for task in self._tasks:
if task.IsTaskFail():
failTaskCnt += 1
To make GetNumFailedTasks more succinct, I would like to use a filter, but I am not sure what is the correct way to provide filter with IsTaskFail as the first parameter.
In case, this is a duplicate, please mark it so, and point to the right answer.
You can use a generator expression with sum:
failTaskCnt = sum(1 for task in self._tasks if task.IsTaskFail())

How to create queues of objects in Django?

I am new to Django and I am trying to build a blog myself. I'm trying to create a feature I've seen implemented in Drupal using the nodequeue module.
What I want to do is to be able to create queues of objects, for example, queues of blog posts. Below, I describe how I imagine the queues to work:
the size of each queue should be user-defined.
the date an object is added to a queue should be recorded.
I would like to be able to define the order of the items that belong to each queue (but I think this would be very difficult).
if the queue is full, the addition of an extra item should discard the oldest item of the queue.
An example of how such a feature would be useful is the creation of a featured posts queue.
My current knowledge does not allow me to figure out what would be the right way to do it. I would appreciate any pointers.
Thanks in advance
Here's one approach:
import collections, datetime, itertools
class nodequeue(object):
def __init__(self, N):
self.data = collections.deque(N * [(None, None)])
def add(self, anobj):
self.data.popleft()
self.data.push((anobj, datetime.datetime.now())
def __iter__(self):
it = iter(self.data)
return it.dropwhile(lambda x: x[1] is None, self.data)
This ignores the "ordering" desiderata, but that wouldn't be too hard to add, e.g.:
class nodequeueprio(object):
def __init__(self, N):
self.data = collections.deque(N * [(None, None, None)])
def add(self, anobj, prio):
self.data.popleft()
self.data.push((anobj, datetime.datetime.now(), prio)
def __iter__(self):
it = iter(self.data)
return sorted(it.dropwhile(lambda x: x[1] is None, self.data),
key=operator.itemgetter(2))
I think that prefilling the queue with placeholder Nones simplifies the code because add can always drop the leftmost (oldest or None) item before adding the new thingy -- even though __iter__ must then remove the placeholders, that's not too bad.
Alex's approach is great. I won't pretend to compete with his level of expertise, but for sake of completeness, here is another approach using the fantastic Queue.Queue class (bonus: thread-safe, but that's kinda useless to you based on your description). This might be easier to understand for you, as you expressed some concern on that point:
myqueue.py
#!/usr/bin/python
# Renamed in Python 3.0
try: from Queue import Queue, Full, Empty
except: from queue import Queue, Full, Empty
from datetime import datetime
# Spec 1: Size of each queue should be user-defined.
# - maxsize on __init__
# Spec 2: Date an object is added should be recorded.
# - datetime.now() is first member of tuple, data is second
# Spec 3: I would like to be able to define the order of the items that
# belong to each queue.
# - Order cannot be rearranged with this queue.
# Spec 4: If the queue is full, the addition of an extra item should discard
# the oldest item of the queue.
# - implemented in put()
class MyQueue(Queue):
"Wrapper around Queue that discards old items instead of blocking."
def __init__(self, maxsize=10):
assert type(maxsize) is int, "maxsize should be an integer"
Queue.__init__(self, maxsize)
def put(self, item):
"Put an item into the queue, possibly discarding an old item."
try:
Queue.put(self, (datetime.now(), item), False)
except Full:
# If we're full, pop an item off and add on the end.
Queue.get(self, False)
Queue.put(self, (datetime.now(), item), False)
def put_nowait(self, item):
"Put an item into the queue, possibly discarding an old item."
self.put(item)
def get(self):
"Get a tuple containing an item and the datetime it was entered."
try:
return Queue.get(self, False)
except Empty:
return None
def get_nowait(self):
"Get a tuple containing an item and the datetime it was entered."
return self.get()
def main():
"Simple test method, showing at least spec #4 working."
queue = MyQueue(5)
for i in range(1, 7):
queue.put("Test item number %u" % i)
while not queue.empty():
time_and_data = queue.get()
print "%s => %s" % time_and_data
if __name__ == "__main__":
main()
expected output
2009-11-02 23:18:37.518586 => Test item number 2
2009-11-02 23:18:37.518600 => Test item number 3
2009-11-02 23:18:37.518612 => Test item number 4
2009-11-02 23:18:37.518625 => Test item number 5
2009-11-02 23:18:37.518655 => Test item number 6
You can use django-activity-stream. It doesn't have UI as Nodequeue has, but it can be used to create different queues of the objects.
https://github.com/justquick/django-activity-stream
http://www.slideshare.net/steveivy/activity-streams-lightning-talk-djangocon-2011-day-3

Categories