Default heapq is min queue implementation and wondering if there is an option for max queue? Thanks.
I tried the solution using _heapify_max for max heap, but how to handle dynamically push/pop element? It seems _heapify_max could only be used during initialization time.
import heapq
def heapsort(iterable):
h = []
for value in iterable:
heapq.heappush(h, value)
return [heapq.heappop(h) for i in range(len(h))]
if __name__ == "__main__":
print heapsort([1, 3, 5, 7, 9, 2, 4, 6, 8, 0])
Edit, tried _heapify_max seems not working for dynamically push/pop elements. I tried both methods output the same, both output is, [0, 1, 2, 3, 4, 5, 6, 7, 8, 9].
def heapsort(iterable):
h = []
for value in iterable:
heapq.heappush(h, value)
return [heapq.heappop(h) for i in range(len(h))]
def heapsort2(iterable):
h = []
heapq._heapify_max(h)
for value in iterable:
heapq.heappush(h, value)
return [heapq.heappop(h) for i in range(len(h))]
if __name__ == "__main__":
print heapsort([1, 3, 5, 7, 9, 2, 4, 6, 8, 0])
print heapsort2([1, 3, 5, 7, 9, 2, 4, 6, 8, 0])
Thanks in advance,
Lin
In the past I have simply used sortedcontainers's SortedList for this, as:
> a = SortedList()
> a.add(3)
> a.add(2)
> a.add(1)
> a.pop()
3
It's not a heap, but it's fast and works directly as required.
If you absolutely need it to be a heap, you could make a general negation class to hold your items.
class Neg():
def __init__(self, x):
self.x = x
def __cmp__(self, other):
return -cmp(self.x, other.x)
def maxheappush(heap, item):
heapq.heappush(heap, Neg(item))
def maxheappop(heap):
return heapq.heappop(heap).x
But that will be using a little more memory.
There is a _heappop_max function in the latest cpython source that you may find useful:
def _heappop_max(heap):
"""Maxheap version of a heappop."""
lastelt = heap.pop() # raises appropriate IndexError if heap is empty
if heap:
returnitem = heap[0]
heap[0] = lastelt
heapq._siftup_max(heap, 0)
return returnitem
return lastelt
If you change the heappush logic using heapq._siftdown_max you should get the desired output:
def _heappush_max(heap, item):
heap.append(item)
heapq._siftdown_max(heap, 0, len(heap)-1)
def _heappop_max(heap):
"""Maxheap version of a heappop."""
lastelt = heap.pop() # raises appropriate IndexError if heap is empty
if heap:
returnitem = heap[0]
heap[0] = lastelt
heapq._siftup_max(heap, 0)
return returnitem
return lastelt
def heapsort2(iterable):
h = []
heapq._heapify_max(h)
for value in iterable:
_heappush_max(h, value)
return [_heappop_max(h) for i in range(len(h))]
Output:
In [14]: heapsort2([1,3,6,2,7,9,0,4,5,8])
Out[14]: [9, 8, 7, 6, 5, 4, 3, 2, 1, 0]
In [15]: heapsort2([7, 8, 9, 6, 4, 2, 3, 5, 1, 0])
Out[15]: [9, 8, 7, 6, 5, 4, 3, 2, 1, 0]
In [16]: heapsort2([19,13,15,17,11,10,14,20,18])
Out[16]: [20, 19, 18, 17, 15, 14, 13, 11, 10]
In [17]: heapsort2(["foo","bar","foobar","baz"])
Out[17]: ['foobar', 'foo', 'baz', 'bar']
Related
I'm trying to create a function that consumes a flat list and returns a list of lists/nested list that basically groups the values of the original list together based on their value.
so basically
L = [ 1, 19, 5, 2, 22, 12 28]
if I group them in groups of based on groups of 10 L will become
L = [[1, 2, 5], [12, 19], [22, 28]]
However I HAVE to use recursion to do this. I'm a little lost and don't know where to start. If I can have a bit of help I would appreciate it
def group_lists(L, pos):
if pos>=len(L):
return none
else:
if L[pos]<=10:
return ?
elif 20>=L[pos]>10:
return ?
else:
return ?
You should pass the intermediate result through the recursive calls, and then return it when you've run out of elements. If you must use exactly two parameters for whatever reason, make a wrapper function or use the global keyword. (I would not recommend using global in practice, however.)
def group_lists(L, pos, result):
if len(L) <= pos:
return result
else:
result[L[pos] // 10].append(L[pos])
return group_lists(L, pos + 1, result)
L = [ 1, 19, 5, 2, 22, 12, 28]
print(group_lists(L, 0, [[], [], []]))
Here's a version that adds one sublist per recursive call:
>>> def group_lists(arr):
... if isinstance(arr[-1], list):
... return arr
... i = sum(isinstance(sub, list) for sub in arr)
... return group_lists(
... arr[:i]
... + [[n for n in arr[i:] if n // 10 == i]]
... + [n for n in arr[i:] if n // 10 != i]
... )
...
>>> group_lists([1, 19, 5, 2, 22, 12, 28])
[[1, 5, 2], [19, 12], [22, 28]]
You can try the following:
lst = [1, 19, 5, 2, 22, 12, 28]
def group_lists(lst):
if not lst: # empty list
return []
output = group_lists(lst[1:]) # recursion
while len(output) <= lst[0] // 10: # if not enough slots
output.append([]) # make as many as needed
output[lst[0] // 10].append(lst[0]) # append the item to the right slot
return output
print(group_lists(lst)) # [[2, 5, 1], [12, 19], [28, 22]]
This would automatically add slots as needed, using a while loop. It will make empty slots if your input is something like [1, 21].
You can use defaultdict structure from collection module like
from collections import defaultdict
def group_list(L, res=defaultdict(list)):
if len(L) == 1:
res[L[0] // 10].append(L[0])
return
if len(L) == 0:
return
group_list(L[:len(L) // 2])
group_list(L[len(L) // 2:])
return res
print(group_list([ 1, 19, 5, 2, 22, 12, 28]))
#defaultdict(<class 'list'>, {0: [1, 5, 2], 1: [19, 12], 2: [22, 28]})
My most recent lab assignment has me trying to implement a Greedy algorithm for the 0/1 Knapsack problem, and print out the contents of the knapsack along with the total value of the knapsack. So far, I was able to get it to output the total value of the knapsack without issue, but I'm having trouble with outputting what items went into the knapsack.
#class definitions for the greedy approach
class Item:
def __init__(self,weight,value):
self.weight = weight
self.value = value
self.price_kg = value / weight
def __repr__(self):
return f"Item(weight={self.weight}, value={self.value},v/w={self.price_kg})\n"
class Knapsack:
def __init__(self,max_weight,items):
self.max_weight = max_weight
self.items = items
self.contents = list()
def fillGreedy(self):
self.items.sort(key=lambda x: x.price_kg, reverse=True)#sorts the items by weight/value
for i in self.items:
self.contents.append(i)#Tries putting the item in the bag
if sum(i.weight for i in self.contents) > self.max_weight:
self.contents.remove(i)#Removes the item it is too heavy for the bag
elif sum(i.weight for i in self.contents) == self.max_weight:#finds an optimal configuration for the bag
return sum(i.value for i in self.contents)
return sum(i.value for i in self.contents)
#main method
max_weights = [10, 13, 15, 30, 30]
weights = [
[4, 5, 7],
[6, 5, 7, 3, 1],
[2, 3, 5, 5, 3, 7],
[10, 13, 17, 15],
[5, 4, 7, 6, 3, 4, 2, 1, 7, 6]
]
values = [
[2, 3, 4],
[7, 3, 4, 4, 3],
[3, 4, 10, 9, 6, 13],
[21, 17, 30, 23],
[3, 1, 3, 2, 1, 3, 2, 3, 1, 4]
]
for i in range(len(max_weights)):
items = list()
for j in range(len(weights[i])):
items.append(Item(weights[i][j], values[i][j])) #adds the contents of the arrays to the Items list
i
ks = Knapsack(max_weights[i], items)
v1 = ks.fillGreedy()
print(f"Total value = {v1}")
#print(items)
So far, I tried printing out the contents of the ks and v1 objects, but that only gives the memory addresses of the objects. I tried printing out the 'items' list itself after iterating through the fillGreedy method, but it prints out all the contents of the list and not the ones in the knapsack itself. I also tried doing something in the fillGreedy method that would print the item that was just added, but it ended up causing conflicts. I'm unsure where to continue from here. Is there a way to print out the items of the knapsack using this approach?
Welcome to the site.
You already have a collection of the selected items inside the Knapsack object, so you could iterate over ks.contents and print out the contents or whatever is needed from there...
for item in ks.contents:
print(item)
I have three forms that looks like that:
class RoomsForm(forms.Form):
rooms = forms.IntegerField(min_value=1)
class PeopleForm(forms.Form):
adult = forms.IntegerField(min_value=1)
children = forms.IntegerField(required=False)
class ChildrenAgeForm(forms.Form):
children_age = forms.IntegerField(max_value=10, required=False)
Quantity of PeopleForm depend on value rooms field of RoomsForm and quantity of ChildrenAgeForm depends on values children field of each PeopleForm. So i create formsets for PeopleForm and ChildrenAgeForm, and multiply it using js. Finally i need to create string that looks like this if the value of rooms, of example, is 3:
'<Room Adult=2 Children=2>
<ChildAge>2</ChildAge>
<ChildAge>1</ChildAge>
</Room>
<Room Adult=1 Children=0>
</Room>
<Room Adult=1 Children=1>
<ChildAge>3</ChildAge>
</Room>'
According to this i create loop script in the views.py file:
PeopleFormSet = formset_factory(PeopleForm, extra = 1, max_num = 15)
ChildrenAgeFormSet = formset_factory(ChildrenAgeForm, extra = 1, max_num = 20)
rooms_form = RoomsForm(request.POST, prefix='rooms_form')
people_formset = PeopleFormSet(request.POST, prefix='people')
childrenage_formset = ChildrenAgeFormSet(request.POST, prefix='childrenage')
if room_form.is_valid() and people_formset.is_valid() and childrenage_formset.is_valid():
people = ''
childrenage_str = []
for i in range(0, childrenage_formset.total_form_count()):
childrenage_form = childrenage_formset.forms[i]
childrenage = str(childrenage_form.cleaned_data['children_age'])
childrenage_str += childrenage
for n in range(0, people_formset.total_form_count()):
childrenage_lst = childrenage_str
people_form = people_formset.forms[n]
adults = str(people_form.cleaned_data['adult'])
children = people_form.cleaned_data['children']
for i in range(0, children):
childage_str = ''
childage = childrenage_lst.pop(i)
childage_str += '<ChildAge>%s</ChildrenAge>' % childage
people += '<Room Adults="%s">%s</Room>' % (adults, childage_str)
But i got error pop index out of range. Hope you can help me to edit my script in the right way.
By using pop you're removing elements from the list:
>>> mylist = [0,1,2,3,4,5,6,7,8,9]
>>> for i in range(0, len(mylist)):
... print(mylist)
... print(mylist.pop(i))
...
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
0
[1, 2, 3, 4, 5, 6, 7, 8, 9]
2
[1, 3, 4, 5, 6, 7, 8, 9]
4
[1, 3, 5, 6, 7, 8, 9]
6
[1, 3, 5, 7, 8, 9]
8
[1, 3, 5, 7, 9]
Traceback (most recent call last):
File "<stdin>", line 2, in <module>
IndexError: pop index out of range
So children, which you're using the length of, is constant, but childrenage_lst is constantly getting shorter and shorter. If you're confident that the two will always start out being the same length, then just access elements in childrenage_lst using []:
for i in range(0, children):
print(childrenage_lst[i])
That said, because of its initialisation, childrenage_str = '' and then childrenage_lst = childrenage_str, it looks like childrenage_lst is a string, which doesn't have a pop method, so I think there's something missing from the code you've posted, to get the TraceBack you're getting.
I was faced with the problem of executing n number of concurrent events that all return iterators to the results they aquired. However, there was an optional limit parameter that says, basically, to consolidate all the iterators and return up-to limit results.
So, for example: I execute 2,000 url requests on 8 threads but just want the first 100 results, but not all 100 from the same potential thread.
Thus, unravel:
import itertools
def unravel(*iterables, with_limit = None):
make_iter = {a:iter(i) for a,i in enumerate(iterables)}
if not isinstance(with_limit, int):
with_limit = -1
resize = False
while True:
for iid, take_from in make_iter.items():
if with_limit == 0:
raise StopIteration
try:
yield next(take_from)
except StopIteration:
resize = iid
else:
with_limit -= 1
if resize:
resize = False
if len(make_iter.keys()) > 1:
make_iter.pop(resize)
else: raise StopIteration
Usage:
>>> a = [1,2,3,4,5]
>>> b = [6,7,8,9,10]
>>> c = [1,3,5,7]
>>> d = [2,4,6,8]
>>>
>>> print([e for e in unravel(c, d)])
[1, 2, 3, 4, 5, 6, 7, 8]
>>> print([e for e in unravel(c, d, with_limit = 3)])
[1, 2, 3]
>>> print([e for e in unravel(a, b, with_limit = 6)])
[1, 6, 2, 7, 3, 8]
>>> print([e for e in unravel(a, b, with_limit = 100)])
[1, 6, 2, 7, 3, 8, 4, 9, 5, 10]
Does something like this already exist, or is this a decent implementation?
Thanks
EDIT, WORKING FIX
Inspired by #abernert 's suggestion, this is what I went with. Thanks everybody!
def unravel(*iterables, limit = None):
yield from itertools.islice(
filter(None,
itertools.chain.from_iterable(
itertools.zip_longest(
*iterables
)
)
), limit)
>>> a = [x for x in range(10)]
>>> b = [x for x in range(5)]
>>> c = [x for x in range(0, 20, 2)]
>>> d = [x for x in range(1, 30, 2)]
>>>
>>> print(list(unravel(a, b)))
[1, 1, 2, 2, 3, 3, 4, 4, 5, 6, 7, 8, 9]
>>> print(list(unravel(a, b, limit = 3)))
[1, 1, 2]
>>> print(list(unravel(a, b, c, d, limit = 20)))
[1, 1, 1, 2, 3, 2, 2, 4, 5, 3, 3, 6, 7, 4, 4, 8, 9, 5, 10, 11]
What you're doing here is almost just zip.
You want a flat iterable, rather than an iterable of sub-iterables, but chain fixes that.
And you want to take only the first N values, but islice fixes that.
So, if the lengths are all equal:
>>> list(chain.from_iterable(zip(a, b)))
[1, 6, 2, 7, 3, 8, 4, 9, 5, 10]
>>> list(islice(chain.from_iterable(zip(a, b)), 7))
[1, 6, 2, 7, 3, 8, 4]
But if the lengths aren't equal, that will stop as soon as the first iterable finishes, which you don't want. And the only alternative in the stdlib is zip_longest, which fills in missing values with None.
You can pretty easily write a zip_longest_skipping (which is effectively the round_robin in Peter's answer), but you can also just zip_longest and filter out the results:
>>> list(filter(None, chain.from_iterable(zip_longest(a, b, c, d))))
[1, 6, 1, 2, 2, 7, 3, 4, 3, 8, 5, 6, 4, 9, 7, 8, 5, 10]
(Obviously this doesn't work as well if your values are all either strings or None, but when they're all positive integers it works fine… to handle the "or None" case, do sentinel=object(), pass that to zip_longest, then filter on x is not sentinel.)
From the itertools example recipes:
def roundrobin(*iterables):
"roundrobin('ABC', 'D', 'EF') --> A D E B F C"
# Recipe credited to George Sakkis
pending = len(iterables)
nexts = cycle(iter(it).__next__ for it in iterables)
while pending:
try:
for next in nexts:
yield next()
except StopIteration:
pending -= 1
nexts = cycle(islice(nexts, pending))
Use itertools.islice to enforce your with_limit, eg:
print([e for e in itertools.islice(roundrobin(c, d), 3)])
>>> list(roundrobin(a, b, c, d))
[1, 6, 1, 2, 2, 7, 3, 4, 3, 8, 5, 6, 4, 9, 7, 8, 5, 10]
For what you're actually trying to do, there's probably a much better solution.
I execute 2,000 url requests on 8 threads but just want the first 100 results, but not all 100 from the same potential thread.
OK, so why are the results in 8 separate iterables? There's no good reason for that. Instead of giving each thread its own queue (or global list and lock, or whatever you're using) and then trying to zip them together, why not have them all share a queue in the first place?
In fact, that's the default way that almost any thread pool is designed (including multiprocessing.Pool and concurrent.futures.Executor in the stdlib). Look at the main example for concurrent.futures.ThreadPoolExecutor:
with concurrent.futures.ThreadPoolExecutor(max_workers=5) as executor:
# Start the load operations and mark each future with its URL
future_to_url = {executor.submit(load_url, url, 60): url for url in URLS}
for future in concurrent.futures.as_completed(future_to_url):
url = future_to_url[future]
try:
data = future.result()
except Exception as exc:
print('%r generated an exception: %s' % (url, exc))
else:
print('%r page is %d bytes' % (url, len(data)))
That's almost exactly your use case—spamming a bunch of URL downloads out over 5 different threads and gathering the results as they come in—without your problem even arising.
Of course it's missing with_limit, but you can just wrap that as_completed iterable in islice to handle that, and you're done.
This uses a generator and izip_longest to pull one item at a time from multiple iterators
from itertools import izip_longest
def unravel(cap, *iters):
counter = 0
for slice in izip_longest(*iters):
for entry in [s for s in slice if s is not None]:
yield entry
counter += 1
if counter >= cap: break
Here's my attempt at creating a countdown in which all the numbers get appended to a list.
timeleft = 3
num1 = 24 - timeleft
mylist = []
def countdown():
while num1 != 0:
num1 -= 1
mylist.append(num1)
countdown()
This is a small section of a schedule making app I'm making.
Instead of using global variables, I'd write a countdown function which accepts a start parameter and returns a list like this:
def countdown(start):
return list(range(start,0,-1))
Demo:
timeleft = 3
num1 = 24 - timeleft
cd = countdown(num1)
print(cd) # [21, 20, 19, 18, 17, 16, 15, 14, 13, 12, 11, 10, 9, 8, 7, 6, 5, 4, 3, 2, 1]
If you want to count to zero use range(start,-1,-1).
def countdown(time_left):
return [x for x in range(24 - time_left, -1, -1)]
Test:
>>> countdown(20)
[4, 3, 2, 1, 0]
>>> countdown(15)
[9, 8, 7, 6, 5, 4, 3, 2, 1, 0]
>>>
In Python 2, just return the call to range, which returns a list:
def countdown2(x, y=0):
return range(x, y-1, -1)
In Python 3, need to materialize range in a list:
def countdown2(x, y=0):
return list(range(x, y-1, -1))
To actually append to a list:
def countdown(x, y=0):
'''countdown from x to y, return list'''
l = []
for i in range(x, y-1, -1): # Python 2, use xrange
l.append(i)
return l
But a direct list would be standard approach here, not a list comprehension.