minimal elements in heap for Python - python

Want to get the minimal elements for a min heap in Python using heapq, here is my code and wondering if using h[0] is the correct way or a more elegant public API for heapq? I tried to find is there is an API to get minimal element of a heap, but cannot find it out.
BTW, using Python 2.
import heapq
def heapMin(iterable):
h = []
for value in iterable:
heapq.heappush(h, value)
return h[0]
if __name__ == "__main__":
print heapMin([1, 3, 5, 7, 9, 2, 4, 6, 8, 0])
thanks in advance,
Lin

Convert your iterable list into a heap in one go, use this. Instead of looping, use the heapify() function and the heappop(iterable) should return you the first index (least number)
heapq.heapify(iterable)
print heapq.heappop(iterable)

Related

How can I return a tuple with unique elements given a recursive function

I have looked around in other posts and haven't been able to come to a solution for this. I have a function that needs to be recursively called using an itertools expression in order to return a tuple that has unique elements with it's order preserved.
for instance:
def function(list):
return list and (list[0],) + function(some_itertools_expression)
given example: function([1, 7, 7, 9, 0, 1]) should return (1, 7, 9, 0)
I've tried using:
return list and (list[0],) + function(tuple([itertools.groupby(list)][:len(list)]))
but I end up running into RecursionError: maximum recursion depth exceeded. How can I solve this without getting the max recursion depth error?
If you must use a function from itertools in a recursive call, I would grab the first item of the sequence in each recursion and use itertools.filterfalse to filter items equal to the first from the sequence returned by a recursive call with the rest of the items:
from itertools import filterfalse
def unique(lst):
if not lst:
return ()
first, *rest = lst
return first, *filterfalse(lambda i: i == first, unique(rest))
print(unique([1, 7, 7, 9, 0, 1]))
This outputs:
(1, 7, 9, 0)
Demo: https://replit.com/#blhsing/WelloffPlainAutomaticparallelization
You can do this fairly easily, without needing recursion, by making a tuple via dictionary keys. The dict must have unique keys, and will preserve the order of the original input sequence.
>>> data = [1, 7, 7, 9, 0, 1]
>>> (*{}.fromkeys(data),)
(1, 7, 9, 0)

passing a list pointer to a function rather than the list

Is it possible to pass a (moving) pointer to a list start into a function in Python?
I have a recursive function working on a section of a list. The list itself is not changed, only the pointer to a 'starting-point' into it. The problem I ran into was that long lists killed the code with memory overrun.
Here is the code:
def trim(l):
print("list len= ", len(l))
if len(l)!= 1:
trim(l[1:])
else:
print("done")
The above example is contrived, my actual code does different stuff than just trimming the list, but it also has a moving start-pointer. A list of 1 million integers blew out of memory on a 10G RAM machine.
Any ideas are welcome.
Couldn't you just pass the index instead of passing the whole new list?
So you would call trim(l, 0) and then check the index against the length of the list, and then call trim(l, 1) if needed.
def trim(l, idx):
print("list len = ", (len(l) - idx))
if idx < (len(x) - 1):
trim(l, idx + 1)
else:
print("done")
If you're writing a non-tail-call recursive function to iterate over a list, your problem is more likely to be a stack overflow, or out-of-memory error related to the stack size.
I recommend re-writing this with an integer pointer and a for-loop, as it seems that Python doesn't have tail-call optimisation.
Here's a guess at what you might be wanting to do:
x = [0,0,0,0,0,1,2,3,4]
def trim_leading_zero(l):
the_len = len(l)
start_i = 0
for i in xrange(the_len):
if l[i] != 0:
return l[i:]
>>> trim_leading_zero(x)
[1, 2, 3, 4]
It's not clear from your code what it's meant to actually do. If you're trying to actually return a sequence, then you may want to look at Generators, which don't require holding an entire sequence in memory.
When dealing with large data, use generators instead of regular iterators.
def trim(l):
print("list len= ", len(l))
pointer = 0
if len(l)!= 1:
yield l[pointer:]
pointer += 1
else:
print("done")
x = [1, 2, 3, 4, 5, 6, 7, 8, 9]
for i in trim(x):
print i
# [1, 2, 3, 4, 5, 6, 7, 8, 9]
Generators will yield one item at a time and let you do whatever you need with it, avoiding create the whole list first before processing. If you want to get a list out of it, you can simply do list(trim(x)).
There are great explanations of yield and generators here - What does the yield keyword do

Python : Split list based on negative integers

I have a list say l = [1,5,8,-3,6,8,-3,2,-4,6,8]. Im trying to split it into sublists of positive integers i.e. the above list would give me [[1,5,8],[6,8],[2],[6,8]]. I've tried the following:
l = [1,5,8,-3,6,8,-3,2,-4,6,8]
index = 0
def sublist(somelist):
a = []
for i in somelist:
if i > 0:
a.append(i)
else:
global index
index += somelist.index(i)
break
return a
print sublist(l)
With this I can get the 1st sublist ( [1,5,8] ) and the index number of the 1st negative integer at 3. Now if I run my function again and pass it l[index+1:], I cant get the next sublist and assume that index will be updated to show 6. However i cant, for the life of me cant figure out how to run the function in a loop or what condition to use so that I can keep running my function and giving it l[index+1:] where index is the updated, most recently encountered position of a negative integer. Any help will be greatly appreciated
You need to keep track of two levels of list here - the large list that holds the sublists, and the sublists themselves. Start a large list, start a sublist, and keep appending to the current sublist while i is non-negative (which includes positive numbers and 0, by the way). When i is negative, append the current sublist to the large list and start a new sublist. Also note that you should handle cases where the first element is negative or the last element isn't negative.
l = [1,5,8,-3,6,8,-3,2,-4,6,8]
def sublist(somelist):
result = []
a = []
for i in somelist:
if i > 0:
a.append(i)
else:
if a: # make sure a has something in it
result.append(a)
a = []
if a: # if a is still accumulating elements
result.append(a)
return result
The result:
>>> sublist(l)
[[1, 5, 8], [6, 8], [2], [6, 8]]
Since somelist never changes, rerunning index will always get index of the first instance of an element, not the one you just reached. I'd suggest looking at enumerate to get the index and element as you loop, so no calls to index are necessary.
That said, you could use the included batteries to solve this as a one-liner, using itertools.groupby:
from itertools import groupby
def sublist(somelist):
return [list(g) for k, g in groupby(somelist, key=(0).__le__) if k]
Still worth working through your code to understand it, but the above is going to be fast and fairly simple.
This code makes use of concepts found at this URL:
Python list comprehension- "pop" result from original list?
Applying an interesting concept found here to your problem, the following are some alternatives to what others have posted for this question so far. Both use list comprehensions and are commented to explain the purpose of the second option versus the first. Did this experiment for me as part of my learning curve, but hoping it may help you and others on this thread as well:
What's nice about these is that if your input list is very very large, you won't have to double your memory expenditure to get the job done. You build one up as you shrink the other down.
This code was tested on Python 2.7 and Python 3.6:
o1 = [1,5,8,-3,6,9,-4,2,-5,6,7,-7, 999, -43, -1, 888]
# modified version of poster's list
o1b = [1,5,8,-3,6,8,-3,2,-4,6,8] # poster's list
o2 = [x for x in (o1.pop() for i in range(len(o1))) \
if (lambda x: True if x < 0 else o1.insert(0, x))(x)]
o2b = [x for x in (o1b.pop() for i in range(len(o1b))) \
if (lambda x: True if x < 0 else o1b.insert(0, x))(x)]
print(o1)
print(o2)
print("")
print(o1b)
print(o2b)
It produces result sets like this (on iPython Jupyter Notebooks):
[1, 5, 8, 6, 9, 2, 6, 7, 999, 888]
[-1, -43, -7, -5, -4, -3]
[1, 5, 8, 6, 8, 2, 6, 8]
[-4, -3, -3]
Here is another version that also uses list comprehensions as the work horse, but functionalizes the code in way that is more read-able (I think) and easier to test with different numeric lists. Some will probably prefer the original code since it is shorter:
p1 = [1,5,8,-3,6,9,-4,2,-5,6,7,-7, 999, -43, -1, 888]
# modified version of poster's list
p1b = [1,5,8,-3,6,8,-3,2,-4,6,8] # poster's list
def lst_mut_byNeg_mod(x, pLst): # list mutation by neg nums module
# this function only make sense in context of usage in
# split_pos_negs_in_list()
if x < 0: return True
else:
pLst.insert(0,x)
return False
def split_pos_negs_in_list(pLst):
pLngth = len(pLst) # reduces nesting of ((()))
return [x for x in (pLst.pop() for i in range(pLngth)) \
if lst_mut_byNeg_mod(x, pLst)]
p2 = split_pos_negs_in_list(p1)
print(p1)
print(p2)
print("")
p2b = split_pos_negs_in_list(p1b)
print(p1b)
print(p2b)
Final Thoughts:
Link provided earlier had a number of ideas in the comment thread:
It recommends a Google search for the "python bloom filter library" - this sounds promising from a performance standpoint but I have not yet looked into it
There is a post on that thread with 554 up-voted, and yet it has at least 4 comments explaining what might be faulty with it. When exploring options, it may be advisable to scan the comment trail and not just review what gets the most votes. There are many options proposed for situations like this.
Just for fun you can use re too for a one liner.
l = [1,5,8,-3,6,8,-3,2,-4,6,8]
print map(lambda x: map(int,x.split(",")), re.findall(r"(?<=[,\[])\s*\d+(?:,\s*\d+)*(?=,\s*-\d+|\])", str(l)))
Output:[[1, 5, 8], [6, 8], [2], [6, 8]]

Inserting and removing into/from sorted list in Python

I have a sorted list of integers, L, and I have a value X that I wish to insert into the list such that L's order is maintained. Similarly, I wish to quickly find and remove the first instance of X.
Questions:
How do I use the bisect module to do the first part, if possible?
Is L.remove(X) going to be the most efficient way to do the second part? Does Python detect that the list has been sorted and automatically use a logarithmic removal process?
Example code attempts:
i = bisect_left(L, y)
L.pop(i) #works
del L[bisect_left(L, i)] #doesn't work if I use this instead of pop
You use the bisect.insort() function:
bisect.insort(L, X)
L.remove(X) will scan the whole list until it finds X. Use del L[bisect.bisect_left(L, X)] instead (provided that X is indeed in L).
Note that removing from the middle of a list is still going to incur a cost as the elements from that position onwards all have to be shifted left one step. A binary tree might be a better solution if that is going to be a performance bottleneck.
You could use Raymond Hettinger's IndexableSkiplist. It performs 3 operations in O(ln n) time:
insert value
remove value
lookup value by rank
import skiplist
import random
random.seed(2013)
N = 10
skip = skiplist.IndexableSkiplist(N)
data = range(N)
random.shuffle(data)
for num in data:
skip.insert(num)
print(list(skip))
# [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
for num in data[:N//2]:
skip.remove(num)
print(list(skip))
# [0, 3, 4, 6, 9]

limit output from a sort method

if my views code is:
arttags = sorted(arttags, key=operator.attrgetter('date_added'), reverse=True)
what is the argument that will limit the result to 50 tags?
I'm assuming this:
.... limit=50)
is incorrect.
more complete code follows:
videoarttags = Media.objects.order_by('date_added'),filter(topic__exact='art')
audioarttags = Audio.objects.order_by('date_added'),filter(topic__exact='art')
conarttags = Concert.objects.order_by('date_added'),filter(topic__exact='art')
arttags = list(chain(videoarttags, audioarttags, conarttags))
arttags = sorted(arttags, key=operator.attrgetter('date_added'), reverse=True)
how do incorporate –
itertools.islice(sorted(...),50)
what about heapq.nlargest:
Return a list with the n largest elements from the dataset defined by iterable.key, if provided, specifies a function of one argument that is used to extract a comparison key from each element in the iterable: key=str.lower Equivalent to: sorted(iterable, key=key, reverse=True)[:n]
>>> from heapq import nlargest
>>> data = [1, 3, 5, 7, 9, 2, 4, 6, 8, 0]
>>> nlargest(3, data)
[9, 8, 7]
You'll probably find that a slice works for you:
arttags = sorted(arttags, key=operator.attrgetter('date_added'), reverse=True)[:50]
The general idea of what you want is a take, I believe. From the itertools documentation:
def take(n, iterable):
"Return first n items of the iterable as a list"
return list(islice(iterable, n))
I think I was pretty much barking up the wrong tree. What I was trying to accomplish was actually very simple using a template filter (slice) which I didn't know I could do.
The code was as follows:
{% for arttag in arttags|slice:":50" %}
Yes, I feel pretty stupid, but I'm glad I got it done :-)
You might also want to add [:50] to each of the objects.order_by.filter calls. Doing that will mean you only ever have to sort 150 items in-memory in Python instead of possibly many more.

Categories