I've been playing with BST (binary search tree) and I'm wondering how to do an early exit. Following is the code I've written to find kth smallest. It recursively calls the child node's find_smallest_at_k, stack is just a list passed into the function to add all the elements in inorder. Currently this solution walks all the nodes inorder and then I have to select the kth item from "stack" outside this function.
def find_smallest_at_k(self, k, stack, i):
if self is None:
return i
if (self.left is not None):
i = self.left.find_smallest_at_k(k, stack, i)
print(stack, i)
stack.insert(i, self.data)
i += 1
if i == k:
print(stack[k - 1])
print "Returning"
if (self.right is not None):
i = self.right.find_smallest_at_k(k, stack, i)
return i
It's called like this,
our_stack = []
self.root.find_smallest_at_k(k, our_stack, 0)
return our_stack[k-1]
I'm not sure if it's possible to exit early from that function. If my k is say 1, I don't really have to walk all the nodes then find the first element. It also doesn't feel right to pass list from outside function - feels like passing pointers to a function in C. Could anyone suggest better alternatives than what I've done so far?
Passing list as arguments: Passing the list as argument can be good practice, if you make your function tail-recursive. Otherwise it's pointless. With BST where there are two potential recursive function calls to be done, it's a bit of a tall ask.
Else you can just return the list. I don't see the necessity of variable i. Anyway if you absolutely need to return multiples values, you can always use tuples like this return i, stack and this i, stack = root.find_smallest_at_k(k).
Fast-forwarding: For the fast-forwarding, note the right nodes of a BST parent node are always bigger than the parent. Thus if you descend the tree always on the right children, you'll end up with a growing sequence of values. Thus the first k values of that sequence are necessarily the smallest, so it's pointless to go right k times or more in a sequence.
Even in the middle of you descend you go left at times, it's pointless to go more than k times on the right. The BST properties ensures that if you go right, ALL subsequent numbers below in the hierarchy will be greater than the parent. Thus going right k times or more is useless.
Code: Here is a pseudo-python code quickly made. It's not tested.
def findKSmallest( self, k, rightSteps=0 ):
if rightSteps >= k: #We went right more than k times
return []
leftSmallest = self.left.findKSmallest( k, rightSteps ) if self.left != None else []
rightSmallest = self.right.findKSmallest( k, rightSteps + 1 ) if self.right != None else []
mySmallest = sorted( leftSmallest + [self.data] + rightSmallest )
return mySmallest[:k]
EDIT The other version, following my comment.
def findKSmallest( self, k ):
if k == 0:
return []
leftSmallest = self.left.findKSmallest( k ) if self.left != None else []
rightSmallest = self.right.findKSmallest( k - 1 ) if self.right != None else []
mySmallest = sorted( leftSmallest + [self.data] + rightSmallest )
return mySmallest[:k]
Note that if k==1, this is indeed the search of the smallest element. Any move to the right, will immediately returns [], which contributes to nothing.
As said Lærne, you have to care about turning your function into a tail-recursive one; then you may be interested by using a continuation-passing style. Thus your function could be able to call either itself or the "escape" function. I wrote a module called tco for optimizing tail-calls; see https://github.com/baruchel/tco
Hope it can help.
Here is another approach: it doesn't exit recursion early, instead it prevents additional function calls if not needed, which is essentially what you're trying to achieve.
class Node:
def __init__(self, v):
self.v = v
self.left = None
self.right = None
def find_smallest_at_k(root, k):
res = [None]
count = [k]
def helper(root):
if root is None:
return
helper(root.left)
count[0] -= 1
if count[0] == 0:
print("found it!")
res[0] = root
return
if count[0] > 0:
print("visiting right")
find(root.right)
helper(root)
return res[0].v
If you want to exit as soon as earlier possible, then use exit(0).
This will make your task easy!
Related
How can I implement yield from in my recursion? I am trying to understand how to implement it but failing:
# some data
init_parent = [1020253]
df = pd.DataFrame({'parent': [1020253, 1020253],
'id': [1101941, 1101945]})
# look for parent child
def recur1(df, parents, parentChild=None, step=0):
if len(parents) != 0:
yield parents, parentChild
else:
parents = df.loc[df['parent'].isin(parents)][['id', 'parent']]
parentChild = parents['parent'].to_numpy()
parents = parents['id'].to_numpy()
yield from recur1(df=df, parents=parents, parentChild=parentChild, step=step+1)
# exec / only printing results atm
out = recur1(df, init_parent, step=0)
[x for x in out]
I'd say your biggest issue here is that recur1 isn't always guaranteed to return a generator. For example, suppose your stack calls into the else branch three times before calling into the if branch. In this case, the top three frames would be returning a generator received from the lower frame, but the lowest from would be returned from this:
yield parents, parentChild
So, then, there is a really simple way you can fix this code to ensure that yield from works. Simply transform your return from a tuple to a generator-compatible type by enclosing it in a list:
yield [(parents, parentChild)]
Then, when you call yield from recur1(df=df, parents=parents, parentChild=parentChild, step=step+1) you'll always be working with something for which yeild from makes sense.
I currently try to understand recursion on made up example. Imagine you have a briefcase, which can be opened by the key. The key is in the big box, which can contain other smaller boxes, which key might be in.
In my example boxes are lists. The recursion appears when we find the smaller box - we search it for the key. The problem is that my function can find the key if it is actually in the box and can't go back if there is nothing like 'key'.
Unfortunately, i could not understand how to go back if there is no key in the smaller box. Can you help me solve this puzzle? By the way, have a nice day! Here is the code (big box consists in the way when the key can be found and returned):
box = ['socks', 'papers', ['jewelry', 'flashlight', 'key'], 'dishes', 'souvernirs', 'posters']
def look_for_key(box):
for item in box:
if isinstance(item, list) == True:
look_for_key(item)
elif item == 'key':
print('found the key')
key = item
return key
print(look_for_key(box))
Iteration
The most closed to yours and yet readable solution I could find is:
def look_for_key(box):
for item in box:
if item == 'key':
return item
elif isinstance(item, list) and look_for_key(item) is not None:
return look_for_key(item)
else:
pass
box = [['sock','papers'],['jewelry','key']]
look_for_key(box)
# ==> 'key'
I don't like it because its deduction condition includes a recursive call which is hard to interpret. It does not help to improve interpretability if you assign look_for_key(item) to a variable and check for not None afterwards. It is just similarly difficult to interpret. An equivalent but more interpretable solution is:
def look_for_key(box):
def inner(item, remained):
if item == [] and remained == []:
return None
elif isinstance(item, list) and item != []:
return inner(item[0], [item[1:], remained])
elif item == [] or item != 'key':
return inner(remained[0], remained[1:])
elif item == 'key':
return item
return inner(box[0], box[1:])
box = [['sock','papers'],['jewelry','key']]
look_for_key(box)
# ==> 'key'
It explicitly splits the tree to branches (see below what this means) with return inner(item[0], [item[1:], remained]) and return inner(remained[0], remained[1:]) instead of intrinsically reusing the recursive call conditionally during deduction - if look_for_key(item) is not None: return look_for_key(item) - with this line of code it is hard to see a diagram and understand in which direction the recursion goes.
The 2nd solution also makes it easier to infer the complexity using a tree diagram since you see the branches explicitly, for example remained[0] vs. remained[1:].
As inner is simply an iteration written in a functional way and for loop is a syntactic sugar to form iteration, both solutions should have similar complexity in principle.
Since you do not just want a solution but also a better understanding of recursion, I would try the following approach.
Mapping over Trees (Map-Reduce)
This is a typical text book tree recursion question. What you want is to traverse a hieratical data structure called tree. A typical solution is mapping a function over the tree:
from functools import reduce
def look_for_key(tree):
def look_inner(sub_tree):
if isinstance(sub_tree, list):
return look_for_key(sub_tree)
elif sub_tree == 'key':
return [sub_tree]
else:
return []
return reduce(lambda left_branch, right_branch: look_inner(left_branch) + look_inner(right_branch), tree, [])
box = ['socks', 'papers', ['jewelry', 'flashlight', 'key'], 'dishes', 'souvernirs', 'posters']
look_for_key(box)
# ==> ['key']
To make it explicit I use tree, sub_tree, left_branch, right_branch as variable names instead of box, inner_box and so on as in your example. Notice how the function look_for_key is mapped over each left_branch and right_branch of the sub_trees in the tree. The result is then summarized using reduce (A classic map-reduce procedure).
To be more clear, you can omit the reduce part and keep only the map part:
def look_for_key(tree):
def look_inner(sub_tree):
if isinstance(sub_tree, list):
return look_for_key(sub_tree)
elif sub_tree == 'key':
return sub_tree
else:
return None
return list(map(look_inner, tree))
look_for_key(box)
# ==> [None, None, [None, None, 'key'], None, None, None]
This does not generate your intended format of the result. But it helps to understand how the recursion works. map just adds an abstract layer to recursively look for keys into sub trees which is equivalent to the syntactic sugar of for loop provided by python. That is not important. The essential thing is decomposing the tree properly (deduction) and set-up proper base condition to return the result.
Native Tree Recursion
If it is still not clear enough, you can get rid of all abstractions and syntactic sugars and just build a native recursion from scratch:
def look_for_key(box):
if box == []:
return []
elif not isinstance(box, list) and box == 'key':
print('found the key')
return [box]
elif not isinstance(box, list) and box != 'key':
return []
else:
return look_for_key(box[0]) + look_for_key(box[1:])
look_for_key(box)
# ==> found the key
# ==> ['key']
Here all three fundamental elements of recursion:
base cases
deduction
recursive calls
are explicitly displayed. You can also see from this example clearly that there is no miracle of going out of an inner box (or sub-tree). To look into every possible corner inside the box (or tree), you just repeatedly split it to two parts in every smaller box (or sub tree). Then you properly combine your results at each level (so called fold or reduce or accumulate), here using +, then recursive calls will take care of it and help to return to the top level.
Both the native recursion and map-reduce approaches are able to find out multiple keys, because they traverse over the whole tree and accumulate all matches:
box = ['a','key','c', ['e', ['f','key']]]
look_for_key(box)
# ==> found the key
# ==> found the key
# ==> ['key', 'key']
Recursion Visualization
Finally, to fully understand what is going on with the tree recursion, you could plot the recursive depth and visualize how the calls are moving to deeper levels and then returned.
import functools
import matplotlib.pyplot as plt
# ignore the error of unhashable data type
def ignore_unhashable(func):
uncached = func.__wrapped__
attributes = functools.WRAPPER_ASSIGNMENTS + ('cache_info', 'cache_clear')
#functools.wraps(func, assigned=attributes)
def wrapper(*args, **kwargs):
try:
return func(*args, **kwargs)
except TypeError as error:
if 'unhashable type' in str(error):
return uncached(*args, **kwargs)
raise
wrapper.__uncached__ = uncached
return wrapper
# rewrite the native recursion and cache the recursive calls
#ignore_unhashable
#functools.lru_cache(None)
def look_for_key(box):
global depth, depths
depth += 1
depths.append(depth)
result = ([] if box == [] else
[box] if not isinstance(box, list) and box == 'key' else
[] if not isinstance(box, list) and box != 'key' else
look_for_key(box[0]) + look_for_key(box[1:]))
depth -= 1
return result
# function to plot recursion depth
def plot_depths(f, *args, show=slice(None), **kwargs):
"""Plot the call depths for a cached recursive function"""
global depth, depths
depth, depths = 0, []
f.cache_clear()
f(*args, **kwargs)
plt.figure(figsize=(12, 6))
plt.xlabel('Recursive Call Number'); plt.ylabel('Recursion Depth')
X, Y = range(1, len(depths) + 1), depths
plt.plot(X[show], Y[show], '.-')
plt.grid(True); plt.gca().invert_yaxis()
box = ['socks', 'papers', ['jewelry', 'flashlight', 'key'], 'dishes', 'souvernirs']
plot_depths(look_for_key, box)
Whenever the function got called recursively, the curve goes to a deeper level - the downward slash. When the tree/sub-tree is splitted to left and right branches, two calls happen at the same level - the horizontal line that connected two dots (two calls look_for_key(box[0]) + look_for_key(box[1:])). When it traverses
over a complete sub-tree (or branch) and reaches to the last leave in that sub-tree (a base condition when a value or [] is returned), it starts to go back to upper levels - the valley in the curve. If you have multiple sub/nest lists there will be multiple valleys. Eventually it traverses over the whole tree and returns the results
You can play with boxes (or trees) of different nest structures to understand better how it works. Hopefully these provide you enough information and a more comprehensive understanding of tree-recursion.
Integrating the above comments:
box = ['socks', 'papers', ['jewelry', 'flashlight', 'key'], 'dishes', 'souvernirs', 'posters']
def look_for_key(box):
for item in box:
if isinstance(item, list) == True:
in_box = look_for_key(item)
if in_box is not None:
return in_box
elif item == 'key':
print('found the key')
return item
# not found
return None
print(look_for_key(box))
which prints:
found the key
key
If the key is deleted from the box, executing the code prints:
None
I'm trying to make a function in python were I don't want to change the BST class at all to do this. The function is to find the sum of the path of the root to the node with the highest depth. If there is multiple nodes that have the same depth I'm looking for the maximum sum of that and return it.
What I got so far (Using a basic BST Class)
class BTNode(object):
def __init__(self, data, left=None, right=None):
self.data = data
self.left = left
self.right = right
After trying to make an algorithm to solve this for a while, I came up with a couple above but they were fails and I didn't know how to code it.
ALGORITHM:
I think this would work:
We start from the root. (The base case I think should be whenever we hit a leaf as in when there is no child in a node so no left or right child, and also when there is a left no right, or when there is a right no left)
I check the left subtree first and get the depth of it, we'll call it depth_L as well with the sum. Then I check the right subtree and we will call it depth_R then get the depth of it and its sum.
The second condition would be to check if they are equal, if they are equal then its easy and we just take the max sum of either two depths. Else we would see who has the highest depth and try to get the sum of it.
Now were I don't know how to do is a couple things.
1: I never learned optional parameters so I'm trying to avoid that while trying this exercise but I don't think I can and I'd really appreciate someone could show me some cool helper functions instead.
2: Its not the total sum of the right side or the left side its the path that I need. Its kind of confusing to think of a way to get just the path
(Heres my renewed attempt using the algorithm above):
def deepest_sum(self, bsum = 0, depth = 0):
# The root is in every path so
bsum = bsum + self.data
depth = depth + 1
# Base case whenever we find a leaf
if self.left == None and self.right == None:
result = bsum,depth
elif self.left is not None and self.right is None:
pass
elif self.left is None and self.right is not None:
pass
else:
# Getting the left and right subtree's depth as well as their
# sums, but once we hit a leaf it will stop
# These optional parameters is messing me up
if self.left:
(sums1, depth_L) = self.left.deepest_sum(bsum,depth)
if self.right:
(sums2, depth_R) = self.right.deepest_sum(bsum,depth)
# The parameter to check if they are equal, the highest goes through
if depth_L == depth_R:
result = max(sums1, sums2), depth_R
else:
if depth_L > depth_R:
result = sums1, depth_L
else:
result = sums2, depth_R
return result
Stuck on the parts i mentioned. Heres an example:
>>> BST(8, BST(7, BST(10), BST(11)), BST(6, BST(11), BST(9, None, BST(14)))
37 (depth 3 is the highest so 8 + 6 + 9 + 14 is the path)
Sorry i put BST i just forgot, its a binary Tree not BST.
I know mine gives a tuple but I can always make a helper function to fix that, I just thought it'd be easier keeping track of the nodes.
You could simplify the implementation quite a bit if the function doesn't need to be a method of BTNode. Then you could keep track of the depth & sum, iterate past the leaf and return the current depth and sum. Additionally if you return (depth, sum) tuples you can compare them directly against each other with max:
class BTNode(object):
def __init__(self, data, left=None, right=None):
self.data = data
self.left = left
self.right = right
def deepest_sum(node, depth=0, current=0):
# Base case
if not node:
return (depth, current)
depth += 1
current += node.data
return max(deepest_sum(node.left, depth, current),
deepest_sum(node.right, depth, current))
tree = BTNode(8, BTNode(7, BTNode(10), BTNode(11)), BTNode(6, BTNode(11), BTNode(9, None, BTNode(14))))
print(deepest_sum(tree))
Output:
(4, 37)
I am having trouble finding a node in a tree with arbitrary branching factor. Each Node carries data and has zero or greater children. The search method is inside the Node class and
checks to see if that Node carries data and then checks all of that Nodes children. I keep ending up with infinite loops in my recursive method, any help?
def find(self, x):
_level = [self]
_nextlevel = []
if _level == []:
return None
else:
for node in _level:
if node.data is x:
return node
_nextlevel += node.children
_level = _nextlevel
return self.find(x) + _level
The find method is in the Node class and checks if data x is in the node the method is called from, then checks all of that nodes children. I keep getting an infinite loop, really stuck at this point any insight would be appreciated.
There are a few issues with this code. First, note that on line 2 you have _level = [self]. that means the if _level == [] on line 5 will always be false.
The 2nd issue is that your for loop goes over everything in _level, but, as noted above, that will always be [self] due to line 2.
The 3rd issue is the return statement. You have return self.find(x) + _level. That gets evaluated in 2 parts. First, call self.find(x), then concatenate what that returns with the contents of _level. But, when you call self.find(x) that will call the same method with the same arguments and that, in turn, will then hit the same return self.find(x) + _level line, which will call the same method again, and on and on forever.
A simple pattern for recursive searches is to use a generator. That makes it easy to pass up the answers to calling code without managing the state of the recursion yourself.
class Example(object):
def __init__(self, datum, *children):
self.Children = list(children) # < assumed to be of the same or duck-similar class
self.Datum = datum
def GetChildren(self):
for item in self.Children:
for subitem in item.GetChildren():
yield subitem
yield item
def FindInChildren(self, query): # where query is an expression that is true for desired data
for item in self.GetChildren():
if query(item):
yield item
In a tree structure, I'm trying to find all leafs of a branch. Here is what I wrote:
def leafs_of_branch(node,heads=[]):
if len(node.children()) == 0:
heads.append(str(node))
else:
for des in node.children():
leafs_of_branch(des)
return heads
leafs_of_branch(node)
I don't know why but it feels wrong for me. It works but I want to know if there is a better way to use recursion without creating the heads parameter.
This
def leafs_of_branch(node,heads=[]):
is always a bad idea. Better would be
def leafs_of_branch(node,heads=None):
heads = heads or []
as otherwise you always use the same list for leafs_of_branch. In your specific case it might be o.k., but sooner or later you will run into problems.
I recommend:
def leafs_of_branch(node):
leafs = []
for des in node.children():
leafs.extend(leafs_of_branch(des))
if len(leafs)==0:
leafs.append(str(node))
return leafs
leafs_of_branch(node)
Instead of doing a if len(node.children()==0, I check for len(leafs) after descending into all (possibly zero) children. Thus I call node.children() only once.
I believe this should work:
def leafs_of_branch(node):
if len(node.children()) == 0:
return [str(node)]
else:
x = []
for des in node.children():
x += leafs_of_branch(des) #x.extend(leafs_of_branch(des)) would work too :-)
return x
It's not very pretty and could probably be condensed a bit more, but I was trying to keep the form of your original code as much as possible to make it obvious what was going on.
Your original version won't actually work if you call it more than once because as you append to the heads list, that list will actually be saved between calls.
As long as recursion goes, you are doing it right IMO; you are missing the heads paramater on the recursive call tho. The reason it's working anyway is for what other people said, default parameters are global and reused between calls.
If you want to avoid recursion altogheter, in this case you can use either a Queue or a Stack and a loop:
def leafs_of_branch(node):
traverse = [node]
leafs = []
while traverse:
node = traverse.pop()
children = node.children()
if children:
traverse.extend(children)
else:
leafs.append(str(node))
return leafs
You may also define recursively an iterator this way.
def leafs_of_branch(node):
if len(node.children()) == 0:
yield str(node)
else:
for des in node.children():
for leaf in leafs_of_branch(des):
yield leaf
leafs = list(leafs_of_branch(node))
First of all, refrain from using mutable objects (lists, dicts etc) as default values, since default values are global and reused between the function calls:
def bad_func(val, dest=[]):
dest.append(val)
print dest
>>> bad_func(1)
[1]
>>> bad_func(2)
[1, 2] # surprise!
So, the consequent calls will make something completely unexpected.
As for the recursion question, I'd re-write it like this:
from itertools import chain
def leafs_of_branch(node):
children = node.children()
if not children: # better than len(children) == 0
return (node, )
all_leafs = (leafs_of_branch(child) for child in children)
return chain(*all_leafs)