I currently try to understand recursion on made up example. Imagine you have a briefcase, which can be opened by the key. The key is in the big box, which can contain other smaller boxes, which key might be in.
In my example boxes are lists. The recursion appears when we find the smaller box - we search it for the key. The problem is that my function can find the key if it is actually in the box and can't go back if there is nothing like 'key'.
Unfortunately, i could not understand how to go back if there is no key in the smaller box. Can you help me solve this puzzle? By the way, have a nice day! Here is the code (big box consists in the way when the key can be found and returned):
box = ['socks', 'papers', ['jewelry', 'flashlight', 'key'], 'dishes', 'souvernirs', 'posters']
def look_for_key(box):
for item in box:
if isinstance(item, list) == True:
look_for_key(item)
elif item == 'key':
print('found the key')
key = item
return key
print(look_for_key(box))
Iteration
The most closed to yours and yet readable solution I could find is:
def look_for_key(box):
for item in box:
if item == 'key':
return item
elif isinstance(item, list) and look_for_key(item) is not None:
return look_for_key(item)
else:
pass
box = [['sock','papers'],['jewelry','key']]
look_for_key(box)
# ==> 'key'
I don't like it because its deduction condition includes a recursive call which is hard to interpret. It does not help to improve interpretability if you assign look_for_key(item) to a variable and check for not None afterwards. It is just similarly difficult to interpret. An equivalent but more interpretable solution is:
def look_for_key(box):
def inner(item, remained):
if item == [] and remained == []:
return None
elif isinstance(item, list) and item != []:
return inner(item[0], [item[1:], remained])
elif item == [] or item != 'key':
return inner(remained[0], remained[1:])
elif item == 'key':
return item
return inner(box[0], box[1:])
box = [['sock','papers'],['jewelry','key']]
look_for_key(box)
# ==> 'key'
It explicitly splits the tree to branches (see below what this means) with return inner(item[0], [item[1:], remained]) and return inner(remained[0], remained[1:]) instead of intrinsically reusing the recursive call conditionally during deduction - if look_for_key(item) is not None: return look_for_key(item) - with this line of code it is hard to see a diagram and understand in which direction the recursion goes.
The 2nd solution also makes it easier to infer the complexity using a tree diagram since you see the branches explicitly, for example remained[0] vs. remained[1:].
As inner is simply an iteration written in a functional way and for loop is a syntactic sugar to form iteration, both solutions should have similar complexity in principle.
Since you do not just want a solution but also a better understanding of recursion, I would try the following approach.
Mapping over Trees (Map-Reduce)
This is a typical text book tree recursion question. What you want is to traverse a hieratical data structure called tree. A typical solution is mapping a function over the tree:
from functools import reduce
def look_for_key(tree):
def look_inner(sub_tree):
if isinstance(sub_tree, list):
return look_for_key(sub_tree)
elif sub_tree == 'key':
return [sub_tree]
else:
return []
return reduce(lambda left_branch, right_branch: look_inner(left_branch) + look_inner(right_branch), tree, [])
box = ['socks', 'papers', ['jewelry', 'flashlight', 'key'], 'dishes', 'souvernirs', 'posters']
look_for_key(box)
# ==> ['key']
To make it explicit I use tree, sub_tree, left_branch, right_branch as variable names instead of box, inner_box and so on as in your example. Notice how the function look_for_key is mapped over each left_branch and right_branch of the sub_trees in the tree. The result is then summarized using reduce (A classic map-reduce procedure).
To be more clear, you can omit the reduce part and keep only the map part:
def look_for_key(tree):
def look_inner(sub_tree):
if isinstance(sub_tree, list):
return look_for_key(sub_tree)
elif sub_tree == 'key':
return sub_tree
else:
return None
return list(map(look_inner, tree))
look_for_key(box)
# ==> [None, None, [None, None, 'key'], None, None, None]
This does not generate your intended format of the result. But it helps to understand how the recursion works. map just adds an abstract layer to recursively look for keys into sub trees which is equivalent to the syntactic sugar of for loop provided by python. That is not important. The essential thing is decomposing the tree properly (deduction) and set-up proper base condition to return the result.
Native Tree Recursion
If it is still not clear enough, you can get rid of all abstractions and syntactic sugars and just build a native recursion from scratch:
def look_for_key(box):
if box == []:
return []
elif not isinstance(box, list) and box == 'key':
print('found the key')
return [box]
elif not isinstance(box, list) and box != 'key':
return []
else:
return look_for_key(box[0]) + look_for_key(box[1:])
look_for_key(box)
# ==> found the key
# ==> ['key']
Here all three fundamental elements of recursion:
base cases
deduction
recursive calls
are explicitly displayed. You can also see from this example clearly that there is no miracle of going out of an inner box (or sub-tree). To look into every possible corner inside the box (or tree), you just repeatedly split it to two parts in every smaller box (or sub tree). Then you properly combine your results at each level (so called fold or reduce or accumulate), here using +, then recursive calls will take care of it and help to return to the top level.
Both the native recursion and map-reduce approaches are able to find out multiple keys, because they traverse over the whole tree and accumulate all matches:
box = ['a','key','c', ['e', ['f','key']]]
look_for_key(box)
# ==> found the key
# ==> found the key
# ==> ['key', 'key']
Recursion Visualization
Finally, to fully understand what is going on with the tree recursion, you could plot the recursive depth and visualize how the calls are moving to deeper levels and then returned.
import functools
import matplotlib.pyplot as plt
# ignore the error of unhashable data type
def ignore_unhashable(func):
uncached = func.__wrapped__
attributes = functools.WRAPPER_ASSIGNMENTS + ('cache_info', 'cache_clear')
#functools.wraps(func, assigned=attributes)
def wrapper(*args, **kwargs):
try:
return func(*args, **kwargs)
except TypeError as error:
if 'unhashable type' in str(error):
return uncached(*args, **kwargs)
raise
wrapper.__uncached__ = uncached
return wrapper
# rewrite the native recursion and cache the recursive calls
#ignore_unhashable
#functools.lru_cache(None)
def look_for_key(box):
global depth, depths
depth += 1
depths.append(depth)
result = ([] if box == [] else
[box] if not isinstance(box, list) and box == 'key' else
[] if not isinstance(box, list) and box != 'key' else
look_for_key(box[0]) + look_for_key(box[1:]))
depth -= 1
return result
# function to plot recursion depth
def plot_depths(f, *args, show=slice(None), **kwargs):
"""Plot the call depths for a cached recursive function"""
global depth, depths
depth, depths = 0, []
f.cache_clear()
f(*args, **kwargs)
plt.figure(figsize=(12, 6))
plt.xlabel('Recursive Call Number'); plt.ylabel('Recursion Depth')
X, Y = range(1, len(depths) + 1), depths
plt.plot(X[show], Y[show], '.-')
plt.grid(True); plt.gca().invert_yaxis()
box = ['socks', 'papers', ['jewelry', 'flashlight', 'key'], 'dishes', 'souvernirs']
plot_depths(look_for_key, box)
Whenever the function got called recursively, the curve goes to a deeper level - the downward slash. When the tree/sub-tree is splitted to left and right branches, two calls happen at the same level - the horizontal line that connected two dots (two calls look_for_key(box[0]) + look_for_key(box[1:])). When it traverses
over a complete sub-tree (or branch) and reaches to the last leave in that sub-tree (a base condition when a value or [] is returned), it starts to go back to upper levels - the valley in the curve. If you have multiple sub/nest lists there will be multiple valleys. Eventually it traverses over the whole tree and returns the results
You can play with boxes (or trees) of different nest structures to understand better how it works. Hopefully these provide you enough information and a more comprehensive understanding of tree-recursion.
Integrating the above comments:
box = ['socks', 'papers', ['jewelry', 'flashlight', 'key'], 'dishes', 'souvernirs', 'posters']
def look_for_key(box):
for item in box:
if isinstance(item, list) == True:
in_box = look_for_key(item)
if in_box is not None:
return in_box
elif item == 'key':
print('found the key')
return item
# not found
return None
print(look_for_key(box))
which prints:
found the key
key
If the key is deleted from the box, executing the code prints:
None
Related
How can I implement yield from in my recursion? I am trying to understand how to implement it but failing:
# some data
init_parent = [1020253]
df = pd.DataFrame({'parent': [1020253, 1020253],
'id': [1101941, 1101945]})
# look for parent child
def recur1(df, parents, parentChild=None, step=0):
if len(parents) != 0:
yield parents, parentChild
else:
parents = df.loc[df['parent'].isin(parents)][['id', 'parent']]
parentChild = parents['parent'].to_numpy()
parents = parents['id'].to_numpy()
yield from recur1(df=df, parents=parents, parentChild=parentChild, step=step+1)
# exec / only printing results atm
out = recur1(df, init_parent, step=0)
[x for x in out]
I'd say your biggest issue here is that recur1 isn't always guaranteed to return a generator. For example, suppose your stack calls into the else branch three times before calling into the if branch. In this case, the top three frames would be returning a generator received from the lower frame, but the lowest from would be returned from this:
yield parents, parentChild
So, then, there is a really simple way you can fix this code to ensure that yield from works. Simply transform your return from a tuple to a generator-compatible type by enclosing it in a list:
yield [(parents, parentChild)]
Then, when you call yield from recur1(df=df, parents=parents, parentChild=parentChild, step=step+1) you'll always be working with something for which yeild from makes sense.
I have a branching nested dictionary to visualize species taxonomy data. I'm trying to write a function that gives me all the branches at a particular level.
I've tried iterative and recursive functions, but I have only gotten close using a recursive function.
However, depending on where I put return/print statements, my function either returns None (but prints the correct information), or returns only one branch of the data.
Using the second option, the output is perfect until the dataset branches.
tree = {"k-b":
{"p-a":
{"c-a":{"o-a":{}, "o-b":{}},
"c-b":{"o-a":{}}},
"p-b":
{"c-a":{"o-a":{},"o-b":{}}}}}
def branches(tree, level):
if level == 0:
#print(tree.keys())
return tree.keys()
else:
for i in tree.keys():
return branches(tree[i], level-1)
print(branchNumber(tree, 2))
For level 2, I expect [['c-a', 'c-b'], ['c-a']] (it doesn't have to be an array of arrays, and I don't care if it has dict_keys() or anything else around it)
I actually get dict_keys(['c-a', 'c-b']), which excludes the second branch
Alternatively, if I remove the 'return' before recursively calling branches, and uncomment the print statement, it prints:
dict_keys(['c-a', 'c-b'])
dict_keys(['c-a'])
Which is exactly the output I want, but the function returns None so I can't store that information for future applications
Your code always returns the first item in the loop, so your algorithm ends prematurely and doesn't explore all the necessary branches. You could yield the results to create a generator function (among other approaches):
tree = {"k-b":
{"p-a":
{"c-a":{"o-a":{}, "o-b":{}},
"c-b":{"o-a":{}}},
"p-b":
{"c-a":{"o-a":{},"o-b":{}}}}}
def branches(tree, level):
if level == 0:
yield list(tree.keys())
elif level > 0:
for v in tree.values():
yield from branches(v, level - 1)
for i in range(4):
print(f"level {i}:", list(branches(tree, i)))
Output:
level 0: [['k-b']]
level 1: [['p-a', 'p-b']]
level 2: [['c-a', 'c-b'], ['c-a']]
level 3: [['o-a', 'o-b'], ['o-a'], ['o-a', 'o-b']]
The line elif level > 0: is an optimization to avoid walking deeper into the tree than necessary.
Also, for i in tree.keys(), then tree[i] to access the value could be clearer as for v in tree.values().
You might want to return a list of all items at that level:
tree = {"k-b":
{"p-a":
{"c-a":{"o-a":{}, "o-b":{}},
"c-b":{"o-a":{}}},
"p-b":
{"c-a":{"o-a":{},"o-b":{}}}}}
def branches(tree, level):
if level == 0:
#print(tree.keys())
return tree.keys()
else:
return [branches(tree[i], level-1) for i in tree.keys()]
print(branches(tree, 2))
Output:
[[dict_keys(['c-a', 'c-b']), dict_keys(['c-a'])]]
It sounds like you want to return a list of all branches. One way to do this is with a list comprehension:
def branches(tree, level):
if level == 0:
#print(tree.keys())
return tree.keys()
else:
return [branches(tree[i], level-1) for i in tree.keys()]
Note that this will return a deeply nested list. Flattening is left as an exercise for the reader.
I've been playing with BST (binary search tree) and I'm wondering how to do an early exit. Following is the code I've written to find kth smallest. It recursively calls the child node's find_smallest_at_k, stack is just a list passed into the function to add all the elements in inorder. Currently this solution walks all the nodes inorder and then I have to select the kth item from "stack" outside this function.
def find_smallest_at_k(self, k, stack, i):
if self is None:
return i
if (self.left is not None):
i = self.left.find_smallest_at_k(k, stack, i)
print(stack, i)
stack.insert(i, self.data)
i += 1
if i == k:
print(stack[k - 1])
print "Returning"
if (self.right is not None):
i = self.right.find_smallest_at_k(k, stack, i)
return i
It's called like this,
our_stack = []
self.root.find_smallest_at_k(k, our_stack, 0)
return our_stack[k-1]
I'm not sure if it's possible to exit early from that function. If my k is say 1, I don't really have to walk all the nodes then find the first element. It also doesn't feel right to pass list from outside function - feels like passing pointers to a function in C. Could anyone suggest better alternatives than what I've done so far?
Passing list as arguments: Passing the list as argument can be good practice, if you make your function tail-recursive. Otherwise it's pointless. With BST where there are two potential recursive function calls to be done, it's a bit of a tall ask.
Else you can just return the list. I don't see the necessity of variable i. Anyway if you absolutely need to return multiples values, you can always use tuples like this return i, stack and this i, stack = root.find_smallest_at_k(k).
Fast-forwarding: For the fast-forwarding, note the right nodes of a BST parent node are always bigger than the parent. Thus if you descend the tree always on the right children, you'll end up with a growing sequence of values. Thus the first k values of that sequence are necessarily the smallest, so it's pointless to go right k times or more in a sequence.
Even in the middle of you descend you go left at times, it's pointless to go more than k times on the right. The BST properties ensures that if you go right, ALL subsequent numbers below in the hierarchy will be greater than the parent. Thus going right k times or more is useless.
Code: Here is a pseudo-python code quickly made. It's not tested.
def findKSmallest( self, k, rightSteps=0 ):
if rightSteps >= k: #We went right more than k times
return []
leftSmallest = self.left.findKSmallest( k, rightSteps ) if self.left != None else []
rightSmallest = self.right.findKSmallest( k, rightSteps + 1 ) if self.right != None else []
mySmallest = sorted( leftSmallest + [self.data] + rightSmallest )
return mySmallest[:k]
EDIT The other version, following my comment.
def findKSmallest( self, k ):
if k == 0:
return []
leftSmallest = self.left.findKSmallest( k ) if self.left != None else []
rightSmallest = self.right.findKSmallest( k - 1 ) if self.right != None else []
mySmallest = sorted( leftSmallest + [self.data] + rightSmallest )
return mySmallest[:k]
Note that if k==1, this is indeed the search of the smallest element. Any move to the right, will immediately returns [], which contributes to nothing.
As said Lærne, you have to care about turning your function into a tail-recursive one; then you may be interested by using a continuation-passing style. Thus your function could be able to call either itself or the "escape" function. I wrote a module called tco for optimizing tail-calls; see https://github.com/baruchel/tco
Hope it can help.
Here is another approach: it doesn't exit recursion early, instead it prevents additional function calls if not needed, which is essentially what you're trying to achieve.
class Node:
def __init__(self, v):
self.v = v
self.left = None
self.right = None
def find_smallest_at_k(root, k):
res = [None]
count = [k]
def helper(root):
if root is None:
return
helper(root.left)
count[0] -= 1
if count[0] == 0:
print("found it!")
res[0] = root
return
if count[0] > 0:
print("visiting right")
find(root.right)
helper(root)
return res[0].v
If you want to exit as soon as earlier possible, then use exit(0).
This will make your task easy!
class EmptyMap():
"""
EmptyMap has no slots
"""
__slots__ = ()
class NonEmptyMap():
"""
Has slots of left, key, value, and right.
"""
__slots__ = ('left', 'key', 'value', 'right')
def mkEmptyMap():
"""
Is a function that takes no arguments and returns an instance of EmptyMap
"""
return EmptyMap()
def mkNonEmptyMap(left, key, value, right):
"""
Is a function that takes a map, a key, a value, and another map,
and returns an instance of NonEmptyMap. This function merely initializes the slots;
it is possible to use this function to create trees that are not binary search trees.
"""
nonEmptyMap = NonEmptyMap()
nonEmptyMap.left = left
nonEmptyMap.key = key
nonEmptyMap.value = value
nonEmptyMap.right = right
return nonEmptyMap
def mapInsert(key, value, node):
"""
Is a function that takes a key, a value, and a map, and returns an instance
of NonEmptyMap. Further, the map that is returned is a binary search tree based
on the keys. The function inserts the key-value pair into the correct position in the
map. The map returned need not be balanced. Before coding, review the binary
search tree definition and the structurally recursive design pattern, and determine
what the function should look like for maps. If the key already exists, the new value
should replace the old value.
"""
if isinstance(node, EmptyMap):
return mkNonEmptyMap(mkEmptyMap(), key, value, mkEmptyMap())
else:
if key > node.key:
node.right = mapInsert(key, value, node.right)
return node.right
elif key < node.key:
node.left = mapInsert(key, value, node.left)
return node.left
elif key == node.key:
node.value = value
return mapInsert(key, value, node)
else:
raise TypeError("Bad Tree Map")
def mapToString(node):
"""
Is a function that takes a map, and returns a string that represents the
map. Before coding, review the structurally recursive design pattern, and determine
how to adapt it for maps. An EmptyMap is represented as ’ ’. For an instance of
NonEmptyMap, the left sub-tree appears on the left, and the right sub-tree appears
on the right.
"""
if isinstance(node, EmptyMap):
return '_'
elif isinstance(node, NonEmptyMap):
return '(' + mapToString(node.left) + ',' + str(node.key) + '->' + str(node.value) + ',' + mapToString(node.right)+ ')'
else:
raise TypeError("Not a Binary Tree")
def mapSearch(key, node):
"""
Is a function that takes a key and a map, and returns the value associated
with the key or None if the key is not there. Before coding, review the binary search
tree definition and the structurally recursive design pattern, and determine how it
should look for maps.
"""
if isinstance(node, EmptyMap):
return 'None'
elif isinstance(node, NonEmptyMap):
if key == node.key:
return str(node.value)
elif key < node.key:
return mapSearch(key, node.left)
elif key > node.key:
return mapSearch(key, node.right)
else:
raise TypeError("Not a Binary Tree")
def main():
smallMap = mapInsert(\
'one',\
1,\
mapInsert(\
'two',\
2,\
mapInsert(\
'three',\
3,\
mkEmptyMap())))
print(smallMap.key)
print(smallMap.left.key)
print(smallMap.right.key)
main()
When I run the program, I got a syntax which I have no idea what I am doing wrong. I am pretty sure the emptymap has an object which is in mkNonEmptyMap function. This is my homework problem.
A map is a data structure that associates values with keys. One can search for a particular key to find its associated value. For example, the value 3 could be associated with the key ’three’.
one
Traceback (most recent call last):
File "/Users/USER/Desktop/test.py", line 113, in <module>
main()
File "/Users/USER/Desktop/test.py", line 110, in main
print(smallMap.left.key)
AttributeError: 'EmptyMap' object has no attribute 'key'
If you look at what's in smallMap, its left and right are both EmptyMaps. So of course smallMap.left.key isn't going to work—EmptyMaps don't have keys.
So, why is it wrong? Well, let's break that monster expression down into steps and see where it goes wrong:
>>> empty = mkEmptyMap()
>>> mapToString(empty)
'_'
>>> three = mapInsert('three', 3, mkEmptyMap())
>>> mapToString(three)
'(_,three->3,_)'
>>> two = mapInsert('two', 2, three)
>>> mapToString(two)
(_,two->2,_)
There's a problem. The two object has no left or right. What about three?
>>> mapToString(three)
(_,three->3,(_,two->2,_))
OK, so we do have a valid balanced tree—but it's not in the two object returned by mapInsert, it's in the three object that you passed in to mapInsert (which your original program isn't even keeping a reference to).
So, why is that happening? Is that valid? It depends on your design. If you want to mutate your arguments like this, it's perfectly reasonable to do so (although I suspect it's not what your teacher actually wanted—anyone who's trying to force you to write ML in Python like this probably wants you to use non-mutating algorithms…). But then you need to always return the root node. Your function is clearly trying to return the newly-created node whether it's the root or not. So, just fix that:
if key > node.key:
node.right = mapInsert(key, value, node.right)
return node # not node.right
And likewise for the other two cases. (I'm not sure why you were trying to call yourself recursively in the == case in the first place.)
If you do that, the code no longer has an error.
It doesn't seem to be actually balancing the tree correctly, but that's the next problem for you to solve.
In a tree structure, I'm trying to find all leafs of a branch. Here is what I wrote:
def leafs_of_branch(node,heads=[]):
if len(node.children()) == 0:
heads.append(str(node))
else:
for des in node.children():
leafs_of_branch(des)
return heads
leafs_of_branch(node)
I don't know why but it feels wrong for me. It works but I want to know if there is a better way to use recursion without creating the heads parameter.
This
def leafs_of_branch(node,heads=[]):
is always a bad idea. Better would be
def leafs_of_branch(node,heads=None):
heads = heads or []
as otherwise you always use the same list for leafs_of_branch. In your specific case it might be o.k., but sooner or later you will run into problems.
I recommend:
def leafs_of_branch(node):
leafs = []
for des in node.children():
leafs.extend(leafs_of_branch(des))
if len(leafs)==0:
leafs.append(str(node))
return leafs
leafs_of_branch(node)
Instead of doing a if len(node.children()==0, I check for len(leafs) after descending into all (possibly zero) children. Thus I call node.children() only once.
I believe this should work:
def leafs_of_branch(node):
if len(node.children()) == 0:
return [str(node)]
else:
x = []
for des in node.children():
x += leafs_of_branch(des) #x.extend(leafs_of_branch(des)) would work too :-)
return x
It's not very pretty and could probably be condensed a bit more, but I was trying to keep the form of your original code as much as possible to make it obvious what was going on.
Your original version won't actually work if you call it more than once because as you append to the heads list, that list will actually be saved between calls.
As long as recursion goes, you are doing it right IMO; you are missing the heads paramater on the recursive call tho. The reason it's working anyway is for what other people said, default parameters are global and reused between calls.
If you want to avoid recursion altogheter, in this case you can use either a Queue or a Stack and a loop:
def leafs_of_branch(node):
traverse = [node]
leafs = []
while traverse:
node = traverse.pop()
children = node.children()
if children:
traverse.extend(children)
else:
leafs.append(str(node))
return leafs
You may also define recursively an iterator this way.
def leafs_of_branch(node):
if len(node.children()) == 0:
yield str(node)
else:
for des in node.children():
for leaf in leafs_of_branch(des):
yield leaf
leafs = list(leafs_of_branch(node))
First of all, refrain from using mutable objects (lists, dicts etc) as default values, since default values are global and reused between the function calls:
def bad_func(val, dest=[]):
dest.append(val)
print dest
>>> bad_func(1)
[1]
>>> bad_func(2)
[1, 2] # surprise!
So, the consequent calls will make something completely unexpected.
As for the recursion question, I'd re-write it like this:
from itertools import chain
def leafs_of_branch(node):
children = node.children()
if not children: # better than len(children) == 0
return (node, )
all_leafs = (leafs_of_branch(child) for child in children)
return chain(*all_leafs)