Finding a node in a tree

Finding a node in a tree - python

I am having trouble finding a node in a tree with arbitrary branching factor. Each Node carries data and has zero or greater children. The search method is inside the Node class and
checks to see if that Node carries data and then checks all of that Nodes children. I keep ending up with infinite loops in my recursive method, any help?
def find(self, x):
_level = [self]
_nextlevel = []
if _level == []:
return None
else:
for node in _level:
if node.data is x:
return node
_nextlevel += node.children
_level = _nextlevel
return self.find(x) + _level
The find method is in the Node class and checks if data x is in the node the method is called from, then checks all of that nodes children. I keep getting an infinite loop, really stuck at this point any insight would be appreciated.

There are a few issues with this code. First, note that on line 2 you have _level = [self]. that means the if _level == [] on line 5 will always be false.
The 2nd issue is that your for loop goes over everything in _level, but, as noted above, that will always be [self] due to line 2.
The 3rd issue is the return statement. You have return self.find(x) + _level. That gets evaluated in 2 parts. First, call self.find(x), then concatenate what that returns with the contents of _level. But, when you call self.find(x) that will call the same method with the same arguments and that, in turn, will then hit the same return self.find(x) + _level line, which will call the same method again, and on and on forever.

A simple pattern for recursive searches is to use a generator. That makes it easy to pass up the answers to calling code without managing the state of the recursion yourself.
class Example(object):
def __init__(self, datum, *children):
self.Children = list(children) # < assumed to be of the same or duck-similar class
self.Datum = datum
def GetChildren(self):
for item in self.Children:
for subitem in item.GetChildren():
yield subitem
yield item
def FindInChildren(self, query): # where query is an expression that is true for desired data
for item in self.GetChildren():
if query(item):
yield item

Related

Python: How can I implement yield in my recursion?

How can I implement yield from in my recursion? I am trying to understand how to implement it but failing:
# some data
init_parent = [1020253]
df = pd.DataFrame({'parent': [1020253, 1020253],
'id': [1101941, 1101945]})
# look for parent child
def recur1(df, parents, parentChild=None, step=0):
if len(parents) != 0:
yield parents, parentChild
else:
parents = df.loc[df['parent'].isin(parents)][['id', 'parent']]
parentChild = parents['parent'].to_numpy()
parents = parents['id'].to_numpy()
yield from recur1(df=df, parents=parents, parentChild=parentChild, step=step+1)
# exec / only printing results atm
out = recur1(df, init_parent, step=0)
[x for x in out]

I'd say your biggest issue here is that recur1 isn't always guaranteed to return a generator. For example, suppose your stack calls into the else branch three times before calling into the if branch. In this case, the top three frames would be returning a generator received from the lower frame, but the lowest from would be returned from this:
yield parents, parentChild
So, then, there is a really simple way you can fix this code to ensure that yield from works. Simply transform your return from a tuple to a generator-compatible type by enclosing it in a list:
yield [(parents, parentChild)]
Then, when you call yield from recur1(df=df, parents=parents, parentChild=parentChild, step=step+1) you'll always be working with something for which yeild from makes sense.

How to use yield in BinarySearchTree?

I am following the BinarySearchTree code in the book Data Structure and Algorithms.
Would you like to read the full code in this link?
And I am not clear how this method works
def __iter__(self):
if self.left != None:
for elem in self.left:
yield elem
yield self.val
if self.right != None:
for elem in self.right:
yield elem
Is the elem variable an instance of the Node class or is it a float number (from inputs)? In debug it is both, I guess this value is changed because of line yield elem but I do not understand it.
What are the differences between yield elem and yield self.val? How many generator objects are there in this situation?
In addition, would you like to share some experience in debugging generator functions? I am confused by yield when debugging.

1. elem is a Node instance. From the for loops, we know that elem is always either self.left or self.right. You can see in the example usage that float values are inserted into the binary tree with tree.insert(float(x)) and the BinarySearchTree.insert() method ultimately calls BinarySearchTree.Node(val) where val is float(x) in this case. Therefore self.left and self.right are always Node instances.
As mentioned by don't talk just code in the comments, elem is a float. I did not see this before because I assumed that iterating over self.left would product a list of Node elements. However this is not correct. In fact, iterating over self.left works in this case by calling self.left.__iter__(). I break down this __iter__() function into 3 cases, almost like a recursive function. (It is not in fact recursive because it is calling the __iter__() method of different instances of the Node class, but its behavior is similar.)
First, the Node has no left or right children. This is straightforward: the iter will just yield self.val, which is a float.
Second, the Node has left children. In this case, the for loop will traverse down all the left children in an almost recursive fashion until it reaches a Node that has no left children. Then we are back at the first case.
Third, the Node has right children. In this case, after the own nodes self.val is return, the iterator will continue to the first right node, and repeat.
There is only one generator, namely Node.__iter__(), because generators are functions. It uses multiple yield statements to return different values depending on the situation. yield elem and yield self.val just return either a Node if the current Node has left or right branches or the current Node's value.
I do not have specific tips for debugging yield statements in particular. In general I use IPython for interactive work when building code and use its built-in %debug magic operator. You might also find rubber duck debugging useful.
Using IPython you can run the following in a cell to debug interactively.
In [37]: %%debug
...: for x in tree.root:
...: print(x)
...:
NOTE: Enter 'c' at the ipdb> prompt to continue execution.
You can then use the s command at the debugger prompt, ipdb> , to step through the code, jumping into a function calls.
ipdb> s
--Call--
> <ipython-input-1-c4e297595467>(30)__iter__()
28 # of the nodes of the tree yielding all the values. In this way, we get
29 # the values in ascending order.
---> 30 def __iter__(self):
31 if self.left != None:
32 for elem in self.left:
While debugging, you can evaluate expressions by preceding them with an exclamation point, !.
ipdb> !self
BinarySearchTree.Node(5.5,BinarySearchTree.Node(4.4,BinarySearchTree.Node(3.3,BinarySearchTree.Node(2.2,BinarySearchTree
.Node(1.1,None,None),None),None),None),None)

First, there is an indentation issue in the code you shared: yield self.val should not be in the if block:
def __iter__(self):
if self.left != None:
for elem in self.left:
yield elem
yield self.val # Unconditional. This is also the base case
if self.right != None:
for elem in self.right:
yield elem
To understand this code, first start imagining a tree with just one node. Let's for a moment ignore the BinarySearchTree class and say we have direct access to the Node class. We can create a node and then iterate it:
node = Node(1)
for value in node:
print(value)
This loop will call the __iter__ method, which in this case will not execute any of the if blocks, as it has no children, and only execute yield self.val. And that is what value in the above loop will get as value, and which gets printed.
Now extend this little exercise with 2 more nodes:
node = Node(1,
Node(0),
Node(2)
)
for value in node:
print(value)
Here we have created this tree, and node refers to its root
1 <-- node
/ \
0 2
When the for..in loop will call __iter__ now, it will first enter the first if block, where we get a form of recursion. With the for statement there, we again execute __iter__, but this time on the left child of node, i.e. the node with value 0. But that is a case we already know: this node has no children, and we know from the first example above, that this results in one iteration where the loop variable will be the value of that node, i.e. 0, and that value is yielded. That means the main program gets an iteration with value equal to 0, which gets printed.
So elem is numeric. It would better have been called value or val to take away any confusion.
After that if block has executed we get to yield self.val. self is here node, and so we yield 1. That means the main program gets to execute a second iteration, this time with value equal to 1.
Finally the second if block is executed, and now the right child of node is the subject of a recursive __iter__ call. It is the same principle as with the left child. This yields value 2, and the main program prints 2.
We could again extend the tree with more nodes, but the principle is the same: by recursive calls of __iter__ all the values of the tree are yielded.
yield from
There is a syntax that allows simplification of the code, and also it is more common practice to use the is operator when comparing with None:
def __iter__(self):
if self.left is not None:
yield from self.left
yield self.val
if self.right is not None:
yield from self.right
This results in the same behavior. yield from will yield all values that come from the iterable. And since node instances are iterable as they have the __iter__ method, this works as intended.

Does this type of recursion have a name?

I have a linked list in python and I want to write a filter function that returns a new link list if a call to f(item) is true, this implementation has a filtered that builds the list from the bottom up. I'm having trouble understanding this recursion. What type of recursion is this?
I'm more familiar with recursion like fibonacci where the return recursion is at the very bottom.
class Link:
empty = ()
def __init__(self, first, rest=empty):
assert rest is Link.empty or isinstance(rest, Link)
self.first = first
self.rest = rest
def __getitem__(self, i):
if i == 0:
return self.first
else:
return self.rest[i-1]
def __len__(self):
return 1 + len(self.rest)
def __repr__(self):
if self.rest == Link.empty:
return "Link(" + str(self.first) + ")"
return 'Link({0}, {1})'.format(self.first, repr(self.rest))
def filter_link(f, s):
if s is Link.empty:
return s
else:
filtered = filter_link(f,s.rest) # How does this work?
if f(s.first):
return Link(s.first, filtered)
else:
return filtered

This is the sort of recursion you are used to.
I just looked up a recursive fibonacci solution where the early return is on the second line, just like your code. Also, like your code, the recursion in the example occurs before the more normal returns.
It looks like your code returns a new linked list of the elements that the function f approves of, from the bottom up. That is, it creates new instances of Link around elements s.first, terminated by the single instance of Link.empty.

Python recursion - how to exit early

I've been playing with BST (binary search tree) and I'm wondering how to do an early exit. Following is the code I've written to find kth smallest. It recursively calls the child node's find_smallest_at_k, stack is just a list passed into the function to add all the elements in inorder. Currently this solution walks all the nodes inorder and then I have to select the kth item from "stack" outside this function.
def find_smallest_at_k(self, k, stack, i):
if self is None:
return i
if (self.left is not None):
i = self.left.find_smallest_at_k(k, stack, i)
print(stack, i)
stack.insert(i, self.data)
i += 1
if i == k:
print(stack[k - 1])
print "Returning"
if (self.right is not None):
i = self.right.find_smallest_at_k(k, stack, i)
return i
It's called like this,
our_stack = []
self.root.find_smallest_at_k(k, our_stack, 0)
return our_stack[k-1]
I'm not sure if it's possible to exit early from that function. If my k is say 1, I don't really have to walk all the nodes then find the first element. It also doesn't feel right to pass list from outside function - feels like passing pointers to a function in C. Could anyone suggest better alternatives than what I've done so far?

Passing list as arguments: Passing the list as argument can be good practice, if you make your function tail-recursive. Otherwise it's pointless. With BST where there are two potential recursive function calls to be done, it's a bit of a tall ask.
Else you can just return the list. I don't see the necessity of variable i. Anyway if you absolutely need to return multiples values, you can always use tuples like this return i, stack and this i, stack = root.find_smallest_at_k(k).
Fast-forwarding: For the fast-forwarding, note the right nodes of a BST parent node are always bigger than the parent. Thus if you descend the tree always on the right children, you'll end up with a growing sequence of values. Thus the first k values of that sequence are necessarily the smallest, so it's pointless to go right k times or more in a sequence.
Even in the middle of you descend you go left at times, it's pointless to go more than k times on the right. The BST properties ensures that if you go right, ALL subsequent numbers below in the hierarchy will be greater than the parent. Thus going right k times or more is useless.
Code: Here is a pseudo-python code quickly made. It's not tested.
def findKSmallest( self, k, rightSteps=0 ):
if rightSteps >= k: #We went right more than k times
return []
leftSmallest = self.left.findKSmallest( k, rightSteps ) if self.left != None else []
rightSmallest = self.right.findKSmallest( k, rightSteps + 1 ) if self.right != None else []
mySmallest = sorted( leftSmallest + [self.data] + rightSmallest )
return mySmallest[:k]
EDIT The other version, following my comment.
def findKSmallest( self, k ):
if k == 0:
return []
leftSmallest = self.left.findKSmallest( k ) if self.left != None else []
rightSmallest = self.right.findKSmallest( k - 1 ) if self.right != None else []
mySmallest = sorted( leftSmallest + [self.data] + rightSmallest )
return mySmallest[:k]
Note that if k==1, this is indeed the search of the smallest element. Any move to the right, will immediately returns [], which contributes to nothing.

As said Lærne, you have to care about turning your function into a tail-recursive one; then you may be interested by using a continuation-passing style. Thus your function could be able to call either itself or the "escape" function. I wrote a module called tco for optimizing tail-calls; see https://github.com/baruchel/tco
Hope it can help.

Here is another approach: it doesn't exit recursion early, instead it prevents additional function calls if not needed, which is essentially what you're trying to achieve.
class Node:
def __init__(self, v):
self.v = v
self.left = None
self.right = None
def find_smallest_at_k(root, k):
res = [None]
count = [k]
def helper(root):
if root is None:
return
helper(root.left)
count[0] -= 1
if count[0] == 0:
print("found it!")
res[0] = root
return
if count[0] > 0:
print("visiting right")
find(root.right)
helper(root)
return res[0].v

If you want to exit as soon as earlier possible, then use exit(0).
This will make your task easy!

PYTHON Binary Search Tree - Recursive Remove

I'm working on a binary search tree class and we have to implement the remove method using recursion. Here is the code that was given to us.
def remove (self, x):
def recurse (p):
# Modify this method.
if p==None:
return p
elif x<p.data:
return p
elif x>p.data:
return p
elif p.left==None and p.right==None: # case (1)
return p
elif p.right==None: # case (2)
return p
elif p.left==None: # case (3)
return p
else: # case (4)
return p
self.root = recurse (self.root)
Obviously there should be more than just return p for each conditional. And I'm pretty sure the first if and two elifs are used to 'locate' the node containing x. However I am not sure how to implement the four cases. Any input would be appreciated.
Also, the end goal is to use this method to iterate through the BST and remove each node.

Well, your first step is to locate X, remember that you do this by recursion, so you should recurse the remove function on the child tree its located. (left if x < p.data, right if higher... only if they exist). Next, you need to worry what to do once you find x (x = p.data).
I would recommend drawing a tree and think as an algorithm. :)
Remember the BST property! The tree will necesarily modify it's structure to preserve the property, so.. who is the new father of the lonely childs?
hints:
a sibling possibly?
.
remember the BST charasteristic
.
recursion!!!)

pseudo_code method_recurse requires argument: this_node
if this_node is None then return None Found
if this_node is target then return this_node
if this_node is less_than target then return call self with this_node.right
if this_node is greater_than target then return call self with this_node.left

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Finding a node in a tree - python

Related

Python: How can I implement yield in my recursion?

How to use yield in BinarySearchTree?

Does this type of recursion have a name?

Python recursion - how to exit early

PYTHON Binary Search Tree - Recursive Remove

Categories

Resources