Recursive Depth First Search in Python - python

So I've been trying to implement depth first search recursion in python. In my program I'm aiming to return the parent array. (the parent of the vertices in the graph)
def dfs_tree(graph, start):
a_list = graph.adjacency_list
parent = [None]*len(a_list)
state = [False]*len(a_list)
state[start] = True
for item in a_list[start]:
if state[item[0]] == False:
parent[item[0]] = start
dfs_tree(graph, item[0])
state[start] = True
return parent
It says that the maximum recursion depth is reached. How do I fix this?

The main reason for such behavior as I can see is the re-initialization of state (and parent) arrays on each recursion call. You need to initialize them only once for each traverse. The common approach for that is to add these arrays to function arguments list, initialize them with None's, and replacing with lists on first call:
def dfs_tree(graph, start, parent=None, state=None):
a_list = graph.adjacency_list
if parent is None:
parent = [None]*len(a_list)
if state is None:
state = [False]*len(a_list)
state[start] = True
for item in a_list[start]:
if state[item[0]] == False:
parent[item[0]] = start
dfs_tree(graph, item[0], parent, state)
state[start] = True
return parent
After that change algorithm seems to be correct. But the problem mentioned by Esdes is still there. If you want to correctly traverse graphs of size n you need to do sys.setrecursionlimit(n + 10). 10 there stands for the maximum number of nested function calls outside dfs_tree including hidden initial calls. You should increase it if you need.
But it's not a final solution. Python interpreter have some limit on stack memory itself, which differs depending on OS and some settings. So if you increase recursionlimit above some threshold you would start to get "segmentation fault" errors when the interpreter's memory limit is exceeded. I don't remember exactly the approximate value of this threshold but it looks to be about 3000 calls. I don't have an interpreter now to check.
In this case you may want to consider re-implementing you algorithm to non-recursive version using stack or try to change python interpreter's stack memory limit.
P.S. You may want to change if state[item[0]] == False: to if not state[item[0]]:. Comparing booleans is usually considered as a bad practice.

To fix this error just increase the default recursion depth limit like this (by default it is 1000):
sys.setrecursionlimit(2000)

Related

K array Tree Height using DFS with Recursion, is the code logic wrong?

I feel embarrassed asking this question but I spent close to four hours trying to make sense of why this code works. My problem is that this code to me looks like it works to me when the longest path is selected the first time but not when sub-optimal path is selected first time. My guess is that this code works because when when the not longest path is selected the depth value and height resets and then the next path is selected??? Can someone please explain?
Picture:
'''
For your reference:
class TreeNode:
def __init__(self):
self.children = []
'''
def find_height(root):
global max
max=0
if not root:
return 0
traverse(root,0)
return max-1
def traverse(node, depth):
global max
depth +=1
if depth>max:
max=depth
for child in node.children:
traverse(child, depth)
The code is correct, and it is like you say, but maybe it helps to just take a bit of abstraction. The code essentially traverses the whole tree. It does not really matter in which order it does that. In this code it performs a depth-first traversal, probably because that is the easiest to implement and has a small memory footprint. But it is not really relevant for understanding how it can return the length of the longest path.
Just note that the algorithm guarantees to visit each node and knows at which depth this node occurs. It should be clear that when you take the maximum depth of all the depths that you encounter during the traversal, that you have found the length of the longest path from the root, or in other words, the height of the tree.
The following statement makes sure the code always updates the maximum depth in light of the current depth:
if depth>max:
max=depth
I should maybe also highlight that depth is a local variable, which belongs to one execution context of the function traverse. So each execution of the function creates a new instance of that variable. When you return out of a recursive call, the execution comes back to where you have a previous version of this variable. A recursive call does not modify the value of depth from where that call is made. So it truly reflects the depth of the current node (after depth += 1 is executed)
In contrast, max is a global variable, and so there is only one version of that variable. And so the effect of max=depth persists also after a call of traverse terminates.
As a side note, many would say that modifying a global variable from inside a function is not ideal (it represents a side effect), and there are better ways to code this. The recursive function should better return the height of the subtree rooted in the node that is passed as argument. This also has as advantage that you don't need the depth argument, and you don't need a second function:
def find_height(root):
if not root:
return 0
# the height of the tree is 1 more than that of the tallest subtree below it
return 1 + max(find_height(child) for child in root.children, default=0)

Count the number of nodes in a linked list recursively

Problem: Return the number of nodes in the linked list.
I am learning recursion recently. I know how to use iteration to solve this problem but I am stick by the recursion way. The following is my code and it always return 1 instead of the real count of the linked list. I cannot figure out the problem and hope someone can help me. How can I fix the problem?
def numberOfNodes(head):
total_node = 0
return __helper(total_node, head)
def __helper(total_node, head):
if not head:
return total_node += 1
__helper(total_node, head.next)
return total_node
Recursion is a poor choice for this sort of thing (adds overhead and risks blowing the call stack for no good reason), but if you do use it for educational purposes, it's easiest to pass the total up the call stack, not down:
def linked_list_len(head):
return linked_list_len(head.next) + 1 if head else 0
Basically, add 1 per frame where the head node exists, then call the function again with the next node. The base case is when head is None.
In some languages that offer tail call optimization, you can avoid the + 1 work that happens to the variable returned by the child recursive call per frame. This allows the compiler or interpreter to convert recursion to a loop, avoiding stack overflows. The code would look like (similar to your approach, with the difference that the + 1 is added in the recursive call):
def linked_list_len(head, total=0):
return linked_list_len(head.next, total + 1) if head else total
But Python doesn't support TCO so you may as well write it the simpler way shown above.

Bi-Directional Binary Search Trees?

I have tried to implement a BST. As of now it only adds keys according to the BST property(Left-Lower, Right-Bigger). Though I implemented it in a different way.
This is how I think BST's are supposed to be
Single Direction BST
How I have implemented my BST
Bi-Directional BST
The question is whether or not is it the correct implementation of BST?
(The way i see it in double sided BST's it would be easier to search, delete and insert)
import pdb;
class Node:
def __init__(self, value):
self.value=value
self.parent=None
self.left_child=None
self.right_child=None
class BST:
def __init__(self,root=None):
self.root=root
def add(self,value):
#pdb.set_trace()
new_node=Node(value)
self.tp=self.root
if self.root is not None:
while True:
if self.tp.parent is None:
break
else:
self.tp=self.tp.parent
#the self.tp varible always is at the first node.
while True:
if new_node.value >= self.tp.value :
if self.tp.right_child is None:
new_node.parent=self.tp
self.tp.right_child=new_node
break
elif self.tp.right_child is not None:
self.tp=self.tp.right_child
print("Going Down Right")
print(new_node.value)
elif new_node.value < self.tp.value :
if self.tp.left_child is None:
new_node.parent=self.tp
self.tp.left_child=new_node
break
elif self.tp.left_child is not None:
self.tp=self.tp.left_child
print("Going Down Left")
print(new_node.value)
self.root=new_node
newBST=BST()
newBST.add(9)
newBST.add(10)
newBST.add(2)
newBST.add(15)
newBST.add(14)
newBST.add(1)
newBST.add(3)
Edit: I have used while loops instead of recursion. Could someone please elaborate as why using while loops instead of recursion is a bad idea in this particular case and in general?
BSTs with parent links are used occasionally.
The benefit is not that the links make it easier to search or update (they don't really), but that you can insert before or after any given node, or traverse forward or backward from that node, without having to search from the root.
It becomes convenient to use a pointer to a node to represent a position in the tree, instead of a full path, even when the tree contains duplicates, and that position remains valid as updates or deletions are performed elsewhere.
In an abstract data type, these properties make it easy, for example, to provide iterators that aren't invalidated by mutations.
You haven't described how you gain anything with the parent pointer. An algorithm that cares about rewinding to the parent node, will do so by crawling back up the call stack.
I've been there -- in my data structures class, I implemented my stuff with bi-directional pointers. When we got to binary trees, those pointers ceased to be useful. Proper use of recursion replaces the need to follow a link back up the tree.

How to make list lookup faster in this recursive function

I have a recursive function which creates a json object
def add_to_tree(name, parent, start_tree):
for x in start_tree:
if x["name"] == parent:
x["children"].append({"name":name, "parent":parent, "children":[]})
else:
add_to_tree(name, parent, x["children"])
It is called from another function
def caller():
start_tree = [{"name":"root", "parent":"null", "children":[]}] # basic structure of the json object which holds the d3.js tree data
for x in new_list:
name = x.split('/')[-2]
parent = x.split('/')[-3]
add_to_tree(name, parent, start_tree)
new_list is list which holds links in this form
/root/A/
/root/A/B/
/root/A/B/C/
/root/A/D/
/root/E/
/root/E/F/
/root/E/F/G/
/root/E/F/G/H/
...
Everything is working fine except for the fact the run times grows exponentially with with the input size.
Normally new_list has ~500k links and depth of these links can be more than 10 so there is lots of looping and looks involved in the add_to_tree() function.
Any ideas on how to make this faster?
You are searching your whole tree each time you add a new entry. This is hugely inefficient as your tree grows; you can easily end up with a O(N^2) searches this way; for each new element search the whole tree again.
You could use a dictionary mapping names to specific tree entries, for fast O(1) lookups; this lets you avoid having to traverse the tree each time. It can be as simple as treeindex[parent]. This'll take some more memory however, and you may need to handle the case where the parent is added after the children (using a queue).
However, since your input list appears to be sorted, you could just process your list recursively or use a stack and take advantage of the fact you just found the parent already. If your path is longer than the previous entry, it'll be a child of that entry. If the path is equal or shorter, it'll be a sibling entry to the previous node or a parent of that node, so return or pop the stack.
For example, for these three elements:
/root/A/B/
/root/A/B/C/
/root/A/D/
/root/A/B/C does not have to search the tree from the root for /root/A/B, it was the previously processed entry. That'll be the parent call for this recursive iteration, or the top of the stack. Just add to that parent directly.
/root/A/D is a sibling of a parent; the path is shorter than /root/A/B/C/, so return or pop that entry of the stack. The length is equal to /root/A/B/, so it is a direct sibling; again return or pop the stack. Now you'll be at the /root/A level, and /root/A/D/ is a child. Add, and continue your process.
I have not tested this, but it looks like the loop does not stop when an insertion has been made, so every entry in new_list will cause a recursive search through all of the tree. This should speed it up:
def add_to_tree(name, parent, start_tree):
for x in start_tree:
if x["name"] == parent:
x["children"].append({"name":name, "parent":parent, "children":[]})
return True
elif add_to_tree(name, parent, x["children"]):
return True
return False
It stops searching as soon as the parent is found.
That said, I think there is a bug in the approach. What if you have:
/root/A/B/C/
/root/D/B/E/
Your algorithm only parses the last two elements and it seems that both C and E will be placed under B. I think you will need to take all elements into account and make your way down the tree element by element. Anyway that is better since you will know at each level which branch to take, and the correct version will be much faster. Each insert will be O(log N).

Testing if an object is dependent to another object

Is there a way to check if an object is dependent via parenting, constraints, or connections to another object? I would like to do this check prior to parenting an object to see if it would cause dependency cycles or not.
I remember 3DsMax had a command to do this exactly. I checked OpenMaya but couldn't find anything. There is cmds.cycleCheck, but this only works when there currently is a cycle, which would be too late for me to use.
The tricky thing is that these 2 objects could be anywhere in the scene hierarchy, so they may or may not have direct parenting relationships.
EDIT
It's relatively easy to check if the hierarchy will cause any issues:
children = cmds.listRelatives(obj1, ad = True, f = True)
if obj2 in children:
print "Can't parent to its own children!"
Checking for constraints or connections is another story though.
depending on what you're looking for, cmds.listHistory or cmds.listConnections will tell you what's coming in to a given node. listHistory is limited to a subset of possible connections that drive shape node changes, so if you're interested in constraints you'll need to traverse the listConnections for your node and see what's upstream. The list can be arbitrarily large because it may include lots of hidden nodes like like unit translations, group parts and so on that you probably don't want to care about.
Here's simple way to troll the incoming connections of a node and get a tree of incoming connections:
def input_tree(root_node):
visited = set() # so we don't get into loops
# recursively extract input connections
def upstream(node, depth = 0):
if node not in visited:
visited.add(node)
children = cmds.listConnections(node, s=True, d=False)
if children:
grandparents = ()
for history_node in children:
grandparents += (tuple(d for d in upstream(history_node, depth + 1)))
yield node, tuple((g for g in grandparents if len(g)))
# unfold the recursive generation of the tree
tree_iter = tuple((i for i in upstream(root_node)))
# return the grandparent array of the first node
return tree_iter[0][-1]
Which should produce a nested list of input connections like
((u'pCube1_parentConstraint1',
((u'pSphere1',
((u'pSphere1_orientConstraint1', ()),
(u'pSphere1_scaleConstraint1', ()))),)),
(u'pCube1_scaleConstraint1', ()))
in which each level contains a list of inputs. You can then troll through that to see if your proposed change includes that item.
This won't tell you if the connection will cause a real cycle, however: that's dependent on the data flow within the different nodes. Once you identify the possible cycle you can work your way back to see if the cycle is real (two items affecting each other's translation, for example) or harmless (I affect your rotation and you affect my translation).
This is not the most elegant approach, but it's a quick and dirty way that seems to be working ok so far. The idea is that if a cycle happens, then just undo the operation and stop the rest of the script. Testing with a rig, it doesn't matter how complex the connections are, it will catch it.
# Class to use to undo operations
class UndoStack():
def __init__(self, inputName = ''):
self.name = inputName
def __enter__(self):
cmds.undoInfo(openChunk = True, chunkName = self.name, length = 300)
def __exit__(self, type, value, traceback):
cmds.undoInfo(closeChunk = True)
# Create a sphere and a box
mySphere = cmds.polySphere()[0]
myBox = cmds.polyCube()[0]
# Parent box to the sphere
myBox = cmds.parent(myBox, mySphere)[0]
# Set constraint from sphere to box (will cause cycle)
with UndoStack("Parent box"):
cmds.parentConstraint(myBox, mySphere)
# If there's a cycle, undo it
hasCycle = cmds.cycleCheck([mySphere, myBox])
if hasCycle:
cmds.undo()
cmds.warning("Can't do this operation, a dependency cycle has occurred!")

Categories