Print all paths of a binary tree (DFS) - python

I was trying to print all paths(root-to-leaf paths) of binary tree but of no avail.
My strategy is to use the recursion, having the base case as either tree is None or tree node is leaf return Otherwise, traverse through left and right of the tree.
But I can't find a way to retain both of left and right tree.
def pathSum(self, root, target, result):
if not root:
return []
if not root.left and not root.right:
return [root.val]
for i in [root.left, root.right]:
path = [root.val] + self.pathSum(i, target, result)
print("path", path)
return path

The idea is building the path (list) at each node visit, if current node is a leaf, add current to path and print it, if no, just add current to extend the path:
def pathSum(self, path):
if not self.left and not self.right:
print(path + [self.val])
return
self.left.pathSum(path + [self.val])
self.right.pathSum(path + [self.val])
root.pathSum([])
Update: If you want to keep all paths:
def pathSum(self, current_path, all_paths):
if not self.left and not self.right:
print('Path found: ' + str(current_path + [self.val]))
all_paths.append(current_path + [self.val])
return
self.left.pathSum(current_path + [self.val], all_paths)
self.right.pathSum(current_path + [self.val], all_paths)
all_paths = []
root.pathSum([], all_paths)
print('All paths: ' + str(all_paths))

Through some iterations, I found the following solution works. But I'm not sure if there's a more efficient way of finding all leaf-root paths.
The idea behind this solution is pre-order traversal
def allPaths(self, root, path, all_path):
if not root.left and not root.right:
path.append(root.val)
all_path.append(path[:])
return
if root:
path.append(root.val)
self.allPaths(root.left, path, all_path)
path.pop(-1)
self.allPaths(root.right, path, all_path)
path.pop(-1)
return all_path

Related

Recursive function does not return array in python

I'm writing a simple code that returns the path to the destination node in BST.
class TreeNode(object):
def __init__(self, x):
self.val = x
self.left = None
self.right = None
root = TreeNode(6)
root.left = TreeNode(2)
root.right = TreeNode(8)
root.left.left = TreeNode(0)
root.left.right = TreeNode(4)
root.left.right.left = TreeNode(3)
root.left.right.right = TreeNode(5)
root.right.left = TreeNode(7)
root.right.right = TreeNode(9)
After defining the tree;
p = 2
q = 8
def pathFind(path, cur, node): # path : path the function walked throug so far
# cur : current node
# node : destination node's value
#print(f'current node value is {cur.val}')
#print(path)
## ending condition ##
if cur.val == node: # If we reach the destination node
return path
elif cur.val < node and cur.right != None :
# 'if cur.right != None:' line is useless since the problem guarantees the existence of destination value in BST
path.append(cur)
return pathFind(path, cur.right, node)
elif cur.val > node and cur.left != None: # cur.val > node:
path.append(cur)
return pathFind(path, cur.left, node)
else:
return None
path_p = pathFind([root], root, p)
I checked that my function reaches the destination and record the path toward it without any problem, but the last line - path_p = pathFind([root], root, p) doesn't work.
Anyone could help?
In function pathFind(), the path is returned only by the execution where the tested node contains the target value. The (all) previous executions discard that return value. Fix it by putting return before the recursive calls.
Try below code to find the path for a target node
def pathFind(path, node, target):
if node == None:
return
else:
path.append(node.val)
if node.val == target:
return path
elif node.val < target and node.right != None:
return pathFind(path, node.right, target)
elif node.val > target and node.left != None:
return pathFind(path, node.left, target)
else:
return None
print(pathFind([], root, 9))
output
[6, 8, 9]
A few remaining issues (after the edits to your question):
The root node is in most cases added twice to the path: once in the initial call, and again when moving to the left or right subtree of the root node: cur is the root node and path.append(cur) is executed.
So, pass an empty list in the initial call instead of [root]
When the target node is found, that node is not appended to the path, yet I suppose it should be the final node in the path. So path.append(cur) should also happen in the first if block.
Those changes will fix your code.

Discrepancy of list append in python

I am getting a different result when I am using append(path) vs. append(list(path))
I have the following code to find all paths for a sum:
class TreeNode:
def __init__(self, val, left=None, right=None):
self.val = val
self.left = left
self.right = right
def find_paths(root, sum):
allPaths = []
dfs(root, sum, [], allPaths)
return allPaths
def dfs(root, sum, path, res):
if not root:
return
path.append(root.val)
if root.val == sum and root.left is None and root.left is None:
res.append(path)
dfs(root.left, sum - root.val, path, res)
dfs(root.right, sum - root.val, path, res)
del path[-1]
def main():
root = TreeNode(12)
root.left = TreeNode(7)
root.right = TreeNode(1)
root.left.left = TreeNode(4)
root.right.left = TreeNode(10)
root.right.right = TreeNode(5)
sum = 23
print("Tree paths with sum " + str(sum) +
": " + str(find_paths(root, sum)))
main()
This has the following output:
Tree paths with sum 23: [[], []]
But if I change the res.append(path) to res.append(list(path)) which will then return the correct answer Tree paths with sum 23: [[12, 7, 4], [12, 1, 10]]. I am confused on why using the list operation would change the answer.
res.append(path) appends the object path itself to the list res. After that line, when you modify the path object (like del path[-1]), the modification is also applied to the appended object in res, because, well, they are the same object.
list(path), on the other hand, "copies" the path. So this one is now a different object from path. When you modify path after that, the modification does not propagates to this different object.
You will have the same result if you do path[:] or path.copy() instead of list(path).
res.append(path) appends the actual path object, not a copy of it. So if path changes later on, the change will appear in res also.
res.append(list(path)) appends a copy.

How come these 2 implementations of dfs give different results?

This one gives the correct result:
def binaryTree(root):
paths = []
def dfs(root, path=""):
if root:
if path != "":
path += "->"
path += str(root.val)
if not root.left and not root.right:
paths.append(path)
dfs(root.left, path)
dfs(root.right, path)
dfs(root)
return paths # gives ['1->2->4', '1->2->5', '1->3']
And in this one the list of path keeps growing:
def binaryTree2(root):
paths = []
def dfs(root, path=[]):
if root:
path.append(root.val)
if not root.left and not root.right:
paths.append("->".join(map(str, path)))
dfs(root.left, path)
dfs(root.right, path)
dfs(root)
return paths # gives ['1->2->4', '1->2->4->5', '1->2->4->5->3']
The tree is like this: <1, <2, <4, None, None>, <5, None, None>>, <3, None, None>>
The only difference is that in one I concatenate strings and in the other I append to list.
So in the first implementation: All path += ... statements essentially create a new string and have path point to it.
As for the second implementation you have a single list that is passed around all the time. You should pop back the node right before dfs returns.
def binaryTree2(root):
paths = []
def dfs(root, path=[]):
if root:
path.append(root.val)
if not root.left and not root.right:
paths.append("->".join(map(str, path)))
dfs(root.left, path)
dfs(root.right, path)
path.pop() # this clears your stack as your functions return
dfs(root)
return paths
Edit: Python strings are immutable - i.e. once created, they can't be modified.
# below line essentially creates a pointer,
# and a string object that `path` points to.
path = "huhu"
# this creates another string object `huhu123`.
# So at this point we have 3 strings objects,
# "123", "huhu" and "huhu123". And a pointer `path` if you will.
# `path` points to "huhu123"
path += "123"
If we had more innocent objects instead of strings, once they are left with no references, they'd be garbage collected. Strings get special treatment, in our case all 3 of them are interned.

Adding to list a class instance

I'm implementing a code to find the shortest path between two nodes, but
why when I change the first line of the DFS function the output change too .
Isn't it true that
path += [start] is equivalent to path = path + [start]?
the output before changing is ::
Current DFS path: 0
Current DFS path: 0->1
Current DFS path: 0->1->2
Current DFS path: 0->1->2->3
Current DFS path: 0->1->2->3->4
Current DFS path: 0->1->2->3->5
Current DFS path: 0->1->2->4
Current DFS path: 0->2
Current DFS path: 0->2->3
Current DFS path: 0->2->3->1
Current DFS path: 0->2->3->4
Current DFS path: 0->2->3->5
Current DFS path: 0->2->4
shortest path is 0->2->3->5
after changing is ::
Current DFS path: 0
Current DFS path: 0->1
Current DFS path: 0->1->2
Current DFS path: 0->1->2->3
Current DFS path: 0->1->2->3->4
Current DFS path: 0->1->2->3->4->5
shortest path is 0->1->2->3->4->5
The code ::
class Node(object):
def __init__(self, name):
"""Assumes name is a string"""
self.name = name
def getName(self):
return self.name
def __str__(self):
return self.name
class Edge(object):
def __init__(self, src, dest):
"""Assumes src and dest are nodes"""
self.src = src
self.dest = dest
def getSource(self):
return self.src
def getDestination(self):
return self.dest
def __str__(self):
return self.src.getName() + '->' + self.dest.getName()
class WeightedEdge(Edge):
def __init__(self, src, dest, weight = 1.0):
"""Assumes src and dest are nodes, weight a number"""
self.src = src
self.dest = dest
self.weight = weight
def getWeight(self):
return self.weight
def __str__(self):
return self.src.getName() + '->(' + str(self.weight) + ')'\
+ self.dest.getName()
#Figure 12.8
class Digraph(object):
#nodes is a list of the nodes in the graph
#edges is a dict mapping each node to a list of its children
def __init__(self):
self.nodes = []
self.edges = {}
def addNode(self, node):
if node in self.nodes:
raise ValueError('Duplicate node')
else:
self.nodes.append(node)
self.edges[node] = []
def addEdge(self, edge):
src = edge.getSource()
dest = edge.getDestination()
if not (src in self.nodes and dest in self.nodes):
raise ValueError('Node not in graph')
self.edges[src].append(dest)
def childrenOf(self, node):
return self.edges[node]
def hasNode(self, node):
return node in self.nodes
def __str__(self):
result = ''
for src in self.nodes:
for dest in self.edges[src]:
result = result + src.getName() + '->'\
+ dest.getName() + '\n'
return result[:-1] #omit final newline
class Graph(Digraph):
def addEdge(self, edge):
Digraph.addEdge(self, edge)
rev = Edge(edge.getDestination(), edge.getSource())
Digraph.addEdge(self, rev)
#Figure 12.9
def printPath(path):
"""Assumes path is a list of nodes"""
result = ''
for i in range(len(path)):
result = result + str(path[i])
if i != len(path) - 1:
result = result + '->'
return result
def DFS(graph, start, end, path, shortest, toPrint = False):
"""Assumes graph is a Digraph; start and end are nodes;
path and shortest are lists of nodes
Returns a shortest path from start to end in graph"""
path = path + [start]
if toPrint:
print('Current DFS path:', printPath(path))
if start == end:
return path
for node in graph.childrenOf(start):
if node not in path: #avoid cycles
if shortest == None or len(path) < len(shortest):
newPath = DFS(graph, node, end, path, shortest,
toPrint)
if newPath != None:
shortest = newPath
return shortest
def shortestPath(graph, start, end, toPrint = False):
"""Assumes graph is a Digraph; start and end are nodes
Returns a shortest path from start to end in graph"""
return DFS(graph, start, end, [], None, toPrint)
#Figure 12.10
def testSP():
nodes = []
for name in range(6): #Create 6 nodes
nodes.append(Node(str(name)))
g = Digraph()
for n in nodes:
g.addNode(n)
g.addEdge(Edge(nodes[0],nodes[1]))
g.addEdge(Edge(nodes[1],nodes[2]))
g.addEdge(Edge(nodes[2],nodes[3]))
g.addEdge(Edge(nodes[2],nodes[4]))
g.addEdge(Edge(nodes[3],nodes[4]))
g.addEdge(Edge(nodes[3],nodes[5]))
g.addEdge(Edge(nodes[0],nodes[2]))
g.addEdge(Edge(nodes[1],nodes[0]))
g.addEdge(Edge(nodes[3],nodes[1]))
g.addEdge(Edge(nodes[4],nodes[0]))
sp = shortestPath(g, nodes[0], nodes[5])
print('Shortest path found by DFS:', printPath(sp))
Note :: this code is from this book enter link description here
They are not the same
path += [start] is equivalent to path.extend([start]) -- it mutates path.
On the other hand
path = path + [start] creates a new list and names it start.
Consider the following experiment, and note the IDs:
>>> a = [1]
>>> id(a)
55937672
>>> a += [2,3]
>>> id(a)
55937672
>>> b = [1]
>>> id(b)
55930440
>>> b = b + [1,2]
>>> id(b)
55937288
The ID of b changed but the ID of a didn't.
As to why it makes a difference in your code -- DFS is a function. In the version which uses path += [start], you are modifying the passed parameter path -- and this modification persists after the call returns. On the other hand, in the version which uses path = path + [start], you are creating a new local variable named path, one which goes out of scope when the call returns, without any changes to the parameter path.
In line
path=path+[start]
you create new list object.
In line
path+=[start]
you modify list object that already exists.
You can try this:
path2=path[:]
path2+=[start]

How does this Python graph / tree traversal work?

def find_all_paths(graph, start, end, path=[]):
path = path + [start]
if start == end:
return [path]
if not graph.has_key(start):
return []
paths = []
for node in graph[start]:
if node not in path:
newpaths = find_all_paths(graph, node, end, path)
for newpath in newpaths:
paths.append(newpath)
return paths
I cannot understand the recursive part of this function. I've been looking at it for a few hours now but it is still very confusing to me. Would someone be able to explain this function in a ELI5 way? Thanks

Categories