How does this Python graph / tree traversal work?

How does this Python graph / tree traversal work? - python

def find_all_paths(graph, start, end, path=[]):
path = path + [start]
if start == end:
return [path]
if not graph.has_key(start):
return []
paths = []
for node in graph[start]:
if node not in path:
newpaths = find_all_paths(graph, node, end, path)
for newpath in newpaths:
paths.append(newpath)
return paths
I cannot understand the recursive part of this function. I've been looking at it for a few hours now but it is still very confusing to me. Would someone be able to explain this function in a ELI5 way? Thanks

Related

Discrepancy of list append in python

I am getting a different result when I am using append(path) vs. append(list(path))
I have the following code to find all paths for a sum:
class TreeNode:
def __init__(self, val, left=None, right=None):
self.val = val
self.left = left
self.right = right
def find_paths(root, sum):
allPaths = []
dfs(root, sum, [], allPaths)
return allPaths
def dfs(root, sum, path, res):
if not root:
return
path.append(root.val)
if root.val == sum and root.left is None and root.left is None:
res.append(path)
dfs(root.left, sum - root.val, path, res)
dfs(root.right, sum - root.val, path, res)
del path[-1]
def main():
root = TreeNode(12)
root.left = TreeNode(7)
root.right = TreeNode(1)
root.left.left = TreeNode(4)
root.right.left = TreeNode(10)
root.right.right = TreeNode(5)
sum = 23
print("Tree paths with sum " + str(sum) +
": " + str(find_paths(root, sum)))
main()
This has the following output:
Tree paths with sum 23: [[], []]
But if I change the res.append(path) to res.append(list(path)) which will then return the correct answer Tree paths with sum 23: [[12, 7, 4], [12, 1, 10]]. I am confused on why using the list operation would change the answer.

res.append(path) appends the object path itself to the list res. After that line, when you modify the path object (like del path[-1]), the modification is also applied to the appended object in res, because, well, they are the same object.
list(path), on the other hand, "copies" the path. So this one is now a different object from path. When you modify path after that, the modification does not propagates to this different object.
You will have the same result if you do path[:] or path.copy() instead of list(path).

res.append(path) appends the actual path object, not a copy of it. So if path changes later on, the change will appear in res also.
res.append(list(path)) appends a copy.

my implementation of dijkstra algorithm in python using recursion

i'm beginner and this my implementation can you give you opinion about this algorithm .
I managed to build this algorithm after 3 days
I found lots of implementation but i found them confusing .
and I wanted to build it by myself.
class Graph:
def __init__(self, size):
self.edges = {}
self.cost = {}
def addNode(self, node):
self.edges[node] = {}
def addEdge(self, node1, node2,w):
self.edges[node1][node2] = w
def getSub(self, node):
return self.edges[node]
def getCost(self,node):
return self.cost[node]
def setCost(self, node1, node2, edge):
if self.getCost(node1) + edge < self.getCost(node2):
self.cost[node2] = self.getCost(node1) + edge
def getDistance(self,node1, node2,c):
return self.getSub(node1)[node2]+c
# this function travel all graph and update cost of each node
def Dijkstra(self, start, visited):
visited +=[start]
for child in self.getSub(start):
self.setCost(start, child, self.getSub(start)[child])
for node in self.getSub(start):
if node not in visited:
self.Dijkstra(node,visited)
# after we set cost for each node/ver we need this fun to find small dis
def Dijkstra_helper(self, start, end, paths, dis = 0):
paths += [start]
if start == end:
return paths
for node in self.getSub(start):
if end in paths:
break
if node not in paths:
new_dis = self.getDistance(start, node, dis)
"""
S -- 1-- c
\2 this just part of graph example if we come from A. the distance from A to B is 6 > 3 that's mean we should
A--6-->B(3) come from C if statement prevent as from continue in this path
"""
if new_dis <= self.getCost(node) and new_dis <=self.getCost(end):
self.Dijkstra_helper(node, end, paths, new_dis)
return paths
if __name__ == "__main__":
nodes = ["S","B","C","D",'E']
g = Graph(len(nodes))
for node in nodes:
g.addNode(node)
g.cost["S"] = 0
infinity = float('inf')
for n in nodes[1:]:
g.cost[n] = infinity
g.addEdge("S", "D",1)
g.addEdge('S', "B",6)
g.addEdge("B", "C",5)
g.addEdge("D", "E",1)
g.addEdge("D", "B",2)
g.addEdge("E", "B",2)
g.addEdge("E", "C",5)
g.Dijkstra("S",[])
print(g.cost)
print(g.Dijkstra_helper("S","C",[]))
I tested this algorithm in it works .but there's only one problem the shortest path depend on how you build your graph like if you put S--> D if the first it not like hen you put S --> B you can test the algorithm To understand what I want to say. Is there any way I could further optimize it?

How come these 2 implementations of dfs give different results?

This one gives the correct result:
def binaryTree(root):
paths = []
def dfs(root, path=""):
if root:
if path != "":
path += "->"
path += str(root.val)
if not root.left and not root.right:
paths.append(path)
dfs(root.left, path)
dfs(root.right, path)
dfs(root)
return paths # gives ['1->2->4', '1->2->5', '1->3']
And in this one the list of path keeps growing:
def binaryTree2(root):
paths = []
def dfs(root, path=[]):
if root:
path.append(root.val)
if not root.left and not root.right:
paths.append("->".join(map(str, path)))
dfs(root.left, path)
dfs(root.right, path)
dfs(root)
return paths # gives ['1->2->4', '1->2->4->5', '1->2->4->5->3']
The tree is like this: <1, <2, <4, None, None>, <5, None, None>>, <3, None, None>>
The only difference is that in one I concatenate strings and in the other I append to list.

So in the first implementation: All path += ... statements essentially create a new string and have path point to it.
As for the second implementation you have a single list that is passed around all the time. You should pop back the node right before dfs returns.
def binaryTree2(root):
paths = []
def dfs(root, path=[]):
if root:
path.append(root.val)
if not root.left and not root.right:
paths.append("->".join(map(str, path)))
dfs(root.left, path)
dfs(root.right, path)
path.pop() # this clears your stack as your functions return
dfs(root)
return paths
Edit: Python strings are immutable - i.e. once created, they can't be modified.
# below line essentially creates a pointer,
# and a string object that `path` points to.
path = "huhu"
# this creates another string object `huhu123`.
# So at this point we have 3 strings objects,
# "123", "huhu" and "huhu123". And a pointer `path` if you will.
# `path` points to "huhu123"
path += "123"
If we had more innocent objects instead of strings, once they are left with no references, they'd be garbage collected. Strings get special treatment, in our case all 3 of them are interned.

Print all paths of a binary tree (DFS)

I was trying to print all paths(root-to-leaf paths) of binary tree but of no avail.
My strategy is to use the recursion, having the base case as either tree is None or tree node is leaf return Otherwise, traverse through left and right of the tree.
But I can't find a way to retain both of left and right tree.
def pathSum(self, root, target, result):
if not root:
return []
if not root.left and not root.right:
return [root.val]
for i in [root.left, root.right]:
path = [root.val] + self.pathSum(i, target, result)
print("path", path)
return path

The idea is building the path (list) at each node visit, if current node is a leaf, add current to path and print it, if no, just add current to extend the path:
def pathSum(self, path):
if not self.left and not self.right:
print(path + [self.val])
return
self.left.pathSum(path + [self.val])
self.right.pathSum(path + [self.val])
root.pathSum([])
Update: If you want to keep all paths:
def pathSum(self, current_path, all_paths):
if not self.left and not self.right:
print('Path found: ' + str(current_path + [self.val]))
all_paths.append(current_path + [self.val])
return
self.left.pathSum(current_path + [self.val], all_paths)
self.right.pathSum(current_path + [self.val], all_paths)
all_paths = []
root.pathSum([], all_paths)
print('All paths: ' + str(all_paths))

Through some iterations, I found the following solution works. But I'm not sure if there's a more efficient way of finding all leaf-root paths.
The idea behind this solution is pre-order traversal
def allPaths(self, root, path, all_path):
if not root.left and not root.right:
path.append(root.val)
all_path.append(path[:])
return
if root:
path.append(root.val)
self.allPaths(root.left, path, all_path)
path.pop(-1)
self.allPaths(root.right, path, all_path)
path.pop(-1)
return all_path

Adding to list a class instance

I'm implementing a code to find the shortest path between two nodes, but
why when I change the first line of the DFS function the output change too .
Isn't it true that
path += [start] is equivalent to path = path + [start]?
the output before changing is ::
Current DFS path: 0
Current DFS path: 0->1
Current DFS path: 0->1->2
Current DFS path: 0->1->2->3
Current DFS path: 0->1->2->3->4
Current DFS path: 0->1->2->3->5
Current DFS path: 0->1->2->4
Current DFS path: 0->2
Current DFS path: 0->2->3
Current DFS path: 0->2->3->1
Current DFS path: 0->2->3->4
Current DFS path: 0->2->3->5
Current DFS path: 0->2->4
shortest path is 0->2->3->5
after changing is ::
Current DFS path: 0
Current DFS path: 0->1
Current DFS path: 0->1->2
Current DFS path: 0->1->2->3
Current DFS path: 0->1->2->3->4
Current DFS path: 0->1->2->3->4->5
shortest path is 0->1->2->3->4->5
The code ::
class Node(object):
def __init__(self, name):
"""Assumes name is a string"""
self.name = name
def getName(self):
return self.name
def __str__(self):
return self.name
class Edge(object):
def __init__(self, src, dest):
"""Assumes src and dest are nodes"""
self.src = src
self.dest = dest
def getSource(self):
return self.src
def getDestination(self):
return self.dest
def __str__(self):
return self.src.getName() + '->' + self.dest.getName()
class WeightedEdge(Edge):
def __init__(self, src, dest, weight = 1.0):
"""Assumes src and dest are nodes, weight a number"""
self.src = src
self.dest = dest
self.weight = weight
def getWeight(self):
return self.weight
def __str__(self):
return self.src.getName() + '->(' + str(self.weight) + ')'\
+ self.dest.getName()
#Figure 12.8
class Digraph(object):
#nodes is a list of the nodes in the graph
#edges is a dict mapping each node to a list of its children
def __init__(self):
self.nodes = []
self.edges = {}
def addNode(self, node):
if node in self.nodes:
raise ValueError('Duplicate node')
else:
self.nodes.append(node)
self.edges[node] = []
def addEdge(self, edge):
src = edge.getSource()
dest = edge.getDestination()
if not (src in self.nodes and dest in self.nodes):
raise ValueError('Node not in graph')
self.edges[src].append(dest)
def childrenOf(self, node):
return self.edges[node]
def hasNode(self, node):
return node in self.nodes
def __str__(self):
result = ''
for src in self.nodes:
for dest in self.edges[src]:
result = result + src.getName() + '->'\
+ dest.getName() + '\n'
return result[:-1] #omit final newline
class Graph(Digraph):
def addEdge(self, edge):
Digraph.addEdge(self, edge)
rev = Edge(edge.getDestination(), edge.getSource())
Digraph.addEdge(self, rev)
#Figure 12.9
def printPath(path):
"""Assumes path is a list of nodes"""
result = ''
for i in range(len(path)):
result = result + str(path[i])
if i != len(path) - 1:
result = result + '->'
return result
def DFS(graph, start, end, path, shortest, toPrint = False):
"""Assumes graph is a Digraph; start and end are nodes;
path and shortest are lists of nodes
Returns a shortest path from start to end in graph"""
path = path + [start]
if toPrint:
print('Current DFS path:', printPath(path))
if start == end:
return path
for node in graph.childrenOf(start):
if node not in path: #avoid cycles
if shortest == None or len(path) < len(shortest):
newPath = DFS(graph, node, end, path, shortest,
toPrint)
if newPath != None:
shortest = newPath
return shortest
def shortestPath(graph, start, end, toPrint = False):
"""Assumes graph is a Digraph; start and end are nodes
Returns a shortest path from start to end in graph"""
return DFS(graph, start, end, [], None, toPrint)
#Figure 12.10
def testSP():
nodes = []
for name in range(6): #Create 6 nodes
nodes.append(Node(str(name)))
g = Digraph()
for n in nodes:
g.addNode(n)
g.addEdge(Edge(nodes[0],nodes[1]))
g.addEdge(Edge(nodes[1],nodes[2]))
g.addEdge(Edge(nodes[2],nodes[3]))
g.addEdge(Edge(nodes[2],nodes[4]))
g.addEdge(Edge(nodes[3],nodes[4]))
g.addEdge(Edge(nodes[3],nodes[5]))
g.addEdge(Edge(nodes[0],nodes[2]))
g.addEdge(Edge(nodes[1],nodes[0]))
g.addEdge(Edge(nodes[3],nodes[1]))
g.addEdge(Edge(nodes[4],nodes[0]))
sp = shortestPath(g, nodes[0], nodes[5])
print('Shortest path found by DFS:', printPath(sp))
Note :: this code is from this book enter link description here

They are not the same
path += [start] is equivalent to path.extend([start]) -- it mutates path.
On the other hand
path = path + [start] creates a new list and names it start.
Consider the following experiment, and note the IDs:
>>> a = [1]
>>> id(a)
55937672
>>> a += [2,3]
>>> id(a)
55937672
>>> b = [1]
>>> id(b)
55930440
>>> b = b + [1,2]
>>> id(b)
55937288
The ID of b changed but the ID of a didn't.
As to why it makes a difference in your code -- DFS is a function. In the version which uses path += [start], you are modifying the passed parameter path -- and this modification persists after the call returns. On the other hand, in the version which uses path = path + [start], you are creating a new local variable named path, one which goes out of scope when the call returns, without any changes to the parameter path.

In line
path=path+[start]
you create new list object.
In line
path+=[start]
you modify list object that already exists.
You can try this:
path2=path[:]
path2+=[start]

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

How does this Python graph / tree traversal work? - python

Related

Discrepancy of list append in python

my implementation of dijkstra algorithm in python using recursion

How come these 2 implementations of dfs give different results?

Print all paths of a binary tree (DFS)

Adding to list a class instance

Categories

Resources