I have the following directed graph and every node has one or more attributes. I try to modify bfs algorithm to find all possible paths from a starting node until the given attributes are covered. I also want the path that i found to not be a part of a cycle.
For this graph if i start from node 1 and i want to cover attr 4 the paths that my algorithm will find are:
1-2-3
1-2-5-3
1-2-5-6-8
If i add the edge 3-1 then the paths 1-2-3 and 1-2-5-3 i want not to be accepted because are part of a cycle. So in my algorithm i try to check the neighbors of the last visited node and if the neighbor has already visited then i try to discard this path but my algorithm doesnt work in this case. If i add the edge 3-1 it returns the same paths. How can i fix this?
Here is my code:
G = nx.DiGraph()
G.add_edge(1,2)
G.add_edge(2,3)
G.add_edge(2,5)
G.add_edge(3,4)
G.add_edge(5,3)
G.add_edge(5,6)
G.add_edge(5,7)
G.add_edge(6,8)
G.add_edge(3,1)
def checkIfRequiredAttrsAreCovered(path, attrsToBeCovered):
coveredAttrs = []
counter = 0
for node in path:
coveredAttrs.extend(G.node[node]['attrs'])
for i in attrsToBeCovered:
if i in coveredAttrs:
counter = counter + 1
if counter == len(attrsToBeCovered):
return True
else:
return False
def bfs(G, startingNode, attrsToBeCovered):
paths = []
q = queue.Queue()
q.put([startingNode])
while not q.empty():
v = q.get()
if checkIfRequiredAttrsAreCovered(v, attrsToBeCovered) == True:
for i in G.neighbors(v[-1]):
if i in v:
break
paths.append(v) #print(v)
else:
for node in G.neighbors(v[-1]):
if node not in v:
path = []
path.extend(v)
path.append(node)
q.put(path)
print(paths)
I'll assume that you don't care if nodes are part of a bigger cycle. E.g. if 4 is connected to 1 and 3 is in a cycle 1-2-3-4. If you want to handle this, you may start a dfs from each matching node, with the current path set as visited.
First, you should use snake case in Python
Second, you should use sets to compare the attributes covered to the attributes to be covered. For a path, compute the set of covered attributes and compare the sets:
def check_if_required_attrs_are_covered(G, path, attrs_to_be_covered): # be sure to pass G as an argument here
covered_attrs = set([G.node[n]['attrs'] for n in path])
return covered_attrs >= attrs_to_be_covered
Third, some remarks on the bfs function:
A test if b == True: is equivalent to if b:, because for a boolean b == (b == True) (try with True and False to convince yourself)
The way you append a path to q may be shortened to q.put(v+ [node])
You probably do not need a synchonized queue: use a list
Use return instead of print or even better, create a generator that yields paths when they are found.
Four: what is the problem? Look at the for i in G.neighbors(v[-1]): loop.
Whether you break or not, you go to the line paths.append(v).
That's why you do not exclude the paths with cycles. You want to distinguish the normal end of the loop from the break.
That's a perfect case for confidential loop syntax in Python: the for...else loop.
I quote the doc: "a loop’s else clause runs when no break occurs". This gives the following code:
for i in G.neighbors(v[-1]):
if i in v:
break
else: # no neighbor from v[-1] in v
yield v # instead of paths.append(v)
But you can also use any for a more natural expression:
if not any(i in v for i in G.neighbors(v[-1])):
yield v # instead of paths.append(v)
This gives the following code:
def bfs(G, starting_node, attrs_to_be_covered):
q = [[starting_node]]
while q:
v = q.pop()
if check_if_required_attrs_are_covered(G, v, attrs_to_be_covered): # be sure to pass G as an argument
if not any(i in v for i in G.neighbors(v[-1])):
yield v
else:
for node in G.neighbors(v[-1]):
if node not in v:
q.append(v+ [node])
Try it with:
print (list(bfs(G, 1, set(["attr4"]))))
Related
I'm solving this LeetCode problem and here's my code:
class Solution:
def alienOrder(self, words: List[str]) -> str:
adjacent = defaultdict(set)
for i in range(1, len(words)):
w1, w2 = words[i - 1], words[i]
min_len = min(len(w1), len(w2))
if len(w1) > len(w2) and w1[:min_len] == w2[:min_len]:
return ""
for j in range(min_len):
if w1[j] != w2[j]:
adjacent[w1[j]].add(w2[j]) # modify, not in the loop causing error
break
visited = dict()
res = []
def dfs(c):
if c in visited:
return visited[c]
visited[c] = True
for neighbor in adjacent[c]: # read only
if dfs(neighbor):
return True
visited[c] = False
res.append(c)
return False
for c in adjacent: # RuntimeError: dictionary changed size during iteration
if dfs(c):
return ''
return ''.join(reversed(res))
The line for c in adjacent throws "RuntimeError: dictionary changed size during iteration", which I don't understand. I'm not modifying adjacent in dfs(), am I?
The main Problem is when dfs method is called it uses this line
for neighbor in adjacent[c]:
This just returns the associated value if it exists in defaultdict, if it doesn't exist in it, it creates and adds a key if you try to access key that doesn't exist.
Potential line that triggers accessing adjacent defaultdict without knowing whether it exist or not is
if dfs(neighbor):
neighbor might be or might not be in adjacent defaultdict this causes defaultdict to change. You might check if it exists if not u might want to skip.
#gilf0yle points out the problem is with defaultdict potentially inserting new keys. The solution is to "freeze" it before iteration by casting to a normal dict
adjacent = dict(adjacent) # No more default key insertion from here on
for c in adjacent:
if dfs(c):
return ''
return ''.join(reversed(res))
I am using DFS to get all routes between two nodes.
My python code is as follows:
graph = {0: [1, 2, 3],
1: [3],
2: [0, 1],
3: []}
def DFS(start, stop, path=[], visited=[]):
global count
global result
# add the visited node to path
path.append(start)
# mark this node visited to avoid infinite loop
visited.append(start)
# found
if start == stop:
print(path)
else:
# if not found
values = graph.get(start)
for next_ in values:
# not visited node
if not next_ in visited:
DFS(next_, stop, path, visited)
# remove the node from path and unmarked it
path.remove(start)
visited.remove(start)
The problem is that if I print path in if start == stop, all 3 routes can be printed correctly.
>>> DFS(2, 3)
[2, 0, 1, 3]
[2, 0, 3]
[2, 1, 3]
But if I change to return path in if start == stop, it would return nothing.
def DFS(start, stop, path=[], visited=[]):
global count
global result
# add the visited node to path
path.append(start)
# mark this node visited to avoid infinite loop
visited.append(start)
# found
if start == stop:
return path
else:
# if not found
values = graph.get(start)
for next_ in values:
# not visited node
if not next_ in visited:
DFS(next_, stop, path, visited)
# remove the node from path and unmarked it
path.remove(start)
visited.remove(start)
>>> result = DFS(2, 3)
>>> result
But if I change to return path in if start == stop, it would return nothing.
Right; because you got to this level of recursion from the previous one, which recursively called DFS(next_, stop, path, visited)... and ignored the result.
It is the same as if you called functions normally:
def inner():
return "hello"
def outer():
inner() # oops, it is not returned.
print(outer()) # None
In general you want to return the results from your recursive calls; but your case is a little special because you need to accumulate the results from multiple recursive calls (for next_ in values:). You could build a list and return it, but this is a bit tricky:
if start == stop:
result = [path] # for uniformity, we need a list of paths in this case too.
# Also, we can't `return` here, because we'll miss the cleanup at the end.
else:
result = []
values = graph.get(start)
for next_ in values:
# BTW, Python treats `not in` as a single operator that does
# what we want here. It's preferred because it's easier to read.
if next_ not in visited:
# add results from the recursive call to our result.
result.extend(DFS(next_, stop, path, visited))
# it is `.extend` and not `.append` here because otherwise we will
# build a tree of nested lists - do you understand why?
# Either way, we want to do our cleanup, and return the collected result.
path.remove(start)
visited.remove(start)
return result # important!
Tricky, right?
My preferred solution for these situations, therefore, is to write a recursive generator, and collect the results outside the recursion:
# Inside the function, we do:
if start == stop:
yield path
else:
values = graph.get(start)
for next_ in values:
if next_ not in visited:
yield from DFS(next_, stop, path, visited))
path.remove(start)
visited.remove(start)
# Then when we call the function, collect the results:
paths = list(DFS(2, 3))
# Or iterate over them directly:
for path in DFS(2, 3):
print("For example, you could take this route:", path)
(Also, the comment you received was good advice. Recursion is a lot easier to understand when you don't try to mutate the arguments and clean up afterwards. Instead, always pass those arguments, and when you make the recursive call, pass a modified version. When the recursion returns, cleanup is automatic, because you just go back to using the old object in the old stack frame.
The problem with your code is that
result=DFS(3,2)
will only return a valid result if start=stop which is not the case as 3!=2.
To get the desired output you have to change the line
DFS(next_,stop,path,visited)
to
return DFS(next_,stop,path,visited)
Now whenever start gets equal to stop the path will be returned and this value will be propogated upwards
I'm working on a homework problem in which we have to write an algorithm that can determine if a graph is bipartite or not. My python solution works, but right now it throws an exception if the graph is not bipartite, instead I would like it to return a bool. How could I modify this code?
def is_bipartite(v, visited, colors, counter):
print(v)
# Set this vertex to visited
visited[v] = True
colors[v] = counter % 2
# Explore links
for u in v.links:
# If linked up node u has already been visited, check its color to see if it breaks
# the bipartite of the graph
if u in visited:
if colors[v] == colors[u]:
raise Exception("Not Bipartite")
# If the link has not be visited then visit it
if u not in visited:
visited[u] = False
is_bipartite(u, visited, colors, counter + 1)
If I understand your code correctly, you want to return False if you get matching colors anywhere along your recursive search. You want to return True if you get to the end of the search without finding anything.
That is not too hard to do. Just change the raise statement to return False and check the result of the recursive calls, and return False if any of them return a False result. Then just put return True at the end of the function and you're done:
def is_bipartite(v, visited, colors, counter):
visited[v] = True
colors[v] = counter % 2
for u in v.links:
if u in visited:
if colors[v] == colors[u]:
return False # return instead of raise in this base case
if u not in visited:
visited[u] = False
if not is_bipartite(u, visited, colors, counter + 1): # check the recursion
return False # pass on any False
return True # return True only if you got to the end without returning False above
I have created an algorithm whose purpose should be of, given two nodes A and B in a BST, it switches the roles (or positions in the tree) of the two by simply moving pointers. In my representation of a BST, I am using a double linked connection (i.e. A.parent == B and (B.left == A) or (B.right == A)). I am not sure if it's completely correct or not. I have divided the algorithm in two situations.
A and B are directly connected (either A is the parent of B or B the parent of A)
All the other cases
For each of the previous cases I have created a nested function. I would like to have your opinion on the first the correctness of the algorithms and if I can somehow then improve it. Here's the code:
def switch(self, x: BSTNode, y: BSTNode, search_first=False):
if not x:
raise ValueError("x cannot be None.")
if not y:
raise ValueError("y cannot be None.")
if x == y:
raise ValueError("x cannot be equal to y")
if search_first:
if not self.search(x.key) or not self.search(y.key):
raise LookupError("x or y not found.")
def switch_1(p, s):
"""Switches the roles of p and s,
where p (parent) is the direct parent of s (son)."""
assert s.parent == p
if s.is_left_child():
p.left = s.left
if s.left:
s.left.parent = p
s.left = p
s.right, p.right = p.right, s.right
if s.right:
s.right.parent = s
if p.right:
p.right.parent = p
else:
p.right = s.right
if s.right:
s.right.parent = p
s.right = p
s.left, p.left = p.left, s.left
if s.left:
s.left.parent = s
if p.left:
p.left.parent = p
if p.parent:
if p.is_left_child():
p.parent.left = s
else:
p.parent.right = s
else: # p is the root
self.root = s
s.parent = p.parent
p.parent = s
def switch_2(u, v):
"""u and v are nodes in the tree
that are not related by a parent-son
or a grandparent-son relantionships."""
if not u.parent:
self.root = v
if v.is_left_child():
v.parent.left = u
else:
v.parent.right = u
elif not v.parent:
self.root = u
if u.is_left_child():
u.parent.left = v
else:
u.parent.right = v
else: # neither u nor v is the root
if u.is_left_child():
if v.is_left_child():
v.parent.left, u.parent.left = u, v
else:
v.parent.right, u.parent.left = u, v
else:
if v.is_left_child():
v.parent.left, u.parent.right = u, v
else:
v.parent.right, u.parent.right = u, v
v.parent, u.parent = u.parent, v.parent
u.left, v.left = v.left, u.left
u.right, v.right = v.right, u.right
if u.left:
u.left.parent = u
if u.right:
u.right.parent = u
if v.left:
v.left.parent = v
if v.right:
v.right.parent = v
if x.parent == y:
switch_1(y, x)
elif y.parent == x:
switch_1(x, y)
else:
switch_2(x, y)
I really need that switch works in all cases no matter which nodes x or y we choose. I have already done some tests, and it seems to work, but I am still not sure.
EDIT
Eventually, if it's helpful somehow, here you have the complete implementation of my BST (with the tests I am doing):
https://github.com/dossan/ands/blob/master/ands/ds/BST.py
EDIT 2 (just a curiosity)
#Rishav commented:
I do not understand the intention behind this function.. if it is to swap two nodes in the BST, is it not sufficient to swap their data instead of manipulating pointers?
I answered:
Ok, maybe I should have added a little bit more about the reason behind all this "monster" function. I can insert BSTNode objects or any comparable objects in my BST. When the user decides to insert any comparable object, the responsibility of creating the BSTNode is mine, therefore the user has no access to a initial BSTNode reference, unless they search for the key. But a BSTNode would only be returned after the insertion of the key, or there's already another BSTNode object in the tree with the same key (or value), but this latter case is irrelevant.
The user can also insert a BSTNode object in the tree which has an initial (and should remain constant) key (or value). Nevertheless, if I just exchanged the values or keys of the nodes, the user would have a reference to a node with a different key then the key of the node he inserted. Of course, I want to avoid this.
you need proper unit testing. I recommend python-nose - very easy to use.
As for the test vectors I'd recommend using every potential combination of two nodes a and b:
In the case of BST trees you have 3 types of nodes:
leaf node,
1-child node,
2-children node.
in combination with the following additional cases:
a is root, or
a is the parent of b,
a is not the parent of b.
and their combinations as well (also in the symmetric situation).
then after swapping you'll need to check all the nodes involved i.e.:
a,b, children of a and b, parents of a and b if everything went as planned.
I'd do that using a small tree that contains all the types of nodes.
Then go through all possible combinations of the nodes and swap the nodes and check against the expected outcome, and then swap again to bring the tree back to its original state.
[ EDIT ]
If your question was how to avoid all the tedious work. You may consider looking for some well established BST implementation and compare results with your function. Vectors can be created automatically by using a prepared tree and generating all possible pairs of nodes of this tree.
[/EDIT]
As for the unwanted input to the function. You'll need to use your imagination although in my opinion you have most of the cases covered. Except the one that Austin Hastings mentions where at least on of the input nodes does not belong to the tree.
I found an old version of the same function written for one of my private projects, maybe you can find it useful:
def swap( a, b ):
if a == b: return
if a is None or b is None: return
#if a not in self or b not in self: return
if b.parent == a:
a, b = b, a
#swap connections naively
a.parent, b.parent = b.parent, a.parent
a.left, b.left = b.left, a.left
a.right, b.right = b.right, a.right
if b.parent == b: #b was the p of a
b.parent = a
if a.parent is not None:
if a.parent.left == b: a.parent.left = a
else: a.parent.right = a
else:
self.root = a
if b.parent is not None:
if b.parent.left == a: b.parent.left = b
else: b.parent.right = b
else:
self.root = b
if a.right is not None: a.right.parent = a
if a.left is not None: a.left.parent = a
if b.right is not None: b.right.parent = b
if b.left is not None: b.left.parent = b
and performance optimised version:
def swap_opt( a, b ):
if a == b: return
if a is None or b is None: return
#if a not in self or b not in self: return
if b.p == a:
a, b = b, a
#swap connections naively
a.p, b.p = b.p, a.p
a.l, b.l = b.l, a.l
a.r, b.r = b.r, a.r
if b.p == b: #b was the p of a
b.p = a
if a.l == a:
a.l = b
if a.r is not None: a.r.p = a
else:
a.r = b
if a.l is not None: a.l.p = a
if b.r is not None: b.r.p = b
if b.l is not None: b.l.p = b
if a.p is not None:
if a.p.l == b: a.p.l = a
else: a.p.r = a
else:
#set new root to a
pass
else:
if a.r is not None: a.r.p = a
if a.l is not None: a.l.p = a
if b.r is not None: b.r.p = b
if b.l is not None: b.l.p = b
if a.p is not None:
if a.p.l == b: a.p.l = a
else: a.p.r = a
else:
#set new root to a
pass
if b.p is not None:
if b.p.l == a: b.p.l = b
else: b.p.r = b
else:
#set new root to b
pass
I haven't done proper unit tests for this code - it worked as I expected it to. I was more interested in performance differences between the implementations.
swap_opt handles neighbouring nodes a bit faster giving it around 5% of speed increase over the compact implementation of swap. [EDIT2] But that depends on the tree used for testing and hardware [/EDIT2]
Your BST.py defines class BST. Members of that class have an element, self.root that can point to a node. Your code, as shown, does not account for this.
I believe you need to handle these cases:
Swap the root node with one of its children.
Swap the root node with a non-child.
Swap a non-root node with one of its children.
Swap a non-root node with a non-child non-root node.
Edit: After re-examining switch_1, I think you do handle all the cases.
Also, there is the possibility that a caller could request you swap a node that is not a member of the tree for a node that is a member. Or swap two nodes that are both not members of the current tree. It would cost some code to detect these cases, but you could probably get by with a dict or set to trace tree membership. I don't know if you want to consider "swap-ins" as a valid operation or not.
In several places you compare nodes using ==. That is an operation that can be overridden. You should use is and is not for identity comparisons and comparisons against None.
Finally, please consider Pythonifying your BST class. It is a mutable iterable container, so it should support the standard operations as much as possible.
I've been playing with BST (binary search tree) and I'm wondering how to do an early exit. Following is the code I've written to find kth smallest. It recursively calls the child node's find_smallest_at_k, stack is just a list passed into the function to add all the elements in inorder. Currently this solution walks all the nodes inorder and then I have to select the kth item from "stack" outside this function.
def find_smallest_at_k(self, k, stack, i):
if self is None:
return i
if (self.left is not None):
i = self.left.find_smallest_at_k(k, stack, i)
print(stack, i)
stack.insert(i, self.data)
i += 1
if i == k:
print(stack[k - 1])
print "Returning"
if (self.right is not None):
i = self.right.find_smallest_at_k(k, stack, i)
return i
It's called like this,
our_stack = []
self.root.find_smallest_at_k(k, our_stack, 0)
return our_stack[k-1]
I'm not sure if it's possible to exit early from that function. If my k is say 1, I don't really have to walk all the nodes then find the first element. It also doesn't feel right to pass list from outside function - feels like passing pointers to a function in C. Could anyone suggest better alternatives than what I've done so far?
Passing list as arguments: Passing the list as argument can be good practice, if you make your function tail-recursive. Otherwise it's pointless. With BST where there are two potential recursive function calls to be done, it's a bit of a tall ask.
Else you can just return the list. I don't see the necessity of variable i. Anyway if you absolutely need to return multiples values, you can always use tuples like this return i, stack and this i, stack = root.find_smallest_at_k(k).
Fast-forwarding: For the fast-forwarding, note the right nodes of a BST parent node are always bigger than the parent. Thus if you descend the tree always on the right children, you'll end up with a growing sequence of values. Thus the first k values of that sequence are necessarily the smallest, so it's pointless to go right k times or more in a sequence.
Even in the middle of you descend you go left at times, it's pointless to go more than k times on the right. The BST properties ensures that if you go right, ALL subsequent numbers below in the hierarchy will be greater than the parent. Thus going right k times or more is useless.
Code: Here is a pseudo-python code quickly made. It's not tested.
def findKSmallest( self, k, rightSteps=0 ):
if rightSteps >= k: #We went right more than k times
return []
leftSmallest = self.left.findKSmallest( k, rightSteps ) if self.left != None else []
rightSmallest = self.right.findKSmallest( k, rightSteps + 1 ) if self.right != None else []
mySmallest = sorted( leftSmallest + [self.data] + rightSmallest )
return mySmallest[:k]
EDIT The other version, following my comment.
def findKSmallest( self, k ):
if k == 0:
return []
leftSmallest = self.left.findKSmallest( k ) if self.left != None else []
rightSmallest = self.right.findKSmallest( k - 1 ) if self.right != None else []
mySmallest = sorted( leftSmallest + [self.data] + rightSmallest )
return mySmallest[:k]
Note that if k==1, this is indeed the search of the smallest element. Any move to the right, will immediately returns [], which contributes to nothing.
As said Lærne, you have to care about turning your function into a tail-recursive one; then you may be interested by using a continuation-passing style. Thus your function could be able to call either itself or the "escape" function. I wrote a module called tco for optimizing tail-calls; see https://github.com/baruchel/tco
Hope it can help.
Here is another approach: it doesn't exit recursion early, instead it prevents additional function calls if not needed, which is essentially what you're trying to achieve.
class Node:
def __init__(self, v):
self.v = v
self.left = None
self.right = None
def find_smallest_at_k(root, k):
res = [None]
count = [k]
def helper(root):
if root is None:
return
helper(root.left)
count[0] -= 1
if count[0] == 0:
print("found it!")
res[0] = root
return
if count[0] > 0:
print("visiting right")
find(root.right)
helper(root)
return res[0].v
If you want to exit as soon as earlier possible, then use exit(0).
This will make your task easy!