Python DFS recursive function retaining values from previous call - python

I'm sorry if the title is misleading, but I could not put it in any other way.
I am trying to implement bfs and dfs in order to remember some concepts, but and odd behavior is going on with the recursive versions of the codes.
This is what is happening:
def rec_dfs(g, start_node, visited=[]):
visited.append(start_node)
for next_node in g[start_node]:
if next_node not in visited:
rec_dfs(g, next_node, visited)
return visited
graph2={'A': ['B', 'C', 'D'],
'B': ['A', 'E', 'F'],
'C': ['A', 'F'],
'D': ['A'],
'E': ['B'],
'F': ['B', 'C']}
rec_dfs(graph2, "A") #['A', 'B', 'E', 'F', 'C', 'D'] OK
rec_dfs(graph2, "A") #['A', 'B', 'E', 'F', 'C', 'D', 'A'] NOK
rec_dfs(graph2, "A") #['A', 'B', 'E', 'F', 'C', 'D', 'A', 'A'] NOK
It should always return the first case, but when I investigated I could see that the second call already had "visited" populated.
If I call the function like:
rec_dfs(graph2, "A", []) #['A', 'B', 'E', 'F', 'C', 'D'] OK
rec_dfs(graph2, "A", []) #['A', 'B', 'E', 'F', 'C', 'D'] OK
rec_dfs(graph2, "A", []) #['A', 'B', 'E', 'F', 'C', 'D'] OK
it works just fine...
I would really appreciate if someone could explain why this behavior is happening, and if there is a way to avoid it.
Thanks!

You're using visited array as a mutable default argument which is essentially initialized to an empty array only once at definition according to http://code.activestate.com/recipes/577786-smarter-default-arguments/.
During each subsequent call to rec_dfs(), if visited array is not explicitly re-initialized, it will maintain its state during each subsequent function call.

Related

Path finder for network link wiring - follow-up

See the initial post on code review
Thanks to #Graipher who proposed the library called networkx in Python for answering my question. My code is now improved and cleaner:
# Path finder improved
class Edge:
# An Edge is a link (physical, radio, logical) between two assets/nodes/vertices
def __init__(self, sku, e1, e2, re1, re2):
# The SKU is the unique ID of the edge
# An edge two vertices that can be inversable (undirected edge)
self.sku = sku
self.sku_endpoint_1 = e1
self.sku_endpoint_2 = e2
self.reverse_sku_endpoint_1 = re1
self.reverse_sku_endpoint_2 = re2
# We can instanciante a edge like that
edge1 = Edge("Edge1","A", "B", "B", "A")
edge2 = Edge("Edge2","B", "C", "C", "B")
edge3 = Edge("Edge3","A", "C", "C", "A")
edge4 = Edge("Edge4","C", "D", "D", "C")
edge5 = Edge("Edge5","B", "E", "E", "B")
edge6 = Edge("Edge6","D", "E", "E", "D")
edges = [edge1, edge2, edge3, edge4, edge5, edge6]
# And then we can find all paths using #Graipher method
def solve(edges, source, target):
g = nx.Graph() # bi-directional edges.
for edge in edges:
g.add_edge(edge.sku_endpoint_1, edge.sku_endpoint_2, sku=edge.sku)
paths = nx.all_simple_paths(g, source=source, target=target)
index = 0
paths_dict = {}
# Creating the dict of paths with only the edgesku
for path in map(nx.utils.pairwise, paths):
paths_dict[index] = []
for edge in path:
paths_dict[index].append(g.edges[edge]["sku"])
index+=1
return paths_dict
But now, what about finding all paths with repeated nodes, but without repeating the same edge? I now see that the networkx library is explicitly not repeating nodes while searching paths...
But consider the following graph:
g.add_edges_from([("A", "B", {"sku": "Edge1"}),
("B", "C", {"sku": "Edge2"}),
("A", "C", {"sku": "Edge3"}),
("C", "D", {"sku": "Edge4"}),
("B", "E", {"sku": "Edge5"}),
("D", "E", {"sku": "Edge6"}),
("C", "E", {"sku": "Edge7"})]
The graph we see looks like that:
When we want to find all paths from A to D we also want find a path even if it uses an already discovered node (here it's C). The only rule we want is, not add a path that has the same edge aleady used (to prevent an infinite loop).
In this example one path that matching these rules for A to D is:
A->C : "Edge3"
C->E : "Edge7"
E->B : "Edge5"
B->C : "Edge2"
C->D : "Edge4"
Is there a way to do that with this library? Because with my code (see previous post on codereview) I was able to find these paths. But that's not very optimised because the program searches ALL paths and only then I remove duplicated and non meaningful paths.
Here is an attempt, but it's not so great since it doesn't track back all the way to 'a' and re-search all paths via a -> c etc...
If you swap the order of ['b', 'c'] you will get the example path you specified in your question here... Not ideal since it doesn't scale, but hopefully this might show you where I'm headed with this...
graph = {
'a': ['b','c'],
'b': ['a', 'c', 'e'],
'c': ['a','b','d','e'],
'd': ['c','e'],
'e': ['b','c','d']
}
def non_simple_paths(g, u, v):
from collections import defaultdict
paths = []
path = []
edges_used = defaultdict(bool)
def dfs(g, u, v):
for n in g[u]:
e = (u, n)
re = (n,u)
if edges_used[re] or edges_used[e]:
continue
elif v in e:
c = path[:]
c.append(e)
paths.append(c)
edges_used[e] = True
edges_used[re] = True
else:
path.append(e)
edges_used[e] = True
edges_used[re] = True
dfs(g, n, v)
if path:
path.pop() # going back to parent
return
dfs(g, u, v)
return paths
# ================================
ps = non_simple_paths(graph, 'a', 'd')
print(ps)
I thought a while about this interesting problem, but unfortunately don’t have a great answer.
1. Approach
My first approach was based on the following observation (I’m calling the the paths you are looking for edge-simple):
Each simple path is obviously edge-simple.
Each edge-simple graph can be reduced to a simple path by removing the cycles (loops) between multiple nodes.
To illustrate the 2. point, look at the path you used as an example path A-C-E-B-C-D. It has the node C twice, and, after removing the corresponding cycle C-E-B-C, it is simple: A-C-D.
My idea was to use the simple paths between two nodes as a basis for the edge-simple ones
simple_paths = list(nx.all_simple_paths(G, 'A', 'D'))
and add cycles to the nodes it contains (here constructed via the (full) corresponding directed graph)
H = nx.DiGraph()
H.add_edges_from(list(G.edges) + [(edge[1], edge[0]) for edge in G.edges])
cycles_basis = [cycle for cycle in nx.simple_cycles(H) if len(cycle) > 2]
But I got lost on the way through the second part ...
2. Approach
I ended up with a second approach that resembles the one given by #JordanSimba:
import networkx as nx
def remove_edge(G, node1, node2):
return G.edge_subgraph(list(set(G.edges)
.difference({(node1, node2), (node2, node1)})))
def all_edge_simple_paths(G, source, target):
paths = []
if source == target:
paths.append([source])
for node in G[source]:
G_sub = remove_edge(G, source, node)
if node in G_sub.nodes and target in G_sub.nodes:
paths.extend([[source] + path
for path in all_edge_simple_paths(G_sub, node, target)])
else:
if node == target:
paths.append([source, target])
return paths
With your graph
G = nx.Graph()
G.add_edges_from([('A', 'B'), ('B', 'C'), ('A', 'C'), ('C', 'D'), ('B', 'E'),
('D', 'E'), ('C', 'E')])
the result (all_edge_simple_paths(G, 'A', 'D')) is
[['A', 'B', 'C', 'D'],
['A', 'B', 'C', 'E', 'D'],
['A', 'B', 'E', 'D'],
['A', 'B', 'E', 'C', 'D'],
['A', 'C', 'B', 'E', 'D'],
['A', 'C', 'B', 'E', 'C', 'D'],
['A', 'C', 'D'],
['A', 'C', 'E', 'B', 'C', 'D'],
['A', 'C', 'E', 'D']]
If a small cycle gets added onto node D
G.add_edges_from([('D', 'F'), ('F', 'G'), ('G', 'D')])
the results includes it (both directions through the cycle)
[['A', 'B', 'C', 'D'],
['A', 'B', 'C', 'D', 'F', 'G', 'D'],
['A', 'B', 'C', 'D', 'G', 'F', 'D'],
['A', 'B', 'C', 'E', 'D'],
['A', 'B', 'C', 'E', 'D', 'F', 'G', 'D'],
['A', 'B', 'C', 'E', 'D', 'G', 'F', 'D'],
['A', 'B', 'E', 'D'],
['A', 'B', 'E', 'D', 'F', 'G', 'D'],
['A', 'B', 'E', 'D', 'G', 'F', 'D'],
['A', 'B', 'E', 'C', 'D'],
['A', 'B', 'E', 'C', 'D', 'F', 'G', 'D'],
['A', 'B', 'E', 'C', 'D', 'G', 'F', 'D'],
['A', 'C', 'B', 'E', 'D'],
['A', 'C', 'B', 'E', 'D', 'F', 'G', 'D'],
['A', 'C', 'B', 'E', 'D', 'G', 'F', 'D'],
['A', 'C', 'B', 'E', 'C', 'D'],
['A', 'C', 'B', 'E', 'C', 'D', 'F', 'G', 'D'],
['A', 'C', 'B', 'E', 'C', 'D', 'G', 'F', 'D'],
['A', 'C', 'D'],
['A', 'C', 'D', 'F', 'G', 'D'],
['A', 'C', 'D', 'G', 'F', 'D'],
['A', 'C', 'E', 'B', 'C', 'D'],
['A', 'C', 'E', 'B', 'C', 'D', 'F', 'G', 'D'],
['A', 'C', 'E', 'B', 'C', 'D', 'G', 'F', 'D'],
['A', 'C', 'E', 'D'],
['A', 'C', 'E', 'D', 'F', 'G', 'D'],
['A', 'C', 'E', 'D', 'G', 'F', 'D']]
To be honest: I’m not 100% sure it works correctly (for all situations). I just don't have the time for extensive testing. And the number of paths grows rapidly with increasing graph size, which makes it hard to keep track on what's going on.
And I have serious doubts regarding the efficiency. Someone pointed out to me recently that working with the subgraph view could slow things down. So maybe only a lower level implementation might produce the speed you’re looking for.
But maybe it helps.

Python code to list dependencies, avoiding loops

Say you have a dictionary describing item dependencies, along the lines of:
deps = {
'A': ['B', 'C', 'D'],
'B': ['C', 'E'],
'C': ['D', 'F'],
'D': ['C', 'G'],
'E': ['A'],
'H': ['N'],
}
meaning that item 'A' depends on items 'B', 'C', and 'D', etc. Obviously, this could be of arbitrary complexity.
How do you write a function get_all_deps(item) that gives you a list of all the dependencies of item, without duplicates and without item. E.g.:
> get_all_deps('H')
['N']
> get_all_deps('A')
['B', 'C', 'D', 'E', 'F', 'G']
> get_all_deps('E')
['A', 'B', 'C', 'D', 'F', 'G']
I'm looking for concise code - ideally a single recursive function. Performance is not terribly important for my use case - we're talking about fairly small dependency graphs (e.g. a few dozen items)
you can use a stack/todo list to avoid recursive implementation:
deps = {
'A': ['B', 'C', 'D'],
'B': ['C', 'E'],
'C': ['D', 'F'],
'D': ['C', 'G'],
'E': ['A'],
'H': ['N'],
}
def get_all_deps(item):
todo = set(deps[item])
rval = set()
while todo:
subitem = todo.pop()
if subitem != item: # don't add start item to the list
rval.add(subitem)
to_add = set(deps.get(subitem,[]))
todo.update(to_add.difference(rval))
return sorted(rval)
print(get_all_deps('A'))
print(get_all_deps('E'))
print(get_all_deps('H'))
result:
['B', 'C', 'D', 'E', 'F', 'G']
['A', 'B', 'C', 'D', 'F', 'G']
['N']
todo set contains the elements to be processed.
Pop one element and put it in return value list
Loop until no more elements (okay there's a loop in here)
add only the elements to process if they're not already in the return value.
return sorted list
The set difference avoids the problem with cyclic dependencies, and the "max recursion depth" is avoided. Only limit is system memory.

Compare a users input list to a set list in order with duplicates

I am trying to take a set of answers either 'A' 'B' 'C' or 'D' in a specific order such as a multiple choice test and have the user input his answers. After I would like it to create a third list and print out what was right and wrong. Here is what I have so far.
userAnswersList = []
correctAnswers = ['A', 'C', 'A', 'A', 'D', 'B', 'C', 'A', 'C', 'B', 'A', 'D', 'C', 'A', 'D', 'C', 'B', 'B', 'D', 'A']
while len(userAnswersList) <= 19:
userAnswers = input('Give me each answer total of 20 questions I\'ll let you know how many you missed.')
userAnswersList.append(userAnswers.upper())
correctedList = []
for i in userAnswersList:
if i in correctAnswers:
correctedList.append(i)
else:
correctedList.append('XX')
print(correctedList)
So my end result would be the corrected list with a 'X' in place where they missed the answer, If it is right it just puts the user input in that place.
So after the user input their 20 answers it would look like
['A', 'C', 'A', 'XX', 'D', 'B', 'C', 'XX', 'C', 'B', 'A', 'XX', 'C', 'A', 'D', 'XX', 'B', 'B', 'XX', 'A']
if they missed 5 questions in that order
EDIT
Thank you again for all your help I was able to solve my problems with your help and some great answers. I used Nicks solution as that is how we are learning it.
I will try out others just so I can get used to them.
Rather than using:
for i in userAnswersList:
you may find it easier to iterate through the array and check if the values are equal, such as:
for i in range(len(userAnswersList)):
if userAnswersList[i] == correctAnswers[i]:
correctedList.append(userAnswersList[i])
else:
correctedList.append('XX')
There is no question here, so I'll assume you're asking what's wrong with what you have.
The compare section uses the same variable i in both lists but even if it was different it wouldn't work.
You'll need something along the following lines:
for i in range(len(correctAnswers)):
correctedList.append(correctAnswers[i] if userAnswersList[i] == correctAnswers[i] else 'XX')
This can be done using Python's map method.
As explained in the help for map:
map(...)
map(function, sequence[, sequence, ...]) -> list
Return a list of the results of applying the function to the items of
the argument sequence(s). If more than one sequence is given, the
function is called with an argument list consisting of the corresponding
item of each sequence, substituting None for missing values when not all
sequences have the same length. If the function is None, return a list of
the items of the sequence (or a list of tuples if more than one sequence).
So in that case, you want to compare each item of your two equal lists, and apply a condition against them. The condition we will introduce will follow this logic:
if something in one list is not equal to the other, set 'XX', otherwise return the value.
So, we will introduce what is called a "lambda" function here to put that above condition. Here is documentation on what a lambda is: http://www.python-course.eu/lambda.php
lambda x, y: 'XX' if x != y else y
So, when we put it all together, we have this:
d = map(lambda x, y: 'XX' if x != y else y, userAnswersList, correctAnswers)
Demo:
correctAnswers = ['A', 'C', 'A', 'A', 'D', 'B', 'C', 'A', 'C', 'B', 'A', 'D', 'C', 'A', 'D', 'C', 'B', 'B', 'D', 'A']
userAnswersList = ['A', 'C', 'A', 'B', 'D', 'B', 'C', 'A', 'A', 'B', 'A', 'C', 'C', 'A', 'D', 'C', 'D', 'C', 'D', 'B']
Result:
['A', 'C', 'A', 'XX', 'D', 'B', 'C', 'A', 'XX', 'B', 'A', 'XX', 'C', 'A', 'D', 'C', 'XX', 'XX', 'D', 'XX']
You can zip the two lists together and check the elements at common indexes, you can get all the answers using a list comprehension replacing your while with range:
userAnswersList = [input('Give me each answer total of 20 questions I\'ll let you know how many you missed.').upper()
for _ in range(20)]
correctAnswers = ['A', 'C', 'A', 'A', 'D', 'B', 'C', 'A', 'C', 'B', 'A', 'D', 'C', 'A', 'D', 'C', 'B', 'B', 'D', 'A']
correctedList = ["XX" if u_a != c_a else u_a for u_a, c_a in zip(userAnswersList)]
If the corresponding elements from each list are the same we add the letter, if not we add "XX" to mark an incorrect answer.

Prevent doctest incorrect failure when unordered output is involved

Suppose we have the function below:
def f(x):
"""
Turns x into a set.
>>> given_x = ['a', 'b', 'c', 'd', 'e', 'f']
>>> f(given_x)
{'a', 'b', 'c', 'd', 'e', 'f'}
"""
return set(x)
Running the doctest will (usually) cause something like this:
Failure
**********************************************************************
File "/home/black/Dev/exp_2.py", line 6, in f
Failed example:
f(given_x)
Expected:
{'a', 'b', 'c', 'd', 'e', 'f'}
Got:
{'d', 'e', 'f', 'c', 'a', 'b'}
Apparently this failure shouldn't have happened since the function works as expected, but it did because of the result being unordered.
My actual function's output can be much more complex than this. It could be a dict with dicts, sets, lists inside of it.
I need a general solution (if there is one). Simply sort() on the presented example would not solve my real case problem.
Question:
How can I prevent the doctest from (incorrectly) failing when unordered output is involved?
Why not just move the expected output up so you're testing for equality, with an expected output of "True"?
def f(x):
"""
Turns x into a set.
>>> given_x = ['a', 'b', 'c', 'd', 'e', 'f']
>>> f(given_x) == {'a', 'b', 'c', 'd', 'e', 'f'}
True
"""
return set(x)
Output:
Trying:
given_x = ['a', 'b', 'c', 'd', 'e', 'f']
Expecting nothing
ok
Trying:
f(given_x) == {'a', 'b', 'c', 'd', 'e', 'f'}
Expecting:
True
ok
Order is not guaranteed fr python sets, so you can't rely on it.
I'd force to have something you could trust:
>>> given_x = ['z','a', 'a', 'b', 'c', 'd', 'e', 'f']
>>> type(f(given_x))
<type 'dict'>
>>> sorted(list(f(given(x)))
['a', 'b', 'c', 'd', 'e', 'f','z']
I test that it has expected type, that it actually did the "uniqueness" of the set and the result is what I expect

Iteratively collect first two elements of each vector of a matrix

I have a matrix:
matrix = [['F', 'B', 'F', 'A', 'C', 'F'],
['D', 'E', 'B', 'E', 'B', 'E'],
['F', 'A', 'D', 'B', 'F', 'B'],
['B', 'E', 'F', 'B', 'D', 'D']]
I want to remove and collect the first two elements of each sub-list, and add them to a new list.
so far i have got:
while messagecypher:
for vector in messagecypher:
final.extend(vector[:2])
the problem is; the slice doesn't seem to remove the elements, and I end up with a huge list of repeated chars. I could use .pop(0) twice, but that isn't very clean.
NOTE: the reason i remove the elements is becuase i need to keep going over each vector until the matrix is empty
You can keep your slice and do:
final = []
for i in range(len(matrix)):
matrix[i], final = matrix[i][:2], final + matrix[i][2:]
Note that this simultaneously assigns the sliced list back to matrix and adds the sliced-off part to final.
Well you can use a list comprehension to get the thing done, but its perhaps counter-intuitive:
>>> matrix = [['F', 'B', 'F', 'A', 'C', 'F'],
['D', 'E', 'B', 'E', 'B', 'E'],
['F', 'A', 'D', 'B', 'F', 'B'],
['B', 'E', 'F', 'B', 'D', 'D']]
>>> while [] not in matrix: print([i for var in matrix for i in [var.pop(0), var.pop(0)]])
['F', 'B', 'D', 'E', 'F', 'A', 'B', 'E']
['F', 'A', 'B', 'E', 'D', 'B', 'F', 'B']
['C', 'F', 'B', 'E', 'F', 'B', 'D', 'D']
EDIT:
Using range makes the syntax look cleaner:
>>> matrix = [['C', 'B', 'B', 'D', 'F', 'B'], ['D', 'B', 'B', 'A', 'B', 'A'], ['B', 'D', 'E', 'F', 'C', 'B'], ['B', 'A', 'C', 'B', 'E', 'F']]
>>> while [] not in matrix: print([var.pop(0) for var in matrix for i in range(2)])
['C', 'B', 'D', 'B', 'B', 'D', 'B', 'A']
['B', 'D', 'B', 'A', 'E', 'F', 'C', 'B']
['F', 'B', 'B', 'A', 'C', 'B', 'E', 'F']
Deleting elements is not an efficient way to go about your task. It requires Python to perform a lot of unnecessary work shifting things around to fill the holes left by the deleted elements. Instead, just shift your slice over by two places each time through the loop:
final = []
for i in xrange(0, len(messagecypher[0]), 2):
for vector in messagecypher:
final.extend(vector[i:i+2])

Categories