Find edges in a cycle networkx python - python

I would like to make an algorithm to find if an edge belongs to a cycle, in an undirected graph, using networkx in Python.
I am thinking to use cycle_basis and get all the cycles in the graph.
My problem is that cycle_basis returns a list of nodes. How can I convert them to edges?

You can construct the edges from the cycle by connecting adjacent nodes.
In [1]: import networkx as nx
In [2]: G = nx.Graph()
In [3]: G.add_cycle([1,2,3,4])
In [4]: G.add_cycle([10,20,30])
In [5]: basis = nx.cycle_basis(G)
In [6]: basis
Out[6]: [[2, 3, 4, 1], [20, 30, 10]]
In [7]: edges = [zip(nodes,(nodes[1:]+nodes[:1])) for nodes in basis]
In [8]: edges
Out[8]: [[(2, 3), (3, 4), (4, 1), (1, 2)], [(20, 30), (30, 10), (10, 20)]]

Here is my take at it, using just lambda functions (I love lambda functions!):
import networkx as nx
G = nx.Graph()
G.add_cycle([1,2,3,4])
G.add_cycle([10,20,30])
G.add_edge(1,10)
in_path = lambda e, path: (e[0], e[1]) in path or (e[1], e[0]) in path
cycle_to_path = lambda path: list(zip(path+path[:1], path[1:] + path[:1]))
in_a_cycle = lambda e, cycle: in_path(e, cycle_to_path(cycle))
in_any_cycle = lambda e, g: any(in_a_cycle(e, c) for c in nx.cycle_basis(g))
for edge in G.edges():
print(edge, 'in a cycle:', in_any_cycle(edge, G))

in case you don't find a nice solution, here's an ugly one.
with edges() you can get a list of edges that are adjacent to nodes in a cycle. unfortunately, this includes edges adjacent to nodes outside the cycle
you can now filter the list of edges by removing those which connect nodes that are not part of the cycle.
please keep us posted if you find a less wasteful solution.

With the help of Aric, and a little trick to check both directions, I finally did this that looks ok.
import networkx as nx
G = nx.Graph()
G.add_cycle([1,2,3,4])
G.add_cycle([10,20,30])
G.add_edge(1,10)
def edge_in_cycle(edge, graph):
u, v = edge
basis = nx.cycle_basis(graph)
edges = [zip(nodes,(nodes[1:]+nodes[:1])) for nodes in basis]
found = False
for cycle in edges:
if (u, v) in cycle or (v, u) in cycle:
found = True
return found
for edge in G.edges():
print edge, 'in a cycle:', edge_in_cycle(edge, G)
output:
(1, 2) in a cycle: True
(1, 4) in a cycle: True
(1, 10) in a cycle: False
(2, 3) in a cycle: True
(3, 4) in a cycle: True
(10, 20) in a cycle: True
(10, 30) in a cycle: True
(20, 30) in a cycle: True

You can directly obtain the edges in a cycle with the find_cycle method. If you want to test if an edge belongs to a cycle, you should check if both of its vertices are part of the same cycle.
Using the example in the answers above:
import networkx as nx
G = nx.Graph()
G.add_cycle([1,2,3,4])
G.add_cycle([10,20,30])
G.add_edge(1,10)
nx.find_cycle(G, 1) # [(1, 2), (2, 3), (3, 4), (4, 1)]
nx.find_cycle(G, 10) # [(10, 20), (20, 30), (30, 10)]
On the other hand, the edge (2, 3) (or (3, 2) as your graph is undirected) is part of a cycle defined first:
nx.find_cycle(G, 2) # [(2, 1), (1, 4), (4, 3), (3, 2)]

Related

Finding a closed path from list of start and end nodes

I have a list of edges (E) of a graph with nodes V = [1,2,3,4,5,6]:
E = [(1,2), (1,5), (2,3), (3,1), (5,6), (6,1)]
where each tuple (a,b) refers to the start & end node of the edge respectively.
If I know the edges form a closed path in graph G, can I recover the path?
Note that E is not the set of all edges of the graph. Its just a set of edges.
In this example, the path would be 1->2->3->1->5->6->1
A naive approach, I can think of is using a tree where I start with a node, say 1, then I look at all tuples that start with 1, here, (1,2) and (1,5). Then I have two branches, and with nodes as 2 & 5, I continue the process till I end at the starting node at a branch.
How to code this efficiently in python?
The networkx package has a function that can generate the desired circuit for you in linear time...
It is possible, that construction of nx.MultiDiGraph() is slower and not such efficient, as desired in question, or usage of external packages for only one function is rather excessive. If it is so, there is another way.
Plan: firstly we will find some way from start_node to start_node, then we will insert all loops, that were not visited yet.
from itertools import chain
from collections import defaultdict, deque
from typing import Tuple, List, Iterable, Iterator, DefaultDict, Deque
def retrieve_closed_path(arcs: List[Tuple[int, int]], start_node: int = 1) -> Iterator[int]:
if not arcs:
return iter([])
# for each node `u` carries queue of its
# neighbours to be visited from node `u`
d: DefaultDict[int, Deque[int]] = defaultdict(deque)
for u, v in arcs:
# deque pop and append complexity is O(1)
d[u].append(v)
def _dfs(node) -> Iterator[int]:
out: Iterator[int] = iter([])
# guarantee, that all queues
# will be emptied at the end
while d[node]:
# chain returns an iterator and helps to
# avoid unnecessary memory reallocations
out = chain([node], _dfs(d[node].pop()), out)
# if we return in this loop from recursive call, then
# `out` already carries some (node, ...) and we need
# only to insert all other loops which start at `node`
return out
return chain(_dfs(start_node), [start_node])
def path_to_string(path: Iterable[int]) -> str:
return '->'.join(str(x) for x in path)
Examples:
E = [(1, 2), (2, 1)]
p = retrieve_closed_path(E, 1)
print(path_to_string(p))
>> 1->2->1
E = [(1, 2), (1, 5), (2, 3), (3, 1), (5, 6), (6, 1)]
p = retrieve_closed_path(E, 1)
print(path_to_string(p))
>> 1->5->6->1->2->3->1
E = [(1, 2), (2, 3), (3, 4), (4, 2), (2, 1)]
p = retrieve_closed_path(E, 1)
print(path_to_string(p))
>> 1->2->3->4->2->1
E = [(5, 1), (1, 5), (5, 2), (2, 5), (5, 1), (1, 4), (4, 5)]
p = retrieve_closed_path(E, 1)
print(path_to_string())
>> 1->4->5->1->5->2->5->1
You're looking for a directed Eulerian circuit in your (sub)graph. An Eulerian circuit is a trail that visits every edge exactly once.
The networkx package has a function that can generate the desired circuit for you in linear time:
import networkx as nx
edges = [(1,2), (1,5), (2,3), (3,1), (5,6), (6,1)]
G = nx.MultiDiGraph()
G.add_edges_from(edges)
# Prints [(1, 5), (5, 6), (6, 1), (1, 2), (2, 3), (3, 1)]
# which matches the desired output (as asked in the comments).
print([edge for edge in nx.algorithms.euler.eulerian_circuit(G)])
The documentation cites a 1973 paper, if you're interested in understanding how the algorithm works. You can also take a look at the source code here. Note that we're working with multigraphs here, since you can have multiple edges that have the same source and destination node. There are probably other implementations floating around on the Internet, but they may or may not work for multigraphs.

Get networkx subgraph containing all nodes in between

I have a networkx DiGraph and I want to extract a subgraph from it by passing in a list of nodes. The subgraph however can contain all nodes that might be in between the nodes that I have passed. I checked nx.subgraph() but it does not work like I intend to. As for a small example:
import networkx as nx
G = nx.DiGraph()
edges = [(7, 4), (3, 8), (3, 2), (3, 0), (3, 1), (7, 5), (7, 6), (7, 8)]
G.add_edges_from(edges)
H = get_subgraph(G, [0,6,7,8])
How can I write the function get_subgraph() so that H has the edges [(3, 8), (3, 0), (7, 6), (7, 8)]?
The subgraph I need is such that it contains all the nodes that are in the ougoing and incoming paths between the nodes that I pass in the get_subgraph()function.
A way to do this could be to find the longest path length between the specified set of nodes, and then find the corresponding induced subgraph containing all nodes in the path. However, being a directed graph, there will be no direct path between say nodes 3 and 7. So we need to find the paths in an undirected copy of the graph. Let's set up the problem:
G = nx.DiGraph()
edges = [(7, 4), (3, 8), (3, 2), (3, 0), (3, 1), (7, 5), (7, 6), (7, 8)]
G.add_edges_from(edges)
plt.figure(figsize=(10,6))
pos = nx.spring_layout(G, scale=20, k=3/np.sqrt(G.order()))
nx.draw(G, pos, node_color='lightblue',
with_labels=True,
node_size=1500,
arrowsize=20)
Now we ca obtain and undirected copy of the graph with nx.to_undirected and find all nx.shortest_path_length for the specified nodes:
from itertools import combinations
H = nx.to_undirected(G)
nodelist = [0,6,7,8]
paths = {}
for nodes in combinations(nodelist, r=2):
paths[nodes] = nx.shortest_path_length(H, *nodes)
print(paths)
# {(0, 6): 4, (0, 7): 3, (0, 8): 2, (6, 7): 1, (6, 8): 2, (7, 8): 1}
We can find the longest path in the undirected graph with:
max_path = max(paths.items(), key=lambda x: x[1])[0]
longest_induced_path = nx.shortest_path(H, *max_path)
And the corresponding induced subgraph can be obtained with Graph.subgraph:
sG = nx.subgraph(G, longest_induced_path)
pos = nx.spring_layout(sG, scale=20, k=3/np.sqrt(G.order()))
nx.draw(sG, pos, node_color='lightblue',
with_labels=True,
node_size=1500,
arrowsize=20)
i understand this from question:
you need all nodes in a path but provide some nodes of that path and algorithm should give all nodes of that path and then you can pass that nodes to a graph and make a new graph.
it should be what you want:
1. you must iterate over all pairs of nodes with this method:
from itertools import combinations
b= combinations('ABCD', 2)
print(list(b)) --> [('A', 'B'), ('A', 'C'), ('A', 'D'), ('B', 'C'), ('B', 'D'), ('C', 'D')]
you must get all pathes with this:
https://networkx.github.io/documentation/stable/reference/algorithms/simple_paths.html
you must select path with maximum nodes and that is your solution.

How to iterate through edges in a graph if they are not in a pre-selected list

I am filtering a subset of edges so I can iterate through them. In this case, I am excluding the "end edges", which are the final edges along a chain:
import networkx as nx
graph = nx.Graph()
graph.add_edges_from([(0, 1), (1, 2), (2, 3), (3, 4)])
end_nodes = [n for n in graph.nodes if nx.degree(graph, n) == 1]
end_edges = graph.edges(end_nodes)
print(f"end edges: {end_edges}")
for edge in graph.edges:
if edge not in end_edges:
print(f"edge {edge} is not an end edge.")
else:
print(f"edge {edge} is an end edge.")
However, when you run this code, you get the following output:
end edges: [(0, 1), (4, 3)]
edge (0, 1) is an end edge.
edge (1, 2) is an end edge.
edge (2, 3) is an end edge.
edge (3, 4) is an end edge.
Edges (1, 2) and (2, 3) are not in end_edges, yet it returns False when the conditional edge not in end_edges is checked (seeming to imply that it is in fact included, when it seems to not be).
What is going on, and how can I filter this properly?
Python version is 3.7, NetworkX is 2.4.
You can convert end_nodes to a set of edges and keep the edges unordered.
>>> graph = nx.Graph()
>>> graph.add_edges_from([(0, 1), (1, 2), (2, 3), (3, 4)])
>>> end_nodes = [n for n in graph.nodes if nx.degree(graph, n) == 1]
>>> end_edges = set(map(frozenset, graph.edges(end_nodes)))
>>> end_edges
{frozenset({3, 4}), frozenset({0, 1})}
>>> for edge in graph.edges:
... print(edge, frozenset(edge) in end_edges)
...
(0, 1) True
(1, 2) False
(2, 3) False
(3, 4) True
import networkx as nx
graph = nx.Graph()
graph.add_edges_from([(0, 1), (1, 2), (2, 3), (3, 4)])
end_nodes = [n for n in graph.nodes if nx.degree(graph, n) == 1]
end_edges = graph.edges(end_nodes)
print(f"end edges: {end_edges}")
for edge in graph.edges:
if edge not in list(end_edges):
print(f"edge {edge} is not an end edge.")
else:
print(f"edge {edge} is an end edge.")
This should return what you ask for.

Graph isomorphism with constraints on the edges using networkx

I would like to define my own isomorphism of two graphs. I want to check if two graphs are isomorphic given that each edge has some attribute --- basically the order of placing each edge. I wonder if one can use the method:
networkx.is_isomorphic(G1,G2, edge_match=some_callable)
somehow by defining function some_callable().
For example, the following graphs are isomorphic, because you can relabel the nodes to obtain one from another.
Namely, relabel [2<->3].
But, the following graphs are not isomorphic.
There is no way to obtain one from another by re-labeling the nodes.
Here you go. This is exactly what the edge_match option is for doing. I'll create 3 graphs the first two are isomorphic (even though the weights have different names --- I've set the comparison function to account for that). The third is not isomorphic.
import networkx as nx
G1 = nx.Graph()
G1.add_weighted_edges_from([(0,1,0), (0,2,1), (0,3,2)], weight = 'aardvark')
G2 = nx.Graph()
G2.add_weighted_edges_from([(0,1,0), (0,2,2), (0,3,1)], weight = 'baboon')
G3 = nx.Graph()
G3.add_weighted_edges_from([(0,1,0), (0,2,2), (0,3,2)], weight = 'baboon')
def comparison(D1, D2):
#for an edge u,v in first graph and x,y in second graph
#this tests if the attribute 'aardvark' of edge u,v is the
#same as the attribute 'baboon' of edge x,y.
return D1['aardvark'] == D2['baboon']
nx.is_isomorphic(G1, G2, edge_match = comparison)
> True
nx.is_isomorphic(G1, G3, edge_match = comparison)
> False
Here answer the problem specifically in the question, with the very same graphs. Note that I'm using the networkx.MultiGraph and consider some 'ordering' in placing those edges.
import networkx as nx
G1,G2,G3,G4=nx.MultiGraph(),nx.MultiGraph(),nx.MultiGraph(),nx.MultiGraph()
G1.add_weighted_edges_from([(0, 1, 0), (0, 2, 1), (0, 3, 2)], weight='ordering')
G2.add_weighted_edges_from([(0, 1, 0), (0, 3, 1), (0, 2, 2)], weight='ordering')
G3.add_weighted_edges_from([(0, 1, 0), (0, 1, 1), (2, 3, 2)], weight='ordering')
G4.add_weighted_edges_from([(0, 1, 0), (2, 3, 1), (0, 1, 2)], weight='ordering')
def comparison(D1,D2):
return D1[0]['ordering'] == D2[0]['ordering']
nx.is_isomorphic(G1,G2, edge_match=comparison)
>True
nx.is_isomorphic(G3,G4, edge_match=comparison)
>False

Strange interferences bewteen Heapq module and dictionary

On one hand, I have a grid defaultdict that stores the neighboring nodes of each node on a grid and its weight (all 1 in the example below).
node (w nbr_node)
grid = { 0: [(1, -5), (1, -4), (1, -3), (1, -1), (1, 1), (1, 3), (1, 4), (1, 5)],
1: [(1, -4), (1, -3), (1, -2), (1, 0), (1, 2), (1, 4), (1, 5), (1, 6)],
2: [(1, -3), (1, -2), (1, -1), (1, 1), (1, 3), (1, 5), (1, 6), (1, 7)],
3: [(1, -2), (1, -1), (1, 0), (1, 2), (1, 4), (1, 6), (1, 7), (1, 8)],
...
}
On the other, I have a Djisktra function that computes the shortest path between 2 nodes on this grid. The algorithm uses the heapq module and works perfectly fine.
import heapq
def Dijkstra(s, e, grid): #startpoint, endpoint, grid
visited = set()
distances = {s: 0}
p = {}
queue = [(0, s)]
while queue != []:
weight, node = heappop(queue)
if node in visited:
continue
visited.add(node)
for n_weight, n_node in grid[node]:
if n_node in visited:
continue
total = weight + n_weight
if n_node not in distances or distances[n_node] > total:
distances[n_node] = total
heappush(queue, (total, n_node))
p[n_node] = node
Problem: when calling the Djikstra function multiple times, heappush is... adding new keys in the grid dictionary for no reason !
Here is a MCVE:
from collections import defaultdict
# Creating the dictionnary
grid = defaultdict(list)
N = 4
kernel = (-N-1, -N, -N+1, -1, 1, N-1, N, N+1)
for i in range(N*N):
for n in kernel:
if i > N and i < (N*N) - 1 - N and (i%N) > 0 and (i%N) < N - 1:
grid[i].append((1, i+n))
# Calling Djikstra multiple times
keys = [*range(N*N)]
while keys:
k1, k2 = random.sample(keys, 2)
Dijkstra(k1, k2, grid)
keys.remove(k1)
keys.remove(k2)
The original grid defaultdict:
dict_keys([5, 6, 9, 10])
...and after calling the Djikstra function multiple times:
dict_keys([5, 6, 9, 10, 4, 0, 1, 2, 8, 3, 7, 11, 12, 13, 14, 15])
When calling the Djikstra function multiple times without heappush (just commenting heappush at the end):
dict_keys([5, 6, 9, 10])
Question:
How can I avoid this strange behavior ?
Please note that I'm using Python 2.7 and can't use numpy.
I could reproduce and fix. The problem is in the way you are building grid: it contains values that are not in keys from -4 to 0 and from 16 to 20 in the example. So you push those inexistant nodes on the head, and later pop them.
And you end in executing for n_weight, n_node in grid[node]: where node does not (still) exists in grid. As grid is a defaultdict, a new node is automatically inserted with an empty list as value.
The fix is trivial (at least for the example data): it is enough to ensure that all nodes added as value is grid exist as key with a modulo:
for i in range(N*N):
for n in kernel:
grid[i].append((1, (i+n + N + 1)%(N*N)))
But even for real data it should not be very hard to ensure that all nodes existing in grid values also exist in keys...
BTW, if grid had been a simple dict the error would have been immediate with a KeyError on grid[node].

Categories