How do I calculate the Graph Edit Distance with networkx(Python)? - python

I am working with the graph edit distance; According to the definition it is the minimum sum of costs to transform the original graph G1 into a graph that is isomorphic to G2;
The graph edit operations typically include:
vertex insertion to introduce a single new labeled vertex to a graph.
vertex deletion to remove a single (often disconnected) vertex from a graph.
vertex substitution to change the label (or color) of a given vertex.
edge insertion to introduce a new colored edge between a pair of vertices.
edge deletion to remove a single edge between a pair of vertices.
edge substitution to change the label (or color) of a given edge.
Now I want to use the implementation networkx has - I do not have any edge labels, the node set of G1 and G2 is the same and I do not want a graph isomorphic to G2 but I want G2 itself;
This is mainly because G1: 1->2->3 and G2: 3->2->1 are isomorphic to each other but if the nodes represent some events, from a perspective of causality, they are very very different;
So in this context, I've been running a test like the following:
import networkx as nx
G=nx.DiGraph()
G.add_node(1)
G.add_node(2)
G.add_node(3)
G.add_edges_from([(1, 2),(2,3)])
G2=nx.DiGraph()
G2.add_node(1)
G2.add_node(2)
G2.add_node(3)
G2.add_edges_from([(3, 2),(2, 1)])
nx.graph_edit_distance(G,G2)
But it returns that the distance is zero which makes sense because the graphs are isomorphic to each other;
So I tried to set up node_match but still no luck
import networkx as nx
def nmatch(n1, n2):
return n1==n2
G=nx.DiGraph()
G.add_node(1)
G.add_node(2)
G.add_node(3)
G.add_edges_from([(1, 2),(2,3)])
G2=nx.DiGraph()
G2.add_node(1)
G2.add_node(2)
G2.add_node(3)
G2.add_edges_from([(3, 2),(2, 1)])
nx.graph_edit_distance(G,G2, node_match=nmatch)
If we assume a cost of 1 to delete or add an edge/ vertex, then the edit distance should be 4, because we can:
delete both edges in G, add the 2 edges from G2
How would it be suitable to calculate the edit distance not considering isomorphy but really equivalence?

It doesn't seem that you are comparing what you want to. n1 and n2 in nmatch are always {}. From documentation
(...) That is, the function will receive the node attribute dictionaries for n1 and n2 as inputs.
you are not comparing the nodes object, but dictionaries associated with them (as any data you need)
You can add your custom data to that dictionary when adding nodes, for example:
import networkx as nx
def nmatch(n1, n2):
return n1==n2
G=nx.DiGraph()
G.add_node(1, id=1)
G.add_node(2, id=2)
G.add_node(3, id=3)
G.add_edges_from([(1, 2),(2,3)])
G2=nx.DiGraph()
G2.add_node(1, id=1)
G2.add_node(2, id=2)
G2.add_node(3, id=3)
G2.add_edges_from([(3, 2),(2,1)])
nx.graph_edit_distance(G,G2, node_match=nmatch)
returns 2, as you can do 2 edge substitutions. You could probably increase substitution cost if you wanted result to be 4 (2 insertions, 2 deletions)

This is another solution, which produce 2
import networkx as nx
G=nx.DiGraph()
G.add_node(1, id=1)
G.add_node(2, id=2)
G.add_node(3, id=3)
G.add_edges_from([(1, 2),(2,3)])
G2=nx.DiGraph()
G2.add_node(1, id=1)
G2.add_node(2, id=2)
G2.add_node(3, id=3)
G2.add_edges_from([(3, 2),(2,1)])
# arguments
# arguments for nodes
def node_subst_cost(node1, node2):
# check if the nodes are equal, if yes then apply no cost, else apply 1
if node1 == node2:
return 0
return 1
def node_del_cost(node):
return 1 # here you apply the cost for node deletion
def node_ins_cost(node):
return 1 # here you apply the cost for node insertion
# arguments for edges
def edge_subst_cost(edge1, edge2):
# check if the edges are equal, if yes then apply no cost, else apply 3
if edge1==edge2:
return 0
return 1
def edge_del_cost(node):
return 1 # here you apply the cost for edge deletion
def edge_ins_cost(node):
return 1 # here you apply the cost for edge insertion
paths, cost = nx.optimal_edit_paths(
G,
G2,
node_subst_cost=node_subst_cost,
node_del_cost=node_del_cost,
node_ins_cost=node_ins_cost,
edge_subst_cost=edge_subst_cost,
edge_del_cost=edge_del_cost,
edge_ins_cost=edge_ins_cost
)
print(cost)
If you run it on Python 2.7, add the following lines to the header
# This Python file uses the following encoding: utf-8
from __future__ import print_function, unicode_literals
from __future__ import absolute_import, division

Related

Graph generation using Networkx

I would like to generate undirected graphs without self-loops using networkx library using the following idea: in the beginning we have two nodes and one edge that is connecting them. Then on the next iterations we create new node and connect it with two edges to random chosen nodes of the graph. So then next iteration will be 3 nodes and three edges (circle), then 4 nodes and 5 edges, 5 nodes and 7 edges, etc.
Here is my approach:
import random
import networkx as nx
def generateGraph(n):
G = nx.Graph()
G.add_node(0)
G.add_node(1)
G.add_edge(0, 1)
nodes = []
while G.number_of_nodes() < n:
new_node = G.number_of_nodes()
new_edge = G.number_of_edges()
G.add_node(new_node)
destination = random.choice(new_node)
nodes.append(destination)
G.add_edge(new_node, new_edge)
G.add_edge(node, new_edge)
return G
What am I missing here and how can I change my approach? The problem that there are only n nodes and n+1 edges using this model but it should be n nodes and +2 edges(after 5 nodes). Thank you
Probably, this should follow your algorithm
def generate_graph(n):
G = nx.Graph([ (0, 1) ]) # put list of edges
for n in range(2, n):
# take two nodes randomly
nodes_to_take = random.sample(list(G.nodes), 2)
G.add_edges_from([ # connect new node 'n' with two chosen nodes from graph
(n, nodes_to_take[0]), (n, nodes_to_take[1])
])
return G
Result:
gr = generate_graph(7)
nx.draw_networkx(gr)

Remove weights from networkx graph

I have a weighted Networkx graph G. I first want to make some operation on G with weights (which is why I just don't read the input and set weights=None) and then remove them from G afterwards. What is the most straightforward way to make it unweighted?
I could just do:
G = nx.from_scipy_sparse_array(nx.to_scipy_sparse_array(G,weight=None))
Or loop through the G.adj dictionary and set weights=0, but both of these options feels too complicated. Something like:
G = G.drop_weights()
It is possible to access the data structure of the networkx graphs directly and remove any unwanted attributes.
At the end, what you can do is define a function that loops over the dictionaries and remove the "weight" attribute.
def drop_weights(G):
'''Drop the weights from a networkx weighted graph.'''
for node, edges in nx.to_dict_of_dicts(G).items():
for edge, attrs in edges.items():
attrs.pop('weight', None)
and an example of usage:
import networkx as nx
def drop_weights(G):
'''Drop the weights from a networkx weighted graph.'''
for node, edges in nx.to_dict_of_dicts(G).items():
for edge, attrs in edges.items():
attrs.pop('weight', None)
G = nx.Graph()
G.add_weighted_edges_from([(1,2,0.125), (1,3,0.75), (2,4,1.2), (3,4,0.375)])
print(nx.is_weighted(G)) # True
F = nx.Graph(G)
print(nx.is_weighted(F)) # True
# OP's suggestion
F = nx.from_scipy_sparse_array(nx.to_scipy_sparse_array(G,weight=None))
print(nx.is_weighted(F)) # True
# Correct solution
drop_weights(F)
print(nx.is_weighted(F)) # False
Note that even reconstructing the graph without the weights through nx.to_scipy_sparse_array is not enough because the graph is constructed with weights, only these are set to 1.

NetworkX DiGraphMatcher returns no results on directed graphs?

I have a large graph in which I want to find a subgraph isomorphism using the built-in VF2 algorithm in NetworkX. Both the 'haystack' as well as 'needle' graphs are directed. Take the following trivial example:
from networkx.algorithms.isomorphism import DiGraphMatcher
G1 = nx.complete_graph(20, nx.DiGraph)
G2 = nx.DiGraph()
G2.add_edge(1, 2)
list(DiGraphMatcher(G1, G2).subgraph_isomorphisms_iter())
The final line returns an empty list [].
My understanding is that this should return all edges in the graph, and indeed, if I substitute GraphMatcher for DiGraphMatcher, I get a list of all edges.
Is there something wrong with DiGraphMatcher, or perhaps something wrong with my understanding of what DiGraphMatcher should be doing?
Versions:
Python: 3.7.7
NetworkX: 2.4
Example undirected graph code (replaces all DiGraph with Graph, otherwise the same):
from networkx.algorithms.isomorphism import GraphMatcher
G1 = nx.complete_graph(20, nx.Graph)
G2 = nx.Graph()
G2.add_edge(1, 2)
list(GraphMatcher(G1, G2).subgraph_isomorphisms_iter())
Answering my own question after many hours of sorrow. I was hoping this was going to be an interesting technical question. Turns out it's just a run-of-the-mill nomenclature question!
NetworkX defines a subgraph isomorphism as the following:
If G'=(N',E') is a node-induced subgraph, then:
N' is a subset of N
E' is the subset of edges in E relating nodes in N'
(Taken from networkx inline code comments.)
It defines a mono​morphism as the following:
If G'=(N',E') is a monomorphism, then:
N' is a subset of N
E' is a subset of the set of edges in E relating nodes in N'
And further, notes:
Note that if G' is a node-induced subgraph of G, then it is always a
subgraph monomorphism of G, but the opposite is not always true, as a
monomorphism can have fewer edges.
In other words, because there are other edges involved in this graph than are described by the G2 graph, the DiGraphMatcher considers the set of edges E' to be not equal to the subset of edges in E relating nodes in N'.
Instead, the edges in E' are a subset of the set of edges in E relating nodes in N', and so networkx calls this a monomorphism instead.
To better illustrate this point, consider the following:
from networkx.algorithms.isomorphism import DiGraphMatcher
G1 = nx.DiGraph()
G1.add_edge(1, 2)
G1.add_edge(2, 1)
G2 = nx.DiGraph()
G2.add_edge(1, 2)
print(list(DiGraphMatcher(G1, G2).subgraph_isomorphisms_iter()))
print(list(DiGraphMatcher(G1, G2).subgraph_monomorphisms_iter()))
This will print the following:
[{1: 1, 2: 2}, {2: 1, 1: 2}] # subgraph MONOmorphism
[] # subgraph ISOmorphism

How to combine two egdes and nodes in to one that has common starting nodes in Networkx?

I am quite new for networkx and I am asking help from the Stackeroverflow community.
I am trying to combine nodes and edges that have a common starting node as shown below in the figure. The arrow shows the expected result.
nodes_to_combine = [n for n in graph.nodes if len(list(graph.neighbors(n))) == 2]
for node in nodes_to_combine:
graph.add_edge(*graph.neighbors(node))
nx.draw(graph, with_labels=True)
Can anyone help me to figure out this?
NetworkX has no functions to merge nodes in the graph so it should be implemented manually. Here is the example without attributes merging (it can has its own logic):
def merge(G, n1, n2):
# Get all predecessors and successors of two nodes
pre = set(G.predecessors(n1)) | set(G.predecessors(n2))
suc = set(G.successors(n1)) | set(G.successors(n2))
# Create the new node with combined name
name = str(n1) + '/' + str(n2)
# Add predecessors and successors edges
# We have DiGraph so there should be one edge per nodes pair
G.add_edges_from([(p, name) for p in pre])
G.add_edges_from([(name, s) for s in suc])
# Remove old nodes
G.remove_nodes_from([n1, n2])
Here is how it works:
import networkx as nx
G = nx.DiGraph()
G.add_edges_from([
('0','20'),
('10','20'),
('10','30'),
('20','40'),
('30','50'),
])
nx.draw(
G,
pos=nx.nx_agraph.graphviz_layout(G, prog='dot'),
node_color='#FF0000',
with_labels=True
)
merge(G, '20', '30')
nx.draw(
G,
pos=nx.nx_agraph.graphviz_layout(G, prog='dot'),
node_color='#FF0000',
with_labels=True
)

Generate a directed Graph using Python Library any python library

I am implementing Bellman ford's algorithm from GeeksForGeeks in Python. I want to generate the Graph(The Diagramatic form and not in dictionary type-which is easy) using some library like pyplot or networkx or something similar. I want the graph UI to contain the nodes,edges and the respective cost.
from collections import defaultdict
#Class to represent a graph
class Graph:
def __init__(self,vertices):
self.V= vertices #No. of vertices
self.graph = [] # default dictionary to store graph
# function to add an edge to graph
def addEdge(self,u,v,w):
self.graph.append([u, v, w])
# utility function used to print the solution
def printArr(self, dist):
print("Vertex Distance from Source")
for i in range(self.V):
print("%d \t\t %d" % (i, dist[i]))
# The main function that finds shortest distances from src to
# all other vertices using Bellman-Ford algorithm. The function
# also detects negative weight cycle
def BellmanFord(self, src):
# Step 1: Initialize distances from src to all other vertices
# as INFINITE
dist = [float("Inf")] * self.V
dist[src] = 0
# Step 2: Relax all edges |V| - 1 times. A simple shortest
# path from src to any other vertex can have at-most |V| - 1
# edges
for i in range(self.V - 1):
# Update dist value and parent index of the adjacent vertices of
# the picked vertex. Consider only those vertices which are still in
# queue
for u, v, w in self.graph:
if dist[u] != float("Inf") and dist[u] + w < dist[v]:
dist[v] = dist[u] + w
# Step 3: check for negative-weight cycles. The above step
# guarantees shortest distances if graph doesn't contain
# negative weight cycle. If we get a shorter path, then there
# is a cycle.
for u, v, w in self.graph:
if dist[u] != float("Inf") and dist[u] + w < dist[v]:
print "Graph contains negative weight cycle"
return
# print all distance
self.printArr(dist)
g = Graph(5)
g.addEdge(0, 1, -1)
g.addEdge(0, 2, 4)
g.addEdge(1, 2, 3)
g.addEdge(1, 3, 2)
g.addEdge(1, 4, 2)
g.addEdge(3, 2, 5)
g.addEdge(3, 1, 1)
g.addEdge(4, 3, -3)
The graph that I want either in terminal or in separate file is(based on above code):
The link of documentation by ekiim is highly useful. This is the code that I did for plotting graph:
import networkx as nx
import matplotlib.pyplot as plt
G=nx.DiGraph()
G.add_node(0),G.add_node(1),G.add_node(2),G.add_node(3),G.add_node(4)
G.add_edge(0, 1),G.add_edge(1, 2),G.add_edge(0, 2),G.add_edge(1, 4),G.add_edge(1, 3),G.add_edge(3, 2),G.add_edge(3,1),G.add_edge(4,3)
nx.draw(G, with_labels=True, font_weight='bold')
plt.show()
This code prints the directed graph without cost. I tried printing with cost but the output was highly distorted with costs jumbled up. Some costs were written in blank spaces while only one or two were present on the edges. Hence if someone knows to implement that it would be highly useful.
If you check this tutorial for networkx, you'll see how easy, is to create a directed graph, plus, plotting it.
Pretty much, It's the same for a directed or simple graph, (API wise), and the plotting, is also simple enough and uses Matplotlib to generate it.
You could make a Tk app, that allows you to input, manually the nodes, and Edges, and store them in ListBoxes, and plot a graph, in function of that, this won't be drag and drop, but still, it helps you visualize the graph on the fly.
and this Matplotlib tutorial, will give you the idea how to embed it in a TK app.

Categories