Coalesce 2 nodes in a networkx graph - python

I was going though the blockmodel function in networkx. It seems something very similar to what I want.
I wish to coalesce two nodes in a networkx graph and replace it by node labels corresponding to any of the nodes being joined. The rest of the nodes should remain as is without any change in their names. (The node joining rules are as explained in the blockmodel 1 tutorial)
As for what I understood, blockmodel requires creating explicit paritions of the whole graph before it could be used, which is not so convenient.
There is no control over the names of the block models formed (ie. nodes of the new graph).
How can I achieve this seemingly simpler task of collapsing 2 nodes into one? I want to do it over an undirected graph with weighted edges.

Here's my attempt at creating a coalesce function for a graph coloring program I'm working on. It, however, does not work with weighted edges.
import networkx as nx
# G is a graph created using nx
# this is to establish the number of colors
k = 5
# inputs node1 and node2 are labels of nodes, e.g. 'a' or 'b' etc.
def coalesce(G,node1,node2):
"""Performs Briggs coalescing. Takes in the graph and two nodes.
Returns 1 if unable to coalesce, 0 otherwise."""
if node1 in G.neighbors(node2) or node2 in G.neighbors(node1):
print "Cannot coalesce. Node",node1,"and node",node2,"share an edge"
return 1
elif G.degree(node1)+G.degree(node2) >= k:
print "Cannot coalesce. Combined degree of",node1,"and",node2,"\
is",G.degree(node1)+G.degree(node2),"which is too high for k =",k
return 1
else:
newedge = []
for i in range(len(G.neighbors(node2))):
newedge.append((node1 , G.neighbors(node2)[i]))
G.add_edges_from(newedge)
G.remove_node(node2)
nx.relabel_nodes(G, {node1:node1+node2},copy=False)
return 0

Related

How to generate all directed permutations of an undirected graph?

I am looking for a way to generate all possible directed graphs from an undirected template. For example, given this graph "template":
I want to generate all six of these directed versions:
In other words, for each edge in the template, choose LEFT, RIGHT, or BOTH direction for the resulting edge.
There is a huge number of outputs for even a small graph, because there are 3^E valid permutations (where E is the number of edges in the template graph), but many of them are duplicates (specifically, they are automorphic to another output). Take these two, for example:
I only need one.
I'm curious first: Is there is a term for this operation? This must be a formal and well-understood process already?
And second, is there a more efficient algorithm to produce this list? My current code (Python, NetworkX, though that's not important for the question) looks like this, which has two things I don't like:
I generate all permutations even if they are isomorphic to a previous graph
I check isomorphism at the end, so it adds additional computational cost
Results := Empty List
T := The Template (Undirected Graph)
For i in range(3^E):
Create an empty directed graph G
convert i to trinary
For each nth edge in T:
If the nth digit of i in trinary is 1:
Add the edge to G as (A, B)
If the nth digit of i in trinary is 2:
Add the edge to G as (B, A)
If the nth digit of i in trinary is 0:
Add the reversed AND forward edges to G
For every graph in Results:
If G is isomorphic to Results, STOP
Add G to Results

Retrieving original node names in Networkit

I am not sure I understand how Networkit handles the names of the nodes.
Let's say that I read a large graph from an edgelist, using another Python module like Networkx; then I convert it to a Network graph and I perform some operations, like computing the pairwise distances. A simple piece of code to do this could be:
import networkx as nx
import networkit as nk
nxG=nx.read_edgelist('test.edgelist',data=True)
G = nk.nxadapter.nx2nk(nxG, weightAttr='weight')
apsp = nk.distance.APSP(G)
apsp.run()
dist=apsp.getDistances()
easy-peasy.
Now, what if I want to do something with those distances? For examples, what if I want to plot them against, I don’t know, the weights on the paths, or any other measure that requires the retrieval of the original node ids?
The getDistances() function returns a list of lists, one for each node with the distance to every other node, but I have no clue on how Networkit maps the nodes’ names to the sequence of ints that it uses as nodes identifiers, thus the order it followed to compute the distances and store them in the output.
When creating a new graph from networkx, NetworKit creates a dictionary that maps each node id in nxG to an unique integer from 0 to n - 1 in G (where n is the number of nodes) with this instruction.
Unfortunately, this mapping is not returned by nx2nk, so you should create it yourself.
Let's assume that you want to get a distance from node 1 to node 2, where 1 and 2 are node ids in nxG:
import networkx as nx
import networkit as nk
nxG=nx.read_edgelist('test.edgelist',data=True)
G = nk.nxadapter.nx2nk(nxG, weightAttr='weight')
# Get mapping from node ids in nxG to node ids in G
idmap = dict((id, u) for (id, u) in zip(nxG.nodes(), range(nxG.number_of_nodes())))
apsp = nk.distance.APSP(G)
apsp.run()
dist=apsp.getDistances()
# Get distance from node `1` to node `2`
dist_from_1_to_2 = dist[idmap['1']][idmap['2']]

Networkx: how to use node_match argument in optimize_graph_edit_distance?

I'm trying to compare two graphs with the help of the optimize_graph_edit_distance algorithm on NetworkX:
optimize_graph_edit_distance(G1, G2,
node_match=None,
node_subst_cost=None,
edge_match=None,
node_del_cost=None,
edge_subst_cost=None,
edge_del_cost=None,
edge_ins_cost=None,
upper_bound=None)
I gave each node on both graphs a set amount of attributes in the form of a dictionary and with the help of node_match I can specify if nodes N1 and N2 should be considered equal during matching.
The function node_match should be called like so:
node_match(G1.nodes[n1], G2.nodes[n2]) >>n1 and n2 are node attribute dictionaries as input.
My problem is that I have more than one node in each graph. Therefore, how do I give the function all other attribute dictionaries to compare all other nodes?
node_match is a function that returns True if node n1 in G1 and n2 in G2 should be considered equal during matching. For example:
import networkx as nx
G1 = nx.DiGraph()
G1.add_nodes_from([(0, {'label':'a'}), (1, {'label':'b'}),(2, {'label':'c'})])
G1.add_edges_from([(0,1),(0,2)])
G2 = nx.DiGraph()
G2.add_nodes_from([(3, {'label':'a'}), (4, {'label':'b'}),(5, {'label':'c'})])
G2.add_edges_from([(3,4),(3,5)])
print(G1.nodes())
print(G1.edges())
print(G2.nodes())
print(G2.edges())
for dist in nx.algorithms.similarity.optimize_graph_edit_distance(G1, G2, node_match=lambda a,b: a['label'] == b['label']):
print(dist)
Here, even though the node identifiers in the two graphs are different, the distance will be zero. That's because we defined the function that compares 2 nodes as lambda a,b: a['label'] == b['label'], meaning that two nodes are considered equal during matching if they have the same 'label' value.
Similarly, you can implement any logic you wish to without specifically treating every pair of nodes in your graphs.

Is there a method in networkx to get the top 10 nodes by clustring coefficient?

I am working on finding the top 10 nodes according to the clustering coefficient, so far I have tried this
clus_coef = []
for node in G.nodes():
clus_coef.append(nx.clustering(G,node))
I am not sure, how to the get the top 10 nodes with their along with their coefficients ?
You first get the clustering result clus_coef and then sort the dictionary based on values. Use the Counter.most_common() method, it'll sort the items for you. You can then return top N items:
import networkx as nx
clus_coef = nx.clustering(G)
from collections import Counter
c = Counter(clus_coef)
c.most_common(10)

Identify first node after source node having two neighbors in NetworkX DiGraph

I am implementing a DiGraph with NetworkX. The source is the red-node. I need to identify the first node starting at the red one that has two neighbors (in "flow-direction"). If I iterate over all nodes - it seems like ramdom. Would be great if someone could help!
You can use the successors method. If your DiGraph instance is called G, and your red node has index 0 then you can take a breadth first search approach like this:
import networkx as nx
# Construct graph from example image, all edges pointing away from source
G = nx.DiGraph()
G.add_path([0,1,2,3,4])
G.add_path([1,5])
G.add_path([3,6])
G.add_path([2,7,8])
# Find first with 2 neighbors
neighbors = G.successors(0)
for n in neighbors:
nneighbors = set(G.successors(n))
if len(nneighbors) == 2:
print "Found", n
break
neighbors.extend(nneighbors)
The neighbors method is interchangable with successors for a DiGraph in networkx. If you want to also count ingoing edges for each node, add G.predecessors(n) to the set of nneighbors when you count them, but remember to not include them in the set when you extend neighbors. The code would then be:
# Find first with 2 neighbors
neighbors = G.successors(0)
for n in neighbors:
if len(G.predecessors(n)+G.successors(n)) == 2:
print "Found", n
break
nneighbors = set(G.successors(n))
neighbors.extend(nneighbors)

Categories