I am not sure I understand how Networkit handles the names of the nodes.
Let's say that I read a large graph from an edgelist, using another Python module like Networkx; then I convert it to a Network graph and I perform some operations, like computing the pairwise distances. A simple piece of code to do this could be:
import networkx as nx
import networkit as nk
nxG=nx.read_edgelist('test.edgelist',data=True)
G = nk.nxadapter.nx2nk(nxG, weightAttr='weight')
apsp = nk.distance.APSP(G)
apsp.run()
dist=apsp.getDistances()
easy-peasy.
Now, what if I want to do something with those distances? For examples, what if I want to plot them against, I don’t know, the weights on the paths, or any other measure that requires the retrieval of the original node ids?
The getDistances() function returns a list of lists, one for each node with the distance to every other node, but I have no clue on how Networkit maps the nodes’ names to the sequence of ints that it uses as nodes identifiers, thus the order it followed to compute the distances and store them in the output.
When creating a new graph from networkx, NetworKit creates a dictionary that maps each node id in nxG to an unique integer from 0 to n - 1 in G (where n is the number of nodes) with this instruction.
Unfortunately, this mapping is not returned by nx2nk, so you should create it yourself.
Let's assume that you want to get a distance from node 1 to node 2, where 1 and 2 are node ids in nxG:
import networkx as nx
import networkit as nk
nxG=nx.read_edgelist('test.edgelist',data=True)
G = nk.nxadapter.nx2nk(nxG, weightAttr='weight')
# Get mapping from node ids in nxG to node ids in G
idmap = dict((id, u) for (id, u) in zip(nxG.nodes(), range(nxG.number_of_nodes())))
apsp = nk.distance.APSP(G)
apsp.run()
dist=apsp.getDistances()
# Get distance from node `1` to node `2`
dist_from_1_to_2 = dist[idmap['1']][idmap['2']]
Related
I have a directed acyclic graph and have specific requirements on the topological sort:
Depth first search: I want each branch to reach an end before a new branch is added to the sorting
Several nodes have multiple outgoing edges. For those nodes I have a sorted list of successor nodes, that is to be used in choosing with which node to continue the sorting.
Example:
When the node n is reached, that has three successors m1, m2, m3 of which each one of them would be a valid option to continue, I would provide a list such as [m3, m1, m2] that would indicate to continue with the node m3.
I am using networkx. I thought about iterating through the nodes with
sorting = []
for n in dfs_edges(dag, source = 'root'):
sorting.append(n[0])
Or using the method dfs_preorder_nodes but I have not found a way to make it use the list.
Any hints?
Hello I'm using networkx library, I have created graph but the i'm having issue in finding multiple targets and target values are bit tricky because target has to be matched with substring within the given target value.
Example:
Nodes = ['C0111', 'N6186', 'C5572', 'N6501', 'C0850-IASW-NO01', 'C1182-IUPE-NO01']
Edges = [('C0111','N6186'),('N6186','C0850-IASW-NO01'),('C0111','C5572'),('C5572','N6501'),('N6501','C1182-IUPE-NO01')]
Problem:
Source = 'C0111'
Target = ['IASW','IUPE']
Their are some special nodes which are considered as target which are 8 of them including nodes containing 'IUPE' , 'IASW' ,etc
I can create graph using networkx.
import networkx as nx
G = nx.Graph()
G.add_nodes_from(Nodes)
G.add_edges_from(Edges)
nx.shortest_path(G,source='C0111',target=?)'''
for multiple targets i can iterate through multi targets but for substring to be in node i'm confused on this point.
example:
normal way ==> '''nx.shortest_path(G,source='C0111',target='C0850-IASW-NO01')'''
'C0850-IASW-NO01' => thats how node is created
but i want to see if target has IASW or IUPE in it.
One solution is to use the pattern to subset the target nodes before looking for the shortest paths:
target_nodes = [n for n in G if "IASW" in str(n) or "IUPE" in str(n)]
With a list of target nodes, now it's possible to iterate over them and find the shortest path of interest (as you describe).
Description of the problem:
The objective is to extract the component that a certain vertex belongs to in order to calculate its size.
Steps of the code:
Use the igraph method clusters() to get the list of all connected components (c.c) in the graph.
Then, iterate over the c.c list while checking each time if that certain node belongs to it or not.
When it is found, I calculate its size.
The code is as follows:
def sizeofcomponent(clusters, vertex):
for i in range(len(clusters)):
if str(vertex) in clusters.subgraphs()[i].vs["name"]:
return(len(clusters.subgraphs()[i].vs["name"]))
The Problem is that this code will be used with extremely large graphs, and this way of doing things slowed my code by a lot. Is there a way to improve it?
EDIT 01: Explanation of how the algorithm works
Suppose that the following graph is the main graph:
The Maximal Independent Set (MIS) is calculated and we get the following graph that I call components:
Randomly add a node from the main Graph in a way that that node belongs to the main graph but doesn't belong to components (isn't part of the MIS). Example: adding node 10 to components.
Calculate the size of the component it forms.
The process is repeated with all nodes (ones that don't belong in components (MIS).
In the end, the node that forms the smallest component (smallest size) is the one added permanently to components.
Your solution:
When the following code is executed (i being the vertex):
cls = components.clusters()
c = cls.membership[i]
The variable c value would be the following list:
Example: node (2) belongs to the component 1 (of id 1).
Why it wouldn't work for me:
The following line of code wouldn't give me the correct result:
cls = components.clusters()
c = cls.membership[i]
because the ids of the nodes in the list c don't match up with the name of the nodes. Example: cls.membership[i] would give an exception error: list out of range. Instead of the correct result which is: 4.
Also, from your code, the size, in your case, is calculated in the following way:
c = components.membership[i]
s = components.membership.count(c)
You can simply get the component vertex i belongs to by doing
components = G.clusters()
c = components.membership[i]
You can then get the size of component c using
s = components.size(c)
I am working on finding the top 10 nodes according to the clustering coefficient, so far I have tried this
clus_coef = []
for node in G.nodes():
clus_coef.append(nx.clustering(G,node))
I am not sure, how to the get the top 10 nodes with their along with their coefficients ?
You first get the clustering result clus_coef and then sort the dictionary based on values. Use the Counter.most_common() method, it'll sort the items for you. You can then return top N items:
import networkx as nx
clus_coef = nx.clustering(G)
from collections import Counter
c = Counter(clus_coef)
c.most_common(10)
I was going though the blockmodel function in networkx. It seems something very similar to what I want.
I wish to coalesce two nodes in a networkx graph and replace it by node labels corresponding to any of the nodes being joined. The rest of the nodes should remain as is without any change in their names. (The node joining rules are as explained in the blockmodel 1 tutorial)
As for what I understood, blockmodel requires creating explicit paritions of the whole graph before it could be used, which is not so convenient.
There is no control over the names of the block models formed (ie. nodes of the new graph).
How can I achieve this seemingly simpler task of collapsing 2 nodes into one? I want to do it over an undirected graph with weighted edges.
Here's my attempt at creating a coalesce function for a graph coloring program I'm working on. It, however, does not work with weighted edges.
import networkx as nx
# G is a graph created using nx
# this is to establish the number of colors
k = 5
# inputs node1 and node2 are labels of nodes, e.g. 'a' or 'b' etc.
def coalesce(G,node1,node2):
"""Performs Briggs coalescing. Takes in the graph and two nodes.
Returns 1 if unable to coalesce, 0 otherwise."""
if node1 in G.neighbors(node2) or node2 in G.neighbors(node1):
print "Cannot coalesce. Node",node1,"and node",node2,"share an edge"
return 1
elif G.degree(node1)+G.degree(node2) >= k:
print "Cannot coalesce. Combined degree of",node1,"and",node2,"\
is",G.degree(node1)+G.degree(node2),"which is too high for k =",k
return 1
else:
newedge = []
for i in range(len(G.neighbors(node2))):
newedge.append((node1 , G.neighbors(node2)[i]))
G.add_edges_from(newedge)
G.remove_node(node2)
nx.relabel_nodes(G, {node1:node1+node2},copy=False)
return 0