partition graph into sungraphs based on node's attribute NetworkX - python

I'm using Networkx to compute some measures of a graph such as diameter, clustering coefficient, etc. It's straight forward how to do this for graph as a whole. What I'm interested in is finding these measures between nodes that have same attribute(say color). I'm thinking if I could partition the graph into different sub graphs, where nodes in each sub graph are of the same color, then I could accomplish go ahead and measure diameter in this sub graph. So my question is: Is there a way to partition a graph into sub graphs which contain nodes of same color?
I would really appreciate any insight.

Use Graph.subgraph(nodes)
NetworkX 2.x+:
Demo
import networkx as nx
G = nx.Graph()
G.add_nodes_from([1, 2, 3], color="red")
G.add_nodes_from([4, 5, 6])
G.nodes # NodeView((1, 2, 3, 4, 5, 6))
# create generator
nodes = (
node
for node, data
in G.nodes(data=True)
if data.get("color") == "red"
)
subgraph = G.subgraph(nodes)
subgraph.nodes # NodeView((1, 2, 3))
older NetworkX's
Iterate over (Graph.iter_nodes()) and filter the nodes based on your criteria. Pass that list to Graph.subgraph() and it'll return a copy of those nodes and their internal edges.
For example:
G = nx.Graph()
# ... build or do whatever to the graph
nodes = (n for n, d in G.nodes_iter(data=True)) if d.get('color') == 'red')
subgraph = G.subgraph(nodes)

Related

Create undirected graph in NetworkX in python from pandas dataframe

I am new to NetworkX package in python. I want to solve the following problem.
lets say this is my data set:
import pandas as pd
d = {'label': [1, 2, 3, 4, 5], 'size': [10, 8, 6, 4, 2], 'dist': [0, 2, -2, 4, -4]}
df = pd.DataFrame(data=d)
df
label and size in the df are quite self-explanatory. The dist column measures the distance from the biggest label (label 1) to the rest of the labels. Hence dist is 0 in the case of label 1.
I want to produce something similar to the picture below:
Where the biggest label in size is in a central position (1abel 1). Edges are the distance from label 1 to all other labels and the size of nodes are proportional to the size of each label. Is it possible?
Thank you very much in advance. Please let me know if the question is unclear.
import matplotlib.pyplot as plt
import networkx as nx
G = nx.Graph()
for _, row in df.iterrows():
G.add_node(row['label'], pos=(row['dist'], 0), size=row['size'])
biggest_node = 1
for node in G.nodes:
if node != biggest_node:
G.add_edge(biggest_node, node)
nx.draw(G,
pos={node: attrs['pos'] for node, attrs in G.nodes.items()},
node_size=[node['size'] * 100 for node in G.nodes.values()],
with_labels=True
)
plt.show()
Which plots
Notes:
You will notice the edges in 1-3 and 1-2 are thicker, because they overlap with the edge sections from 1-5 and 1-4 respectively. You can address that by having one only one edge from the center to the furthest node out in each direction and since every node will be on the same line, it'll look the same.
coords = [(attrs['pos'][0], node) for node, attrs in G.nodes.items()]
nx.draw(G,
# same arguments as before and also add
edgelist=[(biggest_node, min(coords)[1]), (biggest_node, max(coords)[1])]
)
The 100 factor in the list for the node_size argument is just a scaling factor. You can change that to whatever you want.

How to combine two egdes and nodes in to one that has common starting nodes in Networkx?

I am quite new for networkx and I am asking help from the Stackeroverflow community.
I am trying to combine nodes and edges that have a common starting node as shown below in the figure. The arrow shows the expected result.
nodes_to_combine = [n for n in graph.nodes if len(list(graph.neighbors(n))) == 2]
for node in nodes_to_combine:
graph.add_edge(*graph.neighbors(node))
nx.draw(graph, with_labels=True)
Can anyone help me to figure out this?
NetworkX has no functions to merge nodes in the graph so it should be implemented manually. Here is the example without attributes merging (it can has its own logic):
def merge(G, n1, n2):
# Get all predecessors and successors of two nodes
pre = set(G.predecessors(n1)) | set(G.predecessors(n2))
suc = set(G.successors(n1)) | set(G.successors(n2))
# Create the new node with combined name
name = str(n1) + '/' + str(n2)
# Add predecessors and successors edges
# We have DiGraph so there should be one edge per nodes pair
G.add_edges_from([(p, name) for p in pre])
G.add_edges_from([(name, s) for s in suc])
# Remove old nodes
G.remove_nodes_from([n1, n2])
Here is how it works:
import networkx as nx
G = nx.DiGraph()
G.add_edges_from([
('0','20'),
('10','20'),
('10','30'),
('20','40'),
('30','50'),
])
nx.draw(
G,
pos=nx.nx_agraph.graphviz_layout(G, prog='dot'),
node_color='#FF0000',
with_labels=True
)
merge(G, '20', '30')
nx.draw(
G,
pos=nx.nx_agraph.graphviz_layout(G, prog='dot'),
node_color='#FF0000',
with_labels=True
)

Number of ways of arrangement [duplicate]

I am using networkx to find the maximum cardinality matching of a bipartite graph.
The matched edges are not unique for the particular graph.
Is there a way for me to find all the maximum matchings?
For the following example, all edges below can be the maximum matching:
{1: 2, 2: 1} or {1: 3, 3: 1} or {1: 4, 4: 1}
import networkx as nx
import matplotlib.pyplot as plt
G = nx.MultiDiGraph()
edges = [(1,3), (1,4), (1,2)]
nx.is_bipartite(G)
True
nx.draw(G, with_labels=True)
plt.show()
Unfortunately,
nx.bipartite.maximum_matching(G)
only returns
{1: 2, 2: 1}
Is there a way I can get the other combinations as well?
The paper "Algorithms for Enumerating All Perfect, Maximum and Maximal Matchings in Bipartite Graphs" by Takeaki Uno has an algorithm for this. http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.107.8179&rep=rep1&type=pdf
Theorem 2 says
"Maximum matchings in a bipartite graph can be enumerated in O(mn^1/2+
nNm) time and O(m) space, where Nm is the number of maximum matchings in G."
I'v read Uno's work and tried to come up with an implementation. Below is my very lengthy code with a working example. In this particular case there are 4 "feasible" vertices (according to Uno's terminology), so switching each with an already covered vertex, you have all together 2^4 = 16 different possible maximum matchings.
I've to admit that I'm very new to graph theory and I wasn't following Uno's processes exactly, there are minor differences and mostly I didn't attempt to do any optimizations. I did struggle in understanding the paper as I think the explanations are not quite perfect and the figures may have errors in them. So please do use with care and if you can help optimize it that will be real great!
import networkx as nx
from networkx import bipartite
def plotGraph(graph):
import matplotlib.pyplot as plt
fig=plt.figure()
ax=fig.add_subplot(111)
pos=[(ii[1],ii[0]) for ii in graph.nodes()]
pos_dict=dict(zip(graph.nodes(),pos))
nx.draw(graph,pos=pos_dict,ax=ax,with_labels=True)
plt.show(block=False)
return
def formDirected(g,match):
'''Form directed graph D from G and matching M.
<g>: undirected bipartite graph. Nodes are separated by their
'bipartite' attribute.
<match>: list of edges forming a matching of <g>.
Return <d>: directed graph, with edges in <match> pointing from set-0
(bipartite attribute ==0) to set-1 (bipartite attrbiute==1),
and the other edges in <g> but not in <matching> pointing
from set-1 to set-0.
'''
d=nx.DiGraph()
for ee in g.edges():
if ee in match or (ee[1],ee[0]) in match:
if g.node[ee[0]]['bipartite']==0:
d.add_edge(ee[0],ee[1])
else:
d.add_edge(ee[1],ee[0])
else:
if g.node[ee[0]]['bipartite']==0:
d.add_edge(ee[1],ee[0])
else:
d.add_edge(ee[0],ee[1])
return d
def enumMaximumMatching(g):
'''Find all maximum matchings in an undirected bipartite graph.
<g>: undirected bipartite graph. Nodes are separated by their
'bipartite' attribute.
Return <all_matches>: list, each is a list of edges forming a maximum
matching of <g>.
'''
all_matches=[]
#----------------Find one matching M----------------
match=bipartite.hopcroft_karp_matching(g)
#---------------Re-orient match arcs---------------
match2=[]
for kk,vv in match.items():
if g.node[kk]['bipartite']==0:
match2.append((kk,vv))
match=match2
all_matches.append(match)
#-----------------Enter recursion-----------------
all_matches=enumMaximumMatchingIter(g,match,all_matches,None)
return all_matches
def enumMaximumMatchingIter(g,match,all_matches,add_e=None):
'''Recurively search maximum matchings.
<g>: undirected bipartite graph. Nodes are separated by their
'bipartite' attribute.
<match>: list of edges forming one maximum matching of <g>.
<all_matches>: list, each is a list of edges forming a maximum
matching of <g>. Newly found matchings will be appended
into this list.
<add_e>: tuple, the edge used to form subproblems. If not None,
will be added to each newly found matchings.
Return <all_matches>: updated list of all maximum matchings.
'''
#---------------Form directed graph D---------------
d=formDirected(g,match)
#-----------------Find cycles in D-----------------
cycles=list(nx.simple_cycles(d))
if len(cycles)==0:
#---------If no cycle, find a feasible path---------
all_uncovered=set(g.node).difference(set([ii[0] for ii in match]))
all_uncovered=all_uncovered.difference(set([ii[1] for ii in match]))
all_uncovered=list(all_uncovered)
#--------------If no path, terminiate--------------
if len(all_uncovered)==0:
return all_matches
#----------Find a length 2 feasible path----------
idx=0
uncovered=all_uncovered[idx]
while True:
if uncovered not in nx.isolates(g):
paths=nx.single_source_shortest_path(d,uncovered,cutoff=2)
len2paths=[vv for kk,vv in paths.items() if len(vv)==3]
if len(len2paths)>0:
reversed=False
break
#----------------Try reversed path----------------
paths_rev=nx.single_source_shortest_path(d.reverse(),uncovered,cutoff=2)
len2paths=[vv for kk,vv in paths_rev.items() if len(vv)==3]
if len(len2paths)>0:
reversed=True
break
idx+=1
if idx>len(all_uncovered)-1:
return all_matches
uncovered=all_uncovered[idx]
#-------------Create a new matching M'-------------
len2path=len2paths[0]
if reversed:
len2path=len2path[::-1]
len2path=zip(len2path[:-1],len2path[1:])
new_match=[]
for ee in d.edges():
if ee in len2path:
if g.node[ee[1]]['bipartite']==0:
new_match.append((ee[1],ee[0]))
else:
if g.node[ee[0]]['bipartite']==0:
new_match.append(ee)
if add_e is not None:
for ii in add_e:
new_match.append(ii)
all_matches.append(new_match)
#---------------------Select e---------------------
e=set(len2path).difference(set(match))
e=list(e)[0]
#-----------------Form subproblems-----------------
g_plus=g.copy()
g_minus=g.copy()
g_plus.remove_node(e[0])
g_plus.remove_node(e[1])
g_minus.remove_edge(e[0],e[1])
add_e_new=[e,]
if add_e is not None:
add_e_new.extend(add_e)
all_matches=enumMaximumMatchingIter(g_minus,match,all_matches,add_e)
all_matches=enumMaximumMatchingIter(g_plus,new_match,all_matches,add_e_new)
else:
#----------------Find a cycle in D----------------
cycle=cycles[0]
cycle.append(cycle[0])
cycle=zip(cycle[:-1],cycle[1:])
#-------------Create a new matching M'-------------
new_match=[]
for ee in d.edges():
if ee in cycle:
if g.node[ee[1]]['bipartite']==0:
new_match.append((ee[1],ee[0]))
else:
if g.node[ee[0]]['bipartite']==0:
new_match.append(ee)
if add_e is not None:
for ii in add_e:
new_match.append(ii)
all_matches.append(new_match)
#-----------------Choose an edge E-----------------
e=set(match).intersection(set(cycle))
e=list(e)[0]
#-----------------Form subproblems-----------------
g_plus=g.copy()
g_minus=g.copy()
g_plus.remove_node(e[0])
g_plus.remove_node(e[1])
g_minus.remove_edge(e[0],e[1])
add_e_new=[e,]
if add_e is not None:
add_e_new.extend(add_e)
all_matches=enumMaximumMatchingIter(g_plus,match,all_matches,add_e_new)
all_matches=enumMaximumMatchingIter(g_minus,new_match,all_matches,add_e)
return all_matches
if __name__=='__main__':
g=nx.Graph()
edges=[
[(1,0), (0,0)],
[(1,1), (0,0)],
[(1,2), (0,2)],
[(1,3), (0,2)],
[(1,4), (0,3)],
[(1,4), (0,5)],
[(1,5), (0,2)],
[(1,5), (0,4)],
[(1,6), (0,1)],
[(1,6), (0,4)],
[(1,6), (0,6)]
]
for ii in edges:
g.add_node(ii[0],bipartite=0)
g.add_node(ii[1],bipartite=1)
g.add_edges_from(edges)
plotGraph(g)
all_matches=enumMaximumMatching(g)
for mm in all_matches:
g_match=nx.Graph()
for ii in mm:
g_match.add_edge(ii[0],ii[1])
plotGraph(g_match)

Coloring of nodes based on depth from root node

I have a graph on python using networkx
o = net.DiGraph()
hfollowers = defaultdict(lambda: 0)
for (twitter_user, followed_by, followers) in twitter_network:
o.add_edge(twitter_user, followed_by, followers=int(followers))hfollowers[twitter_user] = int(followers)
I have a root defined - which is the name of the twitter user
SEED = 'BarackObama'
I have initialized a subgraph from SEED
g = net.DiGraph(net.ego_graph(o, SEED, radius=4))
Now, I want to assign color to nodes based on its depth from SEED and plot it. How do I do this ?
In the code below I create a graph. Then I get the distances of each node from that graph. Then I invert that so that instead for each distance I have a list of nodes at that distance. Then for each distance I plot the nodes with a given color. Note that if there is an unreachable node, it won't get plotted. If such nodes exist you need to decide what you're doing with them separately.
Also because I'm using just the standard colors (except white), I can only do a few distances. If you need more, then you'll have to use some other way to make your list of colors (or perhaps make a function that returns a color based on distance). It will take an RGB or HEX definition for a color.
G=nx.erdos_renyi_graph(10,0.4)
G.add_node(11) # here's a new node, it's not connected
SEED=1
distanceDict = nx.shortest_path_length(G, SEED) #for each node know how far it is
inverse_dict = {} #for each distance this will say which nodes are there.
for k,v in distanceDict.iteritems():
inverse_dict[v] = inverse_dict.get(v,[])
inverse_dict[v].append(k)
inverse_dict
> {0: [1], 1: [0, 5, 6], 2: [2, 3, 4, 8, 9], 3: [7]}
colors = ['r', 'b', 'g', 'k', 'c', 'm']#create a list of colors
pos = nx.spring_layout(G) #set positions so that each plot below uses same position
for distance in inverse_dict.keys():
if distance<=len(colors): #plot these nodes with the right color
nx.draw_networkx(G, pos = pos, nodelist = inverse_dict[distance], node_color = colors[distance])
tooFar = []
for node in G.nodes_iter():
if node not in distanceDict or distanceDict[node]>max_dist:
tooFar.append(node)
nx.draw_networkx(G,pos=pos, nodelist=tooFar, node_color='w')
plt.show()

Is there a way to run pagerank algorithm on NetworkX's MultiGraph?

I'm working on a graph with multiple edges between the same nodes (edges are having different values). In order to model this graph I need to use MultiGraph instead of normal Graph. Unfortunately, it's not possible to run PageRank algo on it.
Any workarounds known ?
NetworkXNotImplemented: not implemented for multigraph type
You could create make a graph without parallel edges and then run pagerank.
Here is an example of summing edge weights of parallel edges to make a simple graph:
import networkx as nx
G = nx.MultiGraph()
G.add_edge(1,2,weight=7)
G.add_edge(1,2,weight=10)
G.add_edge(2,3,weight=9)
# make new graph with sum of weights on each edge
H = nx.Graph()
for u,v,d in G.edges(data=True):
w = d['weight']
if H.has_edge(u,v):
H[u][v]['weight'] += w
else:
H.add_edge(u,v,weight=w)
print H.edges(data=True)
#[(1, 2, {'weight': 17}), (2, 3, {'weight': 9})]
print nx.pagerank(H)
#{1: 0.32037465332634, 2: 0.4864858243244209, 3: 0.1931395223492388}
You can still compose a Digraph by combining the edges
while adding their weights.
# combining edges using defaultdict
# input-- combined list of all edges
# ouput-- list of edges with summed weights for duplicate edges
from collections import defaultdict
def combine_edges(combined_edge_list):
ddict = defaultdict(list)
for edge in combined_edge_list:
n1,n2,w = edge
ddict[(n1,n2)].append(w)
for k in ddict.keys():
ddict[k] = sum(ddict[k])
edges = list(zip( ddict.keys(), ddict.values() ) )
return [(n1,n2,w) for (n1,n2),w in edges]

Categories