Python/NetworkX: calculate edge weights on the fly - python

I have an unweighted graph created with networkx for which I would like to calculate the weight of edges between nodes based on the count/frequency of an edge occurrence. An edge in my graph can occur more than once but the frequency of an edge appearance is not known in advance. The purpose is to visualize the edges based on the weight (e.g. count/frequency) of moves between connected nodes. Essentially, I'd like to create a network traffic map of movement between connected nodes, and visualize based on color or edge width. E.g., edge from node 0 to 1 has 10 movements between them, and node 1 to 2 has 5, so edge 0-1 would be visualized using a different edge color/size.
How can I calculate the weight of edges between two nodes, on the fly (after adding them to the graph with g.add_edges_from()), and then reapply to my graph for visualization? Below is a sample of my graph, data, and code I've used to create the graph initially and a solution I attempted that failed.
Graph
Sample Data
Cluster centroids(nodes)
cluster_label,latitude,longitude
0,39.18193382,-77.51885109
1,39.18,-77.27
2,39.17917928,-76.6688633
3,39.1782,-77.2617
4,39.1765,-77.1927
5,39.1762375,-76.8675441
6,39.17468,-76.8204499
7,39.17457332,-77.2807235
8,39.17406072,-77.274685
9,39.1731621,-77.2716502
10,39.17,-77.27
Trajectories(edges)
user_id,trajectory
11011.0,"[[340, 269], [269, 340]]"
80973.0,"[[398, 279]]"
608473.0,"[[69, 28]]"
2139671.0,"[[382, 27], [27, 285]]"
3945641.0,"[[120, 422], [422, 217], [217, 340], [340, 340]]"
5820642.0,"[[458, 442]]"
6060732.0,"[[291, 431]]"
6912362.0,"[[68, 27]]"
7362602.0,"[[112, 269]]"
8488782.0,"[[133, 340], [340, 340]]"
Code
import csv
import networkx as nx
import pandas as pd
import community
import matplotlib.pyplot as plt
import time
import mplleaflet
g = nx.MultiGraph()
df = pd.read_csv('cluster_centroids.csv', delimiter=',')
df['pos'] = list(zip(df.longitude,df.latitude))
dict_pos = dict(zip(df.cluster_label,df.pos))
#print dict_pos
for row in csv.reader(open('edges.csv', 'r')):
if '[' in row[1]: #
g.add_edges_from(eval(row[1]))
# Plotting with mplleaflet
fig, ax = plt.subplots()
nx.draw_networkx_nodes(g,pos=dict_pos,node_size=50,node_color='b')
nx.draw_networkx_edges(g,pos=dict_pos,linewidths=0.01,edge_color='k', alpha=.05)
nx.draw_networkx_labels(g,dict_pos)
mplleaflet.show(fig=ax.figure)
I have tried using g.add_weighted_edges_from() and adding weight=1 as an attribute, but have not had any luck. I also tried using this which also did not work:
for u,v,d in g.edges():
d['weight'] = 1
g.edges(data=True)
edges = g.edges()
weights = [g[u][v]['weight'] for u,v in edges]

Since this went unanswered, a 2nd question on this topic was opened (here: Python/NetworkX: Add Weights to Edges by Frequency of Edge Occurance) which received responses. To add weights to edges based on count of edge occurrence:
g = nx.MultiDiGraph()
df = pd.read_csv('G:\cluster_centroids.csv', delimiter=',')
df['pos'] = list(zip(df.longitude,df.latitude))
dict_pos = dict(zip(df.cluster_label,df.pos))
#print dict_pos
for row in csv.reader(open('G:\edges.csv', 'r')):
if '[' in row[1]: #
g.add_edges_from(eval(row[1]))
for u, v, d in g.edges(data=True):
d['weight'] = 1
for u,v,d in g.edges(data=True):
print u,v,d
To scale color and edge width based on the above count:
minLineWidth = 0.25
for u, v, d in g.edges(data=True):
d['weight'] = c[u, v]*minLineWidth
edges,weights = zip(*nx.get_edge_attributes(g,'weight').items())
values = range(len(g.edges())
jet = cm = plt.get_cmap('YlOrRd')
cNorm = colors.Normalize(vmin=0, vmax=values[-1])
scalarMap = cmx.ScalarMappable(norm=cNorm, cmap=jet)
colorList = []
for i in range(len(g.edges()):
colorVal = scalarMap.to_rgba(values[i])
colorList.append(colorVal)
and passing width=[d['weight'] for u,v, d in g.edges(data=True)], edge_color=colorList as arguments in nx.draw_networkx_edges()

Related

How to visualize communities from a list in igraph python

I have a community list as the following list_community.
How do I edit the code below to make the community visible?
from igraph import *
list_community = [['A', 'B', 'C', 'D'],['E','F','G'],['G', 'H','I','J']]
list_nodes = ['A', 'B', 'C', 'D','E','F','G','H','I','J']
tuple_edges = [('A','B'),('A','C'),('A','D'),('B','C'),('B','D'), ('C','D'),('C','E'),
('E','F'),('E','G'),('F','G'),('G','H'),
('G','I'), ('G','J'),('H','I'),('H','J'),('I','J'),]
# Make a graph
g_test = Graph()
g_test.add_vertices(list_nodes)
g_test.add_edges(tuple_edges)
# Plot
layout = g_test.layout("kk")
g.vs["name"] = list_nodes
visual_style = {}
visual_style["vertex_label"] = g.vs["name"]
visual_style["layout"] = layout
ig.plot(g_test, **visual_style)
I would like a plot that visualizes the community as shown below.
I can also do this by using a module other than igraph.
Thank you.
In igraph you can use the VertexCover to draw polygons around clusters (as also suggested by Szabolcs in his comment). You have to supply the option mark_groups when plotting the cover, possibly with some additional palette if you want. See some more detail in the documentation here.
In order to construct the VertexCover, you first have to make sure you get integer indices for each node in the graph you created. You can do that using g_test.vs.find.
clusters = [[g_test.vs.find(name=v).index for v in cl] for cl in list_community]
cover = ig.VertexCover(g_test, clusters)
After that, you can simply draw the cover like
ig.plot(cover,
mark_groups=True,
palette=ig.RainbowPalette(3))
resulting in the following picture
Here is a script that somewhat achieves what you're looking for. I had to handle the cases of single-, and two-nodes communities separately, but for greater than two nodes this draws a polygon within the nodes.
I had some trouble with matplotlib not accounting for overlapping edges and faces of polygons which meant the choice was between (1) not having the polygon surround the nodes or (2) having an extra outline just inside the edge of the polygon due to matplotlib overlapping the widened edge with the fill of the polygon. I left a comment on how to change the code from option (2) to option (1).
I also blatantly borrowed a convenience function from this post to handle correctly sorting the nodes in the polygon for appropriate filling by matplotlib's plt.fill().
Option 1:
Option 2:
Full code:
import networkx as nx
import matplotlib.pyplot as plt
import numpy as np
from matplotlib import cm
def sort_xy(x, y):
x0 = np.mean(x)
y0 = np.mean(y)
r = np.sqrt((x-x0)**2 + (y-y0)**2)
angles = np.where((y-y0) > 0, np.arccos((x-x0)/r), 2*np.pi-np.arccos((x-x0)/r))
mask = np.argsort(angles)
x_sorted = x[mask]
y_sorted = y[mask]
return x_sorted, y_sorted
G = nx.karate_club_graph()
pos = nx.spring_layout(G, seed=42)
fig, ax = plt.subplots(figsize=(8, 10))
nx.draw(G, pos=pos, with_labels=True)
communities = nx.community.louvain_communities(G)
alpha = 0.5
edge_padding = 10
colors = cm.get_cmap('viridis', len(communities))
for i, comm in enumerate(communities):
if len(comm) == 1:
cir = plt.Circle((pos[comm.pop()]), edge_padding / 100, alpha=alpha, color=colors(i))
ax.add_patch(cir)
elif len(comm) == 2:
comm_pos = {k: pos[k] for k in comm}
coords = [a for a in zip(*comm_pos.values())]
x, y = coords[0], coords[1]
plt.plot(x, y, linewidth=edge_padding, linestyle="-", alpha=alpha, color=colors(i))
else:
comm_pos = {k: pos[k] for k in comm}
coords = [a for a in zip(*comm_pos.values())]
x, y = sort_xy(np.array(coords[0]), np.array(coords[1]))
plt.fill(x, y, alpha=alpha, facecolor=colors(i),
edgecolor=colors(i), # set to None to remove edge padding
linewidth=edge_padding)

How to draw trees left to right

Consider the tree below.
import matplotlib.pyplot as plt
import networkx as nx
import pydot
from networkx.drawing.nx_pydot import graphviz_layout
T = nx.balanced_tree(2, 5)
for line in nx.generate_adjlist(T):
print(line)
pos = graphviz_layout(T, prog="dot")
nx.draw(T, pos, node_color="y", edge_color='#909090', node_size=200, with_labels=True)
plt.show()
How can I draw this left to right so that the whole image is rotated by 90 degrees with the root on the right?
If you want to have fine-grained control over node positions (which includes rotating the whole graph) you can actually set each node's position explicitly. Here's a way to do that that produces a 'centred' hierarchy, left to right.
import itertools
import matplotlib.pyplot as plt
import networkx as nx
plt.figure(figsize=(12,8))
subset_sizes = [1, 2, 4, 8, 16, 32]
def multilayered_graph(*subset_sizes):
extents = nx.utils.pairwise(itertools.accumulate((0,) + subset_sizes))
layers = [range(start, end) for start, end in extents]
G = nx.Graph()
for (i, layer) in enumerate(layers):
G.add_nodes_from(layer, layer=i)
for layer1, layer2 in nx.utils.pairwise(layers):
G.add_edges_from(itertools.product(layer1, layer2))
return G
# Instantiate the graph
G = multilayered_graph(*subset_sizes)
# use the multipartite layout
pos = nx.multipartite_layout(G, subset_key="layer")
nodes = G.nodes
nodes_0 = set([n for n in nodes if G.nodes[n]['layer']==0])
nodes_1 = set([n for n in nodes if G.nodes[n]['layer']==1])
nodes_2 = set([n for n in nodes if G.nodes[n]['layer']==2])
nodes_3 = set([n for n in nodes if G.nodes[n]['layer']==3])
nodes_4 = set([n for n in nodes if G.nodes[n]['layer']==4])
nodes_5 = set([n for n in nodes if G.nodes[n]['layer']==5])
# setup a position list
pos = dict()
base = 128
thisList = list(range(-int(base/2),int(base/2),1))
# then assign nodes to indices
pos.update( (n, (10, thisList[int(base/2)::int(base/2)][i])) for i, n in enumerate(nodes_0) )
pos.update( (n, (40, thisList[int(base/4)::int(base/2)][i])) for i, n in enumerate(nodes_1) )
pos.update( (n, (60, thisList[int(base/8)::int(base/4)][i])) for i, n in enumerate(nodes_2) )
pos.update( (n, (80, thisList[int(base/16)::int(base/8)][i])) for i, n in enumerate(nodes_3) )
pos.update( (n, (100, thisList[int(base/32)::int(base/16)][i])) for i, n in enumerate(nodes_4) )
pos.update( (n, (120, thisList[int(base/64)::int(base/32)][i])) for i, n in enumerate(nodes_5) )
nx.draw(G, pos, node_color='y', edge_color='grey', with_labels=True)
plt.show()
By using a position list, you can easily transform this graph into any number of alignments or rotations.
Notes
add nodes with a layer key and use multipartite_layout to make the graph layered
setup a "position list" based on the number of nodes in your widest layer (to make the layout centre-aligned, use a zero-centred list)
To assign positions in each layer use basic Python list slice/skip notation to grab the right number of positions, spaced the appropriate amount apart, starting at the right position for the alignment you want
You can do this with the rankdir attribute from graphviz, which can be set on a networkx graph by:
T.graph["graph"] = dict(rankdir="RL")
networkx issue #3547 gives some more info about setting graph attributes.

Networkx apparently scrambling color list python [duplicate]

I managed to produce the graph correctly, but with some more testing noted inconsistent result for the following two different line of codes:
colors = [h.edge[i][j]['color'] for (i,j) in h.edges_iter()]
widths = [h.edge[i][j]['width'] for (i,j) in h.edges_iter()]
nx.draw_circular(h, edge_color=colors, width=widths)
This approach results in consistent output, while the following produces wrong color/size per the orders of edges:
colors = list(nx.get_edge_attributes(h,'color').values())
widths = list(nx.get_edge_attributes(h,'width').values())
nx.draw_circular(h, edge_color=colors, width=widths)
However, it looks to me the above two lines both rely on the function call to return the attributes per the order of edges. Why the different results?
It looks a bit clumsy to me to access attributes with h[][][]; is it possible to access it by dot convention, e.g. edge.color for edge in h.edges().
Or did I miss anything?
The order of the edges passed to the drawing functions are important. If you don't specify (using the edges keyword) you'll get the default order of G.edges(). It is safest to explicitly give the parameter like this:
import networkx as nx
G = nx.Graph()
G.add_edge(1,2,color='r',weight=2)
G.add_edge(2,3,color='b',weight=4)
G.add_edge(3,4,color='g',weight=6)
pos = nx.circular_layout(G)
edges = G.edges()
colors = [G[u][v]['color'] for u,v in edges]
weights = [G[u][v]['weight'] for u,v in edges]
nx.draw(G, pos, edges=edges, edge_color=colors, width=weights)
This results in an output like this:
Dictionaries are the underlying data structure used for NetworkX graphs, and as of Python 3.7+ they maintain insertion order.
This means that we can safely use nx.get_edge_attributes to retrieve edge attributes since we are guaranteed to have the same edge order in every run of Graph.edges() (which is internally called by get_edge_attributes).
So when plotting, we can directly set attributes such as edge_color and width from the result returned by get_edge_attributes. Here's an example:
G = nx.Graph()
G.add_edge(0,1,color='r',weight=2)
G.add_edge(1,2,color='g',weight=4)
G.add_edge(2,3,color='b',weight=6)
G.add_edge(3,4,color='y',weight=3)
G.add_edge(4,0,color='m',weight=1)
colors = nx.get_edge_attributes(G,'color').values()
weights = nx.get_edge_attributes(G,'weight').values()
pos = nx.circular_layout(G)
nx.draw(G, pos,
edge_color=colors,
width=list(weights),
with_labels=True,
node_color='lightgreen')
if you want to avoid adding edge colors and alphas / width manually, you may also find this function helpful:
def rgb_to_hex(rgb):
return '#%02x%02x%02x' % rgb
adjacency_matrix = np.array([[0, 0, 0.5], [1, 0, 1], [1, 0.5, 0]]))
n_graphs = 5
fig, axs = plt.subplots(1, len(n_graphs), figsize=(19,2.5))
for graph in range(n_graphs):
pos = {0: (1, 0.9), 1: (0.9, 1), 2: (1.1, 1)}
# draw DAG graph from adjacency matrix
gr = nx.from_numpy_matrix(adjacency_matrix, create_using=nx.DiGraph)
weights = nx.get_edge_attributes(gr, "weight")
# adding nodes
all_rows = range(0, adjacency_matrix.shape[0])
for n in all_rows:
gr.add_node(n)
# getting edges
edges = gr.edges()
# weight and color of edges
scaling_factor = 4 # to emphasise differences
alphas = [weights[edge] * scaling_factor for edge in edges]
colors = [rgb_to_hex(tuple(np.repeat(int(255 * (1-
weights[edge])),3))) for edge in edges]
# draw graph
nx.draw(gr,
pos,
ax=axs[graph],
edgecolors='black',
node_color='white',
node_size=2000,
labels={0: "A", 1: "B", 2: "C"},
font_weight='bold',
linewidths=2,
with_labels=True,
connectionstyle="arc3,rad=0.15",
edge_color=colors,
width=alphas)
plt.tight_layout()

Python NetworkX: edges based on indices

I want to connect nodes based on the index of arrays. An example:
import networkx as nx
import numpy as np
G=nx.Graph()
G.add_nodes_from(["N1","N2","N3","N4","N5"])
set1 = {'A1':np.array([1,0,1,1,0])}
set1["A2"] = np.array([1,1,1,0,1])
set1["A3"]= np.array([0,0,0,0,1])
set1["A4"] = np.array([1,0,1,0,1])
I created a graph G with five nodes (N1 ... N5) and a dictionary set1 with four keys (A1 ... A5). The values for the keys are numpy arrays with the length 5 and the values 0 or 1. Every entry corresponds to a node. All nodes with 1 should be connected with edges. E.g. A1 = [1,0,1,1,0]: The node N1 should be connected with N3, N1 with N4 and N3 with N4. The same for A2, A3 and A4.
Therefore, I tried the following:
for key, value in set1.items():
position = np.where(value)
for x in np.nditer(position[0]):
#G.add_edge(names
#nx.draw(G,with_labels=True)
I stuck here - would be great, if someone could help me.
An easier representation is probably an adjacency matrix, which describes all of the edges in the graph (weights, which can just be 0/1).
For this, you need to represent your edges differently, e.g.
# adding all-zeros from node 5, since the example dict has no A5 entry
adj = np.array([[1,0,1,1,0], [1,1,1,0,1], [0,0,0,0,1], [1,0,1,0,1], [0,0,0,0,0]])
G1 = nx.from_numpy_array(adj)
# some relabelling because the nodes are automatically given integer labels
mapping = {k:"N{}".format(k+1) for k in G1.nodes()}
G1 = nx.relabel_nodes(G1, mapping)
If you have some other reason why the edge data has to remain in a dictionary, and you don't want to produce an adjacency matrix, you could use the following procedure:
for key, value in set1.items():
# get the source node name, Nx from the key Ax
source_node = key.replace("A", "N")
# and the list of targets
tgt_nodes = np.where(value)[0]
for tgt_i in tgt_nodes:
# construct target - note that python arrays are zero-indexed#
# and your node list starts at 1.
tgt_node = "N{}".format(tgt_i +1)
G.add_edge(source_node, tgt_node)
now lets draw the two with same layout:
import matplotlib.pyplot as plt
fig, ax = plt.subplots(1,2,sharex=True, sharey=True)
pos = nx.circular_layout(G)
nx.draw(G, with_labels=True, ax=ax[0], pos=pos)
nx.draw(G1, with_labels=True, ax=ax[1], pos=pos)

networkx - change color/width according to edge attributes - inconsistent result

I managed to produce the graph correctly, but with some more testing noted inconsistent result for the following two different line of codes:
colors = [h.edge[i][j]['color'] for (i,j) in h.edges_iter()]
widths = [h.edge[i][j]['width'] for (i,j) in h.edges_iter()]
nx.draw_circular(h, edge_color=colors, width=widths)
This approach results in consistent output, while the following produces wrong color/size per the orders of edges:
colors = list(nx.get_edge_attributes(h,'color').values())
widths = list(nx.get_edge_attributes(h,'width').values())
nx.draw_circular(h, edge_color=colors, width=widths)
However, it looks to me the above two lines both rely on the function call to return the attributes per the order of edges. Why the different results?
It looks a bit clumsy to me to access attributes with h[][][]; is it possible to access it by dot convention, e.g. edge.color for edge in h.edges().
Or did I miss anything?
The order of the edges passed to the drawing functions are important. If you don't specify (using the edges keyword) you'll get the default order of G.edges(). It is safest to explicitly give the parameter like this:
import networkx as nx
G = nx.Graph()
G.add_edge(1,2,color='r',weight=2)
G.add_edge(2,3,color='b',weight=4)
G.add_edge(3,4,color='g',weight=6)
pos = nx.circular_layout(G)
edges = G.edges()
colors = [G[u][v]['color'] for u,v in edges]
weights = [G[u][v]['weight'] for u,v in edges]
nx.draw(G, pos, edges=edges, edge_color=colors, width=weights)
This results in an output like this:
Dictionaries are the underlying data structure used for NetworkX graphs, and as of Python 3.7+ they maintain insertion order.
This means that we can safely use nx.get_edge_attributes to retrieve edge attributes since we are guaranteed to have the same edge order in every run of Graph.edges() (which is internally called by get_edge_attributes).
So when plotting, we can directly set attributes such as edge_color and width from the result returned by get_edge_attributes. Here's an example:
G = nx.Graph()
G.add_edge(0,1,color='r',weight=2)
G.add_edge(1,2,color='g',weight=4)
G.add_edge(2,3,color='b',weight=6)
G.add_edge(3,4,color='y',weight=3)
G.add_edge(4,0,color='m',weight=1)
colors = nx.get_edge_attributes(G,'color').values()
weights = nx.get_edge_attributes(G,'weight').values()
pos = nx.circular_layout(G)
nx.draw(G, pos,
edge_color=colors,
width=list(weights),
with_labels=True,
node_color='lightgreen')
if you want to avoid adding edge colors and alphas / width manually, you may also find this function helpful:
def rgb_to_hex(rgb):
return '#%02x%02x%02x' % rgb
adjacency_matrix = np.array([[0, 0, 0.5], [1, 0, 1], [1, 0.5, 0]]))
n_graphs = 5
fig, axs = plt.subplots(1, len(n_graphs), figsize=(19,2.5))
for graph in range(n_graphs):
pos = {0: (1, 0.9), 1: (0.9, 1), 2: (1.1, 1)}
# draw DAG graph from adjacency matrix
gr = nx.from_numpy_matrix(adjacency_matrix, create_using=nx.DiGraph)
weights = nx.get_edge_attributes(gr, "weight")
# adding nodes
all_rows = range(0, adjacency_matrix.shape[0])
for n in all_rows:
gr.add_node(n)
# getting edges
edges = gr.edges()
# weight and color of edges
scaling_factor = 4 # to emphasise differences
alphas = [weights[edge] * scaling_factor for edge in edges]
colors = [rgb_to_hex(tuple(np.repeat(int(255 * (1-
weights[edge])),3))) for edge in edges]
# draw graph
nx.draw(gr,
pos,
ax=axs[graph],
edgecolors='black',
node_color='white',
node_size=2000,
labels={0: "A", 1: "B", 2: "C"},
font_weight='bold',
linewidths=2,
with_labels=True,
connectionstyle="arc3,rad=0.15",
edge_color=colors,
width=alphas)
plt.tight_layout()

Categories