Python NetworkX: edges based on indices - python

I want to connect nodes based on the index of arrays. An example:
import networkx as nx
import numpy as np
G=nx.Graph()
G.add_nodes_from(["N1","N2","N3","N4","N5"])
set1 = {'A1':np.array([1,0,1,1,0])}
set1["A2"] = np.array([1,1,1,0,1])
set1["A3"]= np.array([0,0,0,0,1])
set1["A4"] = np.array([1,0,1,0,1])
I created a graph G with five nodes (N1 ... N5) and a dictionary set1 with four keys (A1 ... A5). The values for the keys are numpy arrays with the length 5 and the values 0 or 1. Every entry corresponds to a node. All nodes with 1 should be connected with edges. E.g. A1 = [1,0,1,1,0]: The node N1 should be connected with N3, N1 with N4 and N3 with N4. The same for A2, A3 and A4.
Therefore, I tried the following:
for key, value in set1.items():
position = np.where(value)
for x in np.nditer(position[0]):
#G.add_edge(names
#nx.draw(G,with_labels=True)
I stuck here - would be great, if someone could help me.

An easier representation is probably an adjacency matrix, which describes all of the edges in the graph (weights, which can just be 0/1).
For this, you need to represent your edges differently, e.g.
# adding all-zeros from node 5, since the example dict has no A5 entry
adj = np.array([[1,0,1,1,0], [1,1,1,0,1], [0,0,0,0,1], [1,0,1,0,1], [0,0,0,0,0]])
G1 = nx.from_numpy_array(adj)
# some relabelling because the nodes are automatically given integer labels
mapping = {k:"N{}".format(k+1) for k in G1.nodes()}
G1 = nx.relabel_nodes(G1, mapping)
If you have some other reason why the edge data has to remain in a dictionary, and you don't want to produce an adjacency matrix, you could use the following procedure:
for key, value in set1.items():
# get the source node name, Nx from the key Ax
source_node = key.replace("A", "N")
# and the list of targets
tgt_nodes = np.where(value)[0]
for tgt_i in tgt_nodes:
# construct target - note that python arrays are zero-indexed#
# and your node list starts at 1.
tgt_node = "N{}".format(tgt_i +1)
G.add_edge(source_node, tgt_node)
now lets draw the two with same layout:
import matplotlib.pyplot as plt
fig, ax = plt.subplots(1,2,sharex=True, sharey=True)
pos = nx.circular_layout(G)
nx.draw(G, with_labels=True, ax=ax[0], pos=pos)
nx.draw(G1, with_labels=True, ax=ax[1], pos=pos)

Related

How to visualize communities from a list in igraph python

I have a community list as the following list_community.
How do I edit the code below to make the community visible?
from igraph import *
list_community = [['A', 'B', 'C', 'D'],['E','F','G'],['G', 'H','I','J']]
list_nodes = ['A', 'B', 'C', 'D','E','F','G','H','I','J']
tuple_edges = [('A','B'),('A','C'),('A','D'),('B','C'),('B','D'), ('C','D'),('C','E'),
('E','F'),('E','G'),('F','G'),('G','H'),
('G','I'), ('G','J'),('H','I'),('H','J'),('I','J'),]
# Make a graph
g_test = Graph()
g_test.add_vertices(list_nodes)
g_test.add_edges(tuple_edges)
# Plot
layout = g_test.layout("kk")
g.vs["name"] = list_nodes
visual_style = {}
visual_style["vertex_label"] = g.vs["name"]
visual_style["layout"] = layout
ig.plot(g_test, **visual_style)
I would like a plot that visualizes the community as shown below.
I can also do this by using a module other than igraph.
Thank you.
In igraph you can use the VertexCover to draw polygons around clusters (as also suggested by Szabolcs in his comment). You have to supply the option mark_groups when plotting the cover, possibly with some additional palette if you want. See some more detail in the documentation here.
In order to construct the VertexCover, you first have to make sure you get integer indices for each node in the graph you created. You can do that using g_test.vs.find.
clusters = [[g_test.vs.find(name=v).index for v in cl] for cl in list_community]
cover = ig.VertexCover(g_test, clusters)
After that, you can simply draw the cover like
ig.plot(cover,
mark_groups=True,
palette=ig.RainbowPalette(3))
resulting in the following picture
Here is a script that somewhat achieves what you're looking for. I had to handle the cases of single-, and two-nodes communities separately, but for greater than two nodes this draws a polygon within the nodes.
I had some trouble with matplotlib not accounting for overlapping edges and faces of polygons which meant the choice was between (1) not having the polygon surround the nodes or (2) having an extra outline just inside the edge of the polygon due to matplotlib overlapping the widened edge with the fill of the polygon. I left a comment on how to change the code from option (2) to option (1).
I also blatantly borrowed a convenience function from this post to handle correctly sorting the nodes in the polygon for appropriate filling by matplotlib's plt.fill().
Option 1:
Option 2:
Full code:
import networkx as nx
import matplotlib.pyplot as plt
import numpy as np
from matplotlib import cm
def sort_xy(x, y):
x0 = np.mean(x)
y0 = np.mean(y)
r = np.sqrt((x-x0)**2 + (y-y0)**2)
angles = np.where((y-y0) > 0, np.arccos((x-x0)/r), 2*np.pi-np.arccos((x-x0)/r))
mask = np.argsort(angles)
x_sorted = x[mask]
y_sorted = y[mask]
return x_sorted, y_sorted
G = nx.karate_club_graph()
pos = nx.spring_layout(G, seed=42)
fig, ax = plt.subplots(figsize=(8, 10))
nx.draw(G, pos=pos, with_labels=True)
communities = nx.community.louvain_communities(G)
alpha = 0.5
edge_padding = 10
colors = cm.get_cmap('viridis', len(communities))
for i, comm in enumerate(communities):
if len(comm) == 1:
cir = plt.Circle((pos[comm.pop()]), edge_padding / 100, alpha=alpha, color=colors(i))
ax.add_patch(cir)
elif len(comm) == 2:
comm_pos = {k: pos[k] for k in comm}
coords = [a for a in zip(*comm_pos.values())]
x, y = coords[0], coords[1]
plt.plot(x, y, linewidth=edge_padding, linestyle="-", alpha=alpha, color=colors(i))
else:
comm_pos = {k: pos[k] for k in comm}
coords = [a for a in zip(*comm_pos.values())]
x, y = sort_xy(np.array(coords[0]), np.array(coords[1]))
plt.fill(x, y, alpha=alpha, facecolor=colors(i),
edgecolor=colors(i), # set to None to remove edge padding
linewidth=edge_padding)

How to draw trees left to right

Consider the tree below.
import matplotlib.pyplot as plt
import networkx as nx
import pydot
from networkx.drawing.nx_pydot import graphviz_layout
T = nx.balanced_tree(2, 5)
for line in nx.generate_adjlist(T):
print(line)
pos = graphviz_layout(T, prog="dot")
nx.draw(T, pos, node_color="y", edge_color='#909090', node_size=200, with_labels=True)
plt.show()
How can I draw this left to right so that the whole image is rotated by 90 degrees with the root on the right?
If you want to have fine-grained control over node positions (which includes rotating the whole graph) you can actually set each node's position explicitly. Here's a way to do that that produces a 'centred' hierarchy, left to right.
import itertools
import matplotlib.pyplot as plt
import networkx as nx
plt.figure(figsize=(12,8))
subset_sizes = [1, 2, 4, 8, 16, 32]
def multilayered_graph(*subset_sizes):
extents = nx.utils.pairwise(itertools.accumulate((0,) + subset_sizes))
layers = [range(start, end) for start, end in extents]
G = nx.Graph()
for (i, layer) in enumerate(layers):
G.add_nodes_from(layer, layer=i)
for layer1, layer2 in nx.utils.pairwise(layers):
G.add_edges_from(itertools.product(layer1, layer2))
return G
# Instantiate the graph
G = multilayered_graph(*subset_sizes)
# use the multipartite layout
pos = nx.multipartite_layout(G, subset_key="layer")
nodes = G.nodes
nodes_0 = set([n for n in nodes if G.nodes[n]['layer']==0])
nodes_1 = set([n for n in nodes if G.nodes[n]['layer']==1])
nodes_2 = set([n for n in nodes if G.nodes[n]['layer']==2])
nodes_3 = set([n for n in nodes if G.nodes[n]['layer']==3])
nodes_4 = set([n for n in nodes if G.nodes[n]['layer']==4])
nodes_5 = set([n for n in nodes if G.nodes[n]['layer']==5])
# setup a position list
pos = dict()
base = 128
thisList = list(range(-int(base/2),int(base/2),1))
# then assign nodes to indices
pos.update( (n, (10, thisList[int(base/2)::int(base/2)][i])) for i, n in enumerate(nodes_0) )
pos.update( (n, (40, thisList[int(base/4)::int(base/2)][i])) for i, n in enumerate(nodes_1) )
pos.update( (n, (60, thisList[int(base/8)::int(base/4)][i])) for i, n in enumerate(nodes_2) )
pos.update( (n, (80, thisList[int(base/16)::int(base/8)][i])) for i, n in enumerate(nodes_3) )
pos.update( (n, (100, thisList[int(base/32)::int(base/16)][i])) for i, n in enumerate(nodes_4) )
pos.update( (n, (120, thisList[int(base/64)::int(base/32)][i])) for i, n in enumerate(nodes_5) )
nx.draw(G, pos, node_color='y', edge_color='grey', with_labels=True)
plt.show()
By using a position list, you can easily transform this graph into any number of alignments or rotations.
Notes
add nodes with a layer key and use multipartite_layout to make the graph layered
setup a "position list" based on the number of nodes in your widest layer (to make the layout centre-aligned, use a zero-centred list)
To assign positions in each layer use basic Python list slice/skip notation to grab the right number of positions, spaced the appropriate amount apart, starting at the right position for the alignment you want
You can do this with the rankdir attribute from graphviz, which can be set on a networkx graph by:
T.graph["graph"] = dict(rankdir="RL")
networkx issue #3547 gives some more info about setting graph attributes.

Plot size of each node according to dictionary

So far I have the following code:
source = ['0','0','0','0','0','0','0']
destination = ['1','2','3','4','5','6','7']
FB_network_graph = pd.DataFrame({ 'from':source, 'to':destination})
G=nx.from_pandas_edgelist(FB_network_graph, 'from', 'to')
plt.figure(figsize = (100,100))
nx.draw(G, with_labels=True)
I want to plot a graph whereby node '0' has a size of 7 and node '1'-'7' has a size of 1.
It looks like you want to adjust the node_size according to the degree. For that you can define a dictionary from the result of G.degree and set the size according to the corresponding node degree by looking up the dictionary:
scale = 300
d = dict(G.degree)
nx.draw(G, node_color='lightblue',
with_labels=True,
nodelist=d,
node_size=[d[k]*scale for k in d])
Alternatively you could just define your custom dictionary, to set the corresponding node sizes in node_size. For this specific case with something like:
d = {str(k):1 for k in range(1,8)}
d['0'] = 7

Speed up Python cKDTree

I currently have a function that I created that connects the blue dots with its (at maximum) 3 nearest neighbors within a pixel range of 55. The vertices_xy_list is an extremely large list or points (nested list) of about 5000-10000 pairs.
Example of vertices_xy_list:
[[3673.3333333333335, 2483.3333333333335],
[3718.6666666666665, 2489.0],
[3797.6666666666665, 2463.0],
[3750.3333333333335, 2456.6666666666665],...]
I currently have written this calculate_draw_vertice_lines() function that uses a CKDTree inside of a While loop to find all points within 55 pixels and then connect them each with a green line.
It can be seen that this would become exponentially slower as the list gets longer. Is there any method to speed up this function significantly? Such as vectorizing operations?
def calculate_draw_vertice_lines():
global vertices_xy_list
global cell_wall_lengths
global list_of_lines_references
index = 0
while True:
if (len(vertices_xy_list) == 1):
break
point_tree = spatial.cKDTree(vertices_xy_list)
index_of_closest_points = point_tree.query_ball_point(vertices_xy_list[index], 55)
index_of_closest_points.remove(index)
for stuff in index_of_closest_points:
list_of_lines_references.append(plt.plot([vertices_xy_list[index][0],vertices_xy_list[stuff][0]] , [vertices_xy_list[index][1],vertices_xy_list[stuff][1]], color = 'green'))
wall_length = math.sqrt( (vertices_xy_list[index][0] - vertices_xy_list[stuff][0])**2 + (vertices_xy_list[index][1] - vertices_xy_list[stuff][1])**2 )
cell_wall_lengths.append(wall_length)
del vertices_xy_list[index]
fig.canvas.draw()
If I understand the logic of selecting the green lines correctly, there is no need to create a KDTree at each iteration. For each pair (p1, p2) of blue points, the line should be drawn if and only if the following hold:
p1 is one of 3 closest neighbors of p2.
p2 is one of 3 closest neighbors of p1.
dist(p1, p2) < 55.
You can create the KDTree once and create a list of green lines efficiently. Here is part of the implementation that returns a list of pairs of indices for points between which the green lines need to be drawn. The runtime is about 0.5 seconds on my machine for 10,000 points.
import numpy as np
from scipy import spatial
data = np.random.randint(0, 1000, size=(10_000, 2))
def get_green_lines(data):
tree = spatial.cKDTree(data)
# each key in g points to indices of 3 nearest blue points
g = {i: set(tree.query(data[i,:], 4)[-1][1:]) for i in range(data.shape[0])}
green_lines = list()
for node, candidates in g.items():
for node2 in candidates:
if node2 < node:
# avoid double-counting
continue
if node in g[node2] and spatial.distance.euclidean(data[node,:], data[node2,:]) < 55:
green_lines.append((node, node2))
return green_lines
You can proceed to plot green lines as follows:
green_lines = get_green_lines(data)
fig, ax = plt.subplots()
ax.scatter(data[:, 0], data[:, 1], s=1)
from matplotlib import collections as mc
lines = [[data[i], data[j]] for i, j in green_lines]
line_collection = mc.LineCollection(lines, color='green')
ax.add_collection(line_collection)
Example output:

Python/NetworkX: calculate edge weights on the fly

I have an unweighted graph created with networkx for which I would like to calculate the weight of edges between nodes based on the count/frequency of an edge occurrence. An edge in my graph can occur more than once but the frequency of an edge appearance is not known in advance. The purpose is to visualize the edges based on the weight (e.g. count/frequency) of moves between connected nodes. Essentially, I'd like to create a network traffic map of movement between connected nodes, and visualize based on color or edge width. E.g., edge from node 0 to 1 has 10 movements between them, and node 1 to 2 has 5, so edge 0-1 would be visualized using a different edge color/size.
How can I calculate the weight of edges between two nodes, on the fly (after adding them to the graph with g.add_edges_from()), and then reapply to my graph for visualization? Below is a sample of my graph, data, and code I've used to create the graph initially and a solution I attempted that failed.
Graph
Sample Data
Cluster centroids(nodes)
cluster_label,latitude,longitude
0,39.18193382,-77.51885109
1,39.18,-77.27
2,39.17917928,-76.6688633
3,39.1782,-77.2617
4,39.1765,-77.1927
5,39.1762375,-76.8675441
6,39.17468,-76.8204499
7,39.17457332,-77.2807235
8,39.17406072,-77.274685
9,39.1731621,-77.2716502
10,39.17,-77.27
Trajectories(edges)
user_id,trajectory
11011.0,"[[340, 269], [269, 340]]"
80973.0,"[[398, 279]]"
608473.0,"[[69, 28]]"
2139671.0,"[[382, 27], [27, 285]]"
3945641.0,"[[120, 422], [422, 217], [217, 340], [340, 340]]"
5820642.0,"[[458, 442]]"
6060732.0,"[[291, 431]]"
6912362.0,"[[68, 27]]"
7362602.0,"[[112, 269]]"
8488782.0,"[[133, 340], [340, 340]]"
Code
import csv
import networkx as nx
import pandas as pd
import community
import matplotlib.pyplot as plt
import time
import mplleaflet
g = nx.MultiGraph()
df = pd.read_csv('cluster_centroids.csv', delimiter=',')
df['pos'] = list(zip(df.longitude,df.latitude))
dict_pos = dict(zip(df.cluster_label,df.pos))
#print dict_pos
for row in csv.reader(open('edges.csv', 'r')):
if '[' in row[1]: #
g.add_edges_from(eval(row[1]))
# Plotting with mplleaflet
fig, ax = plt.subplots()
nx.draw_networkx_nodes(g,pos=dict_pos,node_size=50,node_color='b')
nx.draw_networkx_edges(g,pos=dict_pos,linewidths=0.01,edge_color='k', alpha=.05)
nx.draw_networkx_labels(g,dict_pos)
mplleaflet.show(fig=ax.figure)
I have tried using g.add_weighted_edges_from() and adding weight=1 as an attribute, but have not had any luck. I also tried using this which also did not work:
for u,v,d in g.edges():
d['weight'] = 1
g.edges(data=True)
edges = g.edges()
weights = [g[u][v]['weight'] for u,v in edges]
Since this went unanswered, a 2nd question on this topic was opened (here: Python/NetworkX: Add Weights to Edges by Frequency of Edge Occurance) which received responses. To add weights to edges based on count of edge occurrence:
g = nx.MultiDiGraph()
df = pd.read_csv('G:\cluster_centroids.csv', delimiter=',')
df['pos'] = list(zip(df.longitude,df.latitude))
dict_pos = dict(zip(df.cluster_label,df.pos))
#print dict_pos
for row in csv.reader(open('G:\edges.csv', 'r')):
if '[' in row[1]: #
g.add_edges_from(eval(row[1]))
for u, v, d in g.edges(data=True):
d['weight'] = 1
for u,v,d in g.edges(data=True):
print u,v,d
To scale color and edge width based on the above count:
minLineWidth = 0.25
for u, v, d in g.edges(data=True):
d['weight'] = c[u, v]*minLineWidth
edges,weights = zip(*nx.get_edge_attributes(g,'weight').items())
values = range(len(g.edges())
jet = cm = plt.get_cmap('YlOrRd')
cNorm = colors.Normalize(vmin=0, vmax=values[-1])
scalarMap = cmx.ScalarMappable(norm=cNorm, cmap=jet)
colorList = []
for i in range(len(g.edges()):
colorVal = scalarMap.to_rgba(values[i])
colorList.append(colorVal)
and passing width=[d['weight'] for u,v, d in g.edges(data=True)], edge_color=colorList as arguments in nx.draw_networkx_edges()

Categories