`graph-tool` edge bundling circular layout graph from weighted adjacency block matrix - python

I am trying graph-tool by Tiago Peixoto to build a graph (either directed or undirected) from a given weighted adjacency matrix with a block structure. So far, unsuccessfully. My question partly overlaps with this thread on SO, which, however, remains without a clear solution.
Suppose I have a function that generates my block matrix of weights J, which is in the form:
Each block Jij is some random binary matrix with entries drawn from a given distribution. The scalars s and g respectively denote weights for connections within diagonal blocks (i.e. when i = j) and blocks off the diagonal (i.e. i ≠ j).
I build my graph in graph_tool as follows:
import graph_tool.all as gt
directed = False # True if we want the graph to be directed
J = generate_adj_bmatrix(...,s=0.1,g=0.01,directed=directed) # Some function to generate the weighted adjacency matrix (here the matrix will be symmetric since we want the graph undirected)
# Define graph
G = gt.Graph(directed=directed)
indexes = J.nonzero()
G.add_edge_list(np.transpose(indexes))
# Add weight information
G.ep['weight'] = G.new_ep("double", vals=J[indexes])
I can also add, if I want, some VertexProperty to my G graph to whose block my nodes belong. But how do I include this information in the code whereby I can build the circular graph? The code reads (pasted here from graph-tool docs):
state = gt.minimize_blockmodel_dl(G) # or should I consider instead state = gt.minimize_nested_blockmodel_dl(G)?
gt.draw_hierarchy(state)
t = gt.get_hierarchy_tree(state)[0]
tpos = pos = gt.radial_tree_layout(t, t.vertex(t.num_vertices() - 1), weighted=True)
cts = gt.get_hierarchy_control_points(G, t, tpos)
pos = G.own_property(tpos)
b = state.levels[0].b
shape = b.copy()
shape.a %= 14 # Have not yet figured out what I need it for
gt.graph_draw(G, pos=pos, vertex_fill_color=b, vertex_shape=shape,
edge_control_points=cts,edge_color=[0, 0, 0, 0.3], vertex_anchor=0)
Noteworthy is that the above code currently hangs seemingly too long. The minimize_blockmodel_dl(G) line appears to engage in an endless loop. Ideally, I should not sample my graph for clusters, since this information could already be provided as a property to the vertexes, based on my knowledge of the block structure of J. At the same time, minimize_blockmodel_dl(G) seems necessary in order to access the edge bundling option, doesn't it?

Here is the solution I came up with.
def visualize_network(J,N_sizes):
"""
Visualize a network from weighted block adjacency matrix in a circular layout with FEB.
Input arguments:
-- J : Weighted adjacency matrix (in block-matrix form, but can be any, as far as it is square).
-- N_sizes : {<block1_label>: size; <block2_label>: size,...} such that node indexes of block n follow immediately those of block n-1.
"""
import numpy as np
import matplotlib.colors as mcolors
import graph_tool.all as gt
# Generate the graph
G = gt.Graph(directed=True) # In my case, network edges are oriented
eindexes = J.nonzero()
G.add_edge_list(np.transpose(eindexes))
# Add weight information
weight = G.new_ep("double", vals = J[eindexes])
# Assign color to each vertex based on the block it belongs to
colors = {'B1' : 'k',
'B2' : 'r',
'B3' : 'g',
'B4' : 'b'}
regs = np.asarray(list(N_sizes.keys()))
rindexes = np.cumsum(list(N_sizes.values()))
iidd = regs[np.searchsorted(rindexes,np.arange(np.shape(J)[0]))]
region_id = G.new_vp("string",vals=iidd)
vcolors = [colors[id] for id in iidd]
vertex_color = G.new_vp("string",vals=vcolors)
# Assigns edge colors by out-node.
eid = regs[np.searchsorted(rindexes,np.arange(np.shape(J)[0]))]
ecolors = [mcolors.to_hex(c) for c in regs[np.searchsorted(rindexes,eindexes[0]]]
edge_color = G.new_ep("string",vals=ecolors)
# Construct a graph in a circular layout with FEB
G = gt.GraphView(G, vfilt=gt.label_largest_component(G))
state = gt.minimize_nested_blockmodel_dl(G)
t = gt.get_hierarchy_tree(state)[0]
tpos = gt.radial_tree_layout(t, t.vertex(t.num_vertices() - 1, use_index=False), weighted=True)
cts = gt.get_hierarchy_control_points(G, t, tpos)
pos = G.own_property(tpos)
gt.graph_draw(G,
pos = pos,
vertex_fill_color = vertex_color,
edge_control_points = cts,
edge_color = edge_color,
vertex_anchor = 0)
Additional documentation on the circular layout and this way of building the graph can be found at this graph-tool doc page.

Related

Finding neighbourhoods (cliques) in street data (a graph)

I am looking for a way to automatically define neighbourhoods in cities as polygons on a graph.
My definition of a neighbourhood has two parts:
A block: An area inclosed between a number of streets, where the number of streets (edges) and intersections (nodes) is a minimum of three (a triangle).
A neighbourhood: For any given block, all the blocks directly adjacent to that block and the block itself.
See this illustration for an example:
E.g. B4 is block defined by 7 nodes and 6 edges connecting them. As most of the examples here, the other blocks are defined by 4 nodes and 4 edges connecting them. Also, the neighbourhood of B1 includes B2 (and vice versa) while B2 also includes B3.
I am using osmnx to get street data from OSM.
Using osmnx and networkx, how can I traverse a graph to find the nodes and edges that define each block?
For each block, how can I find the adjacent blocks?
I am working myself towards a piece of code that takes a graph and a pair of coordinates (latitude, longitude) as input, identifies the relevant block and returns the polygon for that block and the neighbourhood as defined above.
Here is the code used to make the map:
import osmnx as ox
import networkx as nx
import matplotlib.pyplot as plt
G = ox.graph_from_address('Nørrebrogade 20, Copenhagen Municipality',
network_type='all',
distance=500)
and my attempt at finding cliques with different number of nodes and degrees.
def plot_cliques(graph, number_of_nodes, degree):
ug = ox.save_load.get_undirected(graph)
cliques = nx.find_cliques(ug)
cliques_nodes = [clq for clq in cliques if len(clq) >= number_of_nodes]
print("{} cliques with more than {} nodes.".format(len(cliques_nodes), number_of_nodes))
nodes = set(n for clq in cliques_nodes for n in clq)
h = ug.subgraph(nodes)
deg = nx.degree(h)
nodes_degree = [n for n in nodes if deg[n] >= degree]
k = h.subgraph(nodes_degree)
nx.draw(k, node_size=5)
Theory that might be relevant:
Enumerating All Cycles in an Undirected Graph
Finding city blocks using the graph is surprisingly non-trivial.
Basically, this amounts to finding the smallest set of smallest rings (SSSR), which is an NP-complete problem.
A review of this problem (and related problems) can be found here.
On SO, there is one description of an algorithm to solve it here.
As far as I can tell, there is no corresponding implementation in networkx (or in python for that matter).
I tried this approach briefly and then abandoned it -- my brain is not up to scratch for that kind of work today.
That being said, I will award a bounty to anybody that might visit this page at a later date and post a tested implementation of an algorithm that finds the SSSR in python.
I have instead pursued a different approach, leveraging the fact that the graph is guaranteed to be planar.
Briefly, instead of treating this as a graph problem, we treat this as an image segmentation problem.
First, we find all connected regions in the image. We then determine the contour around each region,
transform the contours in image coordinates back to longitudes and latitudes.
Given the following imports and function definitions:
#!/usr/bin/env python
# coding: utf-8
"""
Find house blocks in osmnx graphs.
"""
import numpy as np
import osmnx as ox
import networkx as nx
import matplotlib.pyplot as plt
from matplotlib.path import Path
from matplotlib.patches import PathPatch
from matplotlib.backends.backend_agg import FigureCanvasAgg as FigureCanvas
from skimage.measure import label, find_contours, points_in_poly
from skimage.color import label2rgb
ox.config(log_console=True, use_cache=True)
def k_core(G, k):
H = nx.Graph(G, as_view=True)
H.remove_edges_from(nx.selfloop_edges(H))
core_nodes = nx.k_core(H, k)
H = H.subgraph(core_nodes)
return G.subgraph(core_nodes)
def plot2img(fig):
# remove margins
fig.subplots_adjust(left=0, bottom=0, right=1, top=1, wspace=0, hspace=0)
# convert to image
# https://stackoverflow.com/a/35362787/2912349
# https://stackoverflow.com/a/54334430/2912349
canvas = FigureCanvas(fig)
canvas.draw()
img_as_string, (width, height) = canvas.print_to_buffer()
as_rgba = np.fromstring(img_as_string, dtype='uint8').reshape((height, width, 4))
return as_rgba[:,:,:3]
Load the data. Do cache the imports, if testing this repeatedly -- otherwise your account can get banned.
Speaking from experience here.
G = ox.graph_from_address('Nørrebrogade 20, Copenhagen Municipality',
network_type='all', distance=500)
G_projected = ox.project_graph(G)
ox.save_graphml(G_projected, filename='network.graphml')
# G = ox.load_graphml('network.graphml')
Prune nodes and edges that cannot be part of a cycle.
This step is not strictly necessary but results in nicer contours.
H = k_core(G, 2)
fig1, ax1 = ox.plot_graph(H, node_size=0, edge_color='k', edge_linewidth=1)
Convert plot to image and find connected regions:
img = plot2img(fig1)
label_image = label(img > 128)
image_label_overlay = label2rgb(label_image[:,:,0], image=img[:,:,0])
fig, ax = plt.subplots(1,1)
ax.imshow(image_label_overlay)
For each labelled region, find the contour and convert the contour pixel coordinates back to data coordinates.
# using a large region here as an example;
# however we could also loop over all unique labels, i.e.
# for ii in np.unique(labels.ravel()):
ii = np.argsort(np.bincount(label_image.ravel()))[-5]
mask = (label_image[:,:,0] == ii)
contours = find_contours(mask.astype(np.float), 0.5)
# Select the largest contiguous contour
contour = sorted(contours, key=lambda x: len(x))[-1]
# display the image and plot the contour;
# this allows us to transform the contour coordinates back to the original data cordinates
fig2, ax2 = plt.subplots()
ax2.imshow(mask, interpolation='nearest', cmap='gray')
ax2.autoscale(enable=False)
ax2.step(contour.T[1], contour.T[0], linewidth=2, c='r')
plt.close(fig2)
# first column indexes rows in images, second column indexes columns;
# therefor we need to swap contour array to get xy values
contour = np.fliplr(contour)
pixel_to_data = ax2.transData + ax2.transAxes.inverted() + ax1.transAxes + ax1.transData.inverted()
transformed_contour = pixel_to_data.transform(contour)
transformed_contour_path = Path(transformed_contour, closed=True)
patch = PathPatch(transformed_contour_path, facecolor='red')
ax1.add_patch(patch)
Determine all points in the original graph that fall inside (or on) the contour.
x = G.nodes.data('x')
y = G.nodes.data('y')
xy = np.array([(x[node], y[node]) for node in G.nodes])
eps = (xy.max(axis=0) - xy.min(axis=0)).mean() / 100
is_inside = transformed_contour_path.contains_points(xy, radius=-eps)
nodes_inside_block = [node for node, flag in zip(G.nodes, is_inside) if flag]
node_size = [50 if node in nodes_inside_block else 0 for node in G.nodes]
node_color = ['r' if node in nodes_inside_block else 'k' for node in G.nodes]
fig3, ax3 = ox.plot_graph(G, node_color=node_color, node_size=node_size)
Figuring out if two blocks are neighbors is pretty easy. Just check if they share a node:
if set(nodes_inside_block_1) & set(nodes_inside_block_2): # empty set evaluates to False
print("Blocks are neighbors.")
I'm not completely sure that cycle_basis will give you the neighborhoods you seek, but if it does, it's a simple thing to get the neighborhood graph from it:
import osmnx as ox
import networkx as nx
import matplotlib.pyplot as plt
G = ox.graph_from_address('Nørrebrogade 20, Copenhagen Municipality',
network_type='all',
distance=500)
H = nx.Graph(G) # make a simple undirected graph from G
cycles = nx.cycles.cycle_basis(H) # I think a cycle basis should get all the neighborhoods, except
# we'll need to filter the cycles that are too small.
cycles = [set(cycle) for cycle in cycles if len(cycle) > 2] # Turn the lists into sets for next loop.
# We can create a new graph where the nodes are neighborhoods and two neighborhoods are connected if
# they are adjacent:
I = nx.Graph()
for i, n in enumerate(cycles):
for j, m in enumerate(cycles[i + 1:], start=i + 1):
if not n.isdisjoint(m):
I.add_edge(i, j)
I appreciate this question is a little bit old now but I have an alternative approach that is relatively straighforward - it does require stepping away from networkx for a moment though.
Creating the blocks
Get the projected graph:
import osmnx as ox
import geopandas as gpd
from shapely.ops import polygonize
G = ox.graph_from_address('Nørrebrogade 20, Copenhagen Municipality',
network_type='all',
dist=500)
G_projected = ox.project_graph(G)
Convert the graph to undirected - this removes duplicate edges that would cause the subsequent polygonization to fail:
G_undirected = G_projected.to_undirected()
Extract just the edges into a GeoPandas GeoDataFrame:
G_edges_as_gdf = ox.graph_to_gdfs(G_undirected, nodes=False, edges=True)
Use polygonize from shapely.ops on the edges to create the block faces and then use these as the geometry in a new GeoDataFrame:
block_faces = list(polygonize(G_edges_as_gdf['geometry']))
blocks = gpd.GeoDataFrame(geometry=block_faces)
Plot the result:
ax = G_edges_as_gdf.plot(figsize=(10,10), color='red', zorder=0)
blocks.plot(ax=ax, facecolor='gainsboro', edgecolor='k', linewidth=2, alpha=0.5, zorder=1)
blocks created from line fragments by shapely.ops.polygonize()
Finding neighbors
PySAL does a great job of this with its spatial weights see https://pysal.org/notebooks/lib/libpysal/weights.html for further information. libpysal can be installed from conda-forge.
Here we use Rook weights to identify blocks that share an edge as in the original question. Queen weights would also include those that only share a node (i.e. meet at a street junction)
from libpysal.weights import Rook # Queen, KNN also available
w_rook = Rook.from_dataframe(blocks)
The spatial weights are just for the neighbours so we need to append the original block (index number 18 here just as an example):
block = 18
neighbors = w_rook.neighbors[block]
neighbors.append(block)
neighbors
Then we can plot using neighbors as a filter:
ax = blocks.plot(figsize=(10,10), facecolor='gainsboro', edgecolor='black')
blocks[blocks.index.isin(neighbors)].plot(ax=ax, color='red', alpha=0.5)
neighboring blocks highlighted on top of all blocks
Notes
This doesn't resolve the sliver polygons caused by multiple road centre lines and it can be challenging to resolve ambiguities caused by pedestrianised streets and footpaths - presumably a pedestrianised street is a legitimate division between blocks but a footpath may not be.
From memory I have in the past needed to fragment the LineStrings created from the edges further before polygonizing them - this didn't seem to be necessary with this example.
I have left out the final step of linking the new geometries back to the Node IDs with some kind of spatial join etc.
I don't have a code, but I guess that once i'm on the sidewalk, if I keep turning to the right at each corner, I will cycle through the edges of my block. I don't know the libraries so I'll just talk algo here.
from your point, go north until you reach a street
turn right as much as you can and walk on the street
on the next corner, find all the steets, chose the one that makes the smallest angle with your street counting from the right.
walk on that street.
turn right, etc.
It's actually an algorithm to use to exit a maze : keep your right hand on the wall and walk. It doesn't work in case of loops in the maze, you just loop around. But it gives a solution to your problem.
This is an implementation of Hashemi Emad's idea. It works well as long as the starting position is chosen such that there exists a way to step counterclockwise in a tight circle. For some edges, in particular around the outside of the map, this is not possible. I don't have an idea how to select good starting positions, or how to filter solutions -- but maybe somebody else has one.
Working example (starting with edge (1204573687, 4555480822)):
Example, where this approach does not work (starting with edge (1286684278, 5818325197)):
Code
#!/usr/bin/env python
# coding: utf-8
"""
Find house blocks in osmnx graphs.
"""
import numpy as np
import networkx as nx
import osmnx as ox
import matplotlib.pyplot as plt; plt.ion()
from matplotlib.path import Path
from matplotlib.patches import PathPatch
ox.config(log_console=True, use_cache=True)
def k_core(G, k):
H = nx.Graph(G, as_view=True)
H.remove_edges_from(nx.selfloop_edges(H))
core_nodes = nx.k_core(H, k)
H = H.subgraph(core_nodes)
return G.subgraph(core_nodes)
def get_vector(G, n1, n2):
dx = np.diff([G.nodes.data()[n]['x'] for n in (n1, n2)])
dy = np.diff([G.nodes.data()[n]['y'] for n in (n1, n2)])
return np.array([dx, dy])
def angle_between(v1, v2):
# https://stackoverflow.com/a/31735642/2912349
ang1 = np.arctan2(*v1[::-1])
ang2 = np.arctan2(*v2[::-1])
return (ang1 - ang2) % (2 * np.pi)
def step_counterclockwise(G, edge, path):
start, stop = edge
v1 = get_vector(G, stop, start)
neighbors = set(G.neighbors(stop))
candidates = list(set(neighbors) - set([start]))
if not candidates:
raise Exception("Ran into a dead end!")
else:
angles = np.zeros_like(candidates, dtype=float)
for ii, neighbor in enumerate(candidates):
v2 = get_vector(G, stop, neighbor)
angles[ii] = angle_between(v1, v2)
next_node = candidates[np.argmin(angles)]
if next_node in path:
# next_node might not be the same as the first node in path;
# therefor, we backtrack until we end back at next_node
closed_path = [next_node]
for node in path[::-1]:
closed_path.append(node)
if node == next_node:
break
return closed_path[::-1] # reverse to have counterclockwise path
else:
path.append(next_node)
return step_counterclockwise(G, (stop, next_node), path)
def get_city_block_patch(G, boundary_nodes, *args, **kwargs):
xy = []
for node in boundary_nodes:
x = G.nodes.data()[node]['x']
y = G.nodes.data()[node]['y']
xy.append((x, y))
path = Path(xy, closed=True)
return PathPatch(path, *args, **kwargs)
if __name__ == '__main__':
# --------------------------------------------------------------------------------
# load data
# # DO CACHE RESULTS -- otherwise you can get banned for repeatedly querying the same address
# G = ox.graph_from_address('Nørrebrogade 20, Copenhagen Municipality',
# network_type='all', distance=500)
# G_projected = ox.project_graph(G)
# ox.save_graphml(G_projected, filename='network.graphml')
G = ox.load_graphml('network.graphml')
# --------------------------------------------------------------------------------
# prune nodes and edges that should/can not be part of a cycle;
# this also reduces the chance of running into a dead end when stepping counterclockwise
H = k_core(G, 2)
# --------------------------------------------------------------------------------
# pick an edge and step counterclockwise until you complete a circle
# random edge
total_edges = len(H.edges)
idx = np.random.choice(total_edges)
start, stop, _ = list(H.edges)[idx]
# good edge
# start, stop = 1204573687, 4555480822
# bad edge
# start, stop = 1286684278, 5818325197
steps = step_counterclockwise(H, (start, stop), [start, stop])
# --------------------------------------------------------------------------------
# plot
patch = get_city_block_patch(G, steps, facecolor='red', edgecolor='red', zorder=-1)
node_size = [100 if node in steps else 20 for node in G.nodes]
node_color = ['crimson' if node in steps else 'black' for node in G.nodes]
fig1, ax1 = ox.plot_graph(G, node_size=node_size, node_color=node_color, edge_color='k', edge_linewidth=1)
ax1.add_patch(patch)
fig1.savefig('city_block.png')

how to draw communities with networkx

How can I draw a graph with it's communities using python networkx like this image :
image url
The documentation for networkx.draw_networkx_nodes and networkx.draw_networkx_edges explains how to set the node and edge colors. The patches bounding the communities can be made by finding the positions of the nodes for each community and then drawing a patch (e.g. matplotlib.patches.Circle) that contains all positions (and then some).
The hard bit is the graph layout / setting the node positions.
AFAIK, there is no routine in networkx to achieve the desired graph layout "out of the box". What you want to do is the following:
Position the communities with respect to each other: create a new, weighted graph, where each node corresponds to a community, and the weights correspond to the number of edges between communities. Get a decent layout with your favourite graph layout algorithm (e.g.spring_layout).
Position the nodes within each community: for each community, create a new graph. Find a layout for the subgraph.
Combine node positions in 1) and 3). E.g. scale community positions calculated in 1) by a factor of 10; add those values to the positions of all nodes (as computed in 2)) within that community.
I have been wanting to implement this for a while. I might do it later today or over the weekend.
EDIT:
Voila. Now you just need to draw your favourite patch around (behind) the nodes.
import numpy as np
import matplotlib.pyplot as plt
import networkx as nx
def community_layout(g, partition):
"""
Compute the layout for a modular graph.
Arguments:
----------
g -- networkx.Graph or networkx.DiGraph instance
graph to plot
partition -- dict mapping int node -> int community
graph partitions
Returns:
--------
pos -- dict mapping int node -> (float x, float y)
node positions
"""
pos_communities = _position_communities(g, partition, scale=3.)
pos_nodes = _position_nodes(g, partition, scale=1.)
# combine positions
pos = dict()
for node in g.nodes():
pos[node] = pos_communities[node] + pos_nodes[node]
return pos
def _position_communities(g, partition, **kwargs):
# create a weighted graph, in which each node corresponds to a community,
# and each edge weight to the number of edges between communities
between_community_edges = _find_between_community_edges(g, partition)
communities = set(partition.values())
hypergraph = nx.DiGraph()
hypergraph.add_nodes_from(communities)
for (ci, cj), edges in between_community_edges.items():
hypergraph.add_edge(ci, cj, weight=len(edges))
# find layout for communities
pos_communities = nx.spring_layout(hypergraph, **kwargs)
# set node positions to position of community
pos = dict()
for node, community in partition.items():
pos[node] = pos_communities[community]
return pos
def _find_between_community_edges(g, partition):
edges = dict()
for (ni, nj) in g.edges():
ci = partition[ni]
cj = partition[nj]
if ci != cj:
try:
edges[(ci, cj)] += [(ni, nj)]
except KeyError:
edges[(ci, cj)] = [(ni, nj)]
return edges
def _position_nodes(g, partition, **kwargs):
"""
Positions nodes within communities.
"""
communities = dict()
for node, community in partition.items():
try:
communities[community] += [node]
except KeyError:
communities[community] = [node]
pos = dict()
for ci, nodes in communities.items():
subgraph = g.subgraph(nodes)
pos_subgraph = nx.spring_layout(subgraph, **kwargs)
pos.update(pos_subgraph)
return pos
def test():
# to install networkx 2.0 compatible version of python-louvain use:
# pip install -U git+https://github.com/taynaud/python-louvain.git#networkx2
from community import community_louvain
g = nx.karate_club_graph()
partition = community_louvain.best_partition(g)
pos = community_layout(g, partition)
nx.draw(g, pos, node_color=list(partition.values())); plt.show()
return
Addendum
Although the general idea is sound, my old implementation above has a few issues. Most importantly, the implementation doesn't work very well for unevenly sized communities. Specifically, _position_communities gives each community the same amount of real estate on the canvas. If some of the communities are much larger than others, these communities end up being compressed into the same amount of space as the small communities. Obviously, this does not reflect the structure of the graph very well.
I have written a library for visualizing networks, which is called netgraph. It includes an improved version of the community layout routine outlined above, which also considers the sizes of the communities when arranging them. It is fully compatible with networkx and igraph Graph objects, so it should be easy and fast to make great looking graphs (at least that is the idea).
import matplotlib.pyplot as plt
import networkx as nx
# installation easiest via pip:
# pip install netgraph
from netgraph import Graph
# create a modular graph
partition_sizes = [10, 20, 30, 40]
g = nx.random_partition_graph(partition_sizes, 0.5, 0.1)
# since we created the graph, we know the best partition:
node_to_community = dict()
node = 0
for community_id, size in enumerate(partition_sizes):
for _ in range(size):
node_to_community[node] = community_id
node += 1
# # alternatively, we can infer the best partition using Louvain:
# from community import community_louvain
# node_to_community = community_louvain.best_partition(g)
community_to_color = {
0 : 'tab:blue',
1 : 'tab:orange',
2 : 'tab:green',
3 : 'tab:red',
}
node_color = {node: community_to_color[community_id] for node, community_id in node_to_community.items()}
Graph(g,
node_color=node_color, node_edge_width=0, edge_alpha=0.1,
node_layout='community', node_layout_kwargs=dict(node_to_community=node_to_community),
edge_layout='bundled', edge_layout_kwargs=dict(k=2000),
)
plt.show()

How to reshape a networkx graph in Python?

So I created a really naive (probably inefficient) way of generating hasse diagrams.
Question:
I have 4 dimensions... p q r s .
I want to display it uniformly (tesseract) but I have no idea how to reshape it. How can one reshape a networkx graph in Python?
I've seen some examples of people using spring_layout() and draw_circular() but it doesn't shape in the way I'm looking for because they aren't uniform.
Is there a way to reshape my graph and make it uniform? (i.e. reshape my hasse diagram into a tesseract shape (preferably using nx.draw() )
Here's what mine currently look like:
Here's my code to generate the hasse diagram of N dimensions
#!/usr/bin/python
import networkx as nx
import matplotlib.pyplot as plt
import itertools
H = nx.DiGraph()
axis_labels = ['p','q','r','s']
D_len_node = {}
#Iterate through axis labels
for i in xrange(0,len(axis_labels)+1):
#Create edge from empty set
if i == 0:
for ax in axis_labels:
H.add_edge('O',ax)
else:
#Create all non-overlapping combinations
combinations = [c for c in itertools.combinations(axis_labels,i)]
D_len_node[i] = combinations
#Create edge from len(i-1) to len(i) #eg. pq >>> pqr, pq >>> pqs
if i > 1:
for node in D_len_node[i]:
for p_node in D_len_node[i-1]:
#if set.intersection(set(p_node),set(node)): Oops
if all(p in node for p in p_node) == True: #should be this!
H.add_edge(''.join(p_node),''.join(node))
#Show Plot
nx.draw(H,with_labels = True,node_shape = 'o')
plt.show()
I want to reshape it like this:
If anyone knows of an easier way to make Hasse Diagrams, please share some wisdom but that's not the main aim of this post.
This is a pragmatic, rather than purely mathematical answer.
I think you have two issues - one with layout, the other with your network.
1. Network
You have too many edges in your network for it to represent the unit tesseract. Caveat I'm not an expert on the maths here - just came to this from the plotting angle (matplotlib tag). Please explain if I'm wrong.
Your desired projection and, for instance, the wolfram mathworld page for a Hasse diagram for n=4 has only 4 edges connected all nodes, whereas you have 6 edges to the 2 and 7 edges to the 3 bit nodes. Your graph fully connects each "level", i.e. 4-D vectors with 0 1 values connect to all vectors with 1 1 value, which then connect to all vectors with 2 1 values and so on. This is most obvious in the projection based on the Wikipedia answer (2nd image below)
2. Projection
I couldn't find a pre-written algorithm or library to automatically project the 4D tesseract onto a 2D plane, but I did find a couple of examples, e.g. Wikipedia. From this, you can work out a co-ordinate set that would suit you and pass that into the nx.draw() call.
Here is an example - I've included two co-ordinate sets, one that looks like the projection you show above, one that matches this one from wikipedia.
import networkx as nx
import matplotlib.pyplot as plt
import itertools
H = nx.DiGraph()
axis_labels = ['p','q','r','s']
D_len_node = {}
#Iterate through axis labels
for i in xrange(0,len(axis_labels)+1):
#Create edge from empty set
if i == 0:
for ax in axis_labels:
H.add_edge('O',ax)
else:
#Create all non-overlapping combinations
combinations = [c for c in itertools.combinations(axis_labels,i)]
D_len_node[i] = combinations
#Create edge from len(i-1) to len(i) #eg. pq >>> pqr, pq >>> pqs
if i > 1:
for node in D_len_node[i]:
for p_node in D_len_node[i-1]:
if set.intersection(set(p_node),set(node)):
H.add_edge(''.join(p_node),''.join(node))
#This is manual two options to project tesseract onto 2D plane
# - many projections are available!!
wikipedia_projection_coords = [(0.5,0),(0.85,0.25),(0.625,0.25),(0.375,0.25),
(0.15,0.25),(1,0.5),(0.8,0.5),(0.6,0.5),
(0.4,0.5),(0.2,0.5),(0,0.5),(0.85,0.75),
(0.625,0.75),(0.375,0.75),(0.15,0.75),(0.5,1)]
#Build the "two cubes" type example projection co-ordinates
half_coords = [(0,0.15),(0,0.6),(0.3,0.15),(0.15,0),
(0.55,0.6),(0.3,0.6),(0.15,0.4),(0.55,1)]
#make the coords symmetric
example_projection_coords = half_coords + [(1-x,1-y) for (x,y) in half_coords][::-1]
print example_projection_coords
def powerset(s):
ch = itertools.chain.from_iterable(itertools.combinations(s, r) for r in range(len(s)+1))
return [''.join(t) for t in ch]
pos={}
for i,label in enumerate(powerset(axis_labels)):
if label == '':
label = 'O'
pos[label]= example_projection_coords[i]
#Show Plot
nx.draw(H,pos,with_labels = True,node_shape = 'o')
plt.show()
Note - unless you change what I've mentioned in 1. above, they still have your edge structure, so won't look exactly the same as the examples from the web. Here is what it looks like with your existing network generation code - you can see the extra edges if you compare it to your example (e.g. I don't this pr should be connected to pqs:
'Two cube' projection
Wikimedia example projection
Note
If you want to get into the maths of doing your own projections (and building up pos mathematically), you might look at this research paper.
EDIT:
Curiosity got the better of me and I had to search for a mathematical way to do this. I found this blog - the main result of which being the projection matrix:
This led me to develop this function for projecting each label, taking the label containing 'p' to mean the point has value 1 on the 'p' axis, i.e. we are dealing with the unit tesseract. Thus:
def construct_projection(label):
r1 = r2 = 0.5
theta = math.pi / 6
phi = math.pi / 3
x = int( 'p' in label) + r1 * math.cos(theta) * int('r' in label) - r2 * math.cos(phi) * int('s' in label)
y = int( 'q' in label) + r1 * math.sin(theta) * int('r' in label) + r2 * math.sin(phi) * int('s' in label)
return (x,y)
Gives a nice projection into a regular 2D octagon with all points distinct.
This will run in the above program, just replace
pos[label] = example_projection_coords[i]
with
pos[label] = construct_projection(label)
This gives the result:
play with r1,r2,theta and phi to your heart's content :)

Fix position of subset of nodes in NetworkX spring graph

Using Networkx in Python, I'm trying to visualise how different movie critics are biased towards certain production companies. To show this in a graph, my idea is to fix the position of each production-company-node to an individual location in a circle, and then use the spring_layout algorithm to position the remaining movie-critic-nodes, such that one can easily see how some critics are drawn more towards certain production companies.
My problem is that I can't seem to fix the initial position of the production-company-nodes. Surely, I can fix their position but then it is just random, and I don't want that - I want them in a circle. I can calculate the position of all nodes and afterwards set the position of the production-company-nodes, but this beats the purpose of using a spring_layout algorithm and I end up with something wacky like:
Any ideas on how to do this right?
Currently my code does this:
def get_coordinates_in_circle(n):
return_list = []
for i in range(n):
theta = float(i)/n*2*3.141592654
x = np.cos(theta)
y = np.sin(theta)
return_list.append((x,y))
return return_list
G_pc = nx.Graph()
G_pc.add_edges_from(edges_2212)
fixed_nodes = []
for n in G_pc.nodes():
if n in production_companies:
fixed_nodes.append(n)
pos = nx.spring_layout(G_pc,fixed=fixed_nodes)
circular_positions = get_coordinates_in_circle(len(dps_2211))
i = 0
for p in pos.keys():
if p in production_companies:
pos[p] = circular_positions[i]
i += 1
colors = get_node_colors(G_pc, "gender")
nx.draw_networkx_nodes(G_pc, pos, cmap=plt.get_cmap('jet'), node_color=colors, node_size=50, alpha=0.5)
nx.draw_networkx_edges(G_pc,pos, alpha=0.01)
plt.show()
To create a graph and set a few positions:
import networkx as nx
G=nx.Graph()
G.add_edges_from([(1,2),(2,3),(3,1),(1,4)]) #define G
fixed_positions = {1:(0,0),2:(-1,2)}#dict with two of the positions set
fixed_nodes = fixed_positions.keys()
pos = nx.spring_layout(G,pos=fixed_positions, fixed = fixed_nodes)
nx.draw_networkx(G,pos)
Your problem appears to be that you calculate the positions of all the nodes before you set the positions of the fixed nodes.
Move pos = nx.spring_layout(G_pc,fixed=fixed_nodes) to after you set pos[p] for the fixed nodes, and change it to pos = nx.spring_layout(G_pc,pos=pos,fixed=fixed_nodes)
The dict pos stores the coordinates of each node. You should have a quick look at the documentation. In particular,
pos : dict or None optional (default=None).
Initial positions for nodes as a dictionary with node as keys and values as a list or tuple. If None, then nuse random initial positions.
fixed : list or None optional (default=None).
Nodes to keep fixed at initial position.
list or None optional (default=None)
You're telling it to keep those nodes fixed at their initial position, but you haven't told them what that initial position should be. So I would believe it takes a random guess for that initial position, and holds it fixed. However, when I test this, it looks like I run into an error. It appears that if I tell (my version of) networkx to hold nodes in [1,2] as fixed, but I don't tell it what their positions are, I get an error (at bottom of this answer). So I'm surprised your code is running.
For some other improvements to the code using list comprehensions:
def get_coordinates_in_circle(n):
thetas = [2*np.pi*(float(i)/n) for i in range(n)]
returnlist = [(np.cos(theta),np.sin(theta)) for theta in thetas]
return return_list
G_pc = nx.Graph()
G_pc.add_edges_from(edges_2212)
circular_positions = get_coordinates_in_circle(len(dps_2211))
#it's not clear to me why you don't define circular_positions after
#fixed_nodes with len(fixed_nodes) so that they are guaranteed to
#be evenly spaced.
fixed_nodes = [n for n in G_pc.nodes() if n in production_companies]
pos = {}
for i,p in enumerate(fixed_nodes):
pos[p] = circular_positions[i]
colors = get_node_colors(G_pc, "gender")
pos = nx.spring_layout(G_pc,pos=pos, fixed=fixed_nodes)
nx.draw_networkx_nodes(G_pc, pos, cmap=plt.get_cmap('jet'), node_color=colors, node_size=50, alpha=0.5)
nx.draw_networkx_edges(G_pc,pos, alpha=0.01)
plt.show()
Here's the error I see:
import networkx as nx
G=nx.Graph()
G.add_edge(1,2)
pos = nx.spring_layout(G, fixed=[1,2])
---------------------------------------------------------------------------
UnboundLocalError Traceback (most recent call last)
<ipython-input-4-e9586af20cc2> in <module>()
----> 1 pos = nx.spring_layout(G, fixed=[1,2])
.../networkx/drawing/layout.pyc in fruchterman_reingold_layout(G, dim, k, pos, fixed, iterations, weight, scale)
253 # We must adjust k by domain size for layouts that are not near 1x1
254 nnodes,_ = A.shape
--> 255 k=dom_size/np.sqrt(nnodes)
256 pos=_fruchterman_reingold(A,dim,k,pos_arr,fixed,iterations)
257 if fixed is None:
UnboundLocalError: local variable 'dom_size' referenced before assignment

Create a weighted graph from an adjacency matrix in graph-tool, python interface

How should I create a graph using graph-tool in python, out of an adjacency matrix?
Assume we have adj matrix as the adjacency matrix.
What I do now is like this:
g = graph_tool.Graph(directed = False)
g.add_vertex(len(adj))
edge_weights = g.new_edge_property('double')
for i in range(adj.shape[0]):
for j in range(adj.shape[1]):
if i > j and adj[i,j] != 0:
e = g.add_edge(i, j)
edge_weights[e] = adj[i,j]
But it doesn't feel right, do we have any better solution for this?
(and I guess a proper tag for this would be graph-tool, but I can't add it, some kind person with enough privileges could make the tag?)
Graph-tool now includes a function to add a list of edges to the graph. You can now do, for instance:
import graph_tool as gt
import numpy as np
g = gt.Graph(directed=False)
adj = np.random.randint(0, 2, (100, 100))
g.add_edge_list(np.transpose(adj.nonzero()))
this is the extension of Tiago's answer for the weighted graph:
adj = numpy.random.randint(0, 10, (100, 100)) # a random directed graph
idx = adj.nonzero()
weights = adj[idx]
g = Graph()
g.add_edge_list(transpose(idx)))
#add weights as an edge propetyMap
ew = g.new_edge_property("double")
ew.a = weights
g.ep['edge_weight'] = ew
This should be a comment to Tiago's answer, but I don't have enough reputation for that.
For the latest version (2.26) of graph_tool I believe there is a missing transpose there. The i,j entry of the adjacency matrix denotes the weight of the edge going from vertex j to vertex i, so it should be
g.add_edge_list(transpose(transpose(adj).nonzero()))
import numpy as np
import graph_tool.all as gt
g = gt.Graph(directed=False)
adj = np.tril(adj)
g.add_edge_list(np.transpose(adj.nonzero()))
Without np.tril the adjacency matrix will contain entries with 2s instead one 1s because every edge is counted twice. Things like gt.num_edges() will be incorrect too.

Categories