Visualization of force-driven large graph: python and graphviz - python

I am working on community detection algorithms, and I am currently trying to visualize the results of Louvain algorithm (https://arxiv.org/abs/0803.0476) on a graph of 70K nodes and 8M edges.
I plotted a smaller graph before (20K nodes, 650K edges) with igraph by taking inspiration from How to plot Community-based graph using igraph for python, and it took almost 30 minutes. Plotting 70K nodes and 8M edges takes 8 hours.
To plot the current graph, due to performance, I moved to sfdp (e.g.,
sfdp foo.dot -Goutputorder="edgesfirst" -Goverlap=false -Tpdf -O). However, I am not able to achieve a good layout to highlight distinct communities by distantiating them. I tried to tune both K at graph level, and len and/or weight at edge level (by setting high values for intra-communities edges, e.g., 1000; and low values for inter-community edges, e.g., 1). sfdp seems to ignore weights. Still, as an extension of fdp, it should not.
Examples on a small graph
igraph + fruchterman_reingold layout
sfdp
Am I missing something? How can I highlight community differences as done in the above link?

Related

Visualizing a medium size graph in python

I have a medium sized graph with ~400 nodes and ~6000 edges that I am trying to visualize via python. At the moment I am trying to use networkx and this is the output.
There's 2 issues:
The layout seems to be too dense and I can't make out any of the edges near the center of the graph
There's a set of nodes that are semi-bipartite (they have no edges within themselves), and I would like to place these nodes on a vertical line on the right, and all the other nodes on the left. I can't figure out how to manage this with networkx.
Any help would be appreciated, thanks!
I suggest you experiment with different engines other that dot. Consider neato, twopi or circo. The gallery section on the official graphviz site has really nice examples (300+ nodes) that you can mimic.

draw a large graph with many nodes and edges with igraph

I'm trying to visualize a big data set of nodes and edges and I have two files: nodes.txt and edges.txt and I want draw a graph for them. it's got 403,394 nodes and 3,387,388 edges. good to know I generate them randomly.
So I decide using igraph python to draw it by layout and plot but when I try to draw a simple graph with few edges it works but with this huge data set it got an memory error and doesn't work right. I want some help to draw a graph from my edge list with igraph. or maybe there is some better way to do, so suggest it to me.
I use layout with Drl algorithm and use the function plot.

Visualize hierarchical / tree data where each node may have more than one parent in R or python

I'm looking for a way to visualize hierarchical data where there is a many to many relationship between parent and child - this is not a tree, but should be hierarchical like a tree. Is there a good package in R for doing this? I've looked at a few but they're either for visualizing trees or for visualizing graphs, but I'd like to visualize a graph that is also hierarchical.
I think you want to visualise a Directed Acyclic Graph (DAG). I.e. there are no cycles but each node may have multiple in-degree and out-degree. Graph libraries will usually visualise these correctly if you set the right parameters. I would recommend networkx for small/medium-sized graphs, or Gephi for large graphs (gephi is a GUI program but makes good visualisations). Networkx's Graphviz drawing backend will do a good job of drawing DAGs
https://networkx.github.io/documentation/latest/reference/algorithms.dag.html
http://gephi.github.io/

Graphviz: how to insert two new linked nodes and minimize edge crossings?

I have the following graph :
As you can see, there are two natural clusters. I would like to figure out a way to separate these clusters into two graphs.
The key step, of course, is to compute the right split. I would like to insert two nodes n1 & n2, link them e(n1, n2), and move them around, minimizing the number of edge crossings (of course fixing all nodes/edges exactly where they are).
Can anyone offer any help here? I don't think graphviz has anything that enables me to do it.
I think you mingle two different tasks here: the one is Analysis of a graph, the other one is Visualization of the same.
Graphviz, as the name suggests, is a tool for visualization of graphs. Visualization can take many forms, typically one tries to "make it look good" by having those nodes close to each other that are connected, thus reducing the visual edge lengths. One can utilize some spring- or gravitational model to calculate optimal positions for all nodes. Other options include circular- or shell-layouts.
A certain visualization should not be the basis for the analysis of a graph. Graph properties, like average shortest path length or clustering coefficient, are independent of any visualization.
You say you want to "minimize the number of edge crossings". The number of edge crossings is a property of your visualization, not of your graph! It probably changes each time you let graphviz calculate the layout, even if the graph is unchanged. Who says that 2d is the only possible representation of your graph? Add just one dimension, and you won't have any edge crossing.
I'd recommend to concentrate on graph analysis. I don't know if you're aware of NetworkX. They have dozens of Algorithms to analyze your graph. Maybe the clustering and clique sections are of interest to you.

python tsp travelling salesman undirected graph

In other posts Networkx was suggested as "my friend". But there doesn't seem to be a ready to use function for a certain solution for the TSP problem.
i.e. Creating undirected graphs in Python
I have an undirected graph, the suggested solutions are all related to directed graphs, and I want to know a short tour to visit all nodes using the available edges.
(also, the tsp with directed graphs I could not find in the documentation of networkx)
Does anybody did something like this for an undirected graph or should I modify solutions for directed graphs with infinit costs for unconnected nodes?
edit: I am learning: Actually, as the graph is unweighted (or 'all weights' are the same), and not every node is connected to all other nodes, I just need to find a cycle in the graph containing all the nodes. When that cycle does not exist, nodes may be repeated (so, it is not a cycle anymore...). There are no isolated groups (there is a path from each node to another). I think that this is not the salesman problem?!
Thanks for your feedback so far (when milliseconds start to matter, I will install a photofinish :) )
If you already have code for directed graphs, I would just convert your undirected graph. Replace each undirected edge with two directed edges, one in each direction, preserving the edge weight.

Categories