I have the following graph :
As you can see, there are two natural clusters. I would like to figure out a way to separate these clusters into two graphs.
The key step, of course, is to compute the right split. I would like to insert two nodes n1 & n2, link them e(n1, n2), and move them around, minimizing the number of edge crossings (of course fixing all nodes/edges exactly where they are).
Can anyone offer any help here? I don't think graphviz has anything that enables me to do it.
I think you mingle two different tasks here: the one is Analysis of a graph, the other one is Visualization of the same.
Graphviz, as the name suggests, is a tool for visualization of graphs. Visualization can take many forms, typically one tries to "make it look good" by having those nodes close to each other that are connected, thus reducing the visual edge lengths. One can utilize some spring- or gravitational model to calculate optimal positions for all nodes. Other options include circular- or shell-layouts.
A certain visualization should not be the basis for the analysis of a graph. Graph properties, like average shortest path length or clustering coefficient, are independent of any visualization.
You say you want to "minimize the number of edge crossings". The number of edge crossings is a property of your visualization, not of your graph! It probably changes each time you let graphviz calculate the layout, even if the graph is unchanged. Who says that 2d is the only possible representation of your graph? Add just one dimension, and you won't have any edge crossing.
I'd recommend to concentrate on graph analysis. I don't know if you're aware of NetworkX. They have dozens of Algorithms to analyze your graph. Maybe the clustering and clique sections are of interest to you.
Related
I have a medium sized graph with ~400 nodes and ~6000 edges that I am trying to visualize via python. At the moment I am trying to use networkx and this is the output.
There's 2 issues:
The layout seems to be too dense and I can't make out any of the edges near the center of the graph
There's a set of nodes that are semi-bipartite (they have no edges within themselves), and I would like to place these nodes on a vertical line on the right, and all the other nodes on the left. I can't figure out how to manage this with networkx.
Any help would be appreciated, thanks!
I suggest you experiment with different engines other that dot. Consider neato, twopi or circo. The gallery section on the official graphviz site has really nice examples (300+ nodes) that you can mimic.
I have the following edge-detected image:
edge
As you can see, there are two clearly defined lines that make up two sides of an object. I can fit one curve to the entire thing, but how can I perform a curve fitting so that two curves are fit exclusively and simultaneously to such edges, so that I end up with two lines that describe the shape? In other images, the ends may not be discontinuous in this manner.
Is there a good way to specify number of curves to fit to data, like n=2 in this instance, and have it perform a fit where each curve takes into account the other to stay away from it? I have been unable to find similar problems or libraries that can perform something like this.
The two fitted curves in this instance would end up looking like this: curves
I am working on community detection algorithms, and I am currently trying to visualize the results of Louvain algorithm (https://arxiv.org/abs/0803.0476) on a graph of 70K nodes and 8M edges.
I plotted a smaller graph before (20K nodes, 650K edges) with igraph by taking inspiration from How to plot Community-based graph using igraph for python, and it took almost 30 minutes. Plotting 70K nodes and 8M edges takes 8 hours.
To plot the current graph, due to performance, I moved to sfdp (e.g.,
sfdp foo.dot -Goutputorder="edgesfirst" -Goverlap=false -Tpdf -O). However, I am not able to achieve a good layout to highlight distinct communities by distantiating them. I tried to tune both K at graph level, and len and/or weight at edge level (by setting high values for intra-communities edges, e.g., 1000; and low values for inter-community edges, e.g., 1). sfdp seems to ignore weights. Still, as an extension of fdp, it should not.
Examples on a small graph
igraph + fruchterman_reingold layout
sfdp
Am I missing something? How can I highlight community differences as done in the above link?
I already have a way of clustering my graph, so the process of clustering isn't the issue here. What I want to do is, once we have all the nodes clustered - to draw the clustered graph in Python, something like this:
I looked into networkx, igraph and graph-tool, but they seem to do the clustering, but not the drawing. Any ideas and propositions of what library should I use for drawing the already clustered graph, which will minimize the number of crossing links?
Take a look at GraphViz
http://www.graphviz.org/Gallery/directed/cluster.html
There's a Python binding for that, but I have to say I always create the text files directly as they're easy enough to write. Don't be fooled by the plain-looking examples, every aspect of your graph is highly customizable and you can make some pretty nifty graph visualizations with it. Not sure about nested clusters though, never tried that out.
In other posts Networkx was suggested as "my friend". But there doesn't seem to be a ready to use function for a certain solution for the TSP problem.
i.e. Creating undirected graphs in Python
I have an undirected graph, the suggested solutions are all related to directed graphs, and I want to know a short tour to visit all nodes using the available edges.
(also, the tsp with directed graphs I could not find in the documentation of networkx)
Does anybody did something like this for an undirected graph or should I modify solutions for directed graphs with infinit costs for unconnected nodes?
edit: I am learning: Actually, as the graph is unweighted (or 'all weights' are the same), and not every node is connected to all other nodes, I just need to find a cycle in the graph containing all the nodes. When that cycle does not exist, nodes may be repeated (so, it is not a cycle anymore...). There are no isolated groups (there is a path from each node to another). I think that this is not the salesman problem?!
Thanks for your feedback so far (when milliseconds start to matter, I will install a photofinish :) )
If you already have code for directed graphs, I would just convert your undirected graph. Replace each undirected edge with two directed edges, one in each direction, preserving the edge weight.