I'm trying to get the path on a graph which covers all edges, and traverses them only once.
This means there will only be two "end" points - which will have an odd-number of attached nodes. These end points would either have one connecting edge, or be part of a loop and have 3 connections.
So in the simple case below I need to traverse the nodes in this order 1-2-3-4-5 (or 5-4-3-2-1):
In the more complicated case below the path would be 1-2-3-4-2 (or 1-2-4-3-2):
Below is also a valid graph, with 2 end-points: 1-2-4-3-2-5
I've tried to find the name of an algorithm to solve this, and thought it was the "Chinese Postman Problem", but implementing this based on code at https://github.com/rkistner/chinese-postman/blob/master/postman.py didn't provide the results I expected.
The Eulerian path looks almost what is needed, but the networkx implementation will only work for closed (looped) networks.
I also looked at a Hamiltonian Path - and tried the networkx algorithm - but the graph types were not supported.
Ideally I'd like to use Python and networkx to implement this, and there may be a simple solution that is already part of the library, but I can't seem to find it.
You're looking for Eulerian Path that visits every edge exactly once. You can use Fleury's algorithm to generate the path. Fleury's algorithm has O(E^2) time complexity, if you need more efficient algorithm check Hierholzer's algorithm which is O(E) instead.
There is also an unmerged pull request for the networkx library that implements this. The source is easy to use.
(For networkx 1.11 the .edge has to be replaced with .edge_iter).
This is known as the Eulerian Path of a graph. It has now been added to NetworkX as eulerian_path().
Related
I've been searching for graph matching algorithms written in Python but I haven't been able to find much.
I'm currently trying to match two different graphs that derive from two distinct sets of character sequences. I know that there is an underlying connection between the two graphs, more precisely a one-to-one mapping between the nodes. But the graphs don't have the same labels and as such I need graph matching algorithms that return nodes mappings just by comparing topology and/or attributes. By testing, I hope to maximize correct matches.
I've been using Blondel and Heymans from the graphsim package and intend to also use Tacsim from the same package.
I would like to test other options, probably more standard, like maximum subgraph isomorphism or finding subgraphs with very good matchings between the two graphs. Graph edit distance might also help if it manages to give a matching.
The problem is that I can't find anything implemented, even in Networkx that I'm using. Does anyone know of any Python implementations? Would be a plus if those options used Networkx.
I found this implementation of Graph Edit Distance algorithms which uses NetworkX in Python.
https://github.com/Jacobe2169/GMatch4py
"GMatch4py is a library dedicated to graph matching. Graph structure are stored in NetworkX graph objects. GMatch4py algorithms were implemented with Cython to enhance performance."
I have a large undirected graph with m specific source and n specific terminal nodes, and I want to check whether there is a connection between all sources to all terminals or not. The answer will be a binary scalar, i.e. 1 if all sources are connected to all terminals, and 0 if there exists a source and a terminal which are not connected.
For one-source one-terminal connectivity, I can use Networkx to check the connection (which is based on the DFS algorithm):
has_path(G, source, target)
The most simple way to check the m-source n-terminal connectivity is to use m+n-1independent DFS runs (using the function above). However, this is not probably the most efficient way of doing the task, and would be slow if we want to do this task repetitively (say, for millions of graphs). What is the most efficient algorithm? What is the minimum number of the required DFS runs? I am using Python, and I prefer to use Networkx to perform connectivity check. Thanks!
I believe the best way to do this for an undirected graph is to use networkx's node_connected_component command to find all the nodes in the same component as one of your source nodes. Then check if all of the target and source nodes are also in that component.
The node_connected_component returns a list in 1.11 and a set in 2.0. Probably the best way to do the test is to see if the set of sources and targets when intersected with the set of that component is equal to the set of sources and targets.
Given a graph g and a set of N nodes my_nodes = [n1, n2, n3, ...], how can I check if there's a path that contains all N nodes?
Checking among all_simple_paths for paths that contain all nodes in my_nodes becomes computationally cumbersome as the graph grows
The search above can be limited to paths between my_nodes pairwise couples. This reduces complexity only to a small degree. Plus it requires a lot of python looping, which is quite slow
Is there a faster solution to the problem?
You may try out some greedy algorithm here, starting the path find check from all the nodes to find, and step by step explore your graph. Can't provide some real sample, but pseudo-code should be something like this:
Start n path stubs from all your n nodes to find
For all these path stubs adjust them by all the neighbors which weren't checked before
If you have some intersection between path stubs, then you got a new one, which does contain more of your needed nodes than before
If after merging the stub paths you have the one which covers all needed nodes, you're done
If there are still some additional nodes to add to the path, you continue with second step again
If there are no nodes left in graph, the path doesn't exists
This algorithm has complexity O(E + N), because you're visiting the edges and nodes in non-recursive fashion.
However, in case of directed graph the "merge" will be a bit more complicated, yet still be done, but in this case the worst scenario may take a lot of time.
Update:
As you say that the graph is directed, the above approach wouldn't work well. In this case you may simplify your task like this:
Find the strongly connected components in graph (I suggest you to implement it by yourself, e.g., Kosaraju's algorithm). The complexity is O(E + N). You can use a NetworkX method for this, if you want some out-ofbox solution.
Create the condensation of graph, based on step 1 information, with saving the information about which component can be visited from other. Again, there is a NetworkX method for this.
Now you can easily say, which nodes from your set are in the same component, so a path containing all of them definitely exists.
After that all you need to check is a connectivity between different components for your nodes. For example, you can get the topological sort of condensation and do check in linear time again.
Does anybody know any module in Python that computes the best bipartite matching?
I have tried the following two:
munkres
hungarian
However, in my case, I have to deal with non-complete graph (i.e., there might not be an edge between two nodes), and therefore, there might not be a match if the node has no edge. The above two packages seem not to be able to deal with this.
Any advice?
Set cost to infinity or a large value for an edge that does not exist. You can then tell by the result whether an invalid edge was used.
I am interested in finding a path (not necessarily shortest) in a short amount of time. Dijsktra and AStar in networkx is taking too long.
Why is there no DFS or BFS in networkx?
I plan to write my own DFS and BFS search (I am leaning more towards BFS because my graph is pretty deep). Is there anything that I can use in networkx's lib to speed me up?
The Traversal module has multiple depth-first-search variations. Breadth-first-search is implemented in the connected components functions, also in that module. Either use that, or if you need custom behaviour, re-implement your own using that as the example.
There is now a depth-first search and breadth-first search here
These are modified from Eppstein's code at www.ics.uci.edu/~eppstein/PADS
which is also a good place to look for Python graph algorithms.