Related
I have a list of edges (E) of a graph with nodes V = [1,2,3,4,5,6]:
E = [(1,2), (1,5), (2,3), (3,1), (5,6), (6,1)]
where each tuple (a,b) refers to the start & end node of the edge respectively.
If I know the edges form a closed path in graph G, can I recover the path?
Note that E is not the set of all edges of the graph. Its just a set of edges.
In this example, the path would be 1->2->3->1->5->6->1
A naive approach, I can think of is using a tree where I start with a node, say 1, then I look at all tuples that start with 1, here, (1,2) and (1,5). Then I have two branches, and with nodes as 2 & 5, I continue the process till I end at the starting node at a branch.
How to code this efficiently in python?
The networkx package has a function that can generate the desired circuit for you in linear time...
It is possible, that construction of nx.MultiDiGraph() is slower and not such efficient, as desired in question, or usage of external packages for only one function is rather excessive. If it is so, there is another way.
Plan: firstly we will find some way from start_node to start_node, then we will insert all loops, that were not visited yet.
from itertools import chain
from collections import defaultdict, deque
from typing import Tuple, List, Iterable, Iterator, DefaultDict, Deque
def retrieve_closed_path(arcs: List[Tuple[int, int]], start_node: int = 1) -> Iterator[int]:
if not arcs:
return iter([])
# for each node `u` carries queue of its
# neighbours to be visited from node `u`
d: DefaultDict[int, Deque[int]] = defaultdict(deque)
for u, v in arcs:
# deque pop and append complexity is O(1)
d[u].append(v)
def _dfs(node) -> Iterator[int]:
out: Iterator[int] = iter([])
# guarantee, that all queues
# will be emptied at the end
while d[node]:
# chain returns an iterator and helps to
# avoid unnecessary memory reallocations
out = chain([node], _dfs(d[node].pop()), out)
# if we return in this loop from recursive call, then
# `out` already carries some (node, ...) and we need
# only to insert all other loops which start at `node`
return out
return chain(_dfs(start_node), [start_node])
def path_to_string(path: Iterable[int]) -> str:
return '->'.join(str(x) for x in path)
Examples:
E = [(1, 2), (2, 1)]
p = retrieve_closed_path(E, 1)
print(path_to_string(p))
>> 1->2->1
E = [(1, 2), (1, 5), (2, 3), (3, 1), (5, 6), (6, 1)]
p = retrieve_closed_path(E, 1)
print(path_to_string(p))
>> 1->5->6->1->2->3->1
E = [(1, 2), (2, 3), (3, 4), (4, 2), (2, 1)]
p = retrieve_closed_path(E, 1)
print(path_to_string(p))
>> 1->2->3->4->2->1
E = [(5, 1), (1, 5), (5, 2), (2, 5), (5, 1), (1, 4), (4, 5)]
p = retrieve_closed_path(E, 1)
print(path_to_string())
>> 1->4->5->1->5->2->5->1
You're looking for a directed Eulerian circuit in your (sub)graph. An Eulerian circuit is a trail that visits every edge exactly once.
The networkx package has a function that can generate the desired circuit for you in linear time:
import networkx as nx
edges = [(1,2), (1,5), (2,3), (3,1), (5,6), (6,1)]
G = nx.MultiDiGraph()
G.add_edges_from(edges)
# Prints [(1, 5), (5, 6), (6, 1), (1, 2), (2, 3), (3, 1)]
# which matches the desired output (as asked in the comments).
print([edge for edge in nx.algorithms.euler.eulerian_circuit(G)])
The documentation cites a 1973 paper, if you're interested in understanding how the algorithm works. You can also take a look at the source code here. Note that we're working with multigraphs here, since you can have multiple edges that have the same source and destination node. There are probably other implementations floating around on the Internet, but they may or may not work for multigraphs.
I am writing my own function that calculates the Laplacian matrix for any directed graph, and am struggling with filling the diagonal entries of the resulting matrix. The following equation is what I use to calculate entries of the Laplacian matrix, where e_ij represents an edge from node i to node j.
I am creating graph objects with NetworkX (https://networkx.org/). I know NetworkX has its own Laplacian function for directed graphs, but I want to be 100% sure I am using a function that carries out the correct computation for my purposes. The code I have developed thus far is shown below, for the following example graph:
# Create a simple example of a directed weighted graph
G = nx.DiGraph()
G.add_nodes_from([1, 2, 3])
G.add_weighted_edges_from([(1, 2, 1), (1, 3, 1), (2, 1, 1), (2, 3, 1), (3, 1, 1), (3, 2, 1)])
# Put node, edge, and weight information into Python lists
node_list = []
for item in G.nodes():
node_list.append(item)
edge_list = []
weight_list = []
for item in G.edges():
weight_list.append(G.get_edge_data(item[0],item[1])['weight'])
item = (item[0]-1,item[1]-1)
edge_list.append(item)
print(edge_list)
> [(0, 1), (0, 2), (1, 0), (1, 2), (2, 0), (2, 1)]
# Fill in the non-diagonal entries of the Laplacian
num_nodes = len(node_list)
num_edges = len(edge_list)
J = np.zeros(shape = (num_nodes,num_nodes))
for x in range(num_edges):
i = edge_list[x][0]
j = edge_list[x][1]
J[i,j] = weight_list[x]
I am struggling to figure out how to fill in the diagonal entries. edge_list is a list of tuples. To perform the computation in the above equation for L(G), I need to loop through the second entries of each tuple, store the first entry into a temporary list, sum over all the elements of that temporary list, and finally store the negative of the sum in the correct diagonal entry of L(G).
Any suggestions would be greatly appreciated, especially if there are steps above that can be done more efficiently or elegantly.
I adjusted networkx.laplacian_matrix function for undirected graphs a little bit
import networkx as nx
import scipy.sparse
G = nx.DiGraph()
G.add_nodes_from([1, 2, 3])
G.add_weighted_edges_from([(1, 2, 1), (1, 3, 1), (2, 1, 1), (2, 3, 1), (3, 1, 1), (3, 2, 1)])
nodelist = list(G)
A = nx.to_scipy_sparse_matrix(G, nodelist=nodelist, weight="weight", format="csr")
n, m = A.shape
diags = A.sum(axis=0) # 1 = outdegree, 0 = indegree
D = scipy.sparse.spdiags(diags.flatten(), [0], m, n, format="csr")
print((A - D).todense())
# [[-2 1 1]
# [ 1 -2 1]
# [ 1 1 -2]]
I will deviate a little from your method, since I prefer to work with Numpy if possible :P.
In the following snippet, I generate test data for a network of n=10 nodes; that is, I generate an array of tuples V to populate with random nodes, and also a (n,n) array A with the values of the edges between nodes. Hopefully the code is somewhat self-explanatory and is correct (let me know otherwise):
from random import sample
import numpy as np
# Number and list of nodes
n = 10
nodes = list(np.arange(n)) # random.sample needs list
# Test array of linked nodes
# V[i] is a tuple with all nodes the i-node connects to.
V = np.zeros(n, dtype = tuple)
for i in range(n):
nv = np.random.randint(5) # Random number of edges from node i
# To avoid self-loops (do not know if it is your case - comment out if necessary)
itself = True
while itself:
cnodes = sample(nodes, nv) # samples nv elements from the nodes list w/o repetition
itself = i in cnodes
V[i] = cnodes
# Test matrix of weighted edges (from i-node to j-node)
A = np.zeros((n,n))
for i in range(n):
for j in range(n):
if j in V[i]:
A[i,j] = np.random.random()*5
# Laplacian of network
J = np.copy(A) # This already sets the non-diagonal elements
for i in range(n):
J[i,i] = - np.sum(A[:,i]) - A[i,i]
Thank you all for your suggestions! I agree that numpy is the way to go. As a rudimentary solution that I will optimize later, this is what I came up with:
def Laplacian_all(edge_list,weight_list,num_nodes,num_edges):
J = np.zeros(shape = (num_nodes,num_nodes))
for x in range(num_edges):
i = edge_list[x][0]
j = edge_list[x][1]
J[i,j] = weight_list[x]
for i in range(num_nodes):
temp = []
for x in range(num_edges):
if i == edge_list[x][1]:
temp.append(weight_list[x])
temp_sum = -1*sum(temp)
J[i,i] = temp_sum
return J
I have yet to test this on different graphs, but this was what I was hoping to figure out for my immediate purposes.
I would like to define my own isomorphism of two graphs. I want to check if two graphs are isomorphic given that each edge has some attribute --- basically the order of placing each edge. I wonder if one can use the method:
networkx.is_isomorphic(G1,G2, edge_match=some_callable)
somehow by defining function some_callable().
For example, the following graphs are isomorphic, because you can relabel the nodes to obtain one from another.
Namely, relabel [2<->3].
But, the following graphs are not isomorphic.
There is no way to obtain one from another by re-labeling the nodes.
Here you go. This is exactly what the edge_match option is for doing. I'll create 3 graphs the first two are isomorphic (even though the weights have different names --- I've set the comparison function to account for that). The third is not isomorphic.
import networkx as nx
G1 = nx.Graph()
G1.add_weighted_edges_from([(0,1,0), (0,2,1), (0,3,2)], weight = 'aardvark')
G2 = nx.Graph()
G2.add_weighted_edges_from([(0,1,0), (0,2,2), (0,3,1)], weight = 'baboon')
G3 = nx.Graph()
G3.add_weighted_edges_from([(0,1,0), (0,2,2), (0,3,2)], weight = 'baboon')
def comparison(D1, D2):
#for an edge u,v in first graph and x,y in second graph
#this tests if the attribute 'aardvark' of edge u,v is the
#same as the attribute 'baboon' of edge x,y.
return D1['aardvark'] == D2['baboon']
nx.is_isomorphic(G1, G2, edge_match = comparison)
> True
nx.is_isomorphic(G1, G3, edge_match = comparison)
> False
Here answer the problem specifically in the question, with the very same graphs. Note that I'm using the networkx.MultiGraph and consider some 'ordering' in placing those edges.
import networkx as nx
G1,G2,G3,G4=nx.MultiGraph(),nx.MultiGraph(),nx.MultiGraph(),nx.MultiGraph()
G1.add_weighted_edges_from([(0, 1, 0), (0, 2, 1), (0, 3, 2)], weight='ordering')
G2.add_weighted_edges_from([(0, 1, 0), (0, 3, 1), (0, 2, 2)], weight='ordering')
G3.add_weighted_edges_from([(0, 1, 0), (0, 1, 1), (2, 3, 2)], weight='ordering')
G4.add_weighted_edges_from([(0, 1, 0), (2, 3, 1), (0, 1, 2)], weight='ordering')
def comparison(D1,D2):
return D1[0]['ordering'] == D2[0]['ordering']
nx.is_isomorphic(G1,G2, edge_match=comparison)
>True
nx.is_isomorphic(G3,G4, edge_match=comparison)
>False
I have to write a very little Python program that checks whether some group of coordinates are all connected together (by a line, not diagonally). The next 2 pictures show what I mean. In the left picture all colored groups are cohesive, in the right picture not:
I've already made this piece of code, but it doesn't seem to work and I'm quite stuck, any ideas on how to fix this?
def cohesive(container):
co = container.pop()
container.add(co)
return connected(co, container)
def connected(co, container):
done = {co}
todo = set(container)
while len(neighbours(co, container, done)) > 0 and len(todo) > 0:
done = done.union(neighbours(co, container, done))
return len(done) == len(container)
def neighbours(co, container, done):
output = set()
for i in range(-1, 2):
if i != 0:
if (co[0] + i, co[1]) in container and (co[0] + i, co[1]) not in done:
output.add((co[0] + i, co[1]))
if (co[0], co[1] + i) in container and (co[0], co[1] + i) not in done:
output.add((co[0], co[1] + i))
return output
this is some reference material that should return True:
cohesive({(1, 2), (1, 3), (2, 2), (0, 3), (0, 4)})
and this should return False:
cohesive({(1, 2), (1, 4), (2, 2), (0, 3), (0, 4)})
Both tests work, but when I try to test it with different numbers the functions fail.
You can just take an element and attach its neighbors while it is possible.
def dist(A,B):return abs(A[0]-B[0]) + abs(A[1]-B[1])
def grow(K,E):return {M for M in E for N in K if dist(M,N)<=1}
def cohesive(E):
K={min(E)} # an element
L=grow(K,E)
while len(K)<len(L) : K,L=L,grow(L,E)
return len(L)==len(E)
grow(K,E) return the neighborhood of K.
In [1]: cohesive({(1, 2), (1, 3), (2, 2), (0, 3), (0, 4)})
Out[1]: True
In [2]: cohesive({(1, 2), (1, 4), (2, 2), (0, 3), (0, 4)})
Out[2]: False
Usually, to check if something is connected, you need to use disjoint set data structures, the more efficient variations include weighted quick union, weighted quick union with path compression.
Here's an implementation, http://algs4.cs.princeton.edu/15uf/WeightedQuickUnionUF.java.html which you can modify to your needs. Also, the implementation found in the book "The Design and Analysis of Computer Algorithms" by A. Aho, allows you to specify the name of the group that you add 2 connected elements to, so I think that's the modification you're looking for.(It just involves using 1 extra array which keeps track of group numbers).
As a side note, since disjoint sets usually apply to arrays, don't forget that you can represent an N by N matrix as an array of size N*N.
EDIT: just realized that it wasn't clear to me what you were asking at first, and I realized that you also mentioned that diagonal components aren't connected, in that case the algorithm is as follows:
0) Check if all elements refer to the same group.
1) Iterate through the array of pairs that represent coordinates in the matrix in question.
2) For each pair make a set of pairs that satisfies the following formula:
|entry.x - otherEntry.x| + |entry.y - otherEntry.y|=1.
'entry' refers to the element that the outer for loop is referring to.
3) Check if all of the sets overlap. That can be done by "unioning" the sets you're looking at, at the end if you get more than 1 set, then the elements are not cohesive.
The complexity is O(n^2 + n^2 * log(n)).
Example:
(0,4), (1,2), (1,4), (2,2), (2,3)
0) check that they are all in the same group:
all of them belong to group 5.
1) make sets:
set1: (0,4), (1,4)
set2: (1,2), (2,2)
set3: (0,4), (1,4) // here we suppose that sets are sorted, other than that it
should be (1,4), (0,4)
set4: (1,2), (2,2), (2,3)
set5: (2,2), (2,3)
2) check for overlap:
set1 overlaps with set3, so we get:
set1' : (0,4), (1,4)
set2 overlaps with set4 and set 5, so we get:
set2' : (1,2), (2,2), (2,3)
as you can see set1' and set2' don't overlap, hence you get 2 disjoint sets that are in the same group, so the answer is 'false'.
Note that this is inefficient, but I have no idea how to do it more efficiently, but this answers your question.
The logic in your connected function seems wrong. You make a todo variable, but then never change its contents. You always look for neighbours around the same starting point.
Try this code instead:
def connected(co, container):
done = {co}
todo = {co}
while len(todo) > 0:
co = todo.pop()
n = neighbours(co, container, done)
done = done.union(n)
todo = todo.union(n)
return len(done) == len(container)
todo is a set of all the points we are still to check.
done is a set of all the points we have found to be 4-connected to the starting point.
I'd tackle this problem differently... if you're looking for five exactly, that means:
Every coordinate in the line has to be neighbouring another coordinate in the line, because anything less means that coordinate is disconnected.
At least three of the coordinates have to be neighbouring another two or more coordinates in the line, because anything less and the groups will be disconnected.
Hence, you can just get the coordinate's neighbours and check whether both conditions are fulfilled.
Here is a basic solution:
def cells_are_connected(connections):
return all(c > 0 for c in connections)
def groups_are_connected(connections):
return len([1 for c in connections if c > 1]) > 2
def cohesive(coordinates):
connections = []
for x, y in coordinates:
neighbours = [(x-1, y), (x+1, y), (x, y-1), (x, y+1)]
connections.append(len([1 for n in neighbours if n in coordinates]))
return cells_are_connected(connections) and groups_are_connected(connections)
print cohesive([(1, 2), (1, 3), (2, 2), (0, 3), (0, 4)]) # True
print cohesive([(1, 2), (1, 4), (2, 2), (0, 3), (0, 4)]) # False
No need for a general-case solution or union logic. :) Do note that it's specific to the five-in-a-line problem, however.
I would like to make an algorithm to find if an edge belongs to a cycle, in an undirected graph, using networkx in Python.
I am thinking to use cycle_basis and get all the cycles in the graph.
My problem is that cycle_basis returns a list of nodes. How can I convert them to edges?
You can construct the edges from the cycle by connecting adjacent nodes.
In [1]: import networkx as nx
In [2]: G = nx.Graph()
In [3]: G.add_cycle([1,2,3,4])
In [4]: G.add_cycle([10,20,30])
In [5]: basis = nx.cycle_basis(G)
In [6]: basis
Out[6]: [[2, 3, 4, 1], [20, 30, 10]]
In [7]: edges = [zip(nodes,(nodes[1:]+nodes[:1])) for nodes in basis]
In [8]: edges
Out[8]: [[(2, 3), (3, 4), (4, 1), (1, 2)], [(20, 30), (30, 10), (10, 20)]]
Here is my take at it, using just lambda functions (I love lambda functions!):
import networkx as nx
G = nx.Graph()
G.add_cycle([1,2,3,4])
G.add_cycle([10,20,30])
G.add_edge(1,10)
in_path = lambda e, path: (e[0], e[1]) in path or (e[1], e[0]) in path
cycle_to_path = lambda path: list(zip(path+path[:1], path[1:] + path[:1]))
in_a_cycle = lambda e, cycle: in_path(e, cycle_to_path(cycle))
in_any_cycle = lambda e, g: any(in_a_cycle(e, c) for c in nx.cycle_basis(g))
for edge in G.edges():
print(edge, 'in a cycle:', in_any_cycle(edge, G))
in case you don't find a nice solution, here's an ugly one.
with edges() you can get a list of edges that are adjacent to nodes in a cycle. unfortunately, this includes edges adjacent to nodes outside the cycle
you can now filter the list of edges by removing those which connect nodes that are not part of the cycle.
please keep us posted if you find a less wasteful solution.
With the help of Aric, and a little trick to check both directions, I finally did this that looks ok.
import networkx as nx
G = nx.Graph()
G.add_cycle([1,2,3,4])
G.add_cycle([10,20,30])
G.add_edge(1,10)
def edge_in_cycle(edge, graph):
u, v = edge
basis = nx.cycle_basis(graph)
edges = [zip(nodes,(nodes[1:]+nodes[:1])) for nodes in basis]
found = False
for cycle in edges:
if (u, v) in cycle or (v, u) in cycle:
found = True
return found
for edge in G.edges():
print edge, 'in a cycle:', edge_in_cycle(edge, G)
output:
(1, 2) in a cycle: True
(1, 4) in a cycle: True
(1, 10) in a cycle: False
(2, 3) in a cycle: True
(3, 4) in a cycle: True
(10, 20) in a cycle: True
(10, 30) in a cycle: True
(20, 30) in a cycle: True
You can directly obtain the edges in a cycle with the find_cycle method. If you want to test if an edge belongs to a cycle, you should check if both of its vertices are part of the same cycle.
Using the example in the answers above:
import networkx as nx
G = nx.Graph()
G.add_cycle([1,2,3,4])
G.add_cycle([10,20,30])
G.add_edge(1,10)
nx.find_cycle(G, 1) # [(1, 2), (2, 3), (3, 4), (4, 1)]
nx.find_cycle(G, 10) # [(10, 20), (20, 30), (30, 10)]
On the other hand, the edge (2, 3) (or (3, 2) as your graph is undirected) is part of a cycle defined first:
nx.find_cycle(G, 2) # [(2, 1), (1, 4), (4, 3), (3, 2)]