Does networkX create the same adjacency matrix and same nodes position? - python

l'm working with networkX to generate hundred of random graphs of class :
complete_graph()
star_graph()
balanced_tree()
wheel_graph()
watts_strogatz_graph(n, 2, 0)
l set number of nodes n=100
for each class l get create 20 examples.
My question is as follow :
for i in numpy.arange(20):
complete_graph=networkx.complete_graph(n)
node_positions = networkx.spring_layout(G, scale=100)
Adjacency = networkx.adjacency_matrix(G)
Do l get 20 different graphs in terms of adjacency matrix and nodes positions or for all the 20 graphs I get the same adjacency matrix and nodes positions?

You get the same adjacency matrix each time you call complete_graph(n), namely the n by n matrix filled with 1. This, and other non-random graph constructions in NetworkX, yield the same result every time.
The layout methods are not deterministic, however. They involve an optimization process with a random starting point: first, place the vertices randomly, then move them to minimize a certain "energy" value. The resulting layout of a graph will be different for each invocation of spring_layout.

Related

Is the computed number of spanning trees for this undirected graph reasonable/correct?

This is part of my Master thesis which is about designing hydrogen pipeline networks.
For a particular graph of 135 nodes and 157 edges and (see figure below), I need to compute the number of spanning trees (spanning all nodes).
I used the Kirchhoff's theorem and implemented it (using scipy and networkx) this way:
import networkx as nx
from scipy import linalg
def nbr_spanning_trees(a_G):
L = nx.laplacian_matrix(a_G) # compute the laplacian matrix which is equal to degree matrix - adjacency matrix
L = L.toarray() # convert the sparse matrix output of the previous line to an array
L_cof = L[1:, 1:] # extract the submatrix associated with the (1,1)-entry's cofactor from the Laplacian matrix
# (any cofactor can be taken)
det = linalg.det(L_cof) # calculate the cofactor
return det
The returned result is: 1.879759212930661e+16 spanning trees.
This number seems gigantesque, is it reasonable/correct? Is my implementation correct?
As an addition, the graph has at least 23 cycles (but not much more I think). I know this since I have identified the minimum spanning tree and it has 23 edges less than the underlying graph.

Plotting in numpy a graph using the two lowest eigenvalues and eigenvectors

How do I use the eigenvectors corresponding to the two lowest eigenvalues (different from 0) as the x,y coordinates to plot the graph?
I have arrays of eigenvalues and eigenvectors from the laplacian of a graph of nodes. I want to do something similar to the following link that is in MATLAB, but in Python. Also, instead of just the Fielder vector, I also want the next vector larger than the Fielder as well:
https://www.mathworks.com/examples/matlab/mw/matlab-ex64540792-partition-graph-with-laplacian-matrix
Thanks in advance!
This answer assumes you've already got nodes in a list nodelist and two vectors vec1 and vec2 which are in the same order.
Networkx uses a dict (usually called pos) for which pos[u] is the (x,y) coordinates of the node u.
import networkx as nx
#your code here to define network and find the vectors.
#Please edit a simple version of this code into your question so I can
#add it here.
pos = {node:(x,y) for node, x, y in zip(nodelist, vec1, vec2)}
nx.draw_networkx(G, pos=pos)
In defining pos, I've used a dict comprehension and zip.

Python NumPy vectorization

I'm trying to code what is known as the List Right Heuristic for the unweighted vertex cover problem. The background is as follows:
Vertex Cover Problem: In the vertex cover problem, we are given an undirected graph G = (V, E) where V is the set of vertices and E is the set of Edges. We need to find the smallest set V' which is a subset of V such that V' covers G. A set V' is said to cover a graph G if all the edges in the graph have at least one vertex in V'.
List Right Heuristic: The algorithm is very simple. Given a list of vertices V = [v1, v2, ... vn] where n is the number of vertices in G, vi is said to be a right neighbor of vj if i > j and vi and vj are connected by an edge in the graph G. We initiate a cover C = {} (empty set) and scan V from right to left. At any point, say the current vertex being scanned is u. If u has at least one right neighbor not in C then u is added to c. The entire V is just scanned once.
I'm solving this for multiple graphs (with same vertices but different edges) at once.
I coded the List Right Heuristic in python. I was able to vectorize it to solve multiple graphs at once, but I was unable to vectorize the original for loop. I'm representing the graph using an Adjacency matrix. I was wondering if it can be further vectorized. Here's my code:
def list_right_heuristic(population: np.ndarray, adj_matrix: np.ndarray):
adj_matrices = np.matlib.repmat(adj_matrix,population.shape[0], 1).reshape((population.shape[0], *adj_matrix.shape))
for i in range(population.shape[0]):
# Remove covered vertices from the graph. Delete corresponding edges
adj_matrices[i, np.outer(population[i], population[i]).astype(bool)] = 0
vertex_covers = np.zeros(shape=population.shape, dtype=population.dtype)
for index in range(population.shape[-1] - 1, -1, -1):
# Get num of intersecting elements (for each row) in right neighbors and vertex_covers
inclusion_rows = np.sum(((1 - vertex_covers) * adj_matrices[..., index])[..., index + 1:], axis=-1).astype(bool)
# Only add vertices to cover for rows which have at least one right neighbor not in vertex cover
vertex_covers[inclusion_rows, index] = 1
return vertex_covers
I have p graphs that I'm trying to solve simultaneously, where p=population.shape[0]. Each graph has the same vertices but different edges. The population array is a 2D array where each row indicates vertices of the graph G that are already in the cover. I'm only trying to find the vertices which are not in the cover. So for this reason, setting all rows and columns of vertices in cover to 0, i.e., I'm deleting the corresponding edges. The heuristic should theoretically only return vertices not in the cover now.
So in the first for loop, I just set the corresponding rows and columns in the adjacency matrix to 0 ( all elements in the rows and columns will be zero). Next I'm going through the 2D array of vertices from right to left and finding number of right neighbors in each row not in vertex_covers. For this I'm first finding the vertices not in cover (1 - vertex_covers) and then multiplying that with corresponding columns in adj_matrices (or rows since adj matrix is symmetric) to get neighbors of that that vertex we're scanning. Then I'm summing all elements to the right of this. If this value is greater than 0 then there's at least one right neighbor not in vertex_covers.
Am I doing this correctly for one?
And is there any way to vectorize the second for loop ( or the first for that matter) or speed up the code in general? calling this function thousands of times in some other code for large graphs (with 1000+ vertices). Any help would be appreciated.
You can use np.einsum to perform many complex operations between indices. In your case, the first loop can be performed this way:
adj_matrices[np.einsum('ij, ik->ijk', population, population).astype(bool)] = 0
It took me some time to understand how einsum works. I found this SO question very helpful.
BTW, Your code gave me the following syntax error:
SyntaxError: can use starred expression only as assignment target
and I had to re-write the first line of the function as:
adj_matrices = np.matlib.repmat(adj_matrix,population.shape[0],
1).reshape((population.shape[0],) + adj_matrix.shape)

Generating power-law degree-distributed random directed graphs

I searched for generating random directed graphs $G(V,E)$ with a specific node and edge count, specified in and out degree distributions, without loops and fully connected. I found a function in R in this link.
I searched in networkx, but found only this function, where the graph grows by preferential attachment and hence the number of edges is not controllable.
Is there an equivalent to the R function in Python?
It might not be so easy to generate a graph like that (fixed number of edges, nodes, degree distribution, connected)....
But the directed configuration model might get you mostly there.
http://networkx.github.io/documentation/latest/reference/generated/networkx.generators.degree_seq.directed_configuration_model.html#networkx.generators.degree_seq.directed_configuration_model
Return a directed_random graph with the given degree sequences.
The configuration model generates a random directed pseudograph (graph
with parallel edges and self loops) by randomly assigning edges to
match the given degree sequences.
The example shows how to remove self-loops and parallel edges.
>>> D=nx.DiGraph([(0,1),(1,2),(2,3)]) # directed path graph
>>> din=list(D.in_degree().values())
>>> dout=list(D.out_degree().values())
>>> din.append(1)
>>> dout[0]=2
>>> D=nx.directed_configuration_model(din,dout)
To remove parallel edges:
>>> D=nx.DiGraph(D)
To remove self loops:
>>> D.remove_edges_from(D.selfloop_edges())
You will need to generate both an in-degree and out-degree sequence of your specified length and sum as inputs. If you remove self-loop edges and parallel edges that will likely reduce the number of edges from your original specification.
Also no guarantee your graph will be connected.

Networkx graph clustering

in Networkx, how can I cluster nodes based on nodes color? E.g., I have 100 nodes, some of them are close to black, while others are close to white. In the graph layout, I want nodes with similar color stay close to each other, and nodes with very different color stay away from each other. How can I do that? Basically, how does the edge weight influence the layout of spring_layout? If NetworkX cannot do that, is there any other tools can help to calculate the layout?
Thanks
Ok, lets build us adjacency matrix W for that graph following the simple procedure:
if both of adjacent vertexes i-th and j-th are of the same color then weight of the edge between them W_{i,j} is big number (which you will tune in your experiments later) and else it is some small number which you will figure out analogously.
Now, lets write Laplacian of the matrix as
L = D - W, where D is a diagonal matrix with elements d_{i,i} equal to the sum of W i-th row.
Now, one can easily show that the value of
fLf^T, where f is some arbitrary vector, is small if vertexes with huge adjustments weights are having close f values. You may think about it as of the way to set a coordinate system for graph with i-the vertex has f_i coordinate in 1D space.
Now, let's choose some number of such vectors f^k which give us representation of the graph as a set of points in some euclidean space in which, for example, k-means works: now you have i-th vertex of the initial graph having coordinates f^1_i, f^2_i, ... and also adjacent vectors of the same color on the initial graph will be close in this new coordinate space.
The question about how to choose vectors f is a simple one: just take couple of eigenvectors of matrix L as f which correspond to small but nonzero eigenvalues.
This is a well known method called spectral clustering.
Further reading:
The Elements of Statistical Learning: Data Mining, Inference, and Prediction. by Trevor Hastie, Robert Tibshirani and Jerome Friedman
which is available for free from the authors page http://www-stat.stanford.edu/~tibs/ElemStatLearn/

Categories