I have a Graph G1 with 50 nodes and 100 edges. All edges are weighted. I have created a list of edges (sorted by a pre-defined order, removing specific edges with large values), and they are indexed like:
Edgelist: [75, 35, 32, 1, ...]
I want to add the edges to a different graph G2 in batches of 10 (to save computation time), but add.edges seems to want a tuple list of vertex pairs. So,
How can I convert the Edge list above into a tuple list, e.g. [(40,2),(10,1),(10,11),(0,0),...]. I've tried a loop with G1.es[edge].tuple, but iGraph reads the [edge] variable as an attribute, whereas if you just write G1.es[75].tuple, it works fine.
How can I look up weights from G1 and add them to G2 in batches of 10?
You have to be aware that indexing G1.es with a single number will return an object of type Edge, while indexing it with a list of numbers will return an object of type EdgeSeq. Edge objects have a property named tuple, but EdgeSeq objects don't, so that's why G1.es[edgelist].tuple does not work However, you can do this:
sorted_tuples = [edge.tuple for edge in G1.es[edgelist]]
You can also extract the value of the weight attribute directly from the EdgeSeq object:
sorted_weights = G1.es[edgelist]["weight"]
Here you can make use of the fact that if G2 has M edges and you add m extra edges, then the IDs of these new edges will be in the range from M (inclusive) to M+m (exclusive):
M = G2.ecount()
m = len(sorted_tuples)
G2.add_edges(sorted_tuples)
G2.es[M:(M+m)] = sorted_weights
1) Graph G1 has had unwanted edges deleted already. Edgelist is the edge order for G1.
tuple_list=[]
for e in G1.es:
tuple_list.append(e.tuple)
sorted_tuples=[tuple_list[i] for i in Edgelist]
sorted_weights = [G1.es['weight'][o] for o in Edgelist]
2) Add edges - this can simply be looped for all edges in G1. Example below for first 10.
edges_to_add=sorted_tuples[0:10]
weights_to_add=sorted_weights[0:10]
G2.add_edges(edges_to_add)
for edge in range(len(edges_to_add)):
G2.es[G2.get_eid(edges_to_add[edge][0],edges_to_add[edge][1],0)]['weight'] = weights_to_add[edge]
Edge weights are added individually, which is a little slow, but there doesn't seem to be a way of adding edge weights in a batch in iGraph
Related
I'm a beginner in python, I'm having trouble solving an exercise on graph, the exercise is as follows:
A graph G = (V, A) stores the information of a set of vertices and a set of edges. The degree of a vertex is the number of edges incident on it. The degree of a graph is the maximum value of the degree of its vertices. The "D(degree)" is the inverse idea, the minimum value of the degree of the vertices. Write a program that takes a series of instructions and processes them to generate an undirected graph.
IV A inserts the vertex with id==A into the graph;
IA A B inserts an edge from the vertex of id==A to the vertex of id==B, if the vertices exist;
RV A removes the vertex of id==A, if it exists, and all edges related to it; and
RA A B removes the edge from the vertex of id==A to the vertex of id==B, if it exists;
Input:
The input consists of a line containing the number 0 ≤ n ≤ 100 indicating the number of operations on the graph, followed by n lines, each containing an instruction as shown. Each id is a string with a maximum of 10 characters.
Exit:
Present, in one line, the "D(degree)" of the graph.
Note:
Insert operations overwrite existing information. In the first example, the two vertices have the least number of edges. In the second case, vertex A has the fewest edges. In the last example, vertices A and B have only one edge while C has two.
this is the beginning of my code, as I couldn't develop:
n = int(input())
G = {}
for i in range(n):
l = input().split()
if l[0] == 'IV':
if l[1] not in G:
G[l[1]] = []
It's easy actually. You have mentioned that n <= 100. So there might be atmost 200 different ids that can be introduced. Now we will maintain an array of 200x200
So you will do the following things:
Given those ids map them to integers always. Keep a dictionary and associate a number with it. So if you find they mention IV A then populate the dictionary with {'A': 0} and then if IV B populate it with {A: 0, B: 1} etc.
When you get an edge removal or addition case, check first if both vertices exists in the above mentioned dictionary.
In case they don't ignore it.
In case they do, then if it is addition of edge, the 2d array (which was initialized with 0) increase the corresponding entry.
IA A B --> Then you will add 1 to position [0][1] and [1][0] of 2d array.
RA A B --> Then you will subtract 1 from position [0][1] and [1][0] of 2d array.
In the end you will just count how many non-zero entries each row has (you will only considered the rows corresponding to IDs which appeared in the dictionary). And return the minimum one of them.
I am not sure I understand how Networkit handles the names of the nodes.
Let's say that I read a large graph from an edgelist, using another Python module like Networkx; then I convert it to a Network graph and I perform some operations, like computing the pairwise distances. A simple piece of code to do this could be:
import networkx as nx
import networkit as nk
nxG=nx.read_edgelist('test.edgelist',data=True)
G = nk.nxadapter.nx2nk(nxG, weightAttr='weight')
apsp = nk.distance.APSP(G)
apsp.run()
dist=apsp.getDistances()
easy-peasy.
Now, what if I want to do something with those distances? For examples, what if I want to plot them against, I don’t know, the weights on the paths, or any other measure that requires the retrieval of the original node ids?
The getDistances() function returns a list of lists, one for each node with the distance to every other node, but I have no clue on how Networkit maps the nodes’ names to the sequence of ints that it uses as nodes identifiers, thus the order it followed to compute the distances and store them in the output.
When creating a new graph from networkx, NetworKit creates a dictionary that maps each node id in nxG to an unique integer from 0 to n - 1 in G (where n is the number of nodes) with this instruction.
Unfortunately, this mapping is not returned by nx2nk, so you should create it yourself.
Let's assume that you want to get a distance from node 1 to node 2, where 1 and 2 are node ids in nxG:
import networkx as nx
import networkit as nk
nxG=nx.read_edgelist('test.edgelist',data=True)
G = nk.nxadapter.nx2nk(nxG, weightAttr='weight')
# Get mapping from node ids in nxG to node ids in G
idmap = dict((id, u) for (id, u) in zip(nxG.nodes(), range(nxG.number_of_nodes())))
apsp = nk.distance.APSP(G)
apsp.run()
dist=apsp.getDistances()
# Get distance from node `1` to node `2`
dist_from_1_to_2 = dist[idmap['1']][idmap['2']]
I'm trying to compare two graphs with the help of the optimize_graph_edit_distance algorithm on NetworkX:
optimize_graph_edit_distance(G1, G2,
node_match=None,
node_subst_cost=None,
edge_match=None,
node_del_cost=None,
edge_subst_cost=None,
edge_del_cost=None,
edge_ins_cost=None,
upper_bound=None)
I gave each node on both graphs a set amount of attributes in the form of a dictionary and with the help of node_match I can specify if nodes N1 and N2 should be considered equal during matching.
The function node_match should be called like so:
node_match(G1.nodes[n1], G2.nodes[n2]) >>n1 and n2 are node attribute dictionaries as input.
My problem is that I have more than one node in each graph. Therefore, how do I give the function all other attribute dictionaries to compare all other nodes?
node_match is a function that returns True if node n1 in G1 and n2 in G2 should be considered equal during matching. For example:
import networkx as nx
G1 = nx.DiGraph()
G1.add_nodes_from([(0, {'label':'a'}), (1, {'label':'b'}),(2, {'label':'c'})])
G1.add_edges_from([(0,1),(0,2)])
G2 = nx.DiGraph()
G2.add_nodes_from([(3, {'label':'a'}), (4, {'label':'b'}),(5, {'label':'c'})])
G2.add_edges_from([(3,4),(3,5)])
print(G1.nodes())
print(G1.edges())
print(G2.nodes())
print(G2.edges())
for dist in nx.algorithms.similarity.optimize_graph_edit_distance(G1, G2, node_match=lambda a,b: a['label'] == b['label']):
print(dist)
Here, even though the node identifiers in the two graphs are different, the distance will be zero. That's because we defined the function that compares 2 nodes as lambda a,b: a['label'] == b['label'], meaning that two nodes are considered equal during matching if they have the same 'label' value.
Similarly, you can implement any logic you wish to without specifically treating every pair of nodes in your graphs.
I am trying to find the closely lying points and remove duplicate points for some shape data (co-ordinates) in Python. I name the co-ordinates nodes as 1,2,3.. and so on and I'm using the shapely package and creating polygons around the node points 1,2,3.. by saying
polygons = [Point([nodes[i]).buffer(1) for i in range(len(nodes))]
and to find the cascading ones I use
cascade = cascaded_union(polygons)
the cascade which is returned is a multipolygon and has many co-ordinates listed, I want to exactly know which of the points from my nodes are cascaded (based on the buffer value of 1) so that I can replace them by a new node. How can I know this??
Instead of using the cascaded_union method, it might be easier to write your own method to check if any two polygons intersect. If I'm understanding what you want to do correctly, you want to find if two polygons overlap, and then delete one of them and edit another accordingly.
You could so something like this (not the best solution, I'll explain why):
def clean_closely_lying_points(nodes):
polygons = [Point([nodes[i]).buffer(1) for i in range(len(nodes))]
for i in range(len(polygons) - 1):
if polygons[i] is None:
continue
for j in range(i + 1, len(polygons)):
if polygons[j] is None:
continue
if polygons[i].intersects(polygons[j]):
polygons[j] = None
nodes[j] = None
# now overwrite 'i' so that it's whatever you want it to be, based on the fact that polygons[i] and polygons[j] intersect
polygons[i] =
nodes[i] =
However, overall, I feel like the creation of polygons is time intensive and unnecessary. It's also tedious to update the polygons list and the nodes list together. Instead, you could just use the nodes themselves, and use shapely's distance method to check if two points are within 2 units of each other.
This should be mathematically equivalent since the intersection between two circles both of radius 1 means that their center points are at most distance 2 away. In this scenario, your for loops would take a similar structure except they would iterate over the nodes.
def clean_closely_lying_points(nodes):
point_nodes = [Point(node) for node in nodes] # Cast each of the nodes (which I assume are in tuple form like (x,y), to shapely Points)
for i in range(len(point_nodes) - 1):
if point_nodes[i] is None:
continue
for j in range(i + 1, len(point_nodes)):
if point_nodes[j] is None:
continue
if point_nodes[i].distance(point_nodes[j]) < 2:
point_nodes[j] = None
point_nodes[i] = # Whatever you want point_nodes[i] to be now that you know that point_nodes[j] was within a distance of 2 (could remain itself)
return [node for node in point_nodes if node is not None]
The result of this method would be a list of shapely point objects, with closely lying points eliminated.
What if I need to create a graph in igraph and add a bunch of edges, but the edges have associated attributes? It looks like .add_edges can only take a list of edges without attributes, so I've been adding them one by one with .add_edge
graph.add_edge('A','B',weight = 20)
Here A and B are names of nodes
You can assign the attributes later; e.g.:
graph.es["weight"] = range(g.ecount())
This will assign weights to all the edges at once. If you want to assign attributes to only a subset of the edges, index or slice the edge sequence (g.es) in however you want:
graph.es[10:20]["weight"] = range(10)