Create undirected graph in NetworkX in python from pandas dataframe - python

I am new to NetworkX package in python. I want to solve the following problem.
lets say this is my data set:
import pandas as pd
d = {'label': [1, 2, 3, 4, 5], 'size': [10, 8, 6, 4, 2], 'dist': [0, 2, -2, 4, -4]}
df = pd.DataFrame(data=d)
df
label and size in the df are quite self-explanatory. The dist column measures the distance from the biggest label (label 1) to the rest of the labels. Hence dist is 0 in the case of label 1.
I want to produce something similar to the picture below:
Where the biggest label in size is in a central position (1abel 1). Edges are the distance from label 1 to all other labels and the size of nodes are proportional to the size of each label. Is it possible?
Thank you very much in advance. Please let me know if the question is unclear.

import matplotlib.pyplot as plt
import networkx as nx
G = nx.Graph()
for _, row in df.iterrows():
G.add_node(row['label'], pos=(row['dist'], 0), size=row['size'])
biggest_node = 1
for node in G.nodes:
if node != biggest_node:
G.add_edge(biggest_node, node)
nx.draw(G,
pos={node: attrs['pos'] for node, attrs in G.nodes.items()},
node_size=[node['size'] * 100 for node in G.nodes.values()],
with_labels=True
)
plt.show()
Which plots
Notes:
You will notice the edges in 1-3 and 1-2 are thicker, because they overlap with the edge sections from 1-5 and 1-4 respectively. You can address that by having one only one edge from the center to the furthest node out in each direction and since every node will be on the same line, it'll look the same.
coords = [(attrs['pos'][0], node) for node, attrs in G.nodes.items()]
nx.draw(G,
# same arguments as before and also add
edgelist=[(biggest_node, min(coords)[1]), (biggest_node, max(coords)[1])]
)
The 100 factor in the list for the node_size argument is just a scaling factor. You can change that to whatever you want.

Related

Networkx: Multiple conditions for edges

I'm trying to generate a network through a dataframe like the following:
import pandas as pd
import networkx as nx
df1 = pd.DataFrame({'id_emp' : [1,2,3,4,5],
'roi': ['positive', 'negative', 'positive', 'negative', 'negative'],
'description': ['middle', 'low', 'middle', 'high', 'low']})
df1 = df1.set_index('id_emp')
On the network that I am trying to develop, the nodes represent the values ​​of the id_emp column. And there are edges between two nodes if the roi AND description column values ​​are the same. Here is the code I'm using to develop:
G = nx.Graph()
G.add_nodes_from([a for a in df1.index])
for cr in set(df1['roi']):
indices = df1[df1['roi']==cr].index
G.add_edges_from(it.product(indices, indices))
for d in set(df1['description']):
indices = df1[df1['description']==d].index
G.add_edges_from(it.product(indices,indices))
pos = nx.kamada_kawai_layout(G)
plt.figure(figsize=(3,3))
nx.draw(G,pos,node_size = 100, width = 0.5,with_labels=True)
plt.show()
Output:
Problem: Edges are being generated for nodes as equal values ​​in the description OR roi columns. In the given example, node 4 should have no connection because it has a different value in the description column.
What should I do to analyze the two conditions together to have an edge between two nodes?
I'm not sure why you're using a graph theory tool in such case. NetworkX would be interesting here if you wanted to find the connected components for instance (i.e linked nodes).
However if two given edges must connect exactly the same nodes for them to be considered as being part of the same component, that is essentially the same as obtaining a list of duplicate rows in the dataframe, which could be achieved by:
df1.roi.str.cat(df1.description, sep='-').reset_index().groupby('roi').id_emp.apply(list)
roi
negative-high [4]
negative-low [2, 5]
positive-middle [1, 3]
Name: id_emp, dtype: object

Find all simplices a point is a part of in scipy.spatial.Delaunay python

Is there anyway to get all the simplices/triangles a certain point is a part of in a Delaunay triangulation using scipy.spatial.Delaunay?
I know there is the find_simplex() function, that that only returns 1 triangle that a point is a part of but I would like to get all triangles that it is a part of.
So in the example, when I do find_simplex() for point 6, it only returns triangle 2, but I would like it to return the triangles 1, 2, 3, 4, 10, and 9, as point 6 is a part of all of those triangles.
Any help would be appreciated!
You don’t want find_simplex because it is geometric, not topological. That is, it treats a point as a location rather than as a component of the triangulation: almost all points lie in only one simplex, so that’s what it reports.
Instead, use the vertex number. The trivial answer is to use the simplices attribute:
vert=6
[i for i,s in enumerate(d.simplices) if vert in s]
With a good bit more code, it is possible to search more efficiently using the vertex_to_simplex and neighbors attributes.
You can get all simplices adjacent to a given vertex efficiently by
def get_simplices(self, vertex):
"Find all simplices this `vertex` belongs to"
visited = set()
queue = [self.vertex_to_simplex[vertex]]
while queue:
simplex = queue.pop()
for i, s in enumerate(self.neighbors[simplex]):
if self.simplices[simplex][i] != vertex and s != -1 and s not in visited:
queue.append(s)
visited.add(simplex)
return np.array(list(visited))
Example:
import scipy.spatial
import numpy as np
np.random.seed(0)
points = np.random.rand(10, 2)
tri = scipy.spatial.Delaunay(points)
vertex = 2
simplices = get_simplices(tri, vertex)
# 0, 2, 5, 9, 11
neighbors = np.unique(tri.simplices[simplices].reshape(-1)])
# 0, 1, 2, 3, 7, 8
Visualisation:
import matplotlib.pyplot as plt
plt.triplot(points[:,0], points[:,1], tri.simplices)
plt.plot(points[neighbors,0], points[neighbors,1], 'or')
plt.plot(points[vertex,0], points[vertex,1], 'ob')
plt.show()

Coloring of nodes based on depth from root node

I have a graph on python using networkx
o = net.DiGraph()
hfollowers = defaultdict(lambda: 0)
for (twitter_user, followed_by, followers) in twitter_network:
o.add_edge(twitter_user, followed_by, followers=int(followers))hfollowers[twitter_user] = int(followers)
I have a root defined - which is the name of the twitter user
SEED = 'BarackObama'
I have initialized a subgraph from SEED
g = net.DiGraph(net.ego_graph(o, SEED, radius=4))
Now, I want to assign color to nodes based on its depth from SEED and plot it. How do I do this ?
In the code below I create a graph. Then I get the distances of each node from that graph. Then I invert that so that instead for each distance I have a list of nodes at that distance. Then for each distance I plot the nodes with a given color. Note that if there is an unreachable node, it won't get plotted. If such nodes exist you need to decide what you're doing with them separately.
Also because I'm using just the standard colors (except white), I can only do a few distances. If you need more, then you'll have to use some other way to make your list of colors (or perhaps make a function that returns a color based on distance). It will take an RGB or HEX definition for a color.
G=nx.erdos_renyi_graph(10,0.4)
G.add_node(11) # here's a new node, it's not connected
SEED=1
distanceDict = nx.shortest_path_length(G, SEED) #for each node know how far it is
inverse_dict = {} #for each distance this will say which nodes are there.
for k,v in distanceDict.iteritems():
inverse_dict[v] = inverse_dict.get(v,[])
inverse_dict[v].append(k)
inverse_dict
> {0: [1], 1: [0, 5, 6], 2: [2, 3, 4, 8, 9], 3: [7]}
colors = ['r', 'b', 'g', 'k', 'c', 'm']#create a list of colors
pos = nx.spring_layout(G) #set positions so that each plot below uses same position
for distance in inverse_dict.keys():
if distance<=len(colors): #plot these nodes with the right color
nx.draw_networkx(G, pos = pos, nodelist = inverse_dict[distance], node_color = colors[distance])
tooFar = []
for node in G.nodes_iter():
if node not in distanceDict or distanceDict[node]>max_dist:
tooFar.append(node)
nx.draw_networkx(G,pos=pos, nodelist=tooFar, node_color='w')
plt.show()

partition graph into sungraphs based on node's attribute NetworkX

I'm using Networkx to compute some measures of a graph such as diameter, clustering coefficient, etc. It's straight forward how to do this for graph as a whole. What I'm interested in is finding these measures between nodes that have same attribute(say color). I'm thinking if I could partition the graph into different sub graphs, where nodes in each sub graph are of the same color, then I could accomplish go ahead and measure diameter in this sub graph. So my question is: Is there a way to partition a graph into sub graphs which contain nodes of same color?
I would really appreciate any insight.
Use Graph.subgraph(nodes)
NetworkX 2.x+:
Demo
import networkx as nx
G = nx.Graph()
G.add_nodes_from([1, 2, 3], color="red")
G.add_nodes_from([4, 5, 6])
G.nodes # NodeView((1, 2, 3, 4, 5, 6))
# create generator
nodes = (
node
for node, data
in G.nodes(data=True)
if data.get("color") == "red"
)
subgraph = G.subgraph(nodes)
subgraph.nodes # NodeView((1, 2, 3))
older NetworkX's
Iterate over (Graph.iter_nodes()) and filter the nodes based on your criteria. Pass that list to Graph.subgraph() and it'll return a copy of those nodes and their internal edges.
For example:
G = nx.Graph()
# ... build or do whatever to the graph
nodes = (n for n, d in G.nodes_iter(data=True)) if d.get('color') == 'red')
subgraph = G.subgraph(nodes)

Colouring edges by weight in networkx

I have only found something similar to what I want here:
Coloring networkx edges based on weight
However I can't seem to apply this to my problem. I have a graph with weighted edges, but the weights aren't unique (so there are like 15 edges with weight 1). I want to colour my edges based on the weight they have, the lower the weight the lighter the colour.
I tried to apply the method suggested in the above question, but from what I understand this requires the weights to be unique on each edge?
So far I've produced a list in ascending order of the different edge weights and wanted to use this to classify the possible edge colours. I'm trying to avoid drawing the edges by weight as I may need to draw a very large graph in the future with a huge range of weights on the edges.
If it's unclear let me know in comments and I'll give more specific info.
Thanks!
EDIT:
def draw_graph(target):
nlist = [target]+G.neighbors(target)
H=nx.subgraph(G, nlist)
n=H.number_of_edges()
colours = range(n)
labels,weights = colour_and_label_edges(H)
pos = nx.spring_layout(H)
nx.draw(H, pos, node_color='#A0CBE2',edge_color=colours, node_size=100, edge_cmap=plt.cm.Blues, width=0.5, with_labels=False)
nx.draw_networkx_edge_labels(H, pos, edge_labels=labels)
plt.savefig("Graphs/edge_colormap_%s.png" % target) # save as png
plt.show() # display
pass
def colour_and_label_edges(graph):
d={}
for (u,v) in graph.edges():
d[u,v]=graph[u][v]['weight']
temp=[]
for val in d.values():
if val not in temp:
temp.append(val)
weights = sorted(temp,key=int)
return d, weights
The above code is incomplete, but the idea is the function gives me a list of the weights, as so:
[1, 2, 3, 4, 5, 6, 9, 10, 16, 21, 47, 89, 124, 134, 224]
I then want to use this list to assign each weight a colour, the higher the weight the darker the colour. (I've used a very small subgraph for this example relative to the data set). Hope that clears it up a little :S
You can use the edge weights and a colormap to draw them. You might want t a different colormap from the one below.
import matplotlib.pyplot as plt
import networkx as nx
import random
G = nx.gnp_random_graph(10,0.3)
for u,v,d in G.edges(data=True):
d['weight'] = random.random()
edges,weights = zip(*nx.get_edge_attributes(G,'weight').items())
pos = nx.spring_layout(G)
nx.draw(G, pos, node_color='b', edgelist=edges, edge_color=weights, width=10.0, edge_cmap=plt.cm.Blues)
plt.savefig('edges.png')

Categories