How to find nodes with Python string matching functions in Networkx? - python

Given a dependency parse graph, if I want to find the shortest path length between two fixed nodes, this is how I've coded it:
nx.shortest_path_length (graph, source='cost', target='20.4')
My question here is: What if I want to match for all sentences in the graph or collection a target with any number formatted approximately as a currency? Would I have to first find every node in the graph that is a currency, and then iterate over the set of currency values?
It would be ideal to have:
nx.shortest_path_length (graph, source='cost', target=r'^[$€£]?(\d+([\.,]00)?)$')
Or from #bluepnume ^[$€£]?((([1-5],?)?\d{2,3}|[5-9])(\.\d{2})?)$

You could do it in two steps, without having to loop over.
Step 1: Calculate the shortest distance from your 'cost' node to all reachable nodes.
Step 2: Subset (using regex) just the currency nodes that you are interested in.
Here's an example to illustrate.
import networkx as nx
import matplotlib.pyplot as plt
import re
g = nx.DiGraph()
#create a dummy graph for illustration
g.add_edges_from([('cost','apples'),('cost', 'of'),
('$2', 'pears'),('lemon', '£1.414'),
('apples', '$2'),('lemon', '£1.414'),
('€3.5', 'lemon'),('pears', '€3.5'),
], distance=0.5) # using a list of edge tuples & specifying distance
g.add_edges_from([('€3.5', 'lemon'),('of', '€3.5')],
distance=0.7)
nx.draw(g, with_labels=True)
which produces:
Now, you can calculate the shortest paths to your nodes of interest, subsetting using regex like you wanted to.
paths = nx.single_source_dijkstra_path(g, 'cost')
lengths=nx.single_source_dijkstra_path_length(g,'cost', weight='distance')
currency_nodes = [ n for n in lengths.keys() if re.findall('(\$|€|£)',n)]
[(n,len) for (n,len) in lengths.items() if n in currency_nodes]
produces:
[('$2', 1.0), ('€3.5', 1.2), ('£1.414', 2.4)]
Hope that helps you move forward.

Related

Osmnx returns just the start and destination OSM ids when finding route between them

To get the route between two coordinates using osmnx, I used the following code:
import osmnx as ox
ox.config(use_cache=True, log_console=True)
G = ox.graph_from_place('Sydney,New South Wales,Australia', network_type='drive')
import networkx as nx
# find the nearest node to the start location
orig_node = ox.get_nearest_node(G,(intersections['lat'][0],intersections['lon'][0]))
# find the nearest node to the end location
dest_node = ox.get_nearest_node(G,(intersections['lat'][1],intersections['lon'][1]))
shortest_route=nx.shortest_path(G,orig_node,dest_node,weight='time')
where, intersections is a dataframe that contains the latitude and longitude of various intersections in sydney.
intersections['lat'][0],intersections['lon'][0] represents the latitude and longitude of the 0th entry and so on.
When I plot this, I do get the appropriate results:Plot showing the route
I get the OSM ids of the points in these routes as:
[771347, 1612748582]
But these seem to be the start and destination points itself.
Is there any way I can get all the coordinates in the route shown in the image above using osmnx itself. I'm aware I can use various APIs for this, but since I have 75000 points, and I need to find the routes between all these points(along with the coordinates that form the route), I would like a more efficient solution to this.
nx.shortest_path() returns a list of OSM ids of the nodes that form the route.
In osmnx, you can get OSM's nodes information using ox.graph_to_gdfs() that will return a GeoDataFrame with all the nodes of the graph.
Once you have all the nodes in a GeoDataFrame, you can easily extract the coordinates:
# Get the nodes given the graph, as a GeoDataFrame
nodes = ox.graph_to_gdfs(G, nodes=True, edges=False)
# Extract only the nodes that form your route
nodes = nodes.set_index('id').reindex(shortest_route).reset_index()
# Store all the route information into a DataFrame keeping only useful columns
route_df = nodes[['id', 'lon', 'lat']]
But these seem to be the start and destination points itself. Is there any way I can get all the coordinates in the route shown in the image above using osmnx itself.
You are getting all the nodes in the route. The origin and destination you provided as an example (OSM IDs 771347 and 1612748582) are adjacent nodes in the graph, because there is no other node between those two off-ramps along this expressway. Hence it is a path comprising only those two nodes.
I need to find the routes between all these points (along with the coordinates that form the route)
First off, you are using a very old version of OSMnx with several deprecated or obsolete functions in it. Your code can be much more efficient if you refactor it, and even more efficient still if you upgrade to the latest version (v1.2.2 as of this writing) to use its newer functionality. For example, you can find all your nearest nodes at once, using an efficient index, with the ox.nearest_nodes function. And you can solve all your shortest paths in parallel, using multiprocessing, with the ox.shortest_path function.
import osmnx as ox
ox.settings.use_cache = True
ox.settings.log_console = True
# get graph and add free-flow travel times to its edges
G = ox.graph_from_place("Sydney, New South Wales, Australia", network_type="drive")
G = ox.add_edge_travel_times(ox.add_edge_speeds(G))
# randomly select 1,000 origin/destination points, just for example
n = 1000
points = ox.utils_geo.sample_points(G, n)
orig_points = points[:int(n / 2)]
dest_points = points[int(n / 2):]
# find nearest node to each origin/dest point, then solve paths in parallel
orig_nodes = ox.nearest_nodes(G, X=orig_points.x, Y=orig_points.y)
dest_nodes = ox.nearest_nodes(G, X=dest_points.x, Y=dest_points.y)
paths = ox.shortest_path(G, orig_nodes, dest_nodes, weight="travel_time", cpus=None)
# convert node paths to lat-lng paths
paths_latlng = [[(G.nodes[node]["y"], G.nodes[node]["x"]) for node in path] for path in paths]
Note that if you have unsolvable routes in your graph (e.g., due to one-way edges and artificial perimeter effects), you'll need to handle (ignore/skip) those nulls when you convert nodes to lat-lng coordinates.

Add node between existing edge in Networkx Graph generated by OSMnx

I have gotten sensor location data from Highway England. I want to add these sensor locations to OSM multidigraph. How to do that?
import numpy as np
import pandas as pd
import networkx as nx
from shapely.geometry import Point, Polygon, LineString
import geopandas as gpd
import osmnx as ox
Graph data is
graph = ox.graph.graph_from_bbox(52.2, 51.85, -.6, -0.9, network_type='drive', simplify=False)
I want to add sensor = Point(-0.6116768, 51.8508765) on the edge nearest to it. Nearest edges to this sensor is n_edge = osmnx.distance.nearest_edges(graph, -0.6116768, 51.8508765, return_dist=False). Now, I need to bend this n_edge such that it passes through the given sensor point.
I found a way to solve this issue by creating a new node in graph, graph.add_node('sensor25', y= 51.8508765, x= -0.6116768, street_count = 2) then graph.add_edges_from([(n_edge[0], 'sensor25'), ('sensor25', n_edge[1)]). However, the node created by me (sensor25) is not identical to other nodes. How to make this node similar to existing nodes?
I have went through following questions
add attribute to node
add new node to existing edge in networkx
add random nodes on edges manually.
I'm not 100% certain what you need, what I understand: You want to add new edges with attributes: speed_limit, length, street number one way, copied from the edge you delete?
I assume that some of these attributes can be copied 1:1, like one way, while others will have to be recalulated. For simplicity, let's assume we have a function d(a, b) that takes (graph) nodes a and b, extracts their position, and calculates the air distance between them. Define other functions as required.
Then you could e.g. define the new edge like this:
# Get from/to id of closest edge
f, t = osmnx.distance.nearest_edges(graph, -0.6116768, 51.8508765, return_dist=False)[0]
c = 'sensor25' # Id of new node, c as in 'center'
edge_attrs = g[f][t] # Copy edge attributes
g.remove_edge(f, t) # Remove edge from graph
graph.add_node(c, y= 51.8508765, x= -0.6116768, street_count = 2)
# Add new edges, recalculating atttributes as required
g.add_edge(f, c, **{**edge_attrs, 'length': d(f, c)})
g.add_edge(c, t, **{**edge_attrs, 'length': d(c, t)})
Hope the syntax is clear, otherwise ask. It copies edge_attrs 1:1, except for attributes you specify after, like lenght. Probably you will have to define multiple functions like d, that also calculate the geometry etc.
The code isn't tested.

Is it possible to control the order that nodes are drawn using NetworkX in python?

I have a large graph object with many nodes that I am trying to graph. Due to the large number of nodes, many are being drawn one over another. This in itself is not a problem. However, a small percentage of nodes have node attributes which dictate their colour.
Ideally I would be able to draw the graph in such a way that nodes with this property are drawn last, on top of the other nodes, so that it is possible to see their distribution across the graph.
The code I have so far used to generate the graph is shown below:
import networkx as nx
import matplotlib.pyplot as plt
import numpy as np
import os
import pickle
from pathlib import Path
def openFileAtPath(filePath):
print('Opening file at: ' + filePath)
with open(filePath, 'rb') as input:
file = pickle.load(input)
return file
# Pre manipulation path
g = openFileAtPath('../initialGraphs/wordNetadj_dictionary1.11.pkl')
# Post manipulation path
# g = openFileAtPath('../manipulatedGraphs/wordNetadj_dictionary1.11.pkl')
print('Fetching SO scores')
scores = list()
for node in g.nodes:
scores.append(g.node[node]['weight'])
print('Drawing network')
nx.draw(g,
with_labels=False,
cmap=plt.get_cmap('RdBu'),
node_color=scores,
node_size=40,
font_size=8)
plt.show()
And currently the output is as shown:
This graph object itself has taken a relatively long time to generate and is computationally intensive, so ideally I wouldn't have to remake the graph from scratch.
However, I am fairly sure that the graph is drawn in the same order that the nodes were added to the graph object. I have searched for a way of changing the order that the nodes are stored within the graph object, but given directional graphs actually have an order, my searches always end up with answers showing me how to reverse the direction of a graph.
So, is there a way to dictate the order in which nodes are drawn, or alternatively, change the order that nodes are stored inside some graph object.
Potentially worthy of a second question, but the edges are also blocked out by the large number of nodes. Is there a way to draw the edges above the nodes behind them?
Piggybacking off Paul Brodersen's answer, if you want different nodes to be in the foreground and background, I think you should do the following:
For all nodes that belong in the same layer, draw the subgraph corresponding to the nodes, and set the , as follows:
pos = {...} # some dictionary of node positions, required for the function below
H = G.subgraph(nbunch)
collection = nx.draw_networkx_nodes(H, pos)
collection.set_zorder(zorder)
Do this for every group of nodes that belong in the same level. It's tedious, but it will do the trick. Here is a toy example that I created based on looking up this question as part of my own research
import matplotlib as mpl
mpl.use('agg')
import pylab
import networkx as nx
G = nx.Graph()
G.add_path([1, 2, 3, 4])
pos = {1 : (0, 0), 2 : (0.5, 0), 3 : (1, 0), 4 : (1.5, 0)}
for node in G.nodes():
H = G.subgraph([node])
collection = nx.draw_networkx_nodes(H, pos)
collection.set_zorder(node)
pylab.plot([0, 2], [0, 0], zorder=2.5)
pylab.savefig('nodes_zorder.pdf', format='pdf')
pylab.close()
This makes a graph, and then puts the each node at a successively higher level going from left to right, so the leftmost node is farthest in the background and the rightmost node is farthest in the foreground. It then draws a straight line whose zorder is 2. As a result, it comes in front of the two left nodes, and behind the two right nodes. Here is the result.
draw is a wrapper around draw_networkx_nodes and draw_networkx_edges.
Unlike draw, the two functions return their respective artists ( PathCollection and LineCollection, IIRC). These are your standard matplotlib artists, and as as such their relative draw order can be controlled via their zorder attribute.

Counting node attributes using succesors on a directed graph

I am currently working on a power distribution reliability index tool for radial networks for my engineering dissertation using NewtworkX and Python. I am struggling to write a command which will add to my accumulator all node attributes downstream of a particular edge which meets a certain condition. I've tried using the successors feature NetworkX offers however it will only count first successor that meets the edge condition instead of all downstream the directed path. I'm seeking guidance as this is confusing me and I can't seem to work my around this simple task.
import networkx as nx
import matplotlib.pyplot as plt
H=nx.DiGraph()
H.add_node(1, loads=2)
H.add_node(2, loads=2)
H.add_node(3, loads=5)
H.add_node(4, loads=5)
H.add_edge(1,2,fault=True, switch=True)
H.add_edge(2,3,fault=False, switch=True)
H.add_edge(3,4,fault=False, switch=True)
nx.draw(H)
plt.show()
a=0
for n1,n2 in H.edges():
if H[n1][n2]['fault']==True:
a=a+H.node[n2]['loads']
for n in H.successors(n2):
a=a+H.node[n]['loads']
My algorithm returns a=7 and the correct answer would be a=12 and so on for all edges that meet the criteria. Obviously is me that is writing the wrong instruction.
If I understand your question right you want to get all of the successors so you can use a breadth first search like this
import networkx as nx
H=nx.DiGraph()
H.add_node(1, loads=2)
H.add_node(2, loads=2)
H.add_node(3, loads=5)
H.add_node(4, loads=5)
H.add_edge(1,2,fault=True, switch=True)
H.add_edge(2,3,fault=False, switch=True)
H.add_edge(3,4,fault=False, switch=True)
source = 1
a = H.node[source]['loads']
nofault = [t for s,t in nx.bfs_edges(H,source=source) if not H.edge[s][t]['fault']]
a += sum(H.node[t]['loads'] for t in nofault)
print(a) #12

Python: Network Spring Layout with different color nodes

I create a spring layout network of the shortest path from a given node. In this case firm1. I want to have a different color for each degree of separation. For instance, all the first edge connecting firm1 and the other firms, say firm2 and firm3, I would like to change the node color of firm2 and firm3 (same color for both). Then all the firms connected from firm2 and firm3, say firm4 and firm5 I want to change their node colors. But I don't know how to change the colors of the node for each degree of separation starting from firm1. Here's my code:
import networkx as nx
import matplotlib.pyplot as plt
import pandas as pd
graph = nx.Graph()
with open('C:\\file.txt') as f: #Here, I load a text file with two columns indicating the connections between each firm
for line in f:
tic_1, tic_2 = line.split()
graph.add_edge(tic_1, tic_2)
paths_from_1 = nx.shortest_path(graph, "firm1") #I get the shortest path starting from firm1
x = pd.DataFrame(paths_from_1.values()) #I convert the dictionary of the shortest path into a dataframe
tic_0=x[0].tolist() #there are 7 columns in my dataframe x and I convert each columns into a list. tic_0 is a list of `firm1` string
tic_1=x[1].tolist() #tic_1 is list of all the firms directly connected to firm1
tic_2=x[2].tolist() #tic_2 are the firms indirectly connected to firm1 via the firms in tic_1
tic_3=x[3].tolist() #and so on...
tic_4=x[4].tolist()
tic_5=x[5].tolist()
tic_6=x[6].tolist()
l = len(tic_0)
graph = nx.Graph()
for i in range(len(tic_0)):
graph.add_edge(tic_0[i], tic_1[i])
graph.add_edge(tic_1[i], tic_2[i])
graph.add_edge(tic_2[i], tic_3[i])
graph.add_edge(tic_3[i], tic_4[i])
graph.add_edge(tic_4[i], tic_5[i])
graph.add_edge(tic_5[i], tic_6[i])
pos = nx.spring_layout(graph_short, iterations=200, k=)
nx.draw(graph_short, pos, font_size='6',)
plt.savefig("network.png")
plt.show()
How can I have different color nodes for each degree of separation? In other words, all the firms in tic_1 should have a node that is blue, all the firms in tic_2 has a yellow node color, etc.
The generic way to do this is to run the shortest path length algorithm from a source node to assign the colors. Here is an example:
import matplotlib.pyplot as plt
import networkx as nx
G = nx.balanced_tree(2,5)
length = nx.shortest_path_length(G, source=0)
nodelist,hops = zip(*length.items())
positions = nx.graphviz_layout(G, prog='twopi', root=0)
nx.draw(G, positions, nodelist = nodelist, node_color=hops, cmap=plt.cm.Blues)
plt.axis('equal')
plt.show()
You could use
positions = nx.spring_layout(G)
instead. I used the graphviz circo layout since it does a better job at drawing the balanced tree I used.

Categories