How to make a for cycle with if conditions faster - python

I have written a piece of code however it takes an enormous amount of time to run and I don't know how to make it faster. Can anyone help me?
Here is the code:
import networkx as nx
import matplotlib.pyplot as plt
import osmnx as ox
import pandas as pd
from shapely.wkt import loads as load_wkt
import numpy as np
import matplotlib.cm as cm
ox.config(log_console=True, use_cache=True)
import matplotlib as mpl
import random as rd
distrito = ['Lisbon District', 'Setúbal District']
G = ox.graph_from_place(distrito, network_type='all_private')
hospitals = ox.pois_from_place(city, amenities=['hospital'])
coord_1 = (38.74817825481225, -9.160815118526642)
coord_2 = (38.74110711410615, -9.152159572392323)
coord_3 = (38.7287248180068, -9.139114834357233)
target_1 = ox.get_nearest_node(G, coord_1)
target_2 = ox.get_nearest_node(G, coord_2)
target_3 = ox.get_nearest_node(G, coord_3)
nodes, edges = ox.graph_to_gdfs(G, nodes=True, edges=True) # Transform nodes and edges into Geodataframes
travel_speed = 20 # km/h
meters_per_minute = travel_speed * 1000 / 60
nodes['shortest_route_length_to_target'] = 0
route_lengths = []
list_nodes = []
i = 0
# print(G.edges(data=True))
for u, v, k, data in G.edges(data=True, keys=True):
data['time'] = data['length'] / meters_per_minute
for node in G.nodes:
try:
route_length_1 = nx.shortest_path_length(G, node, target_1, weight='time')
route_length_2 = nx.shortest_path_length(G, node, target_3, weight='time')
if route_length_1 < route_length_2:
route_lengths.append(route_length_1)
nodes['shortest_route_length_to_target'][node] = route_length_1
list_nodes.append(node)
elif route_length_1 > route_length_2:
route_lengths.append(route_length_2)
nodes['shortest_route_length_to_target'][node] = route_length_2
list_nodes.append(node)
except nx.exception.NetworkXNoPath:
continue
In general up until the for node in G.nodes: the code runs quite fast. It's this for cycle that is taking far too long.
Thank you in advance.

You should profile your code to identify exactly what is slow. With high confidence I would assume that it is not your if conditions within that for loop (though they could be optimized a bit more). It looks like approximately all of your time complexity comes from trying to solve two shortest paths within each for loop run. This is inherently slow.
You could try calculating your shortest paths with something like igraph, which would be faster.
Note also that you do not need to manually calculate edge traversal times as of OSMnx v0.13.0.

Related

Calculation of the pressure force on the wing profile via SALOME and python

I am trying to calculate the pressure force F=pressure*length (because in 2D) on a wing profile.
I used the example of the MEDCOUPLING agitator, but I can't adapt it correctly.
It seems to me that I have to :
-retrieve the mesh from the corresponding .med
-retrieve the corresponding pressure field
-make it an array
The problem is that :
-I can't get the edge of the airfoil to measure it
-I can't determine and select the pressures applying to this edge
-I can't determine and select the pressures that apply to this edge. In your opinion, should I first extract the cells corresponding to the edge and then extract the corresponding pressure, or do you see other possibilities?
Thanks
import system
import salome
import medcoupling as mc
import numpy as np
salome.salome_init()
import salome_notebook
notebook = salome_notebook.NoteBook()
sys.path.insert(0, r'/home/b72882/Documents')
import GEOM
from salome.geom import geomBuilder
import math
import SALOMEDS
import SMESH, SALOMEDS
from salome.smesh import smeshBuilder
ailMesh2DMc=mc.ReadUMeshFromFile("naca10.med", "MeshFlow10", 0)
m2DSurf,desc,descI,revDesc,revDescI = ailMesh2DMc.buildDescendingConnectivity()
nbOf2DCellSharing = revDescI.deltaShiftIndex()
ids2 = nbOf2DCellSharing.findIdsEqual(1)
ailSkinMc = m2DSurf[ids2]
f = mc.MEDCouplingFieldDouble.New(mc.ON_CELLS, mc.ONE_TIME)
f.setTime(1,200,201)
f.setArray(ailMesh2DMc.computeCellCenterOfMass())
f.setMesh(ailMesh2DMc)
f.setName("Pressure")
mc.WriteField("naca10.med",f,True)
f2 = mc.ReadFieldCell("naca10.med", f.getMesh().getName(), 0, f.getName(), 200, 201)
arr = f2.getArray()
arr=mc.DataArrayDouble(ailMesh2DMc.getNumberOfCells(),1)
ids = arr.findIdsInRange(-6.,6.)
pressOnAilMc = f2[ids]
offsetsOfTupleIdsInField = revDescI[ids]
tupleIdsInField = revDesc[offsetsOfTupleIdsInField]
pressOnSkinAilMc = pressOnAilMc[tupleIdsInField]
pressOnSkinAilMc.setMesh(ailSkinMc)
pressSkin = pressOnAilMc.getArray()
pressSkin *= 1e5
areaSkin = ailSkinMc.getMeasureField(True).getArray()
forceSkin = pressSkin*areaSkin
normalSkin = ailSkinMc.buildOrthogonalField().getArray()
forceVectSkin = forceSkin*normalSkin
# mesh1d,lskin,di,r,ri = ailMesh2DMc.explodeIntoEdges()
force = forceVectSkin/lSkin

Is there a better way to solve this MINLP in pyscipopt?

I'm trying to solve the following MINLP, basically attempting to maximize the likelihood of a certain portfolio reaching a "ceiling" performance. My first attempt at the code is below.
EDIT: Math says maximize, should say minimize
from pyscipopt import Model, quicksum
import numpy as np
import pandas as pd
from random import uniform, normalvariate
model=Model()
t=20000
stocks_portfolio = {}
stocks_df = pd.DataFrame(np.zeros((150,4)),columns = {'ids','Mean','cost','stdev'})
noptions = len(stocks_df)
stocks_df['ids'] = [i for i in range(noptions)]
stocks_df['Mean'] = [uniform(500,2500) for i in range(noptions)]
stocks_df['cost'] = [stocks_df.loc[i,'Mean']*uniform(50,250) for i in range(noptions)]
stocks_df['stdev'] = [stocks_df.loc[i,'Mean']*uniform(0.2,0.5) for i in range(noptions)]
cov_mat = np.array([[normalvariate(0,0.3) for i in range(noptions)] for j in range(noptions)])
for i in range(len(stocks_df)):
stocks_portfolio[i] = model.addVar(vtype='B')
model.addCons(quicksum(stocks_portfolio[i] for i in range(noptions))==15)
model.addCons(quicksum(stocks_df.loc[i, 'cost']*stocks_portfolio[i] for i in range(noptions)) <= 600000)
stand_in = model.addVar(vtype='C')
model.addCons(stand_in>=(t-quicksum(stocks_df.loc[i,'Mean']*stocks_portfolio[i] for i in range(noptions)))/((quicksum(stocks_portfolio[i]*stocks_df.loc[i,'stdev']**2 for i in range(noptions))+quicksum(2*stocks_portfolio[i]*stocks_portfolio[j]*cov_mat[i,j] for i in range(noptions) for j in range(noptions)))**0.5))
model.setObjective(stand_in,'minimize')
model.optimize()
model.getCondition()
portfolios = []
for i in range(noptions):
if model.getVal(stocks_portfolio[i]) > 0.9:
portfolios.append(i)
The performance here has been slow and unwieldy, and I was wondering if I'm thinking about the question all wrong.

How to optimize Shapely and Sklearn code?

I am working with a dataset of 4.2 millions points and my codes is already taking a while to process, however below code is taking several hours to process (the code was provided in other public question and basically it takes the nearest linestring to a point, finds the nearest point from that line string and calculus the distance)
The codes actually does an awesome job, but takes too long for its purposes, How I can optimize or do the same thing in a shortest time?
import geopandas as gpd
import numpy as np
from shapely.geometry import Point, LineString
from shapely.ops import nearest_points
from sklearn.neighbors import DistanceMetric
EARTH_RADIUS_IN_MILES = 3440.1 #NAUTICAL MILES
panama = gpd.read_file("/Users/Danilo/Documents/Python/panama_coastline/panama_coastline.shp")
for c in range(b):
#p = Point(-77.65325423107359,9.222038196656131)
p=Point(data['longitude'][c],data['latitude'][c])
def closest_line(point, linestrings):
return np.argmin( [p.distance(linestring) for linestring in panama.geometry] )
closest_linestring = panama.geometry[ closest_line(p, panama.geometry) ]
closest_linestring
closest_point = nearest_points(p, closest_linestring)
dist = DistanceMetric.get_metric('haversine')
points_as_floats = [ np.array([p.x, p.y]) for p in closest_point ]
haversine_distances = dist.pairwise(np.radians(points_as_floats), np.radians(points_as_floats) )
haversine_distances *= EARTH_RADIUS_IN_MILES
dtc1=haversine_distances[0][1]
dtc.append(dtc1)
Edit: Simplify to single calculation with BallTree
Imports
import pandas as pd
import geopandas as gpd
import numpy as np
from shapely.geometry import Point, LineString
from shapely.ops import nearest_points
Read Panama
panama = gpd.read_file("panama_coastline/panama_coastline.shp")
Get all points, long,lat format:
def get_points_as_numpy(geom):
work_list = []
for g in geom:
work_list.append( np.array(g.coords) )
return np.concatenate(work_list)
all_coastline_points = get_points_as_numpy(panama.geometry)
Create Balltree
from sklearn.neighbors import BallTree
import numpy as np
panama_radians = np.radians(np.flip(all_coastline_points,axis=1))
tree = BallTree(panama_radians, leaf_size=12, metric='haversine')
Create 1M random points:
mean = [8.5,-80]
cov = [[1,0],[0,5]] # diagonal covariance, points lie on x or y-axis
random_gps = np.random.multivariate_normal(mean,cov,(10**6))
random_points = pd.DataFrame( {'lat' : random_gps[:,0], 'long' : random_gps[:,1]})
random_points.head()
Calculate closest coast point (<30 Seconds on my machine)
distances, index = tree.query( np.radians(random_gps), k=1)
Put results in DataFrame
EARTH_RADIUS_IN_MILES = 3440.1
random_points['distance_to_coast'] = distances * EARTH_RADIUS_IN_MILES
random_points['closest_lat'] = all_coastline_points[index][:,0,1]
random_points['closest_long'] = all_coastline_points[index][:,0,0]

Igraph shortest path gives an infinite value

I am trying to calculate the distance between a node and two targets, afterwards I compare the lengths of the routes calculated and save the smallest in a list. I know I can use networkx.shortest_path() however this solution takes a long time and the code takes too long to run. For this reason I opted to use Igraph. Here is the code:
import networkx as nx
import matplotlib.pyplot as plt
import osmnx as ox
import pandas as pd
from shapely.wkt import loads as load_wkt
import numpy as np
import matplotlib.cm as cm
import igraph as ig
import matplotlib as mpl
import random as rd
ox.config(log_console=True, use_cache=True)
city = 'Portugal, Lisbon'
G = ox.graph_from_place(city, network_type='drive')
G_nx = nx.relabel.convert_node_labels_to_integers(G)
ox.speed.add_edge_speeds(G_nx, hwy_speeds=20, fallback=20)
ox.speed.add_edge_travel_times(G_nx)
weight = 'travel_time'
coord_1 = (38.74817825481225, -9.160815118526642) # Coordenada Hospital Santa Maria
coord_2 = (38.74110711410615, -9.152159572392323) # Coordenada Hopstial Curry Cabral
coord_3 = (38.7287248180068, -9.139114834357233) # Hospital Dona Estefania
coord_4 = (38.71814053423293, -9.137885476529883) # Hospital Sao Jose
target_1 = ox.get_nearest_node(G_nx, coord_1)
target_2 = ox.get_nearest_node(G_nx, coord_2)
target_3 = ox.get_nearest_node(G_nx, coord_3)
target_4 = ox.get_nearest_node(G_nx, coord_4)
G_ig = ig.Graph(directed=True)
G_ig.add_vertices(list(G_nx.nodes()))
G_ig.add_edges(list(G_nx.edges()))
G_ig.vs['osmid'] = list(nx.get_node_attributes(G_nx, 'osmid').values())
G_ig.es[weight] = list(nx.get_edge_attributes(G_nx, weight).values())
assert len(G_nx.nodes()) == G_ig.vcount()
assert len(G_nx.edges()) == G_ig.ecount()
route_length=[]
list_nodes=[]
for node in G_nx.nodes:
length_1 = G_ig.shortest_paths(source=node, target=target_1, weights=weight)[0][0]
length_2 = G_ig.shortest_paths(source=node, target=target_2, weights=weight)[0][0]
if length_1<length_2:
route_length.append(length_1)
else:
route_length.append(length_2)
list_nodes.append(node)
If you print the list with the lengths of the routes some values will be 'inf' which obviously doesn't make sense. Can anyone help me understand why the length would be inf?
As Vincent Traag said the distance between two disconnected nodes in inf. So it means that for such results, the node and the source are not connected

How do I use the short_paths_dijkstra function properly from igraph?

This is what I have tried
def weighted_path(g, u, v):
x= g.shortest_paths_dijkstra(source=u, target=v, weights=True)
eff=1/x
return eff
How do I use it properly? I have no idea as to how to use igraph properly and can't really find the documentation.
Assuming that you want the nodal efficiency for all nodes, then you can do this:
import numpy as np
from igraph import *
np.seterr(divide='ignore')
# Example using a random graph with 20 nodes
g = Graph.Erdos_Renyi(20,0.5)
# Assign weights on the edges. Here 1s everywhere
g.es["weight"] = np.ones(g.ecount())
def nodal_eff(g):
weights = g.es["weight"][:]
sp = (1.0 / np.array(g.shortest_paths_dijkstra(weights=weights)))
np.fill_diagonal(sp,0)
N=sp.shape[0]
ne= (1.0/(N-1)) * np.apply_along_axis(sum,0,sp)
return ne
eff = nodal_eff(g)
print(eff)
#[0.68421053 0.81578947 0.73684211 0.76315789 0.76315789 0.71052632
# 0.81578947 0.81578947 0.81578947 0.73684211 0.71052632 0.68421053
# 0.71052632 0.81578947 0.84210526 0.76315789 0.68421053 0.68421053
# 0.78947368 0.76315789]

Categories