I would like to understand in the following WORKING AND FINISHED code, why when updating pq_update, it is written as pq_update[neighbour][1].
Instead of writing pq_update[neighbour] (which is how I did it), it does not seem to change anything so why is it included ?
Thank you
import heapq
def dijkstra(graph, start):
distances = {vertex:float('inf') for vertex in graph}
pq = []
pq_update = {}
distances[start] = 0
for vertex, value in distances.items():
entry = [vertex, value]
heapq.heappush(pq, entry)
pq_update[vertex] = entry
while pq:
getmin = heapq.heappop(pq)[0]
for neighbour, distance_neigh in graph[getmin].items():
dist = distances[getmin] + distance_neigh
if dist < distances[neighbour]:
distances[neighbour] = dist
pq_update[neighbour][1] = dist # THIS LINE !!!
print(distances)
return distances
if __name__ == '__main__':
example_graph = {
'U': {'V': 2, 'W': 5, 'X': 1},
'V': {'U': 2, 'X': 2, 'W': 3},
'W': {'V': 3, 'U': 5, 'X': 3, 'Y': 1, 'Z': 5},
'X': {'U': 1, 'V': 2, 'W': 3, 'Y': 1},
'Y': {'X': 1, 'W': 1, 'Z': 1},
'Z': {'W': 5, 'Y': 1},
}
dijkstra(example_graph, 'X')
Note: the implementation you have is broken and doesn't correctly implement Dijkstra. More on that below.
The pq_update dictionary contains lists, each with two entries:
for vertex, value in distances.items():
entry = [vertex, value]
heapq.heappush(pq, entry)
pq_update[vertex] = entry
So pq_update[neighbour] is a list with both the vertex and the distance. You want to update the distance, not replace the [vertex, value] list, so pq_update[neighbour][1] is used.
Note that the entry list is also shared wit the heapq. The pq heap has a reference to the same list object, so changes to pq_update[neightbor][1] will also be visible in entries still to be processed on heap!
When you assign directly to pq_update[neighbour], you remove that connection.
The reason you don't see any difference is because the implementation of the algorithm is actually broken, as the heap is not used correctly. The heap is sorted by first by the first value in the list items you pushed in. In your code that's the node name, not the distance, and the heapq order of items is never updated when the distances in the list items are altered. Because the heapq is not used correctly, you always traverse the nodes in alphabetical order.
To use the heapq correctly, you need to put the edge length first, and you don't alter the values on the heap; if you use tuples you can't accidentally do this. You only need to push nodes onto the heap that you reached, really; you'll end up with multiple entries for some of the nodes (reached by multiple paths), but the heapq will still present the shortest path to that node first. Just keep a set of visited nodes so you know to skip any longer paths. The point is that you visit the shorter path to a given node before the longer path, and you don't need to alter the heapq items in-place to achieve that.
You could re-write your function (with better variable names) to:
def dijkstra(graph, start):
"""Visit all nodes and calculate the shortest paths to each from start"""
queue = [(0, start)]
distances = {start: 0}
visited = set()
while queue:
_, node = heapq.heappop(queue) # (distance, node), ignore distance
if node in visited:
continue
visited.add(node)
dist = distances[node]
for neighbour, neighbour_dist in graph[node].items():
if neighbour in visited:
continue
neighbour_dist += dist
if neighbour_dist < distances.get(neighbour, float('inf')):
heapq.heappush(queue, (neighbour_dist, neighbour))
distances[neighbour] = neighbour_dist
return distances
Related
I am trying to get the following code working. After every for ends heappop gives me an integer instead of Vertex. In addition when I got it working, with changing the Vertex in the priority queue with integer. I have wrong result. Please help.
Thanks in advance
import heapq
class Vertex:
def __init__(self, id):
self.id = id
self.adjList = []
self.adjWeights = []
def shortestPath(vertices, N, source, destination):
distTo = [float('inf') for _ in range(N+1)]
edgeTo = [float('inf') for _ in range(N+1)]
# Set initial distance from source
# to the highest value
distTo[source] = 0.0
edgeTo[source] = float('inf')
pq = [vertices[source]]
heapq.heapify(pq)
while True:
closest = heapq.heappop(pq)
for i in range(len(closest.adjList)):
# Checks if the edges are decreasing and
# whether the current directed edge will
# create a shorter path
if closest.adjWeights[i] < edgeTo[closest.id] and distTo[closest.id] + closest.adjWeights[i] < distTo[closest.adjList[i]]:
edgeTo[closest.adjList[i]] = closest.adjWeights[i]
distTo[closest.adjList[i]] = closest.adjWeights[i] + distTo[closest.id];
heapq.heappush(pq, closest.adjList[i])
print(distTo)
print(distTo[destination])
def main ()
N = 6
M = 9
'''
edges = {{0, 2, 1.1}, {0, 4, 2}, {0, 5, 3.3}, {1, 4, 2.7},
{2, 3, 2}, {2, 4, 1.1}, {3, 1, 2.3}, {4, 5, 2.4}, {5, 1, 3}}
'''
# Create an array of vertices
vertices = [Vertex(i) for i in range(0, N)]
i=0
vertices[0].adjList.append(2)
vertices[0].adjWeights.append(1.1)
vertices[0].adjList.append(4)
vertices[0].adjWeights.append(2.0)
vertices[0].adjList.append(5)
vertices[0].adjWeights.append(3.3)
vertices[1].adjList.append(4)
vertices[1].adjWeights.append(2.7)
vertices[2].adjList.append(3)
vertices[2].adjWeights.append(2.0)
vertices[2].adjList.append(4)
vertices[2].adjWeights.append(1.1)
vertices[3].adjList.append(1)
vertices[3].adjWeights.append(2.3)
vertices[4].adjList.append(5)
vertices[4].adjWeights.append(2.4)
vertices[5].adjList.append(1)
vertices[5].adjWeights.append(3.0)
# Source and destination vertices
src = 0
target = 1
print(shortestPath(vertices, N, src, target))
I have n items in a dict, each of which has an x and a y coordinate:
items = {1: {x: 0, y: 0}, 2: {x: 1, y: 0}, ...
(The initial coordinates don't matter. The data structure is also just an example - could also be items = [(0, 0), (1, 0), ... for example. Whatever works best.)
Additionally, I have a list that contains pairs of item indices to define which items are related to each other:
relation = [{1, 2}, {1, 7}, {2, 5}, ...
(I'm using set instead of tuple here to show that the order doesn't matter. The relation is symmetric. But I can use whichever data structure works best.)
I'd like to fill an xy-grid with the items 1, 2, ... n in such a fashion that related items are as close as possible to each other. The coordinates are integers only. And no grid space can be occupied by more than one item.
A useful metric to define "as close as possible" would be the Manhattan metric for example:
def manhattan(i, j):
return abs(items[i]["x"] - items[j]["x"]) + abs(items[i]["y"] - items[j]["y"])
Is there an obvious way to do this? Does an algorithm exist for this problem?
Given the following function, what would be the correct and pythonic way to archiving the same (and faster) result?
My code is not efficient and I believe I'm missing something that is staring at me.
The idea is to find a pattern that is [[A,B],[A,C],[C,B]] without having to generate additional permutations (since this will result in a higher processing time for the comparisons).
The length of the dictionary fed into find_path in real-life would be approximately 10,000, so having to iterate over that amount with the current code version below is not efficient.
from time import perf_counter
from typing import List, Generator, Dict
def find_path(data: Dict) -> Generator:
for first_pair in data:
pair1: List[str] = first_pair.split("/")
for second_pair in data:
pair2: List[str] = second_pair.split("/")
if pair2[0] == pair1[0] and pair2[1] != pair1[1]:
for third_pair in data:
pair3: List[str] = third_pair.split("/")
if pair3[0] == pair2[1] and pair3[1] == pair1[1]:
amount_pair_1: int = data.get(first_pair)[
"amount"
]
id_pair_1: int = data.get(first_pair)["id"]
amount_pair_2: int = data.get(second_pair)[
"amount"
]
id_pair_2: int = data.get(second_pair)["id"]
amount_pair_3: int = data.get(third_pair)[
"amount"
]
id_pair_3: int = data.get(third_pair)["id"]
yield (
pair1,
amount_pair_1,
id_pair_1,
pair2,
amount_pair_2,
id_pair_2,
pair3,
amount_pair_3,
id_pair_3,
)
raw_data = {
"EZ/TC": {"id": 1, "amount": 9},
"LM/TH": {"id": 2, "amount": 8},
"CD/EH": {"id": 3, "amount": 7},
"EH/TC": {"id": 4, "amount": 6},
"LM/TC": {"id": 5, "amount": 5},
"CD/TC": {"id": 6, "amount": 4},
"BT/TH": {"id": 7, "amount": 3},
"BT/TX": {"id": 8, "amount": 2},
"TX/TH": {"id": 9, "amount": 1},
}
processed_data = list(find_path(raw_data))
for i in processed_data:
print(("The path to traverse is:", i))
>> ('The path to traverse is:', (['CD', 'TC'], 4, 6, ['CD', 'EH'], 7, 3, ['EH', 'TC'], 6, 4))
>> ('The path to traverse is:', (['BT', 'TH'], 3, 7, ['BT', 'TX'], 2, 8, ['TX', 'TH'], 1, 9))
>> ('Time to complete', 5.748599869548343e-05)
# Timing for a simple ref., as mentioned above, the raw_data is a dict containing about 10,000 keys
You can't do that with this representation of the graph. This algorithm has O(|E|^3) time complexity. It is a good idea to store edges as array of lists, each list will store only adjacent vertexes. And then it is easy to do what you need. Fortunately, you can re-represent graph in O(|E|) time.
How to do that
We will store graph as array of vertices (but in this case because of string vertex-values we take a dictionary). We want to access in all neighbours by a vertex. Let's do that -- we will store in the array lists of all neighbours of the given vertex.
Now we just need to construct our structure by set of edges (aka row_data).
How to add an edge in graph? Easy! We should find a vertex from in our array and add a vertex to to the list of it's neighbours
So, the construct_graph function could be like:
def construct_graph(raw_data): # here we will change representation
graph = defaultdict(list) # our graph
for pair in raw_data: # go through every edge
u, v = pair.split("/") # get from and to vertexes
graph[u].append(v) # and add this edge in our structure
return graph # return our new graph to other functions
How to find path length 2
We will use dfs on our graph.
def dfs(g, u, dist): # this is a simple dfs function
if dist == 2: # we has a 'dist' from our start
return [u] # and if we found already answer, return it
for v in g.get(u, []): # otherwise check all neighbours of current vertex
ans = dfs(g, v, dist + 1) # run dfs in every neighbour with dist+1
if ans: # and if that dfs found something
ans.append(u) # store it in ouy answer
return ans # and return it
return [] # otherwise we found nothing
And then we just try it for every vertex.
def main():
graph = construct_graph(raw_data)
for v in graph.keys(): # here we will try to find path
ans = dfs(graph, v, 0) # starting with 0 dist
if ans: # and if we found something
print(list(reversed(ans))) # return it, but answer will be reversed
I am implementing a Bellman-ford shortest path algorithm. Based on the source and destination node, it outputs the shortest distance, and the path through a network.
Now, I need to add a capacity component to the algorithm. So if the demand is 2 but the capacity is 1, that path is no longer usable.
My initial idea was to add a dictionary for the capacity and a variable for the demand. Then if the demand exceeded the capacity of a node, the lenght of the path would be arbritrarily large. I was thinking something like:
if capacity[neighbour] < demand:
distance[neighbour], predecessor[neighbour] = 999
This gives me the following error message:
TypeError: '<' not supported between instances of 'dict' and 'int'
Is there a workaround for this issue, or could I potentially add the demand-constraint in a smarter way?
Full code:
source = 'e'
destination = 'd'
demand = 2
def bellman_ford(graph, source, capacity):
# Step 1: Prepare the distance and predecessor for each node
distance, predecessor = dict(), dict()
for node in capacity:
for node in graph:
distance[node], predecessor[node] = float('inf'), None
distance[source] = 0
# Step 2: Relax the edges
for _ in range(len(graph) - 1):
for node in graph:
for neighbour in graph[node]:
# If the distance between the node and the neighbour is lower than the current, store it
if distance[neighbour] > distance[node] + graph[node][neighbour]:
distance[neighbour], predecessor[neighbour] = distance[node] + graph[node][neighbour], node
if capacity[node] < demand:
distance[neighbour], predecessor[neighbour] = 100
# Step 3: Check for negative weight cycles
for node in graph:
for neighbour in graph[node]:
assert distance[neighbour] <= distance[node] + graph[node][neighbour], "Negative weight cycle."
return distance, predecessor
#Initial graph
graph = {
'a': {'b': 1, 'd': 1},
'b': {'c': 1, 'd': 2},
'c': {},
'd': {'b': 1, 'c': 8, 'e': 1},
'e': {'a': 2, 'd': 7}
}
capacity = {
'a': {'b': 4, 'd': 1},
'b': {'c': 5, 'd': 4},
'c': {},
'd': {'b': 1, 'c': 3, 'e': 3},
'e': {'a': 5, 'd': 3}
}
distance, predecessor = bellman_ford(graph, source, capacity)
print("The cost of shipping from from", source, "to", destination, "is", distance[destination])
for i in graph:
print("node",i,"is reached through node", predecessor[i])
I have a cyclical directed graph. Below is the representation of the graph as a python dict
graph = {
'A': {'B': 5, 'D': 5, 'E': 7 },
'B': {'C': 4},
'C': {'D': 8, 'E': 2},
'D': {'C': 8, 'E': 6},
'E': {'B': 3}
}
I have wrote a simple implementation of a Dijkstra's shortest path. Which seems to work for given two points. Below is my implementation.
def shortestpath(self, start, end, visited=[],distances={},predecessors={}):
# initialize a big number
maxint = 10000
if start==end:
path=[]
while end != None:
path.append(end)
end=predecessors.get(end, None)
return distances[start], path[::-1]
# detect if it's the first time through, set current distance to zero
if not visited: distances[start]=0
# process neighbors as per algorithm, keep track of predecessors
for neighbor in self.graph[start]:
if neighbor not in visited:
neighbordist = distances.get(neighbor,maxint)
tentativedist = distances[start] + self.graph[start][neighbor]
if tentativedist < neighbordist:
distances[neighbor] = tentativedist
predecessors[neighbor]=start
# neighbors processed, now mark the current node as visited
visited.append(start)
# finds the closest unvisited node to the start
unvisiteds = dict((k, distances.get(k,maxint)) for k in self.graph if k not in visited)
closestnode = min(unvisiteds, key=unvisiteds.get)
# now we can take the closest node and recurse, making it current
return self.shortestpath(closestnode,end,visited,distances,predecessors)
now this simple implementation seems to work. For example if I do somthing like this
shortestpath('A', 'C')
it will give me the path and shortest weight
(9, ['A', 'B', 'C'])
in this case.
However, whenever I shortestpath('B', 'B') the program will break.
Now there is a shortest path from B to B since it is a cyclic graph the path is B-C-E-B. I just don't know how to check for that and modify the Dijktra's algorithm accordingly to have it check for cyclic cases like this one. Any suggestion is greatly appreciated. Thanks :)