I'm trying to find the shortest path on a weighted graph given the constraint that the path must have a total distance less than some parameter (let's say 1000).
I tried the following but I don't know why my code is wrong.
def directedDFS(digraph, start, end, maxTotalDist):
visited = []
if not (digraph.hasNode(start) and digraph.hasNode(end)):
raise ValueError('Start or end not in graph.')
path = [str(start)]
if start == end:
return path
shortest = None
for node in digraph.childrenOf(start):
if (str(node) not in visited):
visited = visited + [str(node)]
firststep_distance = digraph.childrenOf(start)[node][0]
firststep_outer_distance = digraph.childrenOf(start)[node][1]
if (firststep_distance <= maxTotalDist):
newPath = directedDFS(digraph, node, end, maxTotalDist-firststep_distance)
if newPath == None:
continue
if (shortest == None or TotalDistance(digraph,newPath) < TotalDistance(digraph,shortest)):
shortest = newPath
if shortest != None:
path = path + shortest
else:
path = None
return path
Another thing is that I don't want to compare based on the distance of the path starting from the given node but based on the distance of the ENTIRE PATH from the original starting point. I don't know the best way to do that here though.
I really can't make heads or tails of the code you provided (firststep_distance? firststep_outer_distance?). Could you provide the name of the algorithm you're trying to implement?
If you're just making something up on the fly, and you're not doing with the goal of reinventing graph theory for educational purposes, I'd suggest looking up a standard shortest-path algorithm. If you can guarantee that your weights are non-negative, then the standard is Dijkstra's algorithm. Wikipedia will report an improved asymptotic runtime if you back it with a Fibonacci heap, but don't fall for that trap---apparently, Fibonacci heaps have horrible performance in practice.
If Dijkstra's isn't good enough, take a look into A*-search methods. Here, as in all algorithm questions, CLR is your best guide, but Wikipedia is damn close. Hope that helps!
I also can't really tell what's going on without more code or info, but this is concerning:
if (firststep_distance <= maxTotalDist):
newPath = directedDFS(digraph, node, end, maxTotalDist-firststep_distance)
If you are decreasing the maxTotalDistance in each recursive call, then firststep_distance (which I assume is the weight of the path) must be greater than the remaining distance, not less.
Related
I have a weighted graph with around 800 nodes, each with a number of connections ranging from 1 to around 300. I need to find the shortest (lowest cost) path between two nodes with some extra criteria:
The path must contain exactly five nodes.
Each node has an attribute (called position in the example code) that takes one of five values; the five nodes in the path must all have unique values for this attribute.
The algorithm needs to allow for 1-2 required nodes to be specified that the path must contain at some point in any order.
The algorithm needs to take less than 10 seconds to run, preferably as little time as possible while losing as little accuracy as possible.
My current solution in Python is to run a Depth-Limited Depth-First Search which recursively searches every possible path. To make this algorithm run in reasonable time I have introduced a limit to the number of neighbour nodes that are searched at each recursion level. This number can be lowered to decrease the computation time but at the cost of accuracy. Currently this algorithm is far too slow, with my most recent test coming in at 75 seconds with a neighbour limit of 30. If I decrease this neighbour limit any more, my testing shows that the accuracy of the algorithm begins to suffer badly. I am out of ideas on how to solve this problem while satisfying all of the above criteria. My code is as follows:
# The path must go from start -> end, be of length 5 and contain all nodes in middle
# Each node is represented as a tuple: (value, position)
def dfs(start, end, middle=[], path=Path(), best=Path([], math.inf)):
# If this is the first level of recursion, initialise the path variable
if len(path) == 0:
path = Path([start])
# If the max depth has been exceeded, check if the current node is the goal node
if len(path) >= depth:
# If it is, save the path
# Check that all required nodes have been visited
if len(path) == depth and start == end and path.cost < best.cost and all(x in path.path for x in middle):
# Store the new best path
best.path = path.path
best.cost = path.cost
return
# Run DFS on all of the neighbors of the node that haven't been searched already
# Use the weights of the neighbors as a heuristic; sort by lowest weight first
neighbors = sorted([x for x in graph.get(*start).connected_nodes], key=lambda x: graph.weight(start, x))
# Make sure that all neighbors haven't been visited yet and that their positions aren't already accounted for
positions = [x[1] for x in path.path]
# Only visit neighbouring nodes with unique positions and ids
filtered = [x for x in neighbors if x not in path.path and x[1] not in positions]
for neighbor in filtered[:neighbor_limit]:
if neighbor not in path.path:
dfs(neighbor, end, middle, Path(path.path + [neighbor], path.cost + graph.weight(start, neighbor)), best)
return best
Path Class:
class Path:
def __init__(self, path=[], cost=0):
self.path = path
self.cost = cost
def __len__(self):
return len(self.path)
Any help in improving this algorithm or even suggestions on a better approach to the problem would be much appreciated, thanks in advance!
You should iterate over all possible orderings of the 'position' attribute, and for each one use Dijkstra's algorithm or BFS to find the shortest path that respects that ordering.
Since you know the position of the first and last nodes, there are only 3! = 6 different orderings for the intermediate nodes, so you only have to run Dijkstra's algorithm 6 times.
Even in python, this shouldn't take more than a couple hundred milliseconds to run, based on the input sizes you provided.
I am looking for python code help to find the distance between source to destination. The input to the function will be number of Rows, number of columns and area which is number of Rows X number of columns matrix.
We could traverse one cell at a time up, down, left, or right.
The accessible areas are represented by 1, inaccessible 0 and destination 9.
Sample input
numRows=3
numCols=3
alist=[[1,0,0],[1,0,0],[1,9,1]]
Output: Should be an integer representing total distance to destination or -1 if there is no path
For the sample input, the path traversed will be (0,0)->(1,0)->(2,0)->(2,1) and the function should return and output of 3
Here's the pseudo code I could get to but need help figuring out the complete solution.
def findthepath(numRows,numCols,alist):
visited=[]
if alist[0,0] == 9:
return 0
for i in range(numRows):
for j in range(numCol):
if alist[i][j] == 1
visited.append(alist[i][j])
You want a Shortest Path algorithm. The most commonly used one is Dijkstra's Algorithm. Slightly difficult to understand, but fairly easy to implement. Example: https://gist.github.com/econchick/4666413
I've done this for C#, but not Python. It could be done fairly easily as the above link demonstrates, though.
I am trying to learn genetic algorthms and AI developement and I copied this code from a book, but I don't know if it is a proper genetic algorithm.
Here is the code (main.py):
import random
geneSet = " abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ!.,1234567890-_=+!##$%^&*():'[]\""
target = input()
def generate_parent(length):
genes = []
while len(genes) < length:
sampleSize = min(length - len(genes), len(geneSet))
genes.extend(random.sample(geneSet, sampleSize))
parent = ""
for i in genes:
parent += i
return parent
def get_fitness(guess):
total = 0
for i in range(len(target)):
if target[i] == guess[i]:
total = total + 1
return total
"""
return sum(1 for expected, actual in zip(target, guess)
if expected == actual)
"""
def mutate(parent):
index = random.randrange(0, len(parent))
childGenes = list(parent)
newGene, alternate = random.sample(geneSet, 2)
if newGene == childGenes[index]:
childGenes[index] = alternate
else:
childGenes[index] = newGene
child = ""
for i in childGenes:
child += i
return child
random.seed()
bestParent = generate_parent(len(target))
bestFitness = get_fitness(bestParent)
print(bestParent)
while True:
child = mutate(bestParent)
childFitness = get_fitness(child)
if bestFitness >= childFitness:
continue
print(str(child) + "\t" + str(get_fitness(child)))
if childFitness >= len(bestParent):
break
bestFitness = childFitness
bestParent = child
I saw that it has the fitness function and the mutate function, but it doesn't generate a population and I don't understand why. I thaught that a genetic algorithm needs a population generation and a crossover from the best population members to the new generation. Is this a proper genetic algorithm?
Although there are a lot of ambiguous definitions in the field of AI, my understanding is that:
An evolutionary algorithm (AE) is an algorithm that has a (set of) solution(s) and by mutating them somehow (crossover is here also seen as "mutating"), you eventually end up with better solution(s).
A genetic algorithm (GA) supports the concept of a crossover where two or more "solutions" produce new solutions.
But the terms are sometimes mixed. Mind however that crossover is definitely not the only way to produce new individuals (there are more ways than genetic algorithms to produce better solutions), like:
Simulated Annealing (SA);
Tabu Search (TS);
...
But as said earlier there is always a lot of discussion what the terms really mean and most papers on probabilistic combinatorial optimization state clearly what they mean with the terms.
So according to the above definition, your program is an evolutionary algorithm, but not a genetic one: it always has a population of one after each iteration. Furthermore your program only accepts a new child if it is better than its parent making it a Local Search (LS) algorithm. The problem with local search algorithms is that - if the mutation space of some/all solutions is a subset of the solution space - local search algorithms can get stuck forever in a local optimum. Even if that is not the case, they can get stuck in a local optimum for a very long time.
Here that is not a problem since there are no local optima (but this is of course an easy problem). More hard (and interesting) problems usually have (a lot) of local optima.
Local Search is not a bad technique if it collaborates with other techniques that help get the system out of the local optimum again. Other evolutionary techniques such as simulated annealing will accept a worse solution with small probability (depending how much worse the solution is, and how far we are in the evolutionary process).
I'm using a version of Dijkstra's algorithm written in Python which I found online, and it works great. But because this is for bus routes, changing 10 times might be the shortest route, but probably not the quickest and definitely not the easiest. I need to modify it somehow to return the path with the least number of changes, regardless of distance to be honest (obviously if 2 paths have equal number of changes, choose the shortest one). My current code is as follows:
from priodict import priorityDictionary
def Dijkstra(stops,start,end=None):
D = {} # dictionary of final distances
P = {} # dictionary of predecessors
Q = priorityDictionary() # est.dist. of non-final vert.
Q[start] = 0
for v in Q:
D[v] = Q[v]
print v
if v == end: break
for w in stops[v]:
vwLength = D[v] + stops[v][w]
if w in D:
if vwLength < D[w]:
raise ValueError, "Dijkstra: found better path to already-final vertex"
elif w not in Q or vwLength < Q[w]:
Q[w] = vwLength
P[w] = v
return (D,P)
def shortestPath(stops,start,end):
D,P = Dijkstra(stops,start,end)
Path = []
while 1:
Path.append(end)
if end == start: break
end = P[end]
Path.reverse()
return Path
stops = MASSIVE DICTIONARY WITH VALUES (7800 lines)
print shortestPath(stops,'Airport-2001','Comrie-106')
I must be honest - I aint no mathematician so I don't quite understand the algorithm fully, despite all my research on it.
I have tried changing a few things but I don't get even close.
Any help? Thanks!
Here is a possible solution:
1)Run breadth first search from the start vertex. It will find the path with the least number of changes, but not the shortest among them. Let's assume that after running breadth first search dist[i] is the distance between the start and the i vertex.
2)Now one can run Djikstra algorithm on modified graph(add only those edges from the initial graph which satisfy this condition: dist[from] + 1 == dist[to]). The shortest path in this graph is the one you are looking for.
P.S If you don't want to use breadth first search, you can use Djikstra algorithm after making all edges' weights equal to 1.
What i would do is to add an offset to the actual costs if you have to change the line. For example if your edge weights represent the time needed between 2 stations, i would add the average waiting time between Line1 Line2 at station X (e.g. 0.5*maxWaitingTime) during the search process. Of course this is a heuristic solution for the problem. If your timetables are known, you can calculate a "exact" solution or at least a solution that satisfies the model because in reality you can't assume that every bus is always on time.
The solution is simple: instead of using the distances as weights, use a wright of 1 for each stop. Dijkstra's algorithm will minimize the number of changes as you requested (the total path weight is the number of rides, which is the number of changes +1). If you want to use the distance to break ties, use something like
vwLength = D[v] + 1+ alpha*stops[v][w]
where alpha<<1, e.g. alpha=0.0001
Practically, I think you're approach is exaggerated. You don't want to fly from Boston to Toronto through Paris even if two of flights are the minimum. I would play with alpha to get an approximation of total traveling time, which is what probably matters.
if f = g + h then where in the below code would I add g?
Also, besides adding the movement cost from my initial position, how else can I make this code more efficient?
def a_star(initial_node):
open_set, closed_set = dict(), list()
open_set[initial_node] = heuristic(initial_node)
while open_set:
onode = get_next_best_node(open_set)
if onode == GOAL_STATE:
return reconstruct_path(onode)
del open_set[onode]
closed_set.append(onode)
for snode in get_successor_nodes(onode):
if snode in closed_set:
continue
if snode not in open_set:
open_set[snode] = heuristic(snode)
self.node_rel[snode] = onode
return False
In the last if, if snode is not in open_set (no pun intended!) you shouldn't set just the heuristic, but the heuristic plus the cost of the current node. And if snode is in the open set, you should check the minimum between the present value and the current one (if there are two or more ways to reach the same node, the least costly one should be considered).
That means you need to store both the node's "actual" cost and the "estimated" cost. The actual cost of the initial node is zero. For every new node, it's the minimum for every incoming arc between the cost of the other vertex plus the cost of the arc (in other words, the cost of the last node plus the cost to move from that to the current one). The estimated cost would have to sum both values: the actual cost so far plus the heuristic function.
I don't know how the nodes are represented in your code, so I can't give advice more specific than that. If you still have doubt, please edit your question providing more details.