Finding a way throug a matrix using Python - python

Im trying to solve a Problem for my University Homework, The task is to find the cheapest path trough a NxN Matrix where every Point in the Matrix stores a random Integer between 0 and 9. The Start is at 0,0 and the end at N,N . The Output should consist of the cheapest Path as a List of Tupels and the Cost of the Path(adding up the values of each Step).
I have tried using a Tree where 0,0 is the root and the children are its neighbours in the matrix, and the children of the children are their neighbours and so on. Then i wanted to add up all the nodes that end with N,N as the last child, but i didnt get the tree working in the first place. We havent had Trees in our lectures yet, so im open to any other Solution for this Problem. Thank you :)
import random
import math
def Matrix_gen(n):
# Generate a n*n matrix with random values
matrix = []
for i in range(n):
matrix.append([])
for j in range(n):
matrix[i].append(random.randint(0, 9))
return matrix
MATRIX = Matrix_gen(5)
def get_neighbour(i, j, matrix,):
neighbours = []
n = len(matrix) - 1
for x in range(len(matrix)-1):
for y in range(len(matrix)-1):
if x != n:
if matrix[x+1][y] == matrix[i][j]:
neighbours.append((x + 1, y))
if x != 0:
if matrix[x-1][y] == matrix[i][j]:
neighbours.append((x - 1, y))
if y != n:
if matrix[x][y + 1] == matrix[i][j]:
neighbours.append((x, y + 1))
if y != 0:
if matrix[x][y - 1] == matrix[i][j]:
neighbours.append((x, y - 1))
if matrix[i][j] == matrix[n][n]:
return []
return neighbours
#creat a class that stores a Tree
class Tree:
def __init__(self, value, Children = []):
self.value = value
self.Children = Children
#the root of the tree is the first element of the matrix
def root(self):
#add (0,0) as the value of the root
self.value = (0,0)
return self.value
#add the neighbours of the root as the children of the root
def add_children(self, matrix):
#add the neighbours of the lowest node as the children of the lowest node until
#a node has no neighbours
while get_neighbour(self.value[0], self.value[1], matrix) != []:
self.Children.append(get_neighbour(self.value[0], self.value[1], matrix))
self.value = self.Children[-1]
return self.Children
#print the tree
def print_tree(self):
print(self.value)
for i in self.Children:
print(i)
return
#Create the tree in the Class Tree
Tree = Tree((0,0))
Tree.add_children(MATRIX)
Tree.print_tree()

Please read the open letter to students befor copy and paste any of this. Seek help with your tutor if things are unclear.
Disclaimer: Because this is homework, this is (intentionally) not a complete answer. The answer works under the assumption that we are NOT allowed to go diagonal. Allowing diagonal movements adds additional complexity in the path generation and is left for exercising (the needed flexibility is there).
The code will take longer and longer the bigger N is, because of the definition of the problem. See combination of pathes on a grid. See benchmark below...
I tried to keep the code readable and understandable, there are more compressed and probably also better optimized ways to do this (happy to take comments, given that readability is maintained).
Let's start with a set of functions.
from itertools import permutations
import numpy as np
DOWN = 'D'
RIGHT = 'R'
def random_int_matrix(size: int) -> np.array:
"""Generates a size x size matrix with random integers from 0 to 9"""
mat = np.random.random((size, size)) * 10
return mat.astype(int)
def find_all_paths(size: int):
"""Creates all possible pathes going down and right"""
return [gen_path(perm) for perm in permutations([DOWN] * (size-1) + [RIGHT] * (size-1))]
def gen_path(permutation: str) -> list:
track = [(0, 0)]
for entry in permutation:
if entry == DOWN:
track.append((track[-1][0] + 1, track[-1][1]))
else:
track.append((track[-1][0], track[-1][1] + 1))
return track
def sum_track_values(mat: np.array, track: list) -> list:
"""Computes the value sum for the given path"""
return sum([mat[e[0], e[1]] for e in track])
OK, now we can run the programm
MATRIX_SIZE = 4
matrix = random_int_matrix(MATRIX_SIZE)
print('Randomly generated matrix:\n', matrix)
paths = find_all_paths(MATRIX_SIZE)
costs = np.array([sum_track_values(matrix, p) for p in paths])
min_idx = costs.argmin()
print('Best path:', paths[min_idx])
print('Costs:', costs[min_idx])
In my case the result was
Randomly generated matrix:
[[3 8 6 6]
[2 4 1 4]
[7 4 0 4]
[9 6 8 4]]
Best path: [(0, 0), (1, 0), (1, 1), (1, 2), (2, 2), (2, 3), (3, 3)]
Costs: 18
Small benchmark:
Runtime for N=1: 0.0000 sec (1 possible paths)
Runtime for N=2: 0.0000 sec (2 possible paths)
Runtime for N=3: 0.0001 sec (24 possible paths)
Runtime for N=4: 0.0016 sec (720 possible paths)
Runtime for N=5: 0.1344 sec (40,320 possible paths)
Runtime for N=6: 19.9810 sec (3,628,800 possible paths)

Related

Optimising slow code: recurrsive program for getting android lock patterns

I am in the midst of a project and would like to find all solutions to the android pattern unlock. If you have not seen it before, here it is, with a link to a stack overflow post discussing it in more detail.
The base rules are:
Only visit a node 0 or 1 times
No jumping over unvisited nodes
No cyclic paths
My implementation deals with solving the problem for a N by M grid, with a cap on the max length of a pattern. Here it is:
def get_all_sols(grid_size: (int, int), max_len: int) -> list:
"""
Return all solutions to the android problem as a list
:param grid_size: (x, y) size of the grid
:param max_len: maximum number of nodes in the solution
"""
sols = []
def r_sols(current_sol):
current_y = current_sol[-1] // grid_size[1] # The solution values are stored as ids ->> 0, 1, 2 for an example 3x3 grid
current_x = current_sol[-1] - current_y * grid_size[1] # Cache x and y of last visited node 3, 4, 5
grid = {} # Prepping a dict to store options for travelling 6, 7, 8
grid_id = -1
for y in range(grid_size[1]):
for x in range(grid_size[0]):
grid_id += 1
if grid_id in current_sol: # Avoid using the same node twice
continue
dist = (x - current_x) ** 2 + (y - current_y) ** 2 # Find some kind of distance, no need to root since all values will be like this
slope = math.atan2((y - current_y), (x - current_x)) # Likely very slow, but need to hold some kind of slope value,
# so that jumping over a point can be detected
# If the option table doesnt have the slope add a new entry with distance and id
# if it does, check distances and pick the closer one
grid[slope] = (dist, grid_id) if grid.get(slope) is None or grid[slope][0] > dist else grid[slope]
# The code matches the android login criteria, since:
# - Each node is visited either 0 or 1 time(s)
# - The path cannot jump over unvisited nodes, but can over visited ones
# - The path is not a cycle
r_sol = [current_sol]
if len(current_sol) == max_len: # Stop if hit the max length and return
return r_sol
for _, opt in grid.values(): # Else recurse for each possible choice
r_sol += r_sols(current_sol + [opt])
return r_sol
for start in range(grid_size[0] * grid_size[1]):
sols += r_sols([start])
return sols
My current issue is the runtime as the paths or grid get bigger. Could I get some help optimizing the function?
For verification, a 4x4 grid should have these path stats:
1 nodes: 16 paths
2 nodes: 172 paths
3 nodes: 1744 paths
4 nodes: 16880 paths
5 nodes: 154680 paths
6 nodes: 1331944 paths
7 nodes: 10690096 paths
Assuming the algorithm is correct, you can apply some small optimizations. The biggest one is to cut the algorithm earlier by moving the len(current_sol) == max_len earlier. Then, you can compute set(current_sol) so to speed up list searching. Then, you can replace val**2 by val*val and store some temporary result not to recompute them. In fact, every basic operation is slow with CPython and it performs almost no optimization. Here is the resulting code:
def get_all_sols_faster(grid_size: (int, int), max_len: int) -> list:
sols = []
def r_sols(current_sol):
r_sol = [current_sol]
if len(current_sol) == max_len:
return r_sol
current_y = current_sol[-1] // grid_size[1]
current_x = current_sol[-1] - current_y * grid_size[1]
grid = {}
grid_id = -1
current_sol_set = set(current_sol)
for y in range(grid_size[1]):
for x in range(grid_size[0]):
grid_id += 1
if grid_id in current_sol_set:
continue
diff_x, diff_y = x - current_x, y - current_y
dist = diff_x * diff_x + diff_y * diff_y
slope = math.atan2(diff_y, diff_x)
tmp = grid.get(slope)
grid[slope] = (dist, grid_id) if tmp is None or tmp[0] > dist else tmp
for _, opt in grid.values():
r_sol += r_sols(current_sol + [opt])
return r_sol
for start in range(grid_size[0] * grid_size[1]):
sols += r_sols([start])
return sols
This code is about 3 time faster.
Honestly, for such a bruteforce algorithm, CPython is a mess. I think you should use a native compiled language to get a much faster code (certainly at least an order of magnitude faster). Note that counting results instead of producing all the solution should also be faster.

Stack overflow using A* to solve 8-puzzle

The goal for this code is to solve an N-puzzle (see here: https://en.wikipedia.org/wiki/15_puzzle)
The function is meant to take in a scrambled list of integers 0 through (N**2-1), and return the list of moves to reach the solved state, where a move is a list of the x and y coordinate (e.g. [2, 2] means moving the bottom right piece in a 3x3 puzzle). I know only certain boards are solvable, and the board I am testing my code with is definitely solvable.
I tried implementing an A* search, but there seems to be something wrong with it, as for many boards I'm getting stack overflow. The board I use in the code below results in stack overflow, but this board will return the correct sequence of moves, after about 20 seconds (which seems way too long):
[7, 2, 8, 1, 5, 6, 0, 3, 4]
The basic idea is to start with a list of paths of moves, and then find the lowest cost path that has not been expanded (cost = current length of path + remaining manhattan distance), and expand that path, and keep going until the solution is found.
I understand that for an 8-puzzle bruteforcing might be just as if not more viable than A*, but I'd like it if my code could function for a 15-puzzle as well.
From my debugging, it seems like the code is kind of working as expected, I'm just not sure why it's taking so long and for most boards resulting in stack overflow. If I had to guess, I'd say that maybe I could eliminate more paths to speed things up, but I'm not sure how to do that without possibly eliminating the best path
I'm really new to programming, so is there a simple bug in my code, or do I have a fundamental misunderstanding of the algorithm? I am fairly confident my helper functions are working as intended, and that the issue is in the solve function. Any advice would be appreciated
import math
import copy
#Making move on board
def makeMove(board, move):
L = int(math.sqrt(len(board)))
copyBoard = copy.copy(board)
zI = copyBoard.index(L**2-1)
#Calculating move index in 1D list based off of x/y coords
moveI = move[0] + L*move[1]
copyBoard[zI], copyBoard[moveI] = copyBoard[moveI], copyBoard[zI]
return copyBoard
#Function to find the board based off of the given path
def makeMoves(board, path):
newBoard = copy.deepcopy(board)
for move in path:
newBoard = makeMove(newBoard, move)
return newBoard
def mDist(board): #Calculating manhattan distance of board
totalDist = 0
L = int(math.sqrt(len(board)))
for i in range(int(L**2)):
#Finding sum of differences between x and y coordinates
cX, cY = i % L, i // L
fX, fY = board[i] % L, board[i] // L
dX, dY = abs(cX-fX), abs(cY - fY)
totalDist += (dX+dY)
return totalDist
def nDisp(board):
score = 0
for i in range(len(board)):
if i != board[i]:
score += 1
return score
def getLegalMoves(board):
finalMoves = []
L = int(math.sqrt(len(board)))
#Finding the coordinates of the blank
zI = board.index(L**2-1)
zX, zY = zI % L, zI // L
#Finding possible moves from that blank
testMoves = [[zX+1, zY], [zX-1, zY], [zX, zY+1], [zX, zY-1]]
for move in testMoves:
if isLegalMove(move, L):
finalMoves.append(move)
return finalMoves
def isLegalMove(move, L):
#Move is legal if on board
for i in move:
if i < 0 or i >= L:
return False
return True
def solve(board):
queue = [] #List of paths, where a path is a sequence of moves
for move in getLegalMoves(board): #Getting initial paths
queue.append([move])
def search(queue, board):
bestScore = math.inf
bestPath = None
for path in queue:
#Score based off of A* = estimated distance to finish + current distance
dist = mDist(makeMoves(board, path))
score = dist + len(path)
#Checking if solved
if dist == 0:
return path
#Finding best path that has not already been expanded
if score < bestScore:
bestScore = score
bestPath = path
#Removing the path since it is going to be expanded
queue.remove(bestPath)
bestPathBoard = makeMoves(board, bestPath)
#Expanding the path
for move in getLegalMoves(bestPathBoard):
newPath = bestPath + [move]
queue.append(newPath)
#Recursing
return search(queue, board)
return search(queue, board)
print(solve([8, 0, 1, 6, 7, 3, 2, 5, 4]))

what is the wrong with my Dividing Sequences code

I am trying to code this problem:
This problem is about sequences of positive integers a1,a2,...,aN. A
subsequence of a sequence is anything obtained by dropping some of the
elements. For example, 3,7,11,3 is a subsequence of
6,3,11,5,7,4,3,11,5,3 , but 3,3,7 is not a subsequence of
6,3,11,5,7,4,3,11,5,3 .
Given a sequence of integers your aim is to find the length of the longest fully dividing subsequence of this sequence.
A fully dividing sequence is a sequence a1,a2,...,aN where ai divides
aj whenever i < j. For example, 3, 15, 60, 720 is a fully dividing
sequence.
My code is:
n=input()
ar=[]
temp=0
for i in range (0,n):
temp=input()
ar.append(temp)
def best(i):
if i==0:
return (1)
else:
ans =1
for j in range (0,i):
if (ar[j]%ar[i]==0):
ans=max(ans,(best(j)+1))
return (ans)
an=[]
for i in range (0,n):
temp=best(i)
an.append(temp)
print max(an)
the input was
9
2
3
7
8
14
39
145
76
320
and I should get 3 (because of 2, 8, 320) as output but I am getting 1
As j < i, you need to check whether a[j] is a divider of a[i], not vice versa. So this means you need to put this condition (and only this one, not combined with the inverse):
if (ar[i]%ar[j]==0):
With this change the output for the given sample data is 3.
The confusion comes from the definition, in which i < j, while in your code j < i.
This solves your problem without using any recursion :)
n = int(input())
ar = []
bestvals = []
best_stored = []
for x in range(n):
ar.append(int(input()))
best_stored.append(0)
best_stored[0] = 1
for i in range(n):
maxval = 1
for j in range(i):
if ar[i] % ar[j] == 0:
maxval = max(maxval,(best_stored[j])+1)
best_stored[i] = maxval
print(max(best_stored))
For the graph theory solution I alluded to in a comment:
class Node(object):
def __init__(self, x):
self.x = x
self.children = []
def add_child(self, child_x):
# Not actually used here, but a useful alternate constructor!
new = self.__class__(child_x)
self.children.append(new)
return new
def how_deep(self):
"""Does a DFS to return how deep the deepest branch goes."""
maxdepth = 1
for child in self.children:
maxdepth = max(maxdepth, child.how_deep()+1)
return maxdepth
nums = [9, 2, 3, 7, 8, 14, 39, 145, 76, 320]
nodes = [Node(num) for num in nums]
for i,node in enumerate(nodes):
for other in nodes[i:]:
if other.x % node.x == 0:
node.children.append(other)
# graph built, rooted at nodes[0]
result = max([n.how_deep() for n in nodes])

Faster algorithm for finding number of paths between two nodes

I am trying to answer a question on an online judge in Python, but I am exceeding both the time limit and memory limit. The question is pretty much asking for the number of all paths from a start node to an end node. Full question specifications can be seen here.
This is my code:
import sys
lines = sys.stdin.read().strip().split('\n')
n = int(lines[0])
dict1 = {}
for i in xrange(1, n+1):
dict1[i] = []
for i in xrange(1, len(lines) - 1):
numbers = map(int, lines[i].split())
num1 = numbers[0]
num2 = numbers[1]
dict1[num2].append(num1)
def pathfinder(start, graph, count):
new = []
if start == []:
return count
for i in start:
numList = graph[i]
for j in numList:
if j == 1:
count += 1
else:
new.append(j)
return pathfinder(new, graph, count)
print pathfinder([n], dict1, 0)
What the code does is it starts at the end node, and works its way up to the top by exploring all neighboring nodes. I made essentially a breadth first search algorithm, but its taking up too much space and time. How can I improve this code to make it more efficient? Is my approach wrong and how should I fix it?
Since the graph is acyclic there is a topological ordering which we can immediately see to be 1, 2, ..., n. So we can use dynamic programming the same way it is used to solve the longest path problem. In a list paths the element paths[i] stores how many paths would there be from 1 to i. The update would be simple - for each edge (i,j) where i is from our topological order we do paths[j] += path[i].
from collections import defaultdict
graph = defaultdict(list)
n = int(input())
while True:
tokens = input().split()
a, b = int(tokens[0]), int(tokens[1])
if a == 0:
break
graph[a].append(b)
paths = [0] * (n+1)
paths[1] = 1
for i in range(1, n+1):
for j in graph[i]:
paths[j] += paths[i]
print(paths[n])
Note that what you are implementing is not actually BFS since you don't mark which vertices you've visited making your start to grow out of proportion.
Test the graph
for i in range(1, n+1):
dict1[i] = list(range(i-1, 0, -1))
If you print the size of start you can see that the max value it gets for a given n grows exactly as binomial(n, floor(n/2)) which is ~4^n/sqrt(n). Note also that BFS is not what you want since it is not possible to count the number of paths in that way.
import sys
from collections import defaultdict
def build_matrix(filename, x):
# A[i] stores number of paths from node x to node i.
# O(n) to build parents_of_node
parents_of_node = defaultdict(list)
with open(filename) as infile:
num_nodes = int(infile.readline())
A = [0] * (num_nodes + 1) # A[0] is dummy variable. Not used.
for line in infile:
if line == "0 0":
break
u, v = map(int, line.strip().split())
parents_of_node[v].append(u)
# Initialize all direct descendants of x to 1
if u == x:
A[v] = 1
# Number of paths from x to i = sum(number of paths from x to parent of i)
for i in xrange(1, num_nodes + 1): # O(n)
A[i] += sum(A[p] for p in parents_of_node[i]) # O(max fan-in of graph), assuming O(1) for accessing dict.
# Total time complexity to build A is O(n * (max_fan-in of graph))
return A
def main():
filename = sys.argv[1]
x = 1 # Find number of paths from x
y = 4 # to y
A = build_matrix(filename, x)
print(A[y])
What you are doing is a DFS (not a BFS) in that code...
Here's a link to a good solution...
EDITED:
Use this approach instead...
http://www.geeksforgeeks.org/find-paths-given-source-destination/

Code is taking too much time

I wrote code to arrange numbers after taking user input. The ordering requires that the sum of adjacent numbers is prime. Up until 10 as an input code is working fine. If I go beyond that the system hangs. Please let me know the steps to optimize it
ex input 8
Answer should be: (1, 2, 3, 4, 7, 6, 5, 8)
Code as follows....
import itertools
x = raw_input("please enter a number")
range_x = range(int(x)+1)
del range_x[0]
result = list(itertools.permutations(range_x))
def prime(x):
for i in xrange(1,x,2):
if i == 1:
i = i+1
if x%i==0 and i < x :
return False
else:
return True
def is_prime(a):
for i in xrange(len(a)):
print a
if i < len(a)-1:
if prime(a[i]+a[i+1]):
pass
else:
return False
else:
return True
for i in xrange(len(result)):
if i < len(result)-1:
if is_prime(result[i]):
print 'result is:'
print result[i]
break
else:
print 'result is'
print result[i-1]
For posterity ;-), here's one more based on finding a Hamiltonian path. It's Python3 code. As written, it stops upon finding the first path, but can easily be changed to generate all paths. On my box, it finds a solution for all n in 1 through 900 inclusive in about one minute total. For n somewhat larger than 900, it exceeds the maximum recursion depth.
The prime generator (psieve()) is vast overkill for this particular problem, but I had it handy and didn't feel like writing another ;-)
The path finder (ham()) is a recursive backtracking search, using what's often (but not always) a very effective ordering heuristic: of all the vertices adjacent to the last vertex in the path so far, look first at those with the fewest remaining exits. For example, this is "the usual" heuristic applied to solving Knights Tour problems. In that context, it often finds a tour with no backtracking needed at all. Your problem appears to be a little tougher than that.
def psieve():
import itertools
yield from (2, 3, 5, 7)
D = {}
ps = psieve()
next(ps)
p = next(ps)
assert p == 3
psq = p*p
for i in itertools.count(9, 2):
if i in D: # composite
step = D.pop(i)
elif i < psq: # prime
yield i
continue
else: # composite, = p*p
assert i == psq
step = 2*p
p = next(ps)
psq = p*p
i += step
while i in D:
i += step
D[i] = step
def build_graph(n):
primes = set()
for p in psieve():
if p > 2*n:
break
else:
primes.add(p)
np1 = n+1
adj = [set() for i in range(np1)]
for i in range(1, np1):
for j in range(i+1, np1):
if i+j in primes:
adj[i].add(j)
adj[j].add(i)
return set(range(1, np1)), adj
def ham(nodes, adj):
class EarlyExit(Exception):
pass
def inner(index):
if index == n:
raise EarlyExit
avail = adj[result[index-1]] if index else nodes
for i in sorted(avail, key=lambda j: len(adj[j])):
# Remove vertex i from the graph. If this isolates
# more than 1 vertex, no path is possible.
result[index] = i
nodes.remove(i)
nisolated = 0
for j in adj[i]:
adj[j].remove(i)
if not adj[j]:
nisolated += 1
if nisolated > 1:
break
if nisolated < 2:
inner(index + 1)
nodes.add(i)
for j in adj[i]:
adj[j].add(i)
n = len(nodes)
result = [None] * n
try:
inner(0)
except EarlyExit:
return result
def solve(n):
nodes, adj = build_graph(n)
return ham(nodes, adj)
This answer is based on #Tim Peters' suggestion about Hamiltonian paths.
There are many possible solutions. To avoid excessive memory consumption for intermediate solutions, a random path can be generated. It also allows to utilize multiple CPUs easily (each cpu generates its own paths in parallel).
import multiprocessing as mp
import sys
def main():
number = int(sys.argv[1])
# directed graph, vertices: 1..number (including ends)
# there is an edge between i and j if (i+j) is prime
vertices = range(1, number+1)
G = {} # vertex -> adjacent vertices
is_prime = sieve_of_eratosthenes(2*number+1)
for i in vertices:
G[i] = []
for j in vertices:
if is_prime[i + j]:
G[i].append(j) # there is an edge from i to j in the graph
# utilize multiple cpus
q = mp.Queue()
for _ in range(mp.cpu_count()):
p = mp.Process(target=hamiltonian_random, args=[G, q])
p.daemon = True # do not survive the main process
p.start()
print(q.get())
if __name__=="__main__":
main()
where Sieve of Eratosthenes is:
def sieve_of_eratosthenes(limit):
is_prime = [True]*limit
is_prime[0] = is_prime[1] = False # zero and one are not primes
for n in range(int(limit**.5 + .5)):
if is_prime[n]:
for composite in range(n*n, limit, n):
is_prime[composite] = False
return is_prime
and:
import random
def hamiltonian_random(graph, result_queue):
"""Build random paths until Hamiltonian path is found."""
vertices = list(graph.keys())
while True:
# build random path
path = [random.choice(vertices)] # start with a random vertice
while True: # until path can be extended with a random adjacent vertex
neighbours = graph[path[-1]]
random.shuffle(neighbours)
for adjacent_vertex in neighbours:
if adjacent_vertex not in path:
path.append(adjacent_vertex)
break
else: # can't extend path
break
# check whether it is hamiltonian
if len(path) == len(vertices):
assert set(path) == set(vertices)
result_queue.put(path) # found hamiltonian path
return
Example
$ python order-adjacent-prime-sum.py 20
Output
[19, 18, 13, 10, 1, 4, 9, 14, 5, 6, 17, 2, 15, 16, 7, 12, 11, 8, 3, 20]
The output is a random sequence that satisfies the conditions:
it is a permutation of the range from 1 to 20 (including)
the sum of adjacent numbers is prime
Time performance
It takes around 10 seconds on average to get result for n = 900 and extrapolating the time as exponential function, it should take around 20 seconds for n = 1000:
The image is generated using this code:
import numpy as np
figname = 'hamiltonian_random_noset-noseq-900-900'
Ns, Ts = np.loadtxt(figname+'.xy', unpack=True)
# use polyfit to fit the data
# y = c*a**n
# log y = log (c * a ** n)
# log Ts = log c + Ns * log a
coeffs = np.polyfit(Ns, np.log2(Ts), deg=1)
poly = np.poly1d(coeffs, variable='Ns')
# use curve_fit to fit the data
from scipy.optimize import curve_fit
def func(x, a, c):
return c*a**x
popt, pcov = curve_fit(func, Ns, Ts)
aa, cc = popt
a, c = 2**coeffs
# plot it
import matplotlib.pyplot as plt
plt.figure()
plt.plot(Ns, np.log2(Ts), 'ko', label='time measurements')
plt.plot(Ns, np.polyval(poly, Ns), 'r-',
label=r'$time = %.2g\times %.4g^N$' % (c, a))
plt.plot(Ns, np.log2(func(Ns, *popt)), 'b-',
label=r'$time = %.2g\times %.4g^N$' % (cc, aa))
plt.xlabel('N')
plt.ylabel('log2(time in seconds)')
plt.legend(loc='upper left')
plt.show()
Fitted values:
>>> c*a**np.array([900, 1000])
array([ 11.37200806, 21.56029156])
>>> func([900, 1000], *popt)
array([ 14.1521409 , 22.62916398])
Dynamic programming, to the rescue:
def is_prime(n):
return all(n % i != 0 for i in range(2, n))
def order(numbers, current=[]):
if not numbers:
return current
for i, n in enumerate(numbers):
if current and not is_prime(n + current[-1]):
continue
result = order(numbers[:i] + numbers[i + 1:], current + [n])
if result:
return result
return False
result = order(range(500))
for i in range(len(result) - 1):
assert is_prime(result[i] + result[i + 1])
You can force it to work for even larger lists by increasing the maximum recursion depth.
Here's my take on a solution. As Tim Peters pointed out, this is a Hamiltonian path problem.
So the first step is to generate the graph in some form.
Well the zeroth step in this case to generate prime numbers. I'm going to use a sieve, but whatever prime test is fine. We need primes upto 2 * n since that is the largest any two numbers can sum to.
m = 8
n = m + 1 # Just so I don't have to worry about zero indexes and random +/- 1's
primelen = 2 * m
prime = [True] * primelen
prime[0] = prime[1] = False
for i in range(4, primelen, 2):
prime[i] = False
for i in range(3, primelen, 2):
if not prime[i]:
continue
for j in range(i * i, primelen, i):
prime[j] = False
Ok, now we can test for primality with prime[i]. Now its easy to make the graph edges. If I have a number i, what numbers can come next. I'll also make use of the fact that i and j have opposite parity.
pairs = [set(j for j in range(i%2+1, n, 2) if prime[i+j])
for i in range(n)]
So here pairs[i] is set object whose elements are integers j such that i+j is prime.
Now we need to walk the graph. This is really where the time consuming part is and all further optimizations will be done here.
chains = [
([], set(range(1, n))
]
chains is going to keep track of the valid paths as we walk them. The first element in the tuple will be your result. The second element is all the unused numbers, or unvisited nodes. The idea is to take one chain out of the queue, take a step down the path and put it back.
while chains:
chain, unused = chains.pop()
if not chain:
# we haven't even started, all unused are valid
valid_next = unused
else:
# We need numbers that are both unused and paired with the last node
# Using sets makes this easy
valid_next = unused & pairs[chains[-1]]
for num in valid_next:
# Take a step to the new node and add the new path back to chains
# Reminder, its important not to mutate anything here, always make new objs
newchain = chain + [num]
newunused = unused - set([num])
chains.append( (newchain, newunused) )
# are we done?
if not newunused:
print newchain
chains = False
Notice that if there is no valid next step, the path is removed without a replacement.
This is really memory inefficient, but runs in a reasonable time. The biggest performance bottleneck is walking the graph, so the next optimization would be popping and inserting paths in intelligent places to prioritize the most likely paths. It might be helpful to use a collections.deque or different container for your chains in that case.
EDIT
Here is an example of how you can implement your path priority. We will assign each path a score and keep the chains list sorted by this score. For a simple example I will suggest that paths containing "harder to use" nodes are worth more. That is for each step on a path the score will increase by n - len(valid_next) The modified code will look something like this.
import bisect
chains = ...
chains_score = [0]
while chains:
chain, unused = chains.pop()
score = chains_score.pop()
...
for num in valid_next:
newchain = chain + [num]
newunused = unused - set([num])
newscore = score + n - len(valid_next)
index = bisect.bisect(chains_score, newscore)
chains.insert(index, (newchain, newunused))
chains_score.insert(index, newscore)
Remember that insertion is O(n) so the overhead of adding this can be rather large. Its worth doing some analysis on your score algorithm to keep the queue length len(chains) managable.

Categories