Speeding up the following function

Speeding up the following function - python

I am implementing a code to find all the paths from top left to bottom right in a n*m matrix.
Here is my code:
# Python3 program to Print all possible paths from
# top left to bottom right of a mXn matrix
'''
/* mat: Pointer to the starting of mXn matrix
i, j: Current position of the robot
(For the first call use 0, 0)
m, n: Dimentions of given the matrix
pi: Next index to be filed in path array
*path[0..pi-1]: The path traversed by robot till now
(Array to hold the path need to have
space for at least m+n elements) */
'''
def printAllPathsUtil(mat, i, j, m, n, path, pi):
# Reached the bottom of the matrix
# so we are left with only option to move right
if (i == m - 1):
for k in range(j, n):
path[pi + k - j] = mat[i][k]
for l in range(pi + n - j):
print(path[l], end = " ")
print()
return
# Reached the right corner of the matrix
# we are left with only the downward movement.
if (j == n - 1):
for k in range(i, m):
path[pi + k - i] = mat[k][j]
for l in range(pi + m - i):
print(path[l], end = " ")
print()
return
# Add the current cell
# to the path being generated
path[pi] = mat[i][j]
# Print all the paths
# that are possible after moving down
printAllPathsUtil(mat, i + 1, j, m, n, path, pi + 1)
# Print all the paths
# that are possible after moving right
printAllPathsUtil(mat, i, j + 1, m, n, path, pi + 1)
# Print all the paths
# that are possible after moving diagonal
# printAllPathsUtil(mat, i+1, j+1, m, n, path, pi + 1);
# The main function that prints all paths
# from top left to bottom right
# in a matrix 'mat' of size mXn
def printAllPaths(mat, m, n):
path = [0 for i in range(m + n)]
printAllPathsUtil(mat, 0, 0, m, n, path, 0)
def printAllPaths(mat, m, n):
path = [0 for i in range(m + n)]
printAllPathsUtil(mat, 0, 0, m, n, path, 0)
matrix = np.random.rand(150, 150)
printAllPaths(matrix, 150, 150)
I would like to find all the paths for a 150 by 150 matrix. But this takes a lot of time. Is there a good way to make it faster? If there are also any suggestions to speed up the algorithm that would be great`.

I think that when you talk of path a graph is a good solution, my idea is to build a graph with all paths and ask to him the solution, this print out all paths, each node is the couple of coordinates (x,y):
import networkx as nx
X = Y = 150
G = nx.DiGraph()
edges = []
for x in range(X):
for y in range(Y):
if x<X-1:
edges.append(((x,y),(x+1,y)))
if y<Y-1:
edges.append(((x,y),(x,y+1)))
G.add_edges_from(edges)
print(len(G.nodes()))
print(len(G.edges()))
for path in nx.all_simple_paths(G,(0,0),(X-1,Y-1)):
print(path)

Related

Find sum of elements within the area of 45 degree rotated rectangle

Given a matrix of integers, we'd like to consider the sum of the elements within the area of a 45° rotated rectangle.
More formally, the area is bounded by two diagonals parallel to the main diagonal and two diagonals parallel to the secondary diagonal.
The dimensions of the rotated rectangle are defined by the number of elements along the borders of the rectangle. Given integers a and b representing the dimensions of the rotated rectangle, and matrix (a matrix of integers), your task is to find the greatest sum of integers contained within an a x b rotated rectangle.
Note: The order of the dimensions is not important - consider all a x b and b x a rectangles.
matrix = [[1, 2, 3, 4, 0],
[5, 6, 7, 8, 1],
[3, 2, 4, 1, 4],
[4, 3, 5, 1, 6]]
a = 2, and b = 3, the output should be rotatedRectSum(matrix, a, b) = 36.
I need help to understand how range(w - 1, rows - h + 1) and range(0, cols - (h + w - 1) + 1) are calculated?
def rotatedRectSum(matrix, a, b):
rows, cols = len(matrix), len(matrix[0])
maxArea = float("-inf")
# go through possible rectangles along both diagonals
for w, h in [(a, b), (b, a)]:
# go through possible "anchors", which is the left top coordinate of the rectangle candidate
for i in range(w - 1, rows - h + 1):
for j in range(0, cols - (h + w - 1) + 1):
area = 0
# sum up the long diagonals
for p in range(w): # go to next long diagonal
for q in range(h): # go down current diagonal
area += matrix[i - p + q][j + p + q]
# sum up the short diagonals
k, l = i, j + 1 # note that short diagonals have one less element than long diagonals
for p in range(w - 1):
for q in range(h - 1):
area += matrix[k- p + q][l + p + q]
if (area > maxArea): maxArea = area
return maxArea
More Explanation: Click Here

I need help to understand how range(w - 1, rows - h + 1) and range(0, cols - (h + w - 1) + 1) are calculated?
What the author calls anchor is the leftmost cell in the rotated rectangle that is considered. Below an orange-marked anchor and the corresponding rectangle that goes with it (width is 2, height is 3):
i is the row in which the anchor can be. It is clear that when the width is 2, we cannot have the anchor in the top row as we need at least one row above it to have the necessary space for the top-corner of the rectangle. As the row in which the anchor resides already counts as 1, we need w - 1 more rows above it so to have enough room for the upper part of the rotated rectangle. This explains why the first range starts with w - 1. It means that the anchor cannot be in the first w - 1 rows, and the least possible row where the anchor can be placed is the one with index w - 1.
With a similar reasoning we can deduce how many rows there should at least be below the anchor's row. There should be at least h - 1 rows below it, so to have enough room for the bottom part of the rotated rectangle. This means the greatest possible row index for the anchor's row is rows - h. That means we should have a range(w - 1, rows - h + 1) to iterate all possible row indices for the anchor (remember that the second value given to range is not included in the values it yields).
The reasoning for the second range is the same, but now it concerns the columns where the anchor could possibly be (i.e. the value for j). The anchor can always be in the very first column, as there is no part of the rectangle that comes at the left of it. So that means that valid column numbers for the anchor start at column index 0, hence range(0, ...)
Finally, the number of columns that follows after the anchor's column is w - 1 (for reaching the column with the top peek of the rectangle) and another h - 1 (for reaching from there on the right most cell in the rectangle). So that gives a total of w + h - 2 columns that need to be available at the right of the anchor's column. This means the last possible column index for an anchor is cols - (w + h - 1), and so the range for possible column indices should be defined as range(0, cols - (w + h - 1) + 1).
I hope this clarifies it.

def rotatedRectSum(matrix, a, b):
rows, cols = len(matrix), len(matrix[0])
maxArea = float("-inf")
# go through possible rectangles along both diagonals
for w, h in [(a, b), (b, a)]:
# go through possible "anchors", which is the left top coordinate of the rectangle candidate
for i in range(w - 1, rows - h + 1):
for j in range(0, cols - (h + w - 1) + 1):
area = 0
# sum up the long diagonals
for p in range(w): # go to next long diagonal
for q in range(h): # go down current diagonal
# print(matrix[i - p + q][j + p + q])
area += matrix[i - p + q][j + p + q]
# sum up the short diagonals
k, l = i, j + 1 # note that short diagonals have one less element than long diagonals
for p in range(w - 1):
for q in range(h - 1):
area += matrix[k- p + q][l + p + q]
if (area > maxArea): maxArea = area
return maxArea

Dynamic Time Wrapping returns small value for far away curves

I have a python code that implements Dynamic Time Wrapping, which I use to compare the predicted curve to my actual curve. I care about the shape of the curve but also about the distance between the 2 curves. I z-normalized the 2 curves before calling the function that returns the cost. However, I got weird results. For example:
I got cost of 0.28 for this example:
While I got 0.38 for the below example:
In the first plot, the prediction is very far away compared to the second plot. I even got the same value of 0.28 with even very far away prediction such as 5000 points further. What is wrong here?
Below is my code from this source:
#Dynamic Time Wrapping Algorithm
def dp(dist_mat):
N, M = dist_mat.shape
# Initialize the cost matrix
cost_mat = numpy.zeros((N + 1, M + 1))
for i in range(1, N + 1):
cost_mat[i, 0] = numpy.inf
for i in range(1, M + 1):
cost_mat[0, i] = numpy.inf
# Fill the cost matrix while keeping traceback information
traceback_mat = numpy.zeros((N, M))
for i in range(N):
for j in range(M):
penalty = [
cost_mat[i, j], # match (0)
cost_mat[i, j + 1], # insertion (1)
cost_mat[i + 1, j]] # deletion (2)
i_penalty = numpy.argmin(penalty)
cost_mat[i + 1, j + 1] = dist_mat[i, j] + penalty[i_penalty]
traceback_mat[i, j] = i_penalty
# Traceback from bottom right
i = N - 1
j = M - 1
path = [(i, j)] #Path is commented because I am not interested in the path
# while i > 0 or j > 0:
# tb_type = traceback_mat[i, j]
# if tb_type == 0:
# # Match
# i = i - 1
# j = j - 1
# elif tb_type == 1:
# # Insertion
# i = i - 1
# elif tb_type == 2:
# # Deletion
# j = j - 1
# path.append((i, j))
# Strip infinity edges from cost_mat before returning
cost_mat = cost_mat[1:, 1:]
return (path[::-1], cost_mat)
I use the above code as below:
z_actual=stats.zscore(actual)
z_pred=stats.zscore(mean_predictions)
N = actual.shape[0]
M = mean_predictions.shape[0]
dist_mat = numpy.zeros((N, M))
for i in range(N):
for j in range(M):
dist_mat[i, j] = abs(z_actual[i] - z_pred[j])
path,cost_mat=dp(dist_mat)
mape=cost_mat[N - 1, M - 1]/(N + M)

Index Out of Rnage in finding ways for binary maze

Basically, I want to find the number of unique paths on the maze, but however, I've tried this code and it's working for my first matrix of the binary maze, however, it only identifies 1 unique path, however, the answer should be 2 because, on the maze[3][0], it could take a new unique path using the "right-down" choice rather than going to maze[4][0] and taking "right" choice.
# Check if cell (x, y) is valid or not
def isValidCell(x, y, N, M):
return not (x < 0 or y < 0 or x >= N or y >= M)
def countPaths(maze, i, j, dest, visited):
# `N × M` matrix
N = len(maze)
M = len(maze[0])
# if destination (x, y) is found, return 1
if (i, j) == dest:
return 1
# stores number of unique paths from source to destination
count = 0
# mark the current cell as visited
visited[i][j] = True
# if the current cell is a valid and open cell
if isValidCell(i, j, N, M) and maze[i][j] == 1:
print(i)
# go down (i, j) ——> (i + 1, j)
if i + 1 < N and not visited[i + 1][j]:
print("down")
count += countPaths(maze, i + 1, j, dest, visited)
# go up (i, j) ——> (i - 1, j)
elif i - 1 >= 0 and not visited[i - 1][j]:
print("up")
count += countPaths(maze, i - 1, j, dest, visited)
# go right (i, j) ——> (i, j + 1)
elif j + 1 < M and not visited[i][j + 1]:
print("right")
count += countPaths(maze, i, j + 1, dest, visited)
# go right-down (diagonal) (i, j) ——> (i + 1, j + 1)
elif j + 1 < M and i + 1 < N and not visited[i + 1][j + 1]:
print("right down")
count += countPaths(maze, i + 1, j + 1, dest, visited)
# backtrack from the current cell and remove it from the current path
visited[i][j] = False
return count
def findCount(maze, src, dest):
# get source cell (i, j)
i, j = src
# get destination cell (x, y)
x, y = dest
# base case: invalid input
if not maze or not len(maze) or not maze[i][j] or not maze[x][y]:
return 0
# `N × M` matrix
N = len(maze)
M = len(maze[0])
print(M)
# 2D matrix to keep track of cells involved in the current path
visited = [[False for k in range(M)] for l in range(N)]
# start from source cell (i, j)
return countPaths(maze, i, j, dest, visited)
if name == 'main':
maze = [
[1, 0],
[1, 0],
[1, 0],
[1, 0],
[1, 1]
]
# source cell
src = (0, 0)
# destination cell
dest = (4, 1)
print("The total number of unique paths are", findCount(maze, src, dest))

The code assumes a square matrix
# `N × N` matrix
N = len(maze)
You are feeding it a rectangular matrix maze = 15x5 (didn't count the lines, guesstimated)

Random tridiagonal matrix from matlab to python

I want to try to implement the following code from Matlab to Python (I am not familiar with Python in general, but I try to translate it from Matlab using basics)
% n is random integer from 1 to 10
% first set the random seed (because we want our results to be reproducible;
% the seed sets a starting point in the sequence of random numbers the program
rng(n)
% Generate random columns
a = rand(n, 1);
b = rand(n, 1);
c = rand(n, 1);
% Convert to a matrix
A = zeros(n);
for i = 1:n
if i ~= n
A(i + 1, i) = a(i + 1);
A(i, i + 1) = c(i);
end
A(i, i) = b(i);
end
This is my attempt in Python:
import numpy as np
## n is random integer from 1 to 10
np.random.seed(n)
### generate random columns:
a = np.random.rand(n)
b = np.random.rand(n)
c = np.random.rand(n)
A = np.zeros((n, n)) ## create zero n-by-n matrix
for i in range(0, n):
if (i != n):
A[i + 1, i] = a[i + 1]
A[i, i + 1] = c[i]
A[i, i] = b[i]
I run into an error on the line A[i + 1, i] = a[i]. Is there any structure in Python that I am missing out here?

As the above comments clearly points out the indexing error, here is a numpy way of doing it based on np.diag:
import numpy as np
# for reproducibility
np.random.seed(42)
# n is random integer from 1 to 10
n = np.random.randint(low=1, high=10)
# first diagonal below main diag: k = -1
a = np.random.rand(n-1)
# main diag: k = 0
b = np.random.rand(n)
# first diagonal above main diag: k = 1
c = np.random.rand(n-1)
# sum all 2-d arrays in order to obtain A
A = np.diag(a, k=-1) + np.diag(b, k=0) + np.diag(c, k=1)

Short answer is that for i = 1:n iterates [1, n], inclusive on both bounds, while for i in range(n): iterates [0, n), exclusive on the right bound. Therefore, the check if i ~= n correctly tests if you are at the right edge, while if (i!=n): does not. Replace it with
if i != n - 1:
The long answer is that you don't need any of that code in either language, since both MATLAB and numpy are intended to be used with vectorized operations. In MATLAB, you can write
A = diag(a(2:end), -1) + diag(b, 0) + diag(c(1:end-1), +1)
In numpy, it's very similar:
A = np.diag(a[1:], -1) + np.diag(b, 0) + np.diag(c[:-1], +1)
There are other tricks you can use, especially if you just want random numbers in the matrix:
A = np.random.rand(n, n)
A[np.tril_indices(n, -2)] = A[np.triu_indices(n, 2)] = 0
You can use other index-based approaches:
i, j = np.diag_indices(n)
i = np.concatenate((i[:-1], i, i[1:]))
j = np.concatenate((j[1:], j, j[:-1]))
A = np.zeros((n, n))
A[i, j] = np.random.rand(3 * n - 2)

Minimal cost path of 2d matrix

I am trying to find the minimal cost path from point (0, 0) to point {(u, v) | u + v <= 100} in a 2d matrix of data I have generated.
My algorithm is pretty simple and currently I have managed to produce the following (visualized) results, which leads me to understand that I am way off in my algorithm.
# each cell of path_arr contains a tuple of (i,j) of the next cell in path.
# data contains the "cost" of stepping on its cell
# total_cost_arr is used to assist reconstructing the path.
def min_path(data, m=100, n=100):
total_cost_arr = np.array([np.array([0 for x in range(0, m)]).astype(float) for x in range(0, n)])
path_arr = np.array([np.array([(0, 0) for x in range(0, m)], dtype='i,i') for x in range(0, n)])
total_cost_arr[0, 0] = data[0][0]
for i in range(0, m):
total_cost_arr[i, 0] = total_cost_arr[i - 1, 0] + data[i][0]
for j in range(0, n):
total_cost_arr[0, j] = total_cost_arr[0, j - 1] + data[0][j]
for i in range(1, m):
for j in range(1, n):
total_cost_arr[i, j] = min(total_cost_arr[i - 1, j - 1], total_cost_arr[i - 1, j], total_cost_arr[i, j - 1]) + data[i][j]
if total_cost_arr[i, j] == total_cost_arr[i - 1, j - 1] + data[i][j]:
path_arr[i - 1, j - 1] = (i, j)
elif total_cost_arr[i, j] == total_cost_arr[i - 1, j] + data[i][j]:
path_arr[i - 1, j] = (i, j)
else:
path_arr[i, j - 1] = (i, j)
each cell of path_arr contains a tuple of (i,j) of the next cell in path.
data contains the "cost" of stepping on its cell, and
total_cost_arr is used to assist reconstructing the path.
I think that placing (i,j) in previous cell is causing some conflicts which lead to this behavior.

I don't think an array is the best structure for your problem.
You should use some graph data structure (with networkx for example) and use algorithm like the Dijkstra one's or A* (derivated from the first one).
The Dijkstra algorithm is implemented in netwokrkx (function for shortest path).

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.