Recurrence relation and time complexity of finding next larger in Generic tree - python

Question: Given a generic tree and an integer n. Find and return the node with next larger element in the tree i.e. find a node with value just greater than n.
Although i was able to solve it is O(n) by removing the later for loop and doing comparisons while calling recursion. I am bit curious about time complexity of following version of code.
I came up with recurrence relation as T(n) = T(n-1) + (n-1) = O(n^2). Where T(n-1) is for time taken by children and + (n-1) for finding the next larger (second for loop). Have i done it right? or am i missing something?
def nextLargestHelper(root, n):
"""
root => reference to root node
n => integer
Returns node and value of node which is just larger not first larger than n.
"""
# Special case
if root is None:
return None, None
# list to store all values > n
largers = list()
# Induction step
if root.data > n:
largers.append([root, root.data])
# Induction step and Base case; if no children do not call recursion
for child in root.children:
# Induction hypothesis; my function returns me node and value just larger than 'n'
node, value = nextLargestHelper(child, n)
# If larger found in subtree
if node:
largers.append([node, value])
# Initialize node to none, and value as +infinity
node = None
value = sys.maxsize
# travers through all larger values and return the smallest value greater than n
for item in largers: # structure if item is [Node, value]
# this is why value is initialized to +infinity; so as it is true for first time
if item[1] < value:
node = item[0]
value = item[1]
return node, value

At first: please use different chacters for O-Notation and inputvalues.
You "touch" every node exactly once, so the result should be O(n). A bit special is your algorithm finding the minimum afterwards. You could include this in your go-though-all-children loop for an easier recurrence estimation. As it is, you have do a recurrence estimation for the minimum of the list as well.
Your recurrence equation should look more like T(n) = a*T(n/a) + c = O(n) since in each step you have a children forming a subtrees with size (n-1)/a. In each step you have next to some constant factors also the computation of the minimum of a list with at most a elements. You could write it as a*T(n/a) + a*c1 +c2 which is the same as a*T(n/a) + c. The actual formula would look more like this: T(n) = a*T((n-1)/a) + c but the n-1 makes it harder to apply the master theorem.

Related

Analyzing the complexity matrix path-finding

Recently in my homework, I was assinged to solve the following problem:
Given a matrix of order nxn of zeros and ones, find the number of paths from [0,0] to [n-1,n-1] that go only through zeros (they are not necessarily disjoint) where you could only walk down or to the right, never up or left. Return a matrix of the same order where the [i,j] entry is the number of paths in the original matrix that go through [i,j], the solution has to be recursive.
My solution in python:
def find_zero_paths(M):
n,m = len(M),len(M[0])
dict = {}
for i in range(n):
for j in range(m):
M_top,M_bot = blocks(M,i,j)
X,Y = find_num_paths(M_top),find_num_paths(M_bot)
dict[(i,j)] = X*Y
L = [[dict[(i,j)] for j in range(m)] for i in range(n)]
return L[0][0],L
def blocks(M,k,l):
n,m = len(M),len(M[0])
assert k<n and l<m
M_top = [[M[i][j] for i in range(k+1)] for j in range(l+1)]
M_bot = [[M[i][j] for i in range(k,n)] for j in range(l,m)]
return [M_top,M_bot]
def find_num_paths(M):
dict = {(1, 1): 1}
X = find_num_mem(M, dict)
return X
def find_num_mem(M,dict):
n, m = len(M), len(M[0])
if M[n-1][m-1] != 0:
return 0
elif (n,m) in dict:
return dict[(n,m)]
elif n == 1 and m > 1:
new_M = [M[0][:m-1]]
X = find_num_mem(new_M,dict)
dict[(n,m-1)] = X
return X
elif m == 1 and n>1:
new_M = M[:n-1]
X = find_num_mem(new_M, dict)
dict[(n-1,m)] = X
return X
new_M1 = M[:n-1]
new_M2 = [M[i][:m-1] for i in range(n)]
X,Y = find_num_mem(new_M1, dict),find_num_mem(new_M2, dict)
dict[(n-1,m)],dict[(n,m-1)] = X,Y
return X+Y
My code is based on the idea that the number of paths that go through [i,j] in the original matrix is equal to the product of the number of paths from [0,0] to [i,j] and the number of paths from [i,j] to [n-1,n-1]. Another idea is that the number of paths from [0,0] to [i,j] is the sum of the number of paths from [0,0] to [i-1,j] and from [0,0] to [i,j-1]. Hence I decided to use a dictionary whose keys are matricies of the form [[M[i][j] for j in range(k)] for i in range(l)] or [[M[i][j] for j in range(k+1,n)] for i in range(l+1,n)] for some 0<=k,l<=n-1 where M is the original matrix and whose values are the number of paths from the top of the matrix to the bottom. After analizing the complexity of my code I arrived at the conclusion that it is O(n^6).
Now, my instructor said this code is exponential (for find_zero_paths), however, I disagree.
The recursion tree (for find_num_paths) size is bounded by the number of submatrices of the form above which is O(n^2). Also, each time we add a new matrix to the dictionary we do it in polynomial time (only slicing lists), SO... the total complexity is polynomial (poly*poly = poly). Also, the function 'blocks' runs in polynomial time, and hence 'find_zero_paths' runs in polynomial time (2 lists of polynomial-size times a function which runs in polynomial time) so all in all the code runs in polynomial time.
My question: Is the code polynomial and my O(n^6) bound is wrong or is it exponential and I am missing something?
Unfortunately, your instructor is right.
There is a lot to unpack here:
Before we start, as quick note. Please don't use dict as a variable name. It hurts ^^. Dict is a reserved keyword for a dictionary constructor in python. It is a bad practice to overwrite it with your variable.
First, your approach of counting M_top * M_bottom is good, if you were to compute only one cell in the matrix. In the way you go about it, you are unnecessarily computing some blocks over and over again - that is why I pondered about the recursion, I would use dynamic programming for this one. Once from the start to end, once from end to start, then I would go and compute the products and be done with it. No need for O(n^6) of separate computations. Sine you have to use recursion, I would recommend caching the partial results and reusing them wherever possible.
Second, the root of the issue and the cause of your invisible-ish exponent. It is hidden in the find_num_mem function. Say you compute the last element in the matrix - the result[N][N] field and let us consider the simplest case, where the matrix is full of zeroes so every possible path exists.
In the first step, your recursion creates branches [N][N-1] and [N-1][N].
In the second step, [N-1][N-1], [N][N-2], [N-2][N], [N-1][N-1]
In the third step, you once again create two branches from every previous step - a beautiful example of an exponential explosion.
Now how to go about it: You will quickly notice that some of the branches are being duplicated over and over. Cache the results.

Calculate height of an arbitrary (non-binary) tree

I'm currently taking on online data structures course and this is one of the homework assignments; please guide me towards the answer rather than giving me the answer.
The prompt is as follows:
Task. You are given a description of a rooted tree. Your task is to compute and output its height. Recall that the height of a (rooted) tree is the maximum depth of a node, or the maximum distance from a leaf to the root. You are given an arbitrary tree, not necessarily a binary tree.
Input Format. The first line contains the number of nodes n. The second line contains integer numbers from −1 to n−1 parents of nodes. If the i-th one of them (0 ≤ i ≤ n−1) is −1, node i is the root, otherwise it’s 0-based index of the parent of i-th node. It is guaranteed that there is exactly one root. It is guaranteed that the input represents a tree.
Constraints. 1 ≤ n ≤ 105.
My current solution works, but is very slow when n > 102. Here is my code:
# python3
import sys
import threading
# In Python, the default limit on recursion depth is rather low,
# so raise it here for this problem. Note that to take advantage
# of bigger stack, we have to launch the computation in a new thread.
sys.setrecursionlimit(10**7) # max depth of recursion
threading.stack_size(2**27) # new thread will get stack of such size
threading.Thread(target=main).start()
# returns all indices of item in seq
def listOfDupes(seq, item):
start = -1
locs = []
while True:
try:
loc = seq.index(item, start+1)
except:
break
else:
locs.append(loc)
start = loc
return locs
def compute_height(node, parents):
if node not in parents:
return 1
else:
return 1 + max(compute_height(i, parents) for i in listOfDupes(parents, node))
def main():
n = int(input())
parents = list(map(int, input().split()))
print(compute_height(parents.index(-1), parents))
Example input:
>>> 5
>>> 4 -1 4 1 1
This will yield a solution of 3, because the root is 1, 3 and 4 branch off of 1, then 0 and 2 branch off of 4 which gives this tree a height of 3.
How can I improve this code to get it under the time benchmark of 3 seconds? Also, would this have been easier in another language?
Python will be fine as long as you get the algorithm right. Since you're only looking for guidance, consider:
1) We know the depth of a node iif the depth of its parent is known; and
2) We're not interested in the tree's structure, so we can throw irrelevant information away.
The root node pointer has the value -1. Suppose that we replaced its children's pointers to the root node with the value -2, their children's pointers with -3, and so forth. The greatest absolute value of these is the height of the tree.
If we traverse the tree from an arbitrary node N(0) we can stop as soon as we encounter a negative value at node N(k), at which point we can replace each node with the value of its parent, less one. I.e, N(k-1) = N(k) -1, N(k-2)=N(k-1) - 1... N(0) = N(1) -1. As more and more pointers are replaced by their depth, each traversal is more likely to terminate by encountering a node whose depth is already known. In fact, this algorithm takes basically linear time.
So: load your data into an array, start with the first element and traverse the pointers until you encounter a negative value. Build another array of the nodes traversed as you go. When you encounter a negative value, use the second array to replace the original values in the first array with their depth. Do the same with the second element and so forth. Keep track of the greatest depth you've encountered: that's your answer.
The structure of this question looks like it would be better solved bottom up rather than top down. Your top-down approach spends time seeking, which is unnecessary, e.g.:
def height(tree):
for n in tree:
l = 1
while n != -1:
l += 1
n = tree[n]
yield l
In []:
tree = '4 -1 4 1 1'
max(height(list(map(int, tree.split()))))
Out[]:
3
Or if you don't like a generator:
def height(tree):
d = [1]*len(tree)
for i, n in enumerate(tree):
while n != -1:
d[i] += 1
n = tree[n]
return max(d)
In []:
tree = '4 -1 4 1 1'
height(list(map(int, tree.split())))
Out[]:
3
The above is brute force as it doesn't take advantage of reusing parts of the tree you've already visited, it shouldn't be too hard to add that.
Your algorithm spends a lot of time searching the input for the locations of numbers. If you just iterate over the input once, you can record the locations of each number as you come across them, so you don't have to keep searching over and over later. Consider what data structure would be effective for recording this information.

How do I write a recursive function that return the minimum element in an array?

write a recursive function that return the minimum element in an array where C is the array and s is the size. this is my code.
c = [2,5,6,4,3]
def min (c, s):
smallest = c[0]
if c[s] < smallest:
smallest = c[s]
else:
return min
print min (c,s)
errors : s is not defined.
Apparently the computer doesn't know what s stands for in the line print min (c,s)
You need to tell a computer what you want s variable to be. I propose you to use 0 instead of s in function call, that way you will start looking for min number from 0.
That being said there are other issues with the code but this will fix your error and you will be able to move forward with your task.
First of all, I'd caution the use of min as a function name.
Min is a python built in so use as your own function name may cause unwanted results in your code.
If I'm understanding the code and your question correctly, it seems as if you are trying to get the minimum of the list length compared to the minimum integer in the list.
c = [2,5,6,4,3]
'''The function takes the name of the array as the parameter. In this case c'''
def minimum(array):
'''This function compares the length of array to the smallest integer in the array'''
length = 0
for i in c:
length += 1
'''In this case length is 5'''
smallest = min(c)
'''smallest is 2 in this case'''
print min(smallest,length)
However if all you want is to get the minimum value of an array, just do:
def minval(array):
print min(array)
A recursive function divides the task into smaller parts that can be solved by the function itself. The function gets repeatedly called with these smaller and smaller tasks until the tasks become so easy, that they can be computed easily.
Thus the smallest element of a list (when computed recursively) is the minimum of the first element and the smallest element of the rest of the list. The task becomes trivial, when there is only one element. In Python3:
def smallest(lst):
"""recursive for learning, not efficient"""
first, *rest = lst
return first if not rest else min(first, smallest(rest))
I agree with others that you should avoid using min as a function name, so that you don't collide with python's builtin implementation. I'm also operating under the assumption that you're not supposed to use min, because otherwise the solution is trivial.
Here's a recursive implementation that doesn't require a second argument, since the list's length can be determined via the len function.
def smallest(lst):
l = len(lst)
if l > 1:
mid = l // 2
m1 = smallest(lst[:mid])
m2 = smallest(lst[mid:])
return m1 if m1 < m2 else m2
return lst[0]
This checks to see whether the argument list has 2 or more values. If so, it splits the list into two halves, determines the smallest value in each half, then returns the smaller of the results. If there's only one element, it's trivially the smallest and gets returned.
Halving the list on each recursive call bounds the depth of the call stack to O(log n), where n is the original list's length. This prevents stack overflow from occurring with any list you could actually create in Python. Another proposed solution whittles the list down one-by-one, and will fail on lists with more than a thousand or so values.

Too slow queries in interval tree

I have a list of intervals and I need to return the ones that overlap with an interval passed in a query. What is special is that in a typical query around a third or even half of the intervals will overlap with the one given in the query. Also, the ratio of the shortest interval to the longest is not more than 1:5. I implemented my own interval tree (augmented red-black tree) - I did not want to use existing implementations because I needed support for closed intervals and some special features. I tested the query speed with 6000 queries in a tree with 6000 intervals (so n=6000 and m=3000 (app.)). It turned out that brute force is just as good as using the tree:
Computation time - loop: 125.220461 s
Tree setup: 0.05064 s
Tree Queries: 123.167337 s
Let me use asymptotic analysis. n: number of queries; n: number of intervals; app. n/2: number of intervals returned in a query:
time complexity brute force: n*n
time complexity tree: n*(log(n)+n/2) --> 1/2 nn + nlog(n) --> n*n
So the result is saying that the two should be roughly the same for a large n. Still one would somehow expect the tree to be noticeably faster given the constant 1/2 in front of n*n. So there are three possible reasons I can imagine for the results I got:
a) My implementation is wrong. (Should I be using BFS like below?)
b) My implementation is right, but I made things cumbersome for Python so it needs more time to deal with the tree than to deal with brute force.
c) everything is OK - it is just how things should behave for a large n
My query function looks like this:
from collections import deque
def query(self,low,high):
result = []
q = deque([self.root]) # this is the root node in the tree
append_result = result.append
append_q = q.append
pop_left = q.popleft
while q:
node = pop_left() # look at the next node
if node.overlap(low,high): # some overlap?
append_result(node.interval)
if node.low != None and low <= node.get_low_max(): # en-q left node
append_q(node.low)
if node.high != None and node.get_high_min() <= high: # en-q right node
append_q(node.high)
I build the tree like this:
def build(self, intervals):
"""
Function which is recursively called to build the tree.
"""
if intervals is None:
return None
if len(intervals) > 2: # intervals is always sorted in increasing order
mid = len(intervals)//2
# split intervals into three parts:
# central element (median)
center = intervals[mid]
# left half (<= median)
new_low = intervals[:mid]
#right half (>= median)
new_high = intervals[mid+1:]
#compute max on the lower side (left):
max_low = max([n.get_high() for n in new_low])
#store min on the higher side (right):
min_high = new_high[0].get_low()
elif len(intervals) == 2:
center = intervals[1]
new_low = [intervals[0]]
new_high = None
max_low = intervals[0].get_high()
min_high = None
elif len(intervals) == 1:
center = intervals[0]
new_low = None
new_high = None
max_low = None
min_high = None
else:
raise Exception('The tree is not behaving as it should...')
return(Node(center, self.build(new_low),self.build(new_high),
max_low, min_high))
EDIT:
A node is represented like this:
class Node:
def __init__(self, interval, low, high, max_low, min_high):
self.interval = interval # pointer to corresponding interval object
self.low = low # pointer to node containing intervals to the left
self.high = high # pointer to node containing intervals to the right
self.max_low = max_low # maxiumum value on the left side
self.min_high = min_high # minimum value on the right side
All the nodes in a subtree can be obtained like this:
def subtree(current):
node_list = []
if current.low != None:
node_list += subtree(current.low)
node_list += [current]
if current.high != None:
node_list += subtree(current.high)
return node_list
p.s. note that by exploiting that there is so much overlap and that all intervals have comparable lenghts, I managed to implement a simple method based on sorting and bisection that completed in 80 s, but I would say this is over-fitting... Amusingly, by using asymptotic analysis, I found it should have app. the same runtime as using the tree...
If I correctly understand your problem, you are trying to speed up your process.
If it is that, try to create a real tree instead of manipulating lists.
Something that looks like :
class IntervalTreeNode():
def __init__(self, parent, min, max):
self.value = (min,max)
self.parent = parent
self.leftBranch = None
self.rightBranch= None
def insert(self, interval):
...
def asList(self):
""" return the list that is this node and all the subtree nodes """
left=[]
if (self.leftBranch != None):
left = self.leftBranch.asList()
right=[]
if (self.rightBranch != None):
left = self.rightBranch.asList()
return [self.value] + left + right
And then at start create an internalTreeNode and insert all yours intervals in.
This way, if you really need a list you can build a list each time you need a result and not each time you make a step in your recursive iteration using [x:] or [:x] as list manipulation is a costly operation in python. It is possible to work also using directly the nodes instead of a list that will greatly speed up the process as you just have to return a reference to the node instead of doing some list addition.

Recursing over each path in a tree

I'm stuck on a programming question involving a tree for a project.
The problem itself is only a subproblem of the larger question (but I won't post that here as its not really relevant). Anyone the problem is:
I'm trying to go over each path in the tree and calculate the associated value.
The situation is for instance like in this tree:
a
b b
Now the result i should get is the multiplications as follows:
leave1 = a * b
leave2 = a * (1-b)
leave3 = (1-a) * b
leave4 = (1-a) * (1-b)
And so the leaves on one level lower in the tree would basically be the results (note that they do not exist in reality, its just conceptual).
Now, I want to do this recursively, but there are a few problems:
The values for a and b are generated during the traversal, but the value for b for instance should only be generated 1 time. All values are either 0 or 1.
If taking the left child of a node A, you use the value A in the multiplication. the right path you use the value 1-A.
Furthermore, the tree is always perfect, i.e. complete and balanced.
Now what I have (I program in python, but its more the algorithm in general im interested in with this question):
def f(n):
if n == 1:
return [1]
generate value #(a, b or whatever one it is)
g = f(n/2)
h = scalarmultiply(value,g)
return h.append(g - h)
Note that g and h are lists.
This code was giving by one of my professors as possible help, but I don't think this does what I want. At least, it wont give me as result a list h which has the result for each path. Especially, I don't think it differentiates between b and 1-b. Am I seeing this wrong and how should I do this?
I'm not very experienced at programming, so try and explain easy if you can :-)
Try something like this:
def f(element_count):
if element_count == 1: #<-------------------------------A
return [1,] #at the root node - it has to be 1 otherwise this is pointless
current_value = get_the_value_however_you_need_to()#<----B
result_so_far = f(element_count/2) #<-------------------C
result = []
for i in result_so_far:#<--------------------------------D
result.append(i*current_value)
result.append(i*(1-current_value))
result.append((1-i)*current_value)
result.append((1-i)*(1-current_value))
return result
Here's how it works:
Say you wanted to work with a three layer pyramid. Then the element_count would be the number of elements on the third layer so you would call f(4). The condition at A fails so we continue to B where the next value is generated. Now at C we call f(2).
The process in f(2) is similar, f(2) calls f(1) and f(1) returns [1,] to f(2).
Now we start working our way back to the widest part of the tree...
I'm not sure what your lecturer was getting at with the end of the function. The for loop does the multiplication you explained and builds up a list which is then returned
If I'm understanding correctly, you want to build up a binary tree like this:
A
/ \
/ \
/ \
B C
/ \ / \
D E F G
Where the Boolean values (1, 0, or their Python equivalents, True and False) of lower level nodes are calculated from the values of their parent and grandparent using the following rules:
D = A and B
E = A and not B
F = not A and C
G = not A and not C
That is, each node's right descendents calculate their values based on it's inverse. You further stated that the tree is defined by a single root value (a) and another value that is used for both of the root's children (b).
Here's a function that will calculate the value of any node of such a tree. The tree positions are defined by an integer index in the same way a binary heap often is, with the parent of a node N being N//2 and it's children being 2*N and 2*N+1 (with the root node being 1). It uses a memoization dictionary to avoid recomputing the same values repeatedly.
def tree_value(n, a, b, memo=None):
if memo is None:
memo = {1:a, 2:b, 3:b} # this initialization covers our base cases
if n not in memo: # if our value is unknown, compute it
parent, parent_dir = divmod(n, 2)
parent_val = tree_value(parent, a, b, memo) # recurse
grandparent, grandparent_dir = divmod(parent, 2)
grandparent_val = tree_value(grandparent, a, b, memo) # recurse again
if parent_dir: # we're a right child, so invert our parent's value
parent_val = not parent_val
if grandparent_dir: # our parent is grandparent's right child, so invert
grandparent_val = not grandparent_val
memo[n] = parent_val and grandparent_val
return memo[n]
You could probably improve performance slightly by noticing that the grandparent's value will always be in the memo dict after the parent's value has been calculated, but I've left that out so the code is clearer.
If you need to efficiently compute many values from the same tree (rather than just one), you probably want to keep a permanent memo dictionary somewhere, perhaps as a value in a global dict, keyed by an (a, b) tuple.

Categories