Height of BST +1 more than expected - python

I made a function which determines the height of BST, but when the height of the tree is e.g. 2, the result that appears for me is 3, etc. I don't know what I should change in my code. If you need whole code to be able to answer me, tell me, so I'll copy it.
def maxDepth(self, node):
if node is None:
return 0
else:
# Compute the depth of each subtree
lDepth = self.maxDepth(node.left)
rDepth = self.maxDepth(node.right)
# Use the larger one
if (lDepth > rDepth):
return lDepth + 1
else:
return rDepth + 1

Instead of return 0 just do return -1 and you'll get desired height smaller by 1. Corrected code is below:
def maxDepth(self, node):
if node is None:
return -1
else:
# Compute the depth of each subtree
lDepth = self.maxDepth(node.left)
rDepth = self.maxDepth(node.right)
# Use the larger one
if (lDepth > rDepth):
return lDepth + 1
else:
return rDepth + 1
Also you can use built-in max() function to make your code shorter:
def maxDepth(self, node):
if node is None:
return -1
return max(self.maxDepth(node.left), self.maxDepth(node.right)) + 1
Note: OP is correct, height should be edge-based, i.e. tree with one node 5 should have height of 0. And empty tree (None-tree) has height -1. There are two proofs of this:
One proof in Wikipedia Tree Article says that height is edge based and Conventionally, an empty tree (tree with no nodes, if such are allowed) has height −1.
And another proof in famous book Cormen T.H. - Introduction to Algorithms:

Related

Depth of a generic tree - Python

I know how to recursively calculate the depth of a binary tree, however I'm not sure I know the best way to calculate the depth of a generic tree where any node n could have k children. I think I've implemented a correct solution below, but I'm wondering if it can be optimized in any way?
def depth_binary(node):
if node == None:
return 0
return max(depth(node.left),depth(node.right)) + 1
def depth_tree(node):
if node == None:
return 0
max_val = 0
for n in node.adjacent:
d = depth_tree(n)
if d > max_val:
max_val = d
return max_val + 1

Python TSP Held-Karp algorithm

I'm trying to implement Held-Karp in Python but it doesn't seem to be working. There are two differences in my problem from the classical TSP problem (as used in the description of the H-K algorithm I found in the web):
- I don't need to return to the original node. I think this is called a Hamiltonian cycle but I'm not very familiar with graph algorithms so I'm not entirely sure. Each city needs to be visited once as in TSP
- Some edges in the graph are missing
I used networkx and created this function:
def hk(nodes,Graph,start_node, total_ops, min_weight = 9999999,min_result = []):
nodes.remove(start_node)
# removes the current node from the set of nodes
for next_node in nodes:
total_ops += 1
current_weight = 0
try:
current_weight += Graph[start_node][next_node]["weight"]
# checks if there's an edge between the current node and the next node
# If there's an edge, adds the edge weight
except:
continue
sub_result = []
if len(nodes) > 1:
new_nodes = set(nodes)
sub_result,sub_weight,total_ops = hk(new_nodes,Graph,next_node, total_ops)
#calculates the minimum weight of the remaining tree
current_weight += sub_weight
if current_weight < min_weight:
# if the weight of the tree is below the minimum weight, update minimum weight
min_weight = current_weight
min_result = [next_node] + sub_result
return min_result,min_weight,total_ops
But something is clearly wrong as I'm expecting O(n ** 2 * 2 ** n) complexity but am getting O(n!) instead, same as for the brute force method (trying all combinations one by one). Clearly, there's an error in my implementation.
Thank you.

Python - Get largest number in heap that is less than n

I'm having trouble finding the following functionality in Python:
Given a set of numbers, return the largest number less than or equal to n or return None if no such number exists.
For example, given the list [1, 3, 7, 10] and n = 9, the function would return 7.
I'm looking for Python functionality similar to Java's TreeSet.lower.
I can use another data structure. A heap seems appropriate.
The O(n) solution is too slow for the scale of the problem. I'm looking for an O(log n) solution.
Background
I'm working on https://www.hackerrank.com/challenges/maximise-sum. The possible values range from 1 - 10^14, so using a sorted list with binary search is too slow.
My current thought is to iterate on Python's heapq backing array directly. I was hoping there might be something more Pythonic.
I think you can use bintrees library for this : https://bitbucket.org/mozman/bintrees/src
Examples :
tree = bintrees.RBTree()
In [10]: tree.insert(5,1)
In [11]: tree.insert(6,1)
In [12]: tree.insert(10,1)
tree.ceiling_item(5) -> (5,1)
The complexity of this operation is O(logN)
nextLowest = lambda seq,x: min([(x-i,i) for i in seq if x>=i] or [(0,None)])
Usage:
t = [10, 20, 50, 200, 100, 300, 250, 150]
print nextLowest(t,55)
> 50
I take the above solution from a similar question.
If you can't make any assumptions about the ordering of the array, then I think the best you can do is O(n):
def largest_less_than(numlist, n):
answer = min(numlist, key=lambda x: n-x if n>=x else float('inf'))
if answer > n:
answer = None
return answer
If the question is about repeatedly getting the largest-less-than for different n values on the same dataset, then maybe one solution is using bucket sort to get your list sorted in O(n), and then use bisect repeatedly.
You can use the selection algorithm for this. Below I have provided a simple algorithm for this:
numbers = [1, 3, 7, 10]
n = 9
largest_number = None
for number in numbers:
if number<=n:
largest_number=number
else:
break
if largest_number:
print 'value found ' + str(largest_number)
else:
print 'value not found'
If you don't have to support dynamic additions and removals from the list, then just sort it and use binary search to find the largest < n in O(log N) time.
ig-melnyk's answer is probably the right way to complete this question. But since HackerRank doesn't have a way to use libraries, here's an implementation of a Left-Leaning Red Black Tree that I used for the problem.
class LLRB(object):
class Node(object):
RED = True
BLACK = False
__slots__ = ['value', 'left', 'right', 'color']
def __init__(self, value):
self.value = value
self.left = None
self.right = None
self.color = LLRB.Node.RED
def flip_colors(self):
self.color = not self.color
self.left.color = not self.left.color
self.right.color = not self.right.color
def __init__(self):
self.root = None
def search_higher(self, value):
"""Return the smallest item greater than or equal to value. If no such value
can be found, return 0.
"""
x = self.root
best = None
while x is not None:
if x.value == value:
return value
elif x.value < value:
x = x.left
else:
best = x.value if best is None else min(best, x.value)
x = x.right
return 0 if best is None else best
#staticmethod
def is_red(node):
if node is None:
return False
else:
return node.color == LLRB.Node.RED
def insert(self, value):
self.root = LLRB.insert_at(self.root, value)
self.root.color = LLRB.Node.BLACK
#staticmethod
def insert_at(node, value):
if node is None:
return LLRB.Node(value)
if LLRB.is_red(node.left) and LLRB.is_red(node.right):
node.flip_colors()
if node.value == value:
node.value = value
elif node.value < value:
node.left = LLRB.insert_at(node.left, value)
else:
node.right = LLRB.insert_at(node.right, value)
if LLRB.is_red(node.right) and not LLRB.is_red(node.left):
node = LLRB.rotate_left(node)
if LLRB.is_red(node.left) and LLRB.is_red(node.left.left):
node = LLRB.rotate_right(node)
return node
You can decrease the number you're looking for until found.
This funtion will find the position of the largest number <= n in fs, a sorted list of integers.
If there are no numbers smaller or equal to n, it will return -1.
def findmaxpos(n):
if n < fs[0]: return -1
while True:
if n in fs: return fs.index(n)
n-=1

How can I restrict a KDTree query to a subset of the nodes?

tl;dr
I need a way to find "Foreign Nearest Neighbors" using a KDTree or some other spatial data structure. i.e find the nearest neighbor in a subset of the tree.
I built a MST algorithm that uses a KDTree to find nearest neighbors. However eventually it needs to look beyond nearest neighbors and into "Nearest Foreign Neighbors" as to connect distant nodes. My first approach simply iteratively increases k-nn parameter until the query returns a node in the subset. I cache k as each time the function is called the breadth of its search is expanded and there is no point in searching the previous k < k_cache.
def FNNd(kdtree, A, b):
"""
kdtree -> nodes in subnet -> coord of b -> index of a
returns nearest foreign neighbor a∈A of b
"""
a = None
b = cartesian_projection(b)
k = k_cache[str(b)] if str(b) in k_cache else 2
while a not in A:
#scipy kdtree where query -> [dist], [idx]
_, nn = kdtree.query(b, k=k)
a = nn[-1][k-1]
k += 1
k_cache[str(b)] = k-1
#return NN a ∈ A of b
return a
However this is quite 'hacky' and inefficient, so I was thinking I could implement a KDTree myself that stops traversing when doing so would result in subtrees that doesn't include the restricted subset. Then the nearest neighbor in the subset would have to be that left or right branch. After many attempts I can't seem to get this to actually work. Is there a flaw in my logic? A better way to do this? A better Data Structure?
Heres my KDTree
class KDTree(object):
def __init__(self, data, depth=0, make_idx=True):
self.n, self.k = data.shape
if make_idx:
# index the data
data = np.column_stack((data, np.arange(self.n)))
else:
# subtract the indexed dimension in later calls
self.k -= 1
self.build(data, depth)
def build(self, data, depth):
if data.size > 0:
# get the axis to pivot on
self.axis = depth % self.k
# sort the data
s_data = data[np.argsort(data[:, self.axis])]
# find the pivot point
point = s_data[len(s_data) // 2]
# point coord
self.point = point[:-1]
# point index
self.idx = int(point[-1])
# all nodes below this node
self.children = s_data[np.all(s_data[:, :-1] != self.point, axis=1)]
# branches
self.left = KDTree(s_data[: len(s_data) // 2], depth+1, False)
self.right = KDTree(s_data[len(s_data) // 2 + 1: ], depth+1, False)
else:
# empty node
self.axis=0
self.point = self.idx = self.left = self.right = None
self.children = np.array([])
def query(self, point, best=None):
if self.point is None:
return best
if best is None:
best = (self.idx, self.point)
# check if current node is closer than best
if distance(self.point, point) < distance(best[1], point):
best = (self.idx, self.point)
# continue traversing the tree
best = self.near_tree(point).query(point, best)
# traverse the away branch if the orthogonal distance is less than best
if self.orthongonal_dist(point) < distance(best[1], point):
best = self.away_tree(point).query(point, best)
return best
def orthongonal_dist(self, point):
orth_point = np.copy(point)
orth_point[self.axis] = self.point[self.axis]
return distance(point, self.point)
def near_tree(self, point):
if point[self.axis] < self.point[self.axis]:
return self.left
return self.right
def away_tree(self, point):
if self.near_tree(point) == self.left:
return self.right
return self.left
[EDIT] Updated attempt, however this doesn't guarantee a return
def query_subset(self, point, subset, best=None):
# if point in subset, update best
if self.idx in subset:
# if closer than current best, or best is none update
if best is None or distance(self.point, point) < distance(best[1], point):
best = (self.idx, self.point)
# Dead end backtrack up the tree
if self.point is None:
return best
near = self.near_tree(point)
far = self.away_tree(point)
# what nodes are in the near branch
if near.children.size > 1:
near_set = set(np.append(near.children[:, -1], near.idx))
else: near_set = {near.idx}
# check the near branch, if its nodes intersect with the queried subset
# otherwise move to the away branch
if any(x in near_set for x in subset):
best = near.query_subset(point, subset, best)
else:
best = far.query_subset(point, subset, best)
# validate best, by ensuring closer point doesn't exist just beyond partition
if best is not None:
if self.orthongonal_dist(point) < distance(best[1], point):
best = far.query_subset(point, subset, best)
return best

How to find the kth smallest node in BST? (revisited)

I have asked a similar question yesterday but I reached a different solution from the one posted in the original question sO I am reposting with new code. I am not keeping track of number of right and left children of each node. The code works fine for some cases, but for the case of of finding 6th smalest element, it fails. The problem is that I somehow need to carry the number of children down the tree. For example, for node 5, I need to cary over rank of node 4 and I am not able to do that.
This is not a homework, I am trying to prepare for interview and this is one of the classical questions and I can't solve it.
class Node:
"""docstring for Node"""
def __init__(self, data):
self.data = data
self.left = None
self.right = None
self.numLeftChildren = 0
self.numRightChildren = 0
class BSTree:
def __init__(self):
# initializes the root member
self.root = None
def addNode(self, data):
# creates a new node and returns it
return Node(data)
def insert(self, root, data):
# inserts a new data
if root == None:
# it there isn't any data
# adds it and returns
return self.addNode(data)
else:
# enters into the tree
if data <= root.data:
root.numLeftChildren += 1
# if the data is less than the stored one
# goes into the left-sub-tree
root.left = self.insert(root.left, data)
else:
# processes the right-sub-tree
root.numRightChildren += 1
root.right = self.insert(root.right, data)
return root
def getRankOfNumber(self, root, x, rank):
if root == None:
return 0
if rank == x:
return root.data
else:
if x > rank:
return self.getRankOfNumber(root.right, x, rank+1+root.right.numLeftChildren)
if x <= rank:
return self.getRankOfNumber(root.left, x, root.left.numLeftChildren+1)
# main
btree = BSTree()
root = btree.addNode(13)
btree.insert(root, 3)
btree.insert(root, 14)
btree.insert(root, 1)
btree.insert(root, 4)
btree.insert(root, 18)
btree.insert(root, 2)
btree.insert(root, 12)
btree.insert(root, 10)
btree.insert(root, 5)
btree.insert(root, 11)
btree.insert(root, 8)
btree.insert(root, 7)
btree.insert(root, 9)
btree.insert(root, 6)
print btree.getRankOfNumber(root, 8, rank=root.numLeftChildren+1)
You have the rank of a node. You need to find the rank of its left or right child. Well, how many nodes are between the node and its child?
a
/ \
/ \
b c
/ \ / \
W X Y Z
Here's an example BST. Lowercase letters are nodes; uppercase are subtrees. The number of nodes between a and b is the number of nodes in X. The number of nodes between a and c is the number of nodes in Y. Thus, you can compute the rank of b or c from the rank of a and the size of X or Y.
rank(b) == rank(a) - size(X) - 1
rank(c) == rank(a) + size(Y) + 1
You had the c formula, but the wrong b formula.

Categories