Count number of leaves in a tree - Failed edge case? - python

In an online assessment I was asked to count the number of leaves in a tree. The tree is given in parent-array representation, meaning the tree has n nodes with labels 0, 1, 2, .., n-1, and you are passed a length n array p, where p[i] returns the label of the parent of node i, except when i is the root of the tree in which case p[i] is -1.
I guess one thing to note is that the problem was as stated above, so there were no extra conditions such as e.g. it being a binary tree.
I thought this was a fairly straight forward problem, but the code that I submitted failed a "Small Tree Case" on the testing platform (which does not let you see the test cases). It passed the other tests, including a performance test on a large tree. I've thought about it for a while but I still can not see what the flaw in my algorithm or handling of some edge case is. I guess one thing to note is that the problem was as stated above, so there were no extra conditions such as e.g. it being a binary tree.
def countLeaves(p):
n = len(p)
if p is None or n == 0 : return 0
if n == 1 or n == 2 : return 1
leaves = set(range(n))
for i in range(n):
if p[i] == -1: # i is root of tree with >1 node, can't be a leaf
leaves.discard(i)
else: # p[i] is parent of node i, can't be a leaf
leaves.discard(p[i])
return len(leaves)
In trying to fix the failed "Small tree case" I also tried returning None if p is None, returning None if n == 0, or both modifications together, but to no success. If anyone could point out what the error in my code may have been I would greatly appreciate it. Thank you.

I would try this:
def countLeaves(p):
n = len(p)
if p is None or n < 2 : return 0
leaves = set(range(n))
for i in range(n):
if p[i] == -1: # i is root of tree with >1 node, can't be a leaf
leaves.discard(i)
else: # p[i] is parent of node i, can't be a leaf
leaves.discard(p[i])
return len(leaves)
The only real change is that it considers trees with a single node to have no leaves.
According to Wolfram Mathworld:
A leaf of an unrooted tree is a node of vertex degree 1. Note that for a rooted or planted tree, the root vertex is generally not considered a leaf node, whereas all other nodes of degree 1 are.

Related

Problem with understanding the code of 968. Binary Tree Cameras

I am studying algorithms and trying to solve the LeetCode problem 968. Binary Tree Cameras:
You are given the root of a binary tree. We install cameras on the tree nodes where each camera at a node can monitor its parent, itself, and its immediate children.
Return the minimum number of cameras needed to monitor all nodes of the tree.
I got stuck on it, and after checking the discussion I better understood the logic, but I am still struggling to understand the code:
def minCameraCover(self, root):
self.res = 0
def dfs(root):
if not root: return 2
l, r = dfs(root.left), dfs(root.right)
if l == 0 or r == 0:
self.res += 1
return 1
return 2 if l == 1 or r == 1 else 0
return (dfs(root) == 0) + self.res
I don't understand why l, r == 0, 0 in a DFS function while the base case is set as if not root: return 2
What are the mechanics behind this that makes dfs(root.left), def(root.right) return 0?
So far I understood that a node has three states:
0: it's a leaf
1: it has a camera and the node is parent of a leaf
2: it's being covered, but does not have a camera on it.
The base case is set for a None, i.e. the absence of a node. Such a virtual position is never a problem, so we can count it as "covered", but there is no camera there. This is why the base case returns 2.
Now when a leaf node is encountered, then obviously both recursive calls will get None as argument and return 2.
Then the expression 2 if l == 1 or r == 1 else 0 will evaluate to 0, as neither l nor r are 1 (they are 2): theelse clause kicks in.
I hope this clarifies that for leaf nodes the return value will be 0, but also for other nodes this can happen: every time both recursive calls return 2, the caller will itself return 0.
Therefore the better explanation of the three states is:
1: the node has a camera
2: the node has no camera, but it is covered from below
0: the node has no camera yet and is not covered from below. If it has a parent, that parent should get a camera so to cover for this node. It it is the root, it must get a camera itself.

Recurrence relation and time complexity of finding next larger in Generic tree

Question: Given a generic tree and an integer n. Find and return the node with next larger element in the tree i.e. find a node with value just greater than n.
Although i was able to solve it is O(n) by removing the later for loop and doing comparisons while calling recursion. I am bit curious about time complexity of following version of code.
I came up with recurrence relation as T(n) = T(n-1) + (n-1) = O(n^2). Where T(n-1) is for time taken by children and + (n-1) for finding the next larger (second for loop). Have i done it right? or am i missing something?
def nextLargestHelper(root, n):
"""
root => reference to root node
n => integer
Returns node and value of node which is just larger not first larger than n.
"""
# Special case
if root is None:
return None, None
# list to store all values > n
largers = list()
# Induction step
if root.data > n:
largers.append([root, root.data])
# Induction step and Base case; if no children do not call recursion
for child in root.children:
# Induction hypothesis; my function returns me node and value just larger than 'n'
node, value = nextLargestHelper(child, n)
# If larger found in subtree
if node:
largers.append([node, value])
# Initialize node to none, and value as +infinity
node = None
value = sys.maxsize
# travers through all larger values and return the smallest value greater than n
for item in largers: # structure if item is [Node, value]
# this is why value is initialized to +infinity; so as it is true for first time
if item[1] < value:
node = item[0]
value = item[1]
return node, value
At first: please use different chacters for O-Notation and inputvalues.
You "touch" every node exactly once, so the result should be O(n). A bit special is your algorithm finding the minimum afterwards. You could include this in your go-though-all-children loop for an easier recurrence estimation. As it is, you have do a recurrence estimation for the minimum of the list as well.
Your recurrence equation should look more like T(n) = a*T(n/a) + c = O(n) since in each step you have a children forming a subtrees with size (n-1)/a. In each step you have next to some constant factors also the computation of the minimum of a list with at most a elements. You could write it as a*T(n/a) + a*c1 +c2 which is the same as a*T(n/a) + c. The actual formula would look more like this: T(n) = a*T((n-1)/a) + c but the n-1 makes it harder to apply the master theorem.

Calculate height of an arbitrary (non-binary) tree

I'm currently taking on online data structures course and this is one of the homework assignments; please guide me towards the answer rather than giving me the answer.
The prompt is as follows:
Task. You are given a description of a rooted tree. Your task is to compute and output its height. Recall that the height of a (rooted) tree is the maximum depth of a node, or the maximum distance from a leaf to the root. You are given an arbitrary tree, not necessarily a binary tree.
Input Format. The first line contains the number of nodes n. The second line contains integer numbers from −1 to n−1 parents of nodes. If the i-th one of them (0 ≤ i ≤ n−1) is −1, node i is the root, otherwise it’s 0-based index of the parent of i-th node. It is guaranteed that there is exactly one root. It is guaranteed that the input represents a tree.
Constraints. 1 ≤ n ≤ 105.
My current solution works, but is very slow when n > 102. Here is my code:
# python3
import sys
import threading
# In Python, the default limit on recursion depth is rather low,
# so raise it here for this problem. Note that to take advantage
# of bigger stack, we have to launch the computation in a new thread.
sys.setrecursionlimit(10**7) # max depth of recursion
threading.stack_size(2**27) # new thread will get stack of such size
threading.Thread(target=main).start()
# returns all indices of item in seq
def listOfDupes(seq, item):
start = -1
locs = []
while True:
try:
loc = seq.index(item, start+1)
except:
break
else:
locs.append(loc)
start = loc
return locs
def compute_height(node, parents):
if node not in parents:
return 1
else:
return 1 + max(compute_height(i, parents) for i in listOfDupes(parents, node))
def main():
n = int(input())
parents = list(map(int, input().split()))
print(compute_height(parents.index(-1), parents))
Example input:
>>> 5
>>> 4 -1 4 1 1
This will yield a solution of 3, because the root is 1, 3 and 4 branch off of 1, then 0 and 2 branch off of 4 which gives this tree a height of 3.
How can I improve this code to get it under the time benchmark of 3 seconds? Also, would this have been easier in another language?
Python will be fine as long as you get the algorithm right. Since you're only looking for guidance, consider:
1) We know the depth of a node iif the depth of its parent is known; and
2) We're not interested in the tree's structure, so we can throw irrelevant information away.
The root node pointer has the value -1. Suppose that we replaced its children's pointers to the root node with the value -2, their children's pointers with -3, and so forth. The greatest absolute value of these is the height of the tree.
If we traverse the tree from an arbitrary node N(0) we can stop as soon as we encounter a negative value at node N(k), at which point we can replace each node with the value of its parent, less one. I.e, N(k-1) = N(k) -1, N(k-2)=N(k-1) - 1... N(0) = N(1) -1. As more and more pointers are replaced by their depth, each traversal is more likely to terminate by encountering a node whose depth is already known. In fact, this algorithm takes basically linear time.
So: load your data into an array, start with the first element and traverse the pointers until you encounter a negative value. Build another array of the nodes traversed as you go. When you encounter a negative value, use the second array to replace the original values in the first array with their depth. Do the same with the second element and so forth. Keep track of the greatest depth you've encountered: that's your answer.
The structure of this question looks like it would be better solved bottom up rather than top down. Your top-down approach spends time seeking, which is unnecessary, e.g.:
def height(tree):
for n in tree:
l = 1
while n != -1:
l += 1
n = tree[n]
yield l
In []:
tree = '4 -1 4 1 1'
max(height(list(map(int, tree.split()))))
Out[]:
3
Or if you don't like a generator:
def height(tree):
d = [1]*len(tree)
for i, n in enumerate(tree):
while n != -1:
d[i] += 1
n = tree[n]
return max(d)
In []:
tree = '4 -1 4 1 1'
height(list(map(int, tree.split())))
Out[]:
3
The above is brute force as it doesn't take advantage of reusing parts of the tree you've already visited, it shouldn't be too hard to add that.
Your algorithm spends a lot of time searching the input for the locations of numbers. If you just iterate over the input once, you can record the locations of each number as you come across them, so you don't have to keep searching over and over later. Consider what data structure would be effective for recording this information.

Why does validating a binary tree require +1 and -1 at the final stage?

This is an efficient solution for checking a binary tree:
# Return true if the given tree is a BST and its values
# >= min and <= max
def isBSTUtil(node, mini, maxi):
# An empty tree is BST
if node is None:
return True
# False if this node violates min/max constraint
if node.data < mini or node.data > maxi:
return False
# Otherwise check the subtrees recursively
# tightening the min or max constraint
return (isBSTUtil(node.left, mini, node.data -1) and
isBSTUtil(node.right, node.data+1, maxi))
Taken from: http://www.geeksforgeeks.org/a-program-to-check-if-a-binary-tree-is-bst-or-not/
I'm testing it against a tree from https://www.hackerrank.com/challenges/ctci-is-binary-search-tree
and of course, it works.
However, I really don't understand why the final stage replaces maxi with node.data -1 and mini with node.data+1. Why does it add or subtract 1? In the previous conditional we have already checked that it is lower or higher.
I've tried removing them, and the hackerrank automated tests won't pass. But I really don't understand why the extra addition of 1 and subtraction of 1 is needed on top of the replacement. I tried doing a full stack trace with a pencil and paper on the far left-most branch which looks like:
isBST(4, -99999, 99999)
(ok, ok, ok)
isBST(2, -99999, 4)
(ok, ok, ok)
isBST(1, -99999, 2)
(ok, ok, ok)
True
What am I missing here?

A node in a 2-hub neighborhood of a node has zero degree?

I have a networkx graph G and I computed the nodes that are 2-hubs away from a particular node using this code:
def node_neighborhood(G, node, n=2):
"""
Returns a list of nodes which are the n-neighborhood of the input node.
Parameters
----------
G: networkx graph object.
node: the node to get the neighborhood for.
n: the neighborhood degree.
"""
path_lengths = nx.single_source_dijkstra_path_length(G, node)
return [node for node, length in path_lengths.iteritems()
if length == n]
so this code returns the nodes are that are 2-hubs away from the specified node.
Next I removed from G all nodes that are NOT in the returned list from node_neighborhood using this code:
for n in G.nodes():
if (n not in node_2_neiborhood) and (n != node):
G.remove_node(n)
else:
if G.degree(n) == 0:
raise Exception("a node has zero degree!!!")
However the problem is that the exception that a node has zero degree gets thrown. My question is why is that? If a node is 2-hubs away from X, then that node must have at least one edge! So how is it possible that a node in the neighborhood has a zero degree?!
This behavior is so normal and expected, since with that approach I'm removing the nodes that are 1-hub away from the target node. So I'm removing the paths between the 2-hubs away nodes and the target node.

Categories