Traversing a non-binary tree - python

I've made custom class for nodes
class NodeTree(object):
def __init__(self, name = None, children = None):
self.name = name
self.children = children
and defined a function that make a tree(a node containing its children nodes)
def create_tree(d):
x = NodeTree()
for a in d.keys():
if type(d[a]) == str:
x.name = d[a]
if type(d[a]) == list:
if d[a] != []:
for b in d[a]:
x.add_child(create_tree(b))
return x
The input is a dict with one argument for the node name and a list with its children in the same form as the parent.
The function work fine and I've made method that prove it but I can't find a way to traverse it right and get the height of the tree. I don't know if "height" it's the right term cause I know it may be ambivalent, I need to count the node as a measure unit, like this:
parent
|
|
---------
| |
child child
The height of this tree is 2, I've tried everything, from counters to tag in the class, everything seems to degenerate an I never get the right height.
How should I approach that?

To create a recursive height method for your tree that determines the height of the node (that is, the maximum number of nodes in a path from that node to a leaf):
def height(self):
if not self.children: # base case
return 1
else: # recursive case
return 1 + max(child.height() for child in self.children)
Other tree traversals can also be done recursively. For example, here's a generator method that yields the names of the trees nodes in "pre-order" (that is, with each parent preceding its children and decedents):
def preorder(self):
yield self.name
for child in self.children:
yield from child.preorder() # Python 3.3 only!
The yield from syntax in that loop is new in Python 3.3. You can get the same results in earlier versions with this:
for descendent in child.preorder():
yield descendent

Related

How to increment height of each path in a suffix trie

I have created a suffix trie and I am trying to find a way to accumulate and store its height in the reverse order and store it in each node. Example:
with strings ['abcd', 'abc', 'aa']
So I have implemented this in two classes. 1 is the Node class and 2 is the Trie class, and each vertex is a node, and hence to represent each of the height would be to go to each vertex and retrieve
ex. a.height = 4, b.height = 3, c.height = 2 instead of the normal (easy) a.height = 1, b.height = 2, c.height = 3
The nodes doesn't have to be the same height away from the root node and the $ sigh represents the end of a string. The height is added up from the bottom ($) as indicated from the image
I have been able to store the frequency of how often each character gets repeated in the Trie by simply initializing in the class Trienode -> self.freq = 1 and updating it during insert. But this is not the height and I'm out of ideas. Any suggestions would be welcome.
Here's my code without the frequency update:
class TrieNode:
# Trie node class
def __init__(self):
self.children = [None]*26
# isEndOfWord is True if node represent the end of the word
self.isEndOfWord = False
class Trie:
# Trie data structure class
def __init__(self):
self.root = self.getNode()
def getNode(self):
# Returns new trie node (initialized to NULLs)
return TrieNode()
def _charToIndex(self,ch):
# private helper function
# Converts key current character into index
# use only 'a' through 'z' and lower case
return ord(ch)-ord('a')
def insert(self,key):
# If not present, inserts key into trie
# If the key is prefix of trie node,
# just marks leaf node
pCrawl = self.root
length = len(key)
for level in range(length):
index = self._charToIndex(key[level])
# if current character is not present
if not pCrawl.children[index]:
pCrawl.children[index] = self.getNode()
pCrawl = pCrawl.children[index]
# mark last node as leaf
pCrawl.isEndOfWord = True
Thanks
Doing this recursively is probably the best way. You'll need to update your TrieNode class to include a 'height' attribute for this.
def compute_heights(self, node=None):
if not node: node = self.root # Default value
# Base case: If we have found an end, return 0 as the height.
if node.isEndOfWord:
return 0
max_so_far = 0 # Use this to track the maximum height among all children
# Iterate over all children
for child in node.children:
self.compute_heights(child) # Recursively compute height
max_so_far = max(max_so_far, child.height)
node.height = 1 + max_so_far # Update the current node's height
You can simply call this function on an instance, the node value is set to the root if none is provided. The function can also be modified to return the height of the node it is called on.
You can run this after completely defining a trie. To allow insertions and update these values live, you will need to search upwards from the newly-created node for which you will probably need to keep track of each node's parent. With that, you can iteratively check at each step on the way from the inserted leaf to the root if the height needs to be updated.

Sum of nodes in a subtree (not binary)

I'm currently trying to find the sum of all nodes in a specified subtree. For example if I have a tree
A(5)
/ \
B(5) C(6)
/ / \
D(3) E(3) F(7)
|
G(1)
and I want to know the the sum(C), which should return 17.
This is the code I came up with using recursion, but I can't seem to reach a subtree which has more than 2 levels. E.g. my algorithm doesn't seem to reach G. I'm trying to get better at recursion, but I can't seem to fix this.
def navigate_tree(node,key): #node of the root of subtree, along with its key
children = node.get_children()
if (len(children) ==0):
return node.key
else:
for child in children: #not a binary tree so trying to loop through siblings
key += navigate_tree(child,key) #summing up key recursively
return key
You would be better with an improved interface and being able to lean on the features of collections:
def navigate_tree(node):
children = node.get_children()
key = node.key
for child in children:
key += navigate_tree(child)
return key
# class Node and data A..G elided
print(navigate_tree(C))
Output:
17
The reason why your code appeared not to work, was that you were passing the previous key down to the next level of recursion. However, your code seemed to recurse OK. If you had added some print(node.key) you would have seen that you were visiting all the correct nodes.
You can use recursion with sum:
class Node:
def __init__(self, n, v, c=[]):
self.name, self.val, self.children = n, v, c
def get_sum(node):
return node.val+sum(map(get_sum, node.children))
tree = Node('A', 5, [Node('B', 5, [Node('D', 3)]), Node('C', 6, [Node('E', 3), Node('F', 7, [Node('G', 1)])])])
print(get_sum(tree.children[-1]))
Output:
17
However, if you do not have access to the exact node C, you can apply a simple search as part of the recursive function:
def get_sum(t, node):
def inner_sum(d, s=False):
return d.val*(s or d.name == t)+sum(inner_sum(i, s or d.name == t) for i in d.children)
return inner_sum(node)
print(get_sum('C', tree))
Output:
17

common_ancestor function for nested tree dictionary in python

I am trying to create a function called "common_ancestor()" that takes two inputs: the first a list of string taxa names, and the second a phylogenetic tree dictionary. It should return a string giving the name of the taxon that is the closest common ancestor of all the
species in the input list. Already made a separate function called "list_ancestors" that gives me the general ancestors of the elements in the list. Also, have a dictionary I am working with.
tax_dict = {
'Pan troglodytes': 'Hominoidea', 'Pongo abelii': 'Hominoidea',
'Hominoidea': 'Simiiformes', 'Simiiformes': 'Haplorrhini',
'Tarsius tarsier': 'Tarsiiformes', 'Haplorrhini': 'Primates',
'Tarsiiformes': 'Haplorrhini', 'Loris tardigradus':'Lorisidae',
'Lorisidae': 'Strepsirrhini', 'Strepsirrhini': 'Primates',
'Allocebus trichotis': 'Lemuriformes', 'Lemuriformes': 'Strepsirrhini',
'Galago alleni': 'Lorisiformes', 'Lorisiformes': 'Strepsirrhini',
'Galago moholi': 'Lorisiformes'
}
def halfroot(tree):
taxon = random.choice(list(tree))
result = [taxon]
for i in range(0,len(tree)):
result.append(tree.get(taxon))
taxon = tree.get(taxon)
return result
def root(tree):
rootlist = halfroot(tree)
rootlist2 = rootlist[::-1]
newlist = []
for e in range(0,len(rootlist)):
if rootlist2[e] != None:
newlist.append(rootlist2[e])
return newlist[0]
def list_ancestors(taxon, tree):
result = [taxon]
while taxon != root(tree):
result.append(tree.get(taxon))
taxon = tree.get(taxon)
return result
def common_ancestors(inputlist,tree)
biglist1 = []
for i in range(0,len(listname)):
biglist1.append(list_ancestors(listname[i],tree))
"continue so that I get three separate lists where i can cross reference all elements from the first list to every other list to find a common ancestor "
the result should look something like
print(common_ancestor([’Hominoidea’, ’Pan troglodytes’,’Lorisiformes’], tax_dict)
Output: ’Primates’"
One way would be to collect the all ancestors for each species, place them in a set and then get an intersection to get what they have in common:
def common_ancestor(species_list, tree):
result = None # initiate a `None` result
for species in species_list: # loop through each species in the species_list
ancestors = {species} # initiate the ancestors set with the species itself
while True: # rinse & repeat until there are leaves in the ancestral tree
try:
species = tree[species] # get the species' ancestor
ancestors.add(species) # store it in the ancestors set
except KeyError:
break
# initiate the result or intersect it with ancestors from the previous species
result = ancestors if result is None else result & ancestors
# finally, return the ancestor if there is only one in the result, or None
return result.pop() if result and len(result) == 1 else None
print(common_ancestor(["Hominoidea", "Pan troglodytes", "Lorisiformes"], tax_dict))
# Primates
You can use the 'middle' part of this function for the list_ancestors(), too - there is no need to complicate it by trying to find the tree's root:
def list_ancestors(species, tree, include_self=True):
ancestors = [species] if include_self else []
while True:
try:
species = tree[species]
ancestors.append(species)
except KeyError:
break
return ancestors
Of course, both rely on a valid ancestral tree dictionary - if some of the ancestors were to recurse on themselves or if there is a breakage in the chain it won't work. Also, if you were to do a lot of these operations it might be worth to turn your flat dictionary into a proper tree.

Python - Flat list tree implementation: Given child, get parent?

I'm creating a python class for a tree in which each node has a number of children given by "order" (but each child only has one node). I have a method, children(self,i), which returns the children of a node at index i. I need to implement parent(self, i) which will get the parent of a child at index i.
Here's what I have so far:
class Tree:
def __init__(self, order=2, l=[]):
self._tree = l
self._order = order
def children(self, i):
left = self._tree[(i+1)*self._order-1]
right = self._tree[(i+1)*self._order]
return [left, right]
def parent(self, i):
if i>len(self._tree):
return ValueError
elif i==0:
return None
else:
#get parent of node i
An example tree represented by order=2 and list [45, 2, 123, 1, 8, 40, 456] would look like this:
45
/ \
2 123
/ \ / \
1 8 40 456
I know that there's probably a way I can reverse the method I used for children(self, i) but I'm not sure how.
You would do the inverse operation:
else:
#get parent of node i
return self._tree[(i-1)//self._order]
Note that your implementation is only working for binary trees (you return two children, not n). Correct it like this:
def children(self, i):
return self._tree[(i*self._order+1):((i+1)*self._order+1)]

Finding a node in a tree

I am having trouble finding a node in a tree with arbitrary branching factor. Each Node carries data and has zero or greater children. The search method is inside the Node class and
checks to see if that Node carries data and then checks all of that Nodes children. I keep ending up with infinite loops in my recursive method, any help?
def find(self, x):
_level = [self]
_nextlevel = []
if _level == []:
return None
else:
for node in _level:
if node.data is x:
return node
_nextlevel += node.children
_level = _nextlevel
return self.find(x) + _level
The find method is in the Node class and checks if data x is in the node the method is called from, then checks all of that nodes children. I keep getting an infinite loop, really stuck at this point any insight would be appreciated.
There are a few issues with this code. First, note that on line 2 you have _level = [self]. that means the if _level == [] on line 5 will always be false.
The 2nd issue is that your for loop goes over everything in _level, but, as noted above, that will always be [self] due to line 2.
The 3rd issue is the return statement. You have return self.find(x) + _level. That gets evaluated in 2 parts. First, call self.find(x), then concatenate what that returns with the contents of _level. But, when you call self.find(x) that will call the same method with the same arguments and that, in turn, will then hit the same return self.find(x) + _level line, which will call the same method again, and on and on forever.
A simple pattern for recursive searches is to use a generator. That makes it easy to pass up the answers to calling code without managing the state of the recursion yourself.
class Example(object):
def __init__(self, datum, *children):
self.Children = list(children) # < assumed to be of the same or duck-similar class
self.Datum = datum
def GetChildren(self):
for item in self.Children:
for subitem in item.GetChildren():
yield subitem
yield item
def FindInChildren(self, query): # where query is an expression that is true for desired data
for item in self.GetChildren():
if query(item):
yield item

Categories