Implementation of a Trie in Python - python

I programmed a Trie as a class in python. The search and insert function are clear, but now i tried to programm the python function __str__, that i can print it on the screen. But my function doesn't work!
class Trie(object):
def __init__(self):
self.children = {}
self.val = None
def __str__(self):
s = ''
if self.children == {}: return ' | '
for i in self.children:
s = s + i + self.children[i].__str__()
return s
def insert(self, key, val):
if not key:
self.val = val
return
elif key[0] not in self.children:
self.children[key[0]] = Trie()
self.children[key[0]].insert(key[1:], val)
Now if I create a Object of Trie:
tr = Trie()
tr.insert('hallo', 54)
tr.insert('hello', 69)
tr.insert('hellas', 99)
And when i now print the Trie, occures the problem that the entries hello and hellas aren't completely.
print tr
hallo | ellas | o
How can i solve that problem?.

Why not have str actually dump out the data in the format that it is stored:
def __str__(self):
if self.children == {}:
s = str(self.val)
else:
s = '{'
comma = False
for i in self.children:
if comma:
s = s + ','
else:
comma = True
s = s + "'" + i + "':" + self.children[i].__str__()
s = s + '}'
return s
Which results in:
{'h':{'a':{'l':{'l':{'o':54}}},'e':{'l':{'l':{'a':{'s':99},'o':69}}}}}

There are several issues you're running into. The first is that if you have several children at the same level, you'll only be prefixing one of them with the initial part of the string, and just showing the suffix of the others. Another issue is that you're only showing leaf nodes, even though you can have terminal values that are not at a leaf (consider what happens when you use both "foo" and "foobar" as keys into a Trie). Finally, you're not outputting the values at all.
To solve the first issue, I suggest using a recursive generator that does the traversal of the Trie. Separating the traversal from __str__ makes things easier since the generator can simply yield each value we come across, rather than needing to build up a string as we go. The __str__ method can assemble the final result easily using str.join.
For the second issue, you should yield the current node's key and value whenever self.val is not None, rather than only at leaf nodes. As long as you don't have any way to remove values, all leaf nodes will have a value, but we don't actually need any special casing to detect that.
And for the final issue, I suggest using string formatting to make a key:value pair. (I suppose you can skip this if you really don't need the values.)
Here's some code:
def traverse(self, prefix=""):
if self.val is not None:
yield "{}:{}".format(prefix, self.val)
for letter, child in self.children.items():
yield from child.traverse(prefix + letter)
def __str__(self):
return " | ".join(self.traverse())
If you're using a version of Python before 3.3, you'll need to replace the yield from statement with an explicit loop to yield the items from the recursive calls:
for item in child.traverse(prefix + letter)
yield item
Example output:
>>> t = Trie()
>>> t.insert("foo", 5)
>>> t.insert("bar", 10)
>>> t.insert("foobar", 100)
>>> str(t)
'bar:10 | foo:5 | foobar:100'

You could go with a simpler representation that just provides a summary of what the structure contains:
class Trie:
def __init__(self):
self.__final = False
self.__nodes = {}
def __repr__(self):
return 'Trie<len={}, final={}>'.format(len(self), self.__final)
def __getstate__(self):
return self.__final, self.__nodes
def __setstate__(self, state):
self.__final, self.__nodes = state
def __len__(self):
return len(self.__nodes)
def __bool__(self):
return self.__final
def __contains__(self, array):
try:
return self[array]
except KeyError:
return False
def __iter__(self):
yield self
for node in self.__nodes.values():
yield from node
def __getitem__(self, array):
return self.__get(array, False)
def create(self, array):
self.__get(array, True).__final = True
def read(self):
yield from self.__read([])
def update(self, array):
self[array].__final = True
def delete(self, array):
self[array].__final = False
def prune(self):
for key, value in tuple(self.__nodes.items()):
if not value.prune():
del self.__nodes[key]
if not len(self):
self.delete([])
return self
def __get(self, array, create):
if array:
head, *tail = array
if create and head not in self.__nodes:
self.__nodes[head] = Trie()
return self.__nodes[head].__get(tail, create)
return self
def __read(self, name):
if self.__final:
yield name
for key, value in self.__nodes.items():
yield from value.__read(name + [key])

Instead of your current strategy for printing, I suggest the following strategy instead:
Keep a list of all characters in order that you have traversed so far. When descending to one of your children, push its character on the end of its list. When returning, pop the end character off of the list. When you are at a leaf node, print the contents of the list as a string.
So say you have a trie built out of hello and hellas. This means that as you descend to hello, you build a list h, e, l, l, o, and at the leaf node you print hello, return once to get (hell), push a, s and at the next leaf you print hellas. This way you re-print letters earlier in the tree rather than having no memory of what they were and missing them.
(Another possiblity is to just descend the tree, and whenever you reach a leaf node go to your parent, your parent's parent, your parent's parent's parent... etc, keeping track of what letters you encounter, reversing the list you make and printing that out. But it may be less efficient.)

Related

adding childern to tree stucture and print

I am trying to implement an n-arry tree based on this post :
[here][1]
and I am getting an error when trying to define a function that adds children:
class node(object):
def __init__(self, value, children = []):
self.value = value
self.children = children
def __str__(self, level=0):
ret = "\t"*level+repr(self.value)+"\n"
for child in self.children:
ret += child.__str__(level+1)
return ret
# trying to implement this method si that I can get rid of
# calling root.children[0].children
def add_child(self, obj):
self.children.append(obj)
def __repr__(self):
return '<tree node representation>'
root = node('grandmother')
root.children = [node('daughter'), node('son')]
root.children[0].children = [node('granddaughter'), node('grandson')]
root.children[1].children = [node('granddaughter'), node('grandson')]
root.add_child([node((1)), node(2)]) # error
print (root)
I want to be able to to create a tree and print it.
[1]: Printing a Tree data structure in Python
If you name a method add_child, it should add a child, not children. And if you add children, you should extend the list, not just append the given list on its end.
Working example:
class Node(object):
def __init__(self, value, children=None):
if children is None:
children = []
self.value = value
self.children = children
def __str__(self, level=0):
ret = "\t" * level + repr(self.value) + "\n"
for child in self.children:
ret += child.__str__(level + 1)
return ret
def add_children(self, obj):
self.children.extend(obj)
root = Node('grandmother')
root.children = [Node('daughter'), Node('son')]
root.children[0].children = [Node('granddaughter'), Node('grandson')]
root.children[1].children = [Node('granddaughter'), Node('grandson')]
root.add_children([Node(1), Node(2)])
print(root)
Output:
'grandmother'
'daughter'
'granddaughter'
'grandson'
'son'
'granddaughter'
'grandson'
1
2
You call add_child with an entire list object. Within add_child you use the method list.append which adds the entire list object to the list itself.
Solution 1: call add_child by specifying the nodes directly:
root.add_child(node((1))
root.add_child(node((2))
Solution 2: change the implementation of add_child by using list.extend instead of list.append. The former adds each element within the supplied argument to the list, while the latter adds the entire argument to the list.
def add_child(self, obj):
self.children.extend(obj)

Cannot write a function to retrieve all words in a trie

I have a following Trie implementation:
class TrieNode:
def __init__(self):
self.nodes = defaultdict(TrieNode)
self.is_fullpath = False
class Trie:
def __init__(self):
self.root = TrieNode()
def insert(self, word):
curr = self.root
for char in word:
curr = curr.nodes[char]
curr.is_fullpath = True
I'm trying to write a method to retrieve a list of all words in my trie.
t = Trie()
t.insert('a')
t.insert('ab')
print(t.paths()) # ---> ['a', 'ab']
My current implementation looks like this:
def paths(self, node=None):
if node is None:
node = self.root
result = []
for k, v in node.nodes.items():
if not node.is_fullpath:
for el in self.paths(v):
result.append(str(k) + el)
else:
result.append('')
return result
But it does not seem to return full list of words.
Here are the issues in your code:
It doesn't look further when is_fullpath is True. But you should also look deeper (for longer words) in that case.
It should not check node.is_fullpath but v.is_fullpath.
result.append('') is not correct. It should be result.append(str(k))
So your for loop body could look like this:
if v.is_fullpath:
result.append(str(k))
for el in self.paths(v):
result.append(str(k) + el)
I would however do it like this:
Define this recursive generator method on your TrieNode class:
def paths(self, prefix=""):
if self.is_fullpath:
yield prefix
for chr, node in self.nodes.items():
yield from node.paths(prefix + chr)
Note how this passes the collected characters on the path to the recursive call. If at any time the is_fullpath boolean is True, we yield that path. Always we continue the search recursively via child nodes.
The method on the Trie class is then quite simple:
def paths(self):
return list(self.root.paths())

Python list of classes , index() not working

Not sure what i'm doing wrong here. I have this class:
class Node:
'''
Class to contain the lspid seq and all data.
'''
def __init__(self, name,pseudonode,fragment,seq_no,data):
self.name = name
self.data = {}
self.pseudonode = pseudonode
self.seq_no = seq_no
self.fragment = fragment
def __unicode__(self):
full_name = ('%s-%d-%d') %(self.name,self.pseudonode,self.fragment)
return str(full_name)
def __cmp__(self, other):
if self.name > other.name:
return 1
elif self.name < other.name:
return -1
return 0
def __repr__(self):
full_name = ('%s-%d-%d') %(self.name,self.pseudonode,self.fragment)
#print 'iside Node full_name: {} \n\n\n ------'.format(full_name)
return str(full_name)
and putting some entries in a list :
nodes = []
node = Node('0000.0000.0001',0,0,100,{})
nodes.append(node)
>>> nodes
[0000.0000.0001-0-0]
node = Node('0000.0000.0001',1,0,100,{})
nodes.append(node)
>>> nodes
[0000.0000.0001-0-0, 0000.0000.0001-1-0]
i'm trying to get the index of a node in list nodes[]
>>> node
0000.0000.0001-1-0
>>> nodes.index(node)
0
0 is not what i was expecting. Not sure why this is happening.
edit
i'm after getting the index of the list where '0000.0000.0001-1-0' is.
The index function, when used on a container, relies on its element's __cmp__ function to return the index of the first element that it thinks is equal to the input-object. You probably know as much, since you implemented it for the node. But what you are expecting is that __cmp__ considers not only the name, but also the pseudonode and the fragment, right?
A straight-forward approach would be to consider them a tuple, which performs a comparison of elements from left to right, until the first inequality was found:
def __cmp__(self, other):
self_tuple = (self.name, self.pseudonode, self.fragment)
other_tuple = (other.name, other.pseudonode, other.fragment)
if self_tuple > other_tuple:
return 1
elif self_tuple < other_tuple:
return -1
return 0
If you want another order, you can use the tuples-ordering to define it.

Sometimes None is printed - and sometimes it doesn't, can't get why?

I got this school assignment, here is my code:
class Doubly_linked_node():
def __init__(self, val):
self.value = val
self.next = None
self.prev = None
def __repr__(self):
return str(self.value)
class Deque():
def __init__(self):
self.header = Doubly_linked_node(None)
self.tailer = self.header
self.length = 0
def __repr__(self):
string = str(self.header.value)
index = self.header
while not (index.next is None):
string+=" " + str(index.next.value)
index = index.next
return string
def head_insert(self, item):
new = Doubly_linked_node(item)
new.next=self.header
self.header.prev=new
self.header=new
self.length+=1
if self.tailer.value==None:
self.tailer = self.header
def tail_insert(self, item):
new = Doubly_linked_node(item)
new.prev=self.tailer
self.tailer.next=new
self.tailer=new
self.length+=1
if self.header.value==None:
self.header = self.tailer
it builds a stack, allowing you to add and remove items from the head or tail (I didn't include all the code only the important stuff).
When I initiate an object, if I return self.next it prints None, but if I return self.prev, it prints nothing, just skips, I don't understand why since they are both defined exactly the same as you see, and if I insert only head several times for example for i in range(1,5): D.head_insert(i) and then I print D it prints 5 4 3 2 1 None but if I do tail insert for example for i in range(1,5): D.tail_insert(i) and print D it prints 1 2 3 4 5"as it should without the None. Why is that?
I have included an image:
Keep in mind that you create a Deque which is not empty. You're initializing it with a Node with value None
You're interchanging the value and the Node object. When you're checking if self.tailer.value==None: it's probably not what you're meaning
Following to point 2 is a special handling for the empty Deque, where header and tailer is None
Here is what I have in mind, if I would implement the Deque. I'm slightly changed the return value of __repr__.
class Deque():
def __init__(self):
self.header = None
self.tailer = None
self.length = 0
def __repr__(self):
if self.header is None:
return 'Deque<>'
string = str(self.header.value)
index = self.header.next
while index!=None:
string+=" " + str(index.value)
index = index.next
return 'Deque<'+string+'>'
def head_insert(self, item):
new = Doubly_linked_node(item)
new.next=self.header
if self.length==0:
self.tailer=new
else:
self.header.prev=new
self.header=new
self.length+=1
def tail_insert(self, item):
new = Doubly_linked_node(item)
new.prev=self.tailer
if self.length==0:
self.header=new
else:
self.tailer.next=new
self.tailer=new
self.length+=1
Following Günthers advice, I have modified the __repr__ to this:
def __repr__(self):
string = str(self.header.value)
index = self.header
while not (str(index.next) == "None"):
string += (" " + str(index.next.value))
index = index.next
return string
that did solve the problem, but it is the ugliest solution I have ever seen.
does anyone know a better way?
Following to the question of a better __repr__ method here my proposal. Extend the Deque class with an __iter__ method. So you can iterate over the Deque which is nice to have, e.g.:
for item in D:
print item
Based on that the __repr__ method is easy. Here is the whole change:
def __repr__(self):
return 'Deque<'+' '.join([str(item.value) for item in self])+'>'
def __iter__(self):
index=self.header
while index is not None:
yield index.value
index=index.next

Trees in Python

Please help me to understand trees in Python. This is an example of tree implementation I found in the Internet.
from collections import deque
class EmptyTree(object):
"""Represents an empty tree."""
# Supported methods
def isEmpty(self):
return True
def __str__(self):
return ""
def __iter__(self):
"""Iterator for the tree."""
return iter([])
def preorder(self, lyst):
return
def inorder(self, lyst):
return
def postorder(self, lyst):
return
class BinaryTree(object):
"""Represents a nonempty binary tree."""
# Singleton for all empty tree objects
THE_EMPTY_TREE = EmptyTree()
def __init__(self, item):
"""Creates a tree with
the given item at the root."""
self._root = item
self._left = BinaryTree.THE_EMPTY_TREE
self._right = BinaryTree.THE_EMPTY_TREE
def isEmpty(self):
return False
def getRoot(self):
return self._root
def getLeft(self):
return self._left
def getRight(self):
return self._right
def setRoot(self, item):
self._root = item
def setLeft(self, tree):
self._left = tree
def setRight(self, tree):
self._right = tree
def removeLeft(self):
left = self._left
self._left = BinaryTree.THE_EMPTY_TREE
return left
def removeRight(self):
right = self._right
self._right = BinaryTree.THE_EMPTY_TREE
return right
def __str__(self):
"""Returns a string representation of the tree
rotated 90 degrees to the left."""
def strHelper(tree, level):
result = ""
if not tree.isEmpty():
result += strHelper(tree.getRight(), level + 1)
result += " " * level
result += str(tree.getRoot()) + "\n"
result += strHelper(tree.getLeft(), level + 1)
return result
return strHelper(self, 0)
def __iter__(self):
"""Iterator for the tree."""
lyst = []
self.inorder(lyst)
return iter(lyst)
def preorder(self, lyst):
"""Adds items to lyst during
a preorder traversal."""
lyst.append(self.getRoot())
self.getLeft().preorder(lyst)
self.getRight().preorder(lyst)
def inorder(self, lyst):
"""Adds items to lyst during
an inorder traversal."""
self.getLeft().inorder(lyst)
lyst.append(self.getRoot())
self.getRight().inorder(lyst)
def postorder(self, lyst):
"""Adds items to lystduring
a postorder traversal."""
self.getLeft().postorder(lyst)
self.getRight().postorder(lyst)
lyst.append(self.getRoot())
def levelorder(self, lyst):
"""Adds items to lyst during
a levelorder traversal."""
# levelsQueue = LinkedQueue()
levelsQueue = deque ([])
levelsQueue.append(self)
while levelsQueue != deque():
node = levelsQueue.popleft()
lyst.append(node.getRoot())
left = node.getLeft()
right = node.getRight()
if not left.isEmpty():
levelsQueue.append(left)
if not right.isEmpty():
levelsQueue.append(right)
This is programm that makes the small tree.
"""
File: testbinarytree.py
Builds a full binary tree with 7 nodes.
"""
from binarytree import BinaryTree
lst = ["5", "+", "2"]
for i in range(len(lst)):
b = BinaryTree(lst[0])
d = BinaryTree(lst[1])
f = BinaryTree(lst[2])
# Build the tree from the bottom up, where
# d is the root node of the entire tree
d.setLeft(b)
d.setRight(f)
def size(tree):
if tree.isEmpty():
return 0
else:
return 1 + size(tree.getLeft()) + size(tree.getRight())
def frontier(tree):
"""Returns a list containing the leaf nodes
of tree."""
if tree.isEmpty():
return []
elif tree.getLeft().isEmpty() and tree.getRight().isEmpty():
return [tree.getRoot()]
else:
return frontier(tree.getLeft()) + frontier(tree.getRight())
print ("Size:", size(d))
print ("String:")
print (d)
How can I make a class that will count the value of the expression, such that the answer = 7 (5+2). I really want to understand the concept with a small example.
It sounds like your problem isn't trees, which are a much more general (and simple) concept, but in how to properly populate and/or evaluate an expression tree.
If you have your operators specified in post-fix order, it becomes a lot easier.
See this wikipedia article on how to deal with infix notation when parsing input to a desktop calculator. It is called the shunting-yard algorithm.
You should do function that walks a tree in depth first order, calculating value of each node, either just taking value of it (if it is "5" for example), or making calculation (if it is "+" for example) - by walking the tree in depth first order you are sure that all subnodes of given node will be calculated when you are calculating that node (for example "5" and "2" will be calculated when you are calculating "+").
Then, at the root of the tree you'll get the result of the whole tree.
First of all, I'm not going to give much detail in case this is homework, which it sounds a bit like.
You need a method on your tree class that evaluates the tree. I suppose it'll assume that the "root" value of each tree node is a number (when the node is a leaf, i.e. when it has no children) or the name of an operator (When the node has children).
Your method will be recursive: the value of a tree-node with children is determined by (1) the value of its left subtree, (2) the value of its right subtree, and (3) the operator in its "root".
You'll probably want a table -- maybe stored in a dict -- mapping operator names like "+" to actual functions like operator.add (or, if you prefer, lambda x,y: x+y).

Categories