how do i represent binary search trees in python?
class Node(object):
def __init__(self, payload):
self.payload = payload
self.left = self.right = 0
# this concludes the "how to represent" asked in the question. Once you
# represent a BST tree like this, you can of course add a variety of
# methods to modify it, "walk" over it, and so forth, such as:
def insert(self, othernode):
"Insert Node `othernode` under Node `self`."
if self.payload <= othernode.payload:
if self.left: self.left.insert(othernode)
else: self.left = othernode
else:
if self.right: self.right.insert(othernode)
else: self.right = othernode
def inorderwalk(self):
"Yield this Node and all under it in increasing-payload order."
if self.left:
for x in self.left.inorderwalk(): yield x
yield self
if self.right:
for x in self.right.inorderwalk(): yield x
def sillywalk(self):
"Tiny, silly subset of `inorderwalk` functionality as requested."
if self.left:
self.left.sillywalk()
print(self.payload)
if self.right:
self.right.sillywalk()
etc, etc -- basically like in any other language which uses references rather than pointers (such as Java, C#, etc).
Edit:
Of course, the very existence of sillywalk is silly indeed, because exactly the same functionality is a singe-liner external snippet on top of the walk method:
for x in tree.walk(): print(x.payload)
and with walk you can obtain just about any other functionality on the nodes-in-order stream, while, with sillywalk, you can obtain just about diddly-squat. But, hey, the OP says yield is "intimidating" (I wonder how many of Python 2.6's other 30 keywords deserve such scare words in the OP's judgment?-) so I'm hoping print isn't!
This is all completely beyond the actual question, on representing BSTs: that question is entirely answered in the __init__ -- a payload attribute to hold the node's payload, left and right attribute to hold either None (meaning, this node has no descendants on that side) or a Node (the top of the sub-tree of descendants on the appropriate side). Of course, the BST constraint is that every left descendant of each node (if any) has a payload less or equal than that of the node in question, every right one (again, if any) has a greater payload -- I added insert just to show how trivial it is to maintain that constraint, walk (and now sillywalk) to show how trivial it is to get all nodes in increasing order of payloads. Again, the general idea is just identical to the way you'd represent a BST in any language which uses references rather than pointers, like, for example, C# and Java.
Related
I am working on this problem that was asked by Google (not to me):
Given the root to a binary tree, implement serialize(root), which
serializes the tree into a string, and deserialize(s), which
deserializes the string back into the tree.
This is what I have so far, but I cannot seem to make the function serialize store the results (from bottom of the tree and up) into a string. So I'm able to print the results, just not store it...
class Tree:
def __init__(self, data):
self.data = data
self.left = None
self.right = None
self.ser_str = None
def insert(self, data):
if self.data:
if data < self.data:
if self.left is None:
self.left = Tree(data)
else:
self.left.insert(data)
elif data >= self.data:
if self.right is None:
self.right = Tree(data)
else:
self.right.insert(data)
else:
self.data = data
def serialize(self):
if self.left:
self.left.serialize()
print(self.data)
if self.right:
self.right.serialize()
root = Tree(23)
root.insert(10);root.insert(124);root.insert(101);root.insert(1);root.insert(40)
print("here comes the sun")
test = root.serialize()
As noted in comments, the problem with your method is that you just print the data, but never join it to a string and return it. However, that format would also not allow for an unambiguous deserialization of the tree.
If you do not want to use libraries like json, which make this trivial, you could resort to an easily parseable format like Polish Notation, where a + b is written as + a b. In the tree case, the data corresponds to the operator and the left and right branch to the operands.
def serialize(t):
if t is None:
return "-"
else:
return f"{t.data} {serialize(t.left)} {serialize(t.right)}"
def deserialize(s):
if isinstance(s, str): s = iter(s.split())
# using an iterator of chunks makes this easier
d = next(s)
if d == "-":
return None
else:
return Tree(d, deserialize(s), deserialize(s))
(Note that I made those functions, not methods, and added a few optional parameters to the Tree constructor to make the code simpler.)
When testing with your tree, this serialized the tree to 23 10 1 - - - 124 101 40 - - - -. (I then deserialized the tree and serialized it again and got the same format, so deserialization should work, too.) You can add parens to better see the tree structure in the string: (23 (10 (1 - -) -) (124 (101 (40 - -) -) -)), but the format is unambiguous even without parens.
This basically corresponds to a simple pre-order traversal of the tree, whereas you are doing an in-order traversal, which, like infix-notation a + b, is not unambiguous without parentheses. In-order traversal returns the elements in sorted order, which is nice for some uses, but not here, as it means that differently structured trees holding the same element will serialize to the same sorted list. You could, of course, just add parens, but that will make parsing/deserialization much harder. With parens and in-order, your tree would be (((- 1 -) 10 -) 23 (((- 40 -) 101 -) 124 -)).
(Note: Even if the serialization is ambiguous, you could recreate a binary tree from that, but the form of that tree will be different; in particular, it will be a degenerate binary tree if you just insert the elements in sorted order, as they come out of in your in-order-traversal.)
Trying to build a binary search tree in python and came across this weird bug. After deleting nodes using my delete_node function, deleted nodes are still being printed, but only ones that are being deleted properly are nodes that have two other nodes attached to it (these ones are supposed to be hardest to delete though)
Here's the code:
class Node:
def __init__(self, data):
self.Left = self.Right = None
self.T_data = data
# function that deletes nodes from the tree
def delete_node(self, item):
if self is None:
return self
elif item < self.T_data:
self.Left.delete_node(item)
elif item > self.T_data:
self.Right.delete_node(item)
else:
# case when the node we want to delete has no leaves attached to it
if self.Right is None and self.Left is None:
self = None
# cases when a node has either left or right leaf node
elif self.Left is None:
temp = self.Right
self = None
return temp
elif self.Right is None:
temp = self.Left
self = None
return temp
else: #case when a node has two leaf nodes attached
temp = self.Right.min_node()
self.T_data = temp.T_data
self.Right = self.Right.delete_node(temp.T_data)
return self
As you can see the way nodes are deleted is using a recursion, so for double-branched nodes to get deleted, the single-branch node deletion should work properly, but it does not.
heres the print function and how the functions are called:
# function that prints contents of the tree in preorder fashion
def print_tree_preorder(self):
if self is None:
return
print("%s" % self.T_data)
if self.Left is not None:
self.Left.print_tree_preorder()
if self.Right is not None:
self.Right.print_tree_preorder()
x = int(input("Which element would you like to delete?\n"))
root = root.delete_node(x)
root.print_tree_preorder()
What you're doing right now, when you have:
self = None
Is not actually deleting the object itself. What you're doing is assigning self to a different value.
I think a good way to illustrate this problem is thinking of self and other variables as a tag.
When you say:
a = 3
You are essentially having the tag a put on the entity 3. 3 resides somewhere in memory, and a "points" to 3(although pointers in C++ isn't really the references in python, so be careful if you're going to make that comparison).
When you point self to None, what you wanted to say was:
So I want to remove this object, and all things that point to this object will point to None instead.
However, what you're currently saying is:
So I want to set my self tag to point to None.
Which is completely different. Just because you set your self tag to None does not mean you set the node's parents .Right or .Left members to None as well.
The solution? Well, you're not gonna like this, but you're gonna have to either:
have a pointer to the parent for each node, and set the parent's child(this child specifically) to None.
check 1 levels deeper in your tree, so you can delete the child node instead of deleting the node itself.
The reason the case for 2 node children works is because you're setting the attribute of the object here, instead of setting self=None. What this means is that you're still pointing to the same object here, specifically on this line:
self.T_data = temp.T_data
It's the difference between "Coloring a object does not make it a different object. Its traits are just different" vs. "replacing a object with another object makes it a different object".
I need to write a recursion for a min-heap binary tree to check if this tree is min-heap. One of the test cases is just NONE.
Is None considered a min-heap tree and returns True, or None is False?
The reason I am asking is that I will reach leaves at some point and their nodes are None and if base case is True then it will return True.
I believe that a none type will be vacuously true as it does not violate the definition of a min-heap tree.
Yes, None is considered a mean-heap tree.
You must mean a Min Heap. When we are dealing with any tree structure the children of a Node are most commonly initialized as None. One of the reasons is that we can easily escape the recursion as such:
def find_node(node, data):
if root is None:
return
if root.data == data:
print "Node found"
find_node(node.left, data)
find_node(node.right, data)
class Node(object):
def __init__(self, data):
self.left = None
self.right = None
self.data = data
In your case you want to check if a tree is min heap by traversing it. You would do something like that
def is_min_heap(root):
#.....check here and then do
return is_min_heap(root.left) and is_min_heap(root.right)
But it depends how you want to handle it. Any one node with no children is a min-heap or a max-heap but it has no meaning. If you want to call
is_min_heap(None) then you are free to do so but it is up to you if you want to say that is True or not.
This question is for school (homework) so I am not asking for code, and I don't want any, just an idea. I have to write a function that returns two lists, a list of the leaves and a list of the internal nodes of a binary tree. My algorithm is:
1) If both the left and the right subtrees are None, it is a leaf, and so I add it to the leaves list.
2) If they are not, then I add it to the internals list, and call the function on the left subtree, and then on the right, if they exist.
This is the code I have written:
def leaves_and_internals(self):
leaves = []
internals = []
if self.left is None and self.right is None:
leaves.append(self.item)
else:
internals.append(self.item)
if self.left != None:
leaves_and_internals(self.left)
else:
leaves_and_internals(self.right)
return internals, leaves
I'm pretty sure that the algorithm is correct, but I think that every time I recurse on the Nodes, the lists will get reset. How can I get around this?
Any help is greatly appreciated. Thanks
I have not looked into the algorithm of your code, and just merely suggesting an answer to the problem you're stuck at. You could pass leaves and internals as arguments to the recursive function, so that their contents get retained across the recursive calls.
In python, if you pass a mutable object to a function/method, the function/method gets a reference to the object. So as long as you still treat it as the same mutable object (i.e. not assign the parameter with something else directly), any changes you make to the object are also visible to the caller. Since list is a mutable type, this behavior is very much helpful for the case you're interested in.
And make sure to initialize the lists to [] before calling the leaves_and_internals function from outside.
def leaves_and_internals(self, leaves, internals):
if self.left is None and self.right is None:
leaves.append(self.item)
else:
internals.append(self.item)
if self.left != None:
leaves_and_internals(self.left, leaves, internals)
else:
leaves_and_internals(self.right, leaves, internals)
return
# Somewhere outside
leaves = []
internals = []
myobj.leaves_and_internals(leaves, internals)
UPDATE:
Since the OP mentions he cannot change the signature of the method nor use instance variables, this is an alternate solution I can think of which returns the leaves and internals to the caller. BTW, I assume some nodes in your tree can have both left and right, so you would need to check both (i.e. use 2 separate if instead of an if...else).
def leaves_and_internals(self):
leaves = []
internals = []
if self.left is None and self.right is None:
leaves = [ self.item ]
else:
if self.left != None:
leaves, internals = leaves_and_internals(self.left)
if self.right != None:
templeaves, tempinternals = leaves_and_internals(self.right)
leaves += templeaves
internals += tempinternals
internals.append(self.item)
return leaves, internals
I am creating a doubly linked structure and am having some issues with comparing if two nodes are equal. The structure is fairly complex in that it has multiple attributes including name, row, column, right, left, up, and down. If two nodes are equal they must agree on all of these attributes. I know in my eq method I could simply hard code checking each attribute versus the other but I figured there would be an easier way to do it and found a way that works most of the time. Thus I have the following:
def __init__ (self,row,col,name=None,up=None,down=None,left=None,right=None):
self.name = name
self.row = row
self.col = col
self.up = up
self.down = down
self.left = left
self.right = right
def __eq__ (self, other):
return vars(self) == vars(other)
And various other methods that aren't really important to this. So my shortcut for determining whether two Nodes was to basically look at the dictionary of their variables and let python compare the two dictionaries for equivalence.
This works great! As long as the two nodes are actually equal. It returns True and I go on my merry way with my code. BUT if the two nodes are actually not equal it falls apart. I get
File "*filename*", line 35 in __eq__ return vars(self) == vars(self)
written to the screen numerous amounts of times until it finally says
RuntimeError: maximum recursion depth exceeded
I know there are some ways around this, i.e. I could explicitly check each attribute, but that's lame and I want to know why this isn't working, and if it can be easily fixed. I have tested this method with other simpler dictionaries and it works so my thought is that the issue has something to do with determining if objects are equal but I have no idea what I could do here. I realize I could also just do a error catch and then make that return False but something other than those two solutions would be appreciated,
It looks like your up, down, etc are pointing to other instances of your class.
Your comparison code is basically saying, to test if self == other, does self.up == other.up? does self.up.up == other.up.up? etc. And then recursing until it runs out of space.
You may instead want to use
def __eq__(self, other):
return self.name == other.name \
and self.row == other.row \
and self.col == other.col \
and self.up is other.up \
and self.down is other.down \
and self.left is other.left \
and self.right is other.right
I have no python at hand, but I guess this is what happens:
in the __dict__ of self and in the __dict__ of other is areference to one of your nodes
now this node is compared for equality (once the one from vars, once the one from other), this causes your comparison method to be called.
If you now have a loop (e.g common parent) you get infinite recursion:
in original comparison:
compare self.parent to other.parent
in parent comparison:
compare self.parent.child to other.parent.child
(parent and child refer to your up and down)
try(untested):
def __eq__(self, other):
for s, o in zip(vars(self),vars(other)):
if not s is o and s != o:
return False
return True
basically what Hugh Bothwell suggested, just in a loop. First check if you have the same object in memory, if so don't compare them, otherwise test.