Python pass by object reference value - python

I'm trying to write an algorithm in python to print out all paths from the root of a (binary) tree to each leaf. Here's my code:
def fb_problem(node, curr_trav):
curr_trav = curr_trav + [node]
if node.left is None and node.right is None:
for path_node in curr_trav:
print path_node.data
print "XXX"
if node.left is not None:
fb_problem(node.left, curr_trav)
if node.right is not None:
fb_problem(node.right, curr_trav)
fb_problem(root, [])
I keep a list of nodes in the current traversal, and when I've reached a leaf, I print out the list. I'm misunderstanding something about the way python passes objects though. I thought that as each recursive call completes and is popped off the stack, the original curr_trav variable would not be affected by what the recursive call did. However, it seems as if the line
curr_trav += [node]
Is mutating the original list. The += operator returns a new list, as opposed to .append(), which actually mutates the original object. So shouldn't this call just be reassigning the name given to the object in the function, not mutating the original object? When I change the line to something like
t_trav = curr_trav += [node]
Everything works fine, but I don't understand what the problem with the original line was. Please let me know if my question is unclear.

With python it is neither by value or reference. It is a combination of both, and depends on the type of object being passed into the function. For example if a mutable type such as dict, list etc is passed in it will pass the reference. Whereas with a immutable type such as a str it will be by value. A good read on this subject is by Jeff Knupp.
The issue with your original code curr_trav += [node] is that it is adding the values of [node] to curr_trav and setting the reference to the new list. Because it passes the reference for curr_trav it will be changed through each subsequent iteration.

Your understanding of += is not quite correct. All operators in Python are really just shortcuts. For example, a + b is a.__add__(b) if a has an __add__ method. If a does not, it is b.__radd__(a). If b doesn't have that method, an error is raised. Usually, a += b behaves quite like a = a + b, but in the case of mutable objects, it usually doesn't. That is because a += b is a.__iadd__(b) if a has the __iadd__ method. If a does not, it is the same as a = a.__add__(b). If a doesn't have that either, it is the same as a = b.__radd__(a). Since lists do have the __iadd__ method, the actual list object is changed instead of redefining curr_trav.

Related

Understanding Mutability and Multiple Variable Assignment to Class Objects in Python

I'm looking for some clarification regarding mutability and class objects. From what I understand, variables in Python are about assigning a variable name to an object.
If that object is immutable then when we set two variables to the same object, it'll be two separate copies (e.g. a = b = 3 so a changing to 4 will not affect b because 3 is a number, an example of an immutable object).
However, if an object is mutable, then changing the value in one variable assignment will naturally change the value in the other (e.g. a = b = [] -> a.append(1) so now both a and b will refer to "[1]")
Working with classes, it seems even more fluid than I believed. I wrote a quick example below to show the differences. The first class is a typical Node class with a next pointer and a value. Setting two variables, "slow" and "fast", to the same instance of the Node object ("head"), and then changing the values of both "slow" and "fast" won't affect the other. That is, "slow", "fast", and "head" all refer to different objects (verified by checking their id() as well).
The second example class doesn't have a next pointer and only has a self.val attribute. This time changing one of the two variables, "p1" and "p2", both of which are set to the same instance, "start", will affect the other. This is despite that self.val in the "start" instance is an immutable number.
'''
The below will have two variable names (slow, fast) assigned to a head Node.
Changing one of them will NOT change the other reference as well.
'''
class Node:
def __init__(self, x, next=None):
self.x = x
self.next = next
def __str__(self):
return str(self.x)
n3 = Node(3)
n2 = Node(2, n3)
n1 = Node(1, n2)
head = n1
slow = fast = head
print(f"Printing before moving...{head}, {slow}, {fast}") # 1, 1, 1
while fast and fast.next:
fast = fast.next.next
slow = slow.next
print(f"Printing after moving...{head}, {slow}, {fast}") # 1, 2, 3
print(f"Checking the ids of each variable {id(head)}, {id(slow)}, {id(fast)}") # all different
'''
The below will have two variable names (p1, p2) assigned to a start Dummy.
Changing one of them will change the other reference as well.
'''
class Dummy:
def __init__(self, val):
self.val = val
def __str__(self):
return str(self.val)
start = Dummy(100)
p1 = p2 = start
print(f"Printing before changing {p1}, {p2}") # 100, 100
p1.val = 42
print(f"Printing after changing {p1}, {p2}") # 42, 42
This is a bit murky for me to understand what is actually going on under the hood and I'm seeking clarification so I can feel confident in setting multiple variable assignments to the same object expecting a true copy (without resorting to "import copy; copy.deepcopy(x);")
Thank you for your help
This isn't a matter of immutability vs mutability. This is a matter of mutating an object vs reassigning a reference.
If that object is immutable then when we set two variables to the same object, it'll be two separate copies
This isn't true. A copy won't be made. If you have:
a = 1
b = a
You have two references to the same object, not a copy of the object. This is fine though because integers are immutable. You can't mutate 1, so the fact that a and b are pointing to the same object won't hurt anything.
Python will never make implicit copies for you. If you want a copy, you need to copy it yourself explicitly (using copy.copy, or some other method like slicing on lists). If you write this:
a = b = some_obj
a and b will point to the same object, regardless of the type of some_obj and whether or not it's mutable.
So what's the difference between your examples?
In your first Node example, you never actually alter any Node objects. They may as well be immutable.
slow = fast = head
That initial assignment makes both slow an fast point to the same object: head. Right after that though, you do:
fast = fast.next.next
This reassigns the fast reference, but never actually mutates the object fast is looking at. All you've done is change what object the fast reference is looking at.
In your second example however, you directly mutate the object:
p1.val = 42
While this looks like reassignment, it isn't. This is actually:
p1.__setattr__("val", 42)
And __setattr__ alters the internal state of the object.
So, reassignment changes what object is being looked at. It will always take the form:
a = b # Maybe chained as well.
Contrast with these that look like reassignment, but are actually calls to mutating methods of the object:
l = [0]
l[0] = 5 # Actually l.__setitem__(0, 5)
d = Dummy()
d.val = 42 # Actually d.__setattr__("val", 42)
You overcomplicate things. The fundamental, simple rule is: each time you use = to assign an object to a variable, you make the variable name refer to that object, that's all. The object being mutable or not makes no difference.
With a = b = 3, you make the names a and b refer to the object 3. If you then make a = 4, you make the name a refer to the object 4, and the name b still refers to 3.
With a = b = [], you've created two names a and b that refer to the same list object. When doing a.append(1), you append 1 to this list. You haven't assigned anything to a or b in the process (you didn't write any a = ... or b = ...). So, whether you access the list through the name a or b, it's still the same list that you manipulate. It can just be called by two different names.
The same happens in your example with classes: when you write fast = fast.next.next, you make the name fast refer to a new object.
When you do p1.val = 42, you don't make p1 refer to a new different instance, but you change the val attribute of this instance. p1 and p2are still two names for this unique instance, so using either name lets you refer to the same instance.
Mutable and Immutable Objects
When a program is run, data objects in the program are stored in the computer’s
memory for processing. While some of these objects can be modified at that memory
location, other data objects can’t be modified once they are stored in the memory. The
property of whether or not data objects can be modified in the same memory location
where they are stored is called mutability. We can check the mutability of an object by checking its memory location before and
after it is modified. If the memory location remains the same when the data object is
modified, it means it is mutable. To check the memory location of where a data object is stored, we use the function, id(). Consider the following example
a=[5, 10, 15]
id(a)
#1906292064
a[1]=20
id(a)
#1906292064
#Assigning values to the list a. The ID of the memory location where a is stored.
#Replacing the second item in the list,10 with a new item, 20.
#print(a) Using the print() function to verify the new value of a.# Using the function #id() to get the memory location of a.
#The ID of the memory location where a is stored.
the memory location has not changed as the ID remains (1906292064)
remains the same before and after the variable is modified. This indicates that the list
is mutable, i.e., it can be modified at the same memory location where it is stored

Why does recursive function for travelling to front of doubly linked list not work?

I am a novice who has just finished edX's introductory course MIT 6.00.1x; the following is related to a problem on that course's final exam (now concluded, so I can seek help). Let
def class DLLNode(object):
def __init__(self, name):
self.cargo = cargo
self.before = None
self.after = None
def setBefore(self, before): self.before = before
def setAfter(self, after): self.after = after
def getBefore(self): return self.before
def getAfter(self): return self.after
def getCargo(self): return self.cargo
be used to create a doubly linked list. Suppose node is an instance of class DLLNode that appears in a doubly linked list. Then node.getBefore() returns that node's immediate predecessor in the list, except that it returns None if node is at the front of the list and so has no predecessor.
I have written a recursive function
def firstInList(nodeInList):
""" Prints out the cargo carried by the first node in that doubly linked list
of which nodeInList is a part. Returns that first node. """
if nodeInList.getBefore() == None:
firstnode = nodeInList
print firstnode.getCargo()
return firstnode
# nodeInList.getBefore() is not None, so nodeInList has an immediate predecessor
# on which firstInList can be be called.
firstInList(nodeInList.getBefore())
that I wish to return the first node in a doubly linked list, given as argument a known node nodeInList in the list.
My problem: firstInList arrives at the correct first node, as evidenced by its printing the first node's cargo regardless of the specific nodeInList used. But whenever nodeInList is not the first node in the linked list, the return value of firstInList(node) turns out to be None rather than the desired first node. This conclusion is based on the following: If, for example, the list's first node node1 has cargo 1 and is followed by node2 with cargo 2, then firstInList(node2) == None evaluates as True but firstInList(node2) == node1 evaluates as False. A call firstInList(node2).getCargo() will return an error message
Attribute Error: 'NoneType' object has no attribute 'getCargo'
Another datum is that firstInList(node1) == node1 evaluates as True; that, at least, is as I would expect.
This suggests the firstnode found is not being returned back up the chain of recursive calls in the way I have imagined. Can anyone explain why?
(Please do not suggest that I use iteration instead of recursion. I know how to do that. I am trying to understand Python 2.7's behavior for the code as written.)
Well, it would appear that you're not returning the result of the recursion, so the function will in all cases but the degenerate simply return the default uninitialized value.
The last line should be:
return firstInList(nodeInList.getBefore())
Many thanks to Nathan Tuggy. At first I misunderstood what you were saying, but in fact you were correct.
My firstInList function worked perfectly once I changed the last line
firstInList(nodeInList.getBefore())
to read
return firstInList(nodeInList.getBefore()) .
Given the ridiculous number of hours I've spent worrying about this, I think this is a type of mistake I'm not likely to make in the future. Or if I do, I'll be able to discover the problem myself.

leaves_and_internals Function

This question is for school (homework) so I am not asking for code, and I don't want any, just an idea. I have to write a function that returns two lists, a list of the leaves and a list of the internal nodes of a binary tree. My algorithm is:
1) If both the left and the right subtrees are None, it is a leaf, and so I add it to the leaves list.
2) If they are not, then I add it to the internals list, and call the function on the left subtree, and then on the right, if they exist.
This is the code I have written:
def leaves_and_internals(self):
leaves = []
internals = []
if self.left is None and self.right is None:
leaves.append(self.item)
else:
internals.append(self.item)
if self.left != None:
leaves_and_internals(self.left)
else:
leaves_and_internals(self.right)
return internals, leaves
I'm pretty sure that the algorithm is correct, but I think that every time I recurse on the Nodes, the lists will get reset. How can I get around this?
Any help is greatly appreciated. Thanks
I have not looked into the algorithm of your code, and just merely suggesting an answer to the problem you're stuck at. You could pass leaves and internals as arguments to the recursive function, so that their contents get retained across the recursive calls.
In python, if you pass a mutable object to a function/method, the function/method gets a reference to the object. So as long as you still treat it as the same mutable object (i.e. not assign the parameter with something else directly), any changes you make to the object are also visible to the caller. Since list is a mutable type, this behavior is very much helpful for the case you're interested in.
And make sure to initialize the lists to [] before calling the leaves_and_internals function from outside.
def leaves_and_internals(self, leaves, internals):
if self.left is None and self.right is None:
leaves.append(self.item)
else:
internals.append(self.item)
if self.left != None:
leaves_and_internals(self.left, leaves, internals)
else:
leaves_and_internals(self.right, leaves, internals)
return
# Somewhere outside
leaves = []
internals = []
myobj.leaves_and_internals(leaves, internals)
UPDATE:
Since the OP mentions he cannot change the signature of the method nor use instance variables, this is an alternate solution I can think of which returns the leaves and internals to the caller. BTW, I assume some nodes in your tree can have both left and right, so you would need to check both (i.e. use 2 separate if instead of an if...else).
def leaves_and_internals(self):
leaves = []
internals = []
if self.left is None and self.right is None:
leaves = [ self.item ]
else:
if self.left != None:
leaves, internals = leaves_and_internals(self.left)
if self.right != None:
templeaves, tempinternals = leaves_and_internals(self.right)
leaves += templeaves
internals += tempinternals
internals.append(self.item)
return leaves, internals

Why does this function print a Linked List backwards?

I'm working through Downey's How To Think Like a Computer Scientist and I have a question regarding his print_backward() function for Linked List.
First, here's Downey's implementation of a Linked List in Python:
class Node:
#initialize with cargo (stores the value of the node)
#and the link. These are set to None initially.
def __init__(self, cargo = None, next = None):
self.cargo = cargo
self.next = next
def __str__(self):
return str(self.cargo)
We give this class the following cargo and link values:
#cargo
node1 = Node('a')
node2 = Node('b')
node3 = Node('c')
#link them
node1.next = node2
node2.next = node3
To print the linked list, we use another of Downey's functions.
def printList(node):
while node:
print node,
node = node.next
>>>printList(node1)
>>>a b c
All very straightforward. But I don't understand how the recursive call in the following function allows one to print the linked list backwards.
def print_backward(list):
if list == None : return
print_backward(list.next)
print list,
>>>print_backward(node1)
>>>c b a
Wouldn't calling "list.next" as the value of print_backward simply give you "b c"?
Note: Several people below have pointed out that this function is badly designed since, given any list, we cannot show that it will always reach the base case. Downey also points out this problem later in the same chapter.
In the forward-printing version, it prints each node before doing the recursive call. In the backward-printing version, it prints each node after doing the recursive call.
This is not coincidental.
Both of the functions recurse until the end of the list is reached. The difference is whether printing happens during this process or afterward.
Function calls use a stack, a last-in first-out data structure that remembers where the computer was executing code when the function call was made. What is put on in the stack in one order comes off in the opposite order. Thus, the recursion is "unwound" in the reverse order of the original calls. The printing occurs during the unwinding process, i.e., after each recursive call has completed.
def print_backward(list):
if list == None : return
print_backward(list.next)
print list,
Wouldn't calling "list.next" as the value of print_backward simply give you "b c"?
No; picture what happens when a->b->c gets passed to print_backward:
"[b c]" is passed to print_backward and then "a" is printed.
But "print_backward", before "a" is printed, calls itself. So:
[ a b c ] is not None, so b->c gets passed to print_backward
[ b c ] is passed to print_backward
[ c] is passed to print_backward
None is passed to print_backward
which returns
and then "c" is printed
and then "b" is printed
and then "a" is printed
quit.
If list isn't None, it calls print_backward, then prints the first member of the list. Expanded, this is exssentially what happens. You can see that when calls start returning, 'c' is printed, then 'b', then 'a'.
It looks like when actually printing a list, it prints the first node
print_backward(list='a','b','c')
print_backward(list='b','c')
print_backward(list='c')
print_backward(list=None)
list is None, so return
print 'c'
print 'b','c'
print 'a','b','c'
Sometimes I find it easier to think of recursion as merely constructing a list of calls to be made in a certain order. As the function continues, it builds up a bunch of calls until it finally gets to the base case. The base case is the situation where no further breaking down the of program is necessary; in this function, the base case is when there is nothing to print, in which case we just leave without doing anything with return.
The cool stuff usually happens on the way back as we unwind the recursive stack of function calls. currently, print_backward has been called on each element of the list, and it will now 'unwind', finishing the most recent calls first and the earlier calls last. This means that the 'instance' of print_backward created when you call it on the last element is the first one to finish, and thus the last element is the first one to be printed, followed by the second to last, third to last, etc., until the original function finally exits.
Take a look at this representation of what happened:
print_backward(node1) #first call to print_backward
print_backward(node2) #calls itself on next node
print_backward(node3) #calls itself on next node
print_backward(None) #calls itself on None. We can now start unwinding as this is the base case:
print Node3 #now the third invocation finishes...
print Node2 #and the second...
print Node1 #and the first.
While the function is called first on the earlier elements, the part that actually prints that element comes after the recursive call, so it won't actually execute until that recursive call finishes. In this case, that means that the print list part won't execute until all of the later elements have been printed first (in reverse order), thus giving you the list elements printed backwards. :D
It's using recursion. It "zips" all the way down until it gets to the end, then it prints every element as each call returns. Since the first one to get to print is the most recent called, it prints the list backwards.
No. There's two kinds of recursion:
Tail recursion: if there is nothing to do after the function returns except return its value. Function calls are not stacked.
Recursion that finds the base case first (in this case, null, then backwardly processes the list). Each function call is pushed into the stack, for later processing. In your exemple, the function is stacked as 'a'->'b'->'c'->null, then as the stack is popped, the author showed that by printing backwards: `if null return: print 'c' -> print 'b' -> print 'a'
In your case, the author only demonstrated a different concept of recursion, and used that to print the list backwards.
Your nodes look something like this:
node1 node2 node3
'a' => 'b' => 'c' => None
At the first call to print_backward, the variable list has the value 'a', subsequent calls to print_backward move one further down the line. Note that none of them print anything until you hit the guard (None) at which time, things get printed from the back to front as the print_backward that received node 'c' must return before the print_backward that received node 'b' can print (because the print statement is after the function call) and so on.
While I recognize that this is somebody else's code, there are a few things in here which are bad practice -- Best I tell you now while you're learning rather than later. First, don't use list as a variable name since it is the name of a builtin function/type in python. second the equality test if obj == None is better done by if obj is None, finally, it's always a good idea to have your classes inherit from object (class node(object):) as that makes it a new-style class.

Concatenate Python Linked List

I am attempting to concatenate a Python linked list without copying the data contained within the nodes of the list. I have a function that will concatenate the list using copies of the nodes passed in, but I can't seem to get the function that doesn't use copies to work.
These functions are for testing and timing purposes; I know that Python's built-in list is awesome!
Here is the class I have been working with and the concatenate function.
class Cell:
def __init__( self, data, next = None ):
self.data = data
self.next = next
def print_list(self):
node = self
while node != None:
print node.data
node = node.next
The concatenation function is not meant to be a member function of the Cell class.
def list_concat(A, B):
while A.next != None:
A = A.next
A.next = B
return A
This function overwrites the first element of a list if the parameter A has more than one node. I understand why that is happening, but am not sure how to go about fixing it.
Here is the testing code I've been using for this function.
e = Cell(5)
test = Cell(3, Cell(4))
test2 = list_concat(test2, e)
test2.print_list()
Any insight or help would be greatly appreciated.
*edited to fix code formatting
Try this instead:
def list_concat(A, B):
current = A
while current.next != None:
current = current.next
current.next = B
return A
Assigning new values to a function's parameters is a bad programming practice, and the code in your question shows why: You used A for iterating over the original list, and by doing so, you lost the reference to its first element.
I'm not sure on about if extend performs a copy or not, but in case it doesn't, just use
A.extend(B)

Categories