Sorry if this is a general question but I am a beginner in Python and many times when I see other people code using recursion, they create a helper function for the main function and then call that helper function which itself is recursive.
This seems a bit different from the simplest cases of recursion for example (sum of lists, factorial) where the function only calls itself.
Can someone explain this technique more carefully perhaps with examples?
Much appreciated.
Example 1: (Reversing linked list using recursion)
def revert_list(self):
self.head = self._revert_helper(self.head)
def _revert_helper(self, node):
temp = None
if node.forward == None:
return node
else:
temp = self._revert_helper(node.forward)
node.forward.forward = node
node.forward = None
return temp
Example 2: (Binary Search Tree)
def __contains__(self, key):
return self._bstSearch(self._root, key)
# Returns the value associated with the key.
def valueOf(self, key):
node = self._bstSearch(self._root, key)
assert node is not None, "Invalid may key."
return node.value
# Helper method that recursively searches the tree for a target key:
# returns a reference to the Node. This allows
# us to use the same helper method to implement
# both the contains and valueOf() methods of the Map class.
def _bstSearch(self, subtree, target):
if subtree is None: # base case
return None
elif target < subtree.key: # target is left of the subtree root
return self._bstSearch(subtree.left)
elif target > subtree.key: # target is right of the subtree root
return self.bstSearch(subtree.right)
else: # base case
return subtree
This is actually used more often in other languages, because python can usually emulate that behavior with optional arguments. The idea is that the recursion gets a number of initial arguments, that the user doesn't need to provide, which help keep track of the problem.
def sum(lst):
return sumhelper(lst, 0)
def sumhelper(lst, acc):
if lst:
acc += lst[0]
return sumhelper(lst[1:], acc)
return acc
Here it's used to set a starting parameter to 0, so the user doesn't have to provide it. However, in python you can emulate it by making acc optional:
def sum(lst, acc=0):
if lst:
acc += lst[0]
return sum(lst[1:], acc)
return acc
Usually when I do this, it is because the recursive function is tricky or annoying to call, so I have a wrapper that is more convenient. For example, imagine a maze solver function. The recursive function needs a data structure to keep track of visited spots inside the maze, but for convenience to the caller I just want the caller to need to pass in a maze to solve. You can maybe handle this with a default variable in Python.
The other major reason I have done this is for speed. The recursive function is very trusting, and assumes its arguments are all valid; it just goes full speed ahead with the recursion. Then the wrapper function carefully checks all the arguments before making the first call to the recursive function. As a trivial example, factorial:
def _fact(n):
if n == 0: # still need to handle the basis case
return 1
return n*_fact(n-1)
def fact(n):
n0 = int(n)
if n0 != n:
raise ValueError("argument must make sense as an int")
if n < 0:
raise ValueError("negative numbers not allowed")
return _fact(n)
I have edited this from the original, and now it's actually a pretty reasonable example. We coerce the argument to an integer ("duck typing") but we require that the != operator not indicate it to have changed in value by this coercion; if converting it to int changes the value (for example, a float value that had a fractional part truncated) we reject the argument. Likewise, we check for negative and reject the argument. Then the actual recursive function is very trusting and contains no checks at all.
I could give less vague answers if you posted an example you have seen of code that inspired this question.
EDIT: Okay, discussion of your examples.
Example 1: (Reversing linked list using recursion)
Pretty simple: the "helper" function is a general recursive function that will work on any node in the class that has a linked list. Then the wrapper is a method function that knows how to find self.head, the head of the list. This "helper" is a class member function, but it could also be a simple function in a general data-structures stuff library. (This makes more sense in Python than in languages like C, because a function like this could work with any linked list that is a class with a member called forward as its "next pointer" value. So you really could write this once and then use it with multiple classes that implement linked lists.)
Example 2: (Binary Search Tree)
The actual recursive function returns None if no node can be found with the specified key. Then there are two wrappers: one that implements __contains__(), which works just fine if it returns None; and valueOf(), which raises an exception if the key is not found. As the comment notes, two wrappers lets us solve two different problems with a single recursive function.
Also, just as with the first example, the two wrappers kick off the search in a specific location: self._root, the root of the tree. The actual recursive function can be started anywhere inside a tree.
If __contains__() were implemented with a default argument of a node to search, and the default was set to some unique value, it could check for the special value and start at the root in that case. Then when __contains__() is called normally, the unique value would be passed in, and the recursive function could know that it needs to look at the special location self._root. (You can't just pass in self._root as the default value, because the default value is set at compile time, and the class instance can change after that, so it wouldn't work right.)
class UniqueValue:
pass
def __contains__(self, key, subtree=UniqueValue):
if subtree is UniqueValue:
subtree = self._root
if subtree is None: # base case
return None
elif key < subtree.key: # target is left of the subtree root
return self.__contains__(key, subtree.left)
elif key > subtree.key: # target is right of the subtree root
return self.__contains__(key, subtree.right)
else: # base case
return subtree
Note that while I said it could be implemented as I show here, I didn't say I prefer it. Actually I prefer the two wrappers version. This is a little bit tricky, and it wastes time on every recursive call checking to see if subtree is UniqueValue. More complex and wastes time... not a win! Just write the two wrappers, which start it off in the right place. Simple.
From my experience (and my experience only), I use this style of coding when
The recursion is only useful in the larger function (not very recommended, but I have some bad habits)
There needs to be preparation done for the function, but only once (instead of a flag or other switch)
One way I use it is for logging purposes, while avoiding re-logging levels
def _factorial(x):
return 1 if x == 0 else x*_factorial(x)
#log #assuming some logging decorator "log"
def factorial(x):
return _factorial(x)
Otherwise, log would be called for each recursive level of the factorial function, something I may not desire.
Another usage would be to resolve default arguments.
def some_function(x = None):
x = x or set() #or whatever else
#some stuff
return some_function()
Would check if x is falsey for every iteration, while what I actually need is a decorator, or as an alternative:
def some_function(x = None):
return _some_function(x if x else set())
where _some_function is the helper function.
Specifically with 2, it allows for some freedom of abstraction. If for some reason you didn't want to use a bstsearch, you could just swap it for some other function in __contains__ (and you'd also be able to reuse code in different places)
Related
I have a script that attempts to read the begin and end point for a subset via a binary search, these values are then used to create a slice for further processing.
I noticed that when these variables did not get set (the search returned None) the code would still run and in the end I noticed that a slice spanning from None to None works as if examining the entire list (see example below).
#! /usr/bin/env python
list = [1,2,3,4,5,6,7,8,9,10]
for x in list[None:None]:
print x
Does anyone know why the choice was made to see the list[None:None] simply as list[:], at least that's what I think that happens (correct me if I'm wrong). I personally would think that throwing a TypeError would be desirable in such a case.
Because None is the default for slice positions. You can use either None or omit the value altogether, at which point None is passed in for you.
None is the default because you can use a negative stride, at which point what the default start and end positions are changed. Compare list[0:len(list):-1] to list[None:None:-1], for example.
Python uses None for 'value not specified' throughout the standard library; this is no exception.
Note that if your class implements the object.__getitem__ hook, you'll get passed a slice() object with the start, end and stride attributes set to None as well:
>>> class Foo(object):
... def __getitem__(self, key):
... print key
...
>>> Foo()[:]
slice(None, None, None)
Since Foo() doesn't even implement a __len__ having the defaults use None is entirely logical here.
I also think that list[None:None] is interpreted as list[:]. This is handy behavior because you can do something like this:
return list[some_params.get('start'):some_params.get('end')]
If the list slicing wouldn't work with None, you would have to check if start and end were None yourself:
if some_params.get('start') and some_params.get('end'):
return list[some_params.get('start'):some_params.get('end')]
elif some_params.get('start'):
return list[some_params.get('start'):]
elif end:
return list[:some_params.get('end')]
else:
return list[:]
Fortunately this is not the case in Python :).
None is the usual representation for "parameter not given", so you can communicate the fact to a function. You will often see functions or methods declared like this
def f(p=None):
if f is None:
f = some_default_value()
I guess this makes the choice clear: Be using None you can tell the slicer to use its default values.
There were several discussions on "returning multiple values in Python", e.g.
1,
2.
This is not the "multiple-value-return" pattern I'm trying to find here.
No matter what you use (tuple, list, dict, an object), it is still a single return value and you need to parse that return value (structure) somehow.
The real benefit of multiple return value is in the upgrade process. For example,
originally, you have
def func():
return 1
print func() + func()
Then you decided that func() can return some extra information but you don't want to break previous code (or modify them one by one). It looks like
def func():
return 1, "extra info"
value, extra = func()
print value # 1 (expected)
print extra # extra info (expected)
print func() + func() # (1, 'extra info', 1, 'extra info') (not expected, we want the previous behaviour, i.e. 2)
The previous codes (func() + func()) are broken. You have to fix it.
I don't know whether I made the question clear... You can see the CLISP example. Is there an equivalent way to implement this pattern in Python?
EDIT: I put the above clisp snippets online for your quick reference.
Let me put two use cases here for multiple return value pattern. Probably someone can have alternative solutions to the two cases:
Better support smooth upgrade. This is shown in the above example.
Have simpler client side codes. See following alternative solutions I have so far. Using exception can make the upgrade process smooth but it costs more codes.
Current alternatives: (they are not "multi-value-return" constructions, but they can be engineering solutions that satisfy some of the points listed above)
tuple, list, dict, an object. As is said, you need certain parsing from the client side. e.g. if ret.success == True: blabla. You need to ret = func() before that. It's much cleaner to write if func() == True: blabal.
Use Exception. As is discussed in this thread, when the "False" case is rare, it's a nice solution. Even in this case, the client side code is still too heavy.
Use an arg, e.g. def func(main_arg, detail=[]). The detail can be list or dict or even an object depending on your design. The func() returns only original simple value. Details go to the detail argument. Problem is that the client need to create a variable before invocation in order to hold the details.
Use a "verbose" indicator, e.g. def func(main_arg, verbose=False). When verbose == False (default; and the way client is using func()), return original simple value. When verbose == True, return an object which contains simple value and the details.
Use a "version" indicator. Same as "verbose" but we extend the idea there. In this way, you can upgrade the returned object for multiple times.
Use global detail_msg. This is like the old C-style error_msg. In this way, functions can always return simple values. The client side can refer to detail_msg when necessary. One can put detail_msg in global scope, class scope, or object scope depending on the use cases.
Use generator. yield simple_return and then yield detailed_return. This solution is nice in the callee's side. However, the caller has to do something like func().next() and func().next().next(). You can wrap it with an object and override the __call__ to simplify it a bit, e.g. func()(), but it looks unnatural from the caller's side.
Use a wrapper class for the return value. Override the class's methods to mimic the behaviour of original simple return value. Put detailed data in the class. We have adopted this alternative in our project in dealing with bool return type. see the relevant commit: https://github.com/fqj1994/snsapi/commit/589f0097912782ca670568fe027830f21ed1f6fc (I don't have enough reputation to put more links in the post... -_-//)
Here are some solutions:
Based on #yupbank 's answer, I formalized it into a decorator, see github.com/hupili/multiret
The 8th alternative above says we can wrap a class. This is the current engineering solution we adopted. In order to wrap more complex return values, we may use meta class to generate the required wrapper class on demand. Have not tried, but this sounds like a robust solution.
try inspect?
i did some try, and not very elegant, but at least is doable.. and works :)
import inspect
from functools import wraps
import re
def f1(*args):
return 2
def f2(*args):
return 3, 3
PATTERN = dict()
PATTERN[re.compile('(\w+) f()')] = f1
PATTERN[re.compile('(\w+), (\w+) = f()')] = f2
def execute_method_for(call_str):
for regex, f in PATTERN.iteritems():
if regex.findall(call_str):
return f()
def multi(f1, f2):
def liu(func):
#wraps(func)
def _(*args, **kwargs):
frame,filename,line_number,function_name,lines,index=\
inspect.getouterframes(inspect.currentframe())[1]
call_str = lines[0].strip()
return execute_method_for(call_str)
return _
return liu
#multi(f1, f2)
def f():
return 1
if __name__ == '__main__':
print f()
a, b = f()
print a, b
Your case does need code editing. However, if you need a hack, you can use function attributes to return extra values , without modifying return values.
def attr_store(varname, value):
def decorate(func):
setattr(func, varname, value)
return func
return decorate
#attr_store('extra',None)
def func(input_str):
func.extra = {'hello':input_str + " ,How r you?", 'num':2}
return 1
print(func("John")+func("Matt"))
print(func.extra)
Demo : http://codepad.org/0hJOVFcC
However, be aware that function attributes will behave like static variables, and you will need to assign values to them with care, appends and other modifiers will act on previous saved values.
the magic is you should use design pattern blablabla to not use actual operation when you process the result, but use a parameter as the operation method, for your case, you can use the following code:
def x():
#return 1
return 1, 'x'*1
def f(op, f1, f2):
print eval(str(f1) + op + str(f2))
f('+', x(), x())
if you want generic solution for more complicated situation, you can extend the f function, and specify the process operation via the op parameter
I am running into a problem writing recursive member functions in Python. I can't initialize the default value of a function parameter to be the same value as a member variable. I am guessing that Python doesn't support that capability as it says self isn't defined at the time I'm trying to assign the parameter. While I can code around it, the lack of function overloading in Python knocks out one obvious solution I would try.
For example, trying to recursively print a linked list I get the following code for my display function;
def display(self,head = -1):
if head == -1:
head = self.head
if not head:
return
print head,
self.display(head.link)
While this code works, it is ugly.
The main function looks like this:
def main():
l = List();
l.insert(3);
l.insert(40);
l.insert(43);
l.insert(45);
l.insert(65);
l.insert(76);
l.display()
if __name__ == "__main__":
main()
If I could set the display function parameter to default to self.head if it is called without parameters then it would look much nicer. I initially tried to create two versions of the function, one that takes two parameters and one that takes one but as I said, Python doesn't support overloading. I could pass in an argument list and check for the number of arguments but that would be pretty ugly as well (it would make it look like Perl!). The trouble is, if I put the line
head = self.head
inside the function body, it will be called during every recursive call, that's definitely not the behavior I need. None is also a valid value for the head variable so I can't pass that in as a default value. I am using -1 to basically know that I'm in the initial function call and not a recursive call. I realize I could write two functions, one driving the other but I'd rather have it all self contained in one recursive function. I'm pretty sure I'm missing some basic pythonic principle here, could someone help me out with the pythonic approach to the problem?
Thanks!
I don't really see what's wrong with your code. If you chose a falsy default value for head, you could do: head = head or self.head which is more concise.
Otherwise, this is pretty much what you have to do to handle default arguments. Alternatively, use kwargs:
def display(self,**kwargs):
head = kwargs.get("head", self.head)
if not head:
return
print head,
self.display(head=head.link) # you should always name an optional argument,
# and you must name it if **kwargs is used.
This question already has answers here:
Why does my recursive function return None?
(4 answers)
Closed 7 months ago.
I don't get it, how can i return a List instead of a None?
class foo():
def recursion(aList):
if isGoal(aList[-1]):
return aList
for item in anotherList:
newList = list(aList)
newList.append(item)
recursion(newList)
someList = [0]
return recursion(someList)
Basically the code is to record all paths (start at 0). Whoever gets 100 first will be returned. isGoal() is to check if last item of the path is 100. And anotherList is a small list of random numbers (from 0 to 100).
return statement
This problem actually took me quite a while to grasp when I first started learning recursion.
One thing to keep in mind when dealing with Python functions/methods is that they always return a value no matter what. So say you forget to declare a return statement in the body of your function/method, Python takes care of it for you then and does return None at the end of it.
What this means is that if you screw up in the body of the function and misplace the return or omit it, instead of an expected return your print type(messed_up_function()) will print NoneType.
Recursion fix
Now with that in mind, when dealing with recursion, first make sure you have a base case besides the inductive case, I.e. to prevent an infinite recursive loop.
Next, make sure you're returning on both cases, so something like this:
def recur(val):
"""
takes a string
returns it back-to-front
"""
assert type(val) == str
# the base case
if len(val) == 1:
return val
# the inductive case
else:
return val[-1] + recur(val[:-1]) # reverses a string char by char
So what this does is it always returns and is 100% infinite recursion proof because it has a valid base case and a decremented length at each inductive step.
Stack Viewer to debug recursive functions
In case we would run recur('big') with the added assert False at the start of the base case, we would have this stack structure:
From it we can see that at each recursive step, we have val, which is the only parameter of this function, getting smaller and smaller until it hits len(val) == 1 and then it reaches the final return, or in this case assert False. So this is just a handy way to debug your recursive functions/methods. In IDLE you can access such a view by calling Debug > Stack Viewer in the shell.
The function is this:
def recursion(aList):
if isGoal(aList[-1]):
return aList
for item in anotherList():
newList = list(aList)
newList.append(item)
recursion(newList) # here you ignore what recursion returns
# when execution reaches this point, nothing is returned
If execution reaches my added comment, after the for loop completes, the function exits and nothing is returned. And when you exit a function without having executed a return statement, None is returned. You must make sure that you return something from the recursive function.
I can't advise you on how to re-write the function since I don't know what it is trying to do. It's very far from obvious to me how it needs to be changed. However, what I can say with complete confidence is that you must return something!
I made a program that extracts the text from a HTML file. It recurses down the HTML document and returns the list of tags. For eg,
input < li >no way < b > you < /b > are doing this < /li >
output ['no','way','you','are'...].
Here is a highly simplified pseudocode for this:
def get_leaves(node):
kids=getchildren(node)
for i in kids:
if leafnode(i):
get_leaves(i)
else:
a=process_leaf(i)
list_of_leaves.append(a)
def calling_fn():
list_of_leaves=[] #which is now in global scope
get_leaves(rootnode)
print list_of_leaves
I am now using list_of_leaves in a global scope from the calling function. The calling_fn() declares this variable, get_leaves() appends to this.
My question is, how do I modify my function so that I am able to do something like list_of_leaves=get_leaves(rootnode), ie without using a global variable?
I dont want each instance of the function to duplicate the list, as the list can get quite big.
Please dont critisize the design of this particular pseudocode, as I simplified this. It is meant for another purpose: extracting tokens along with associated tags using BeautifulSoup
You can pass the result list as optional argument.
def get_leaves(node, list_of_leaves=None):
list_of_leaves = [] if list_of_leaves is None else list_of_leaves
kids=getchildren(node)
for i in kids:
if leafnode(i):
get_leaves(i, list_of_leaves)
else:
a=process_leaf(i)
list_of_leaves.append(a)
def calling_fn():
result = []
get_leaves(rootnode, list_of_leaves=result)
print result
Python objects are always passed by reference. This has been discussed before here. Some of the built-in types are immutable (e.g. int, string), so you cannot modify them in place (a new string is created when you concatenate two strings and assign them to a variable). Instance of mutable types (e.g. list) can be modified in place. We are taking advantage of this by passing the original list for accumulating result in our recursive calls.
For extracting text from HTML in a real application, using a mature library like BeautifulSoup or lxml.html is always a much better option (as others have suggested).
No need to pass an accumulator to the function or accessing it through a global name if you turn get_leaves() into a generator:
def get_leaves(node):
for child in getchildren(node):
if leafnode(child):
for each in get_leaves(child):
yield each
else:
yield process_leaf(child)
def calling_fn():
list_of_leaves = list(get_leaves(rootnode))
print list_of_leaves
Use a decent HTML parser like BeautifulSoup instead of trying smarter than existing software.
#pillmincher's generator answer is the best, but as another alternative, you can turn your function into a class:
class TagFinder:
def __init__(self):
self.leaves = []
def get_leaves(self, node):
kids = getchildren(node)
for i in kids:
if leafnode(i):
self.get_leaves(i)
else:
a = process_leaf(i)
self.list_of_leaves.append(a)
def calling_fn():
finder = TagFinder()
finder.get_leaves(rootnode)
print finder.list_of_leaves
Your code likely involves a number of helper functions anyway, like leafnode, so a class also helps group them all together into one unit.
As a general question about recursion, this is a good one. It is common to have a recursive function that accumulates data into some collection. Either the collection needs to be a global variable (bad) or it is passed to the recursive function. When collections are passed in almost every language, only a reference is passed so you do not have to worry about space. Someone just posted an answer showing how to do this.