How to stop python recursion - python

I made a function that searches files recursively and I want it to stop recursion when the first file is found:
def search_file(path):
for name in os.listdir(path):
sub = os.path.join(path, name)
if os.path.isfile(sub):
return sub#And break recursion
else:
search_file(sub)

Return a flag that says whether the file was found. When you call search_file, return if the return value is True.

You are close. You already break recursion when you find the file, the problem is that you didn't propegate that result all the way up the chain. A well-placed print statement would show what went wrong.
import os
def search_file(path):
for name in os.listdir(path):
sub = os.path.join(path, name)
print('peek at: {}'.format(sub))
if os.path.isfile(sub):
return sub#And break recursion
else:
sub = search_file(sub)
if sub:
return sub
print(search_file('a'))

Note that you need to be able to return a false entry in case it loops all the way down without finding anything. However, you do not want to break out of the for loop if nothing found in one subdirectory without checking the next sub directory. The return will break out of the function, returning the results without having to enter a break.
def search_file(path):
Initialize results to no file found
val = False
sub = None
for name in os.listdir(path):
sub = os.path.join(path, name)
val = os.path.isfile(sub):
if val:
return val, sub #And break recursion
else:
#check the subdirectory.
val, sub = search_file(sub)
# Break out if a valid file was found.
if val:
return val, sub
# No files found for this directory, return a failure
return val, sub
On the other hand, if you want to have only one return, you can use a break as follows
def search_file(path):
Initialize results to No file found
val = False
sub = None
for name in os.listdir(path):
sub = os.path.join(path, name)
val = os.path.isfile(sub):
if val:
break # break out of the for loop and return
else:
#check the subdirectory.
val, sub = search_file(sub)
# Break out if a valid file was found.
if val:
break
# Return the True or False results
return val, sub

I see a problem of not being able to return a value other than None once into the else: statement. Could you provide broader details describing what you are trying to do?
There is no way to simply exit recursion when you accomplish a task. Each recursive step that was opened must be closed before moving on. The function must return something (None or a value) to its caller.
I'm imagining this being a class method that sets value to an attribute because it doesn't return anything once recursion has begun. Here's what I would do in the instance that this is a class method.
def search_file(self, path):
for name in os.listdir(path):
sub = os.path.join(path, name)
if os.path.isfile(sub):
self.random_attr = sub
return True #And break recursion
elif search_file(sub):
break

Related

Recursively construct a tree without using a `return` statement?

I would like to inquire about the scope in Python of an object that is a class variable.
import numpy as np
class treeNode:
def __init__(self,key):
self.leftChild = None
self.rightChild = None
self.value = key
def insert(root,key):
if root is None:
return treeNode(key)
else:
if root.value == key:
return root
elif root.value<key:
root.rightChild = insert(root.rightChild,key)
else:
root.leftChild = insert(root.leftChild,key)
return root
def insert_1(root,key):
if root is None:
root = treeNode(key)
else:
if root.value<key:
insert_1(root.rightChild,key)
elif root.value>key:
insert_1(root.leftChild,key)
def construct_tree(a):
def insert_1(root,key):
if root is None:
root = treeNode(key)
else:
if root.value<key:
insert_1(root.rightChild,key)
elif root.value>key:
insert_1(root.leftChild,key)
root = treeNode(a[0])
for k in a:
insert_1(root,k)
return root
if __name__ == '__main__':
np.random.seed(1)
a = np.random.rand(12)
tree = treeNode(a[0])
for k in a:
insert(tree,k)
for k in a:
insert_1(tree,k)
tree_1 = construct_tree(a)
The insert() function produces the whole tree while insert_1() and construct_tree() which do not return anything fail to do so. Is there a function to recursively construct the whole tree without using a return statement? Thank you very much.
In insert, the base case of the recursion is when you're inserting into an empty subtree, represented by None being passed in as root. It works because you can create and return a new treeNode in that case, and the caller will do the right thing with the return value.
If you don't want to be using return, you need to push that base case up to the calling code, so it avoids making a call when a leaf node is going to be added:
def insert_no_return(root, key):
assert(root != None) # we can't handle empty trees
if root.key == key:
return # no value here, just quit early
elif root.key < key:
if root.rightChild is None: # new base case
root.rightChild = treeNode(key)
else:
insert_no_return(root.rightChild, key) # regular recursive case, with no assignment
elif root.key > key:
if root.leftChild is None: # new base case for the other child
root.leftChild = treeNode(key)
else:
insert_no_return(root.leftChild, key) # no assignment here either
That's a bit more repetitive than the version with return, since the base case needs to be repeated for each possible new child, but the recursive lines are a bit shorter since they don't need to assign a value anywhere.
As the assert says at the top, you can't usefully call this on an empty tree (represented by None), since it has no way to change your existing reference to the None root. So construct_tree probably needs special logic to construct empty trees. Your current version of that function doesn't handle empty input at all (and redundantly tries to add the root value to the tree a second time):
def construct_tree(a):
if len(a) == 0: # special case to construct an empty tree
return None
it = iter(a) # use an iterator to avoid redundant insertion of a[0]
root = treeNode(next(it))
for k in it:
insert_no_return(root, k)

Return or not return in a recursive function

Before asking, I searched out some old questions and get a better idea to put the "return" in front of the inside re-invocated the function to get the expected result.
some of them like:
How to stop python recursion
Python recursion and return statements. But when I do the same thing with my problem, it gets worse.
I have a Binary Search Tree and want to get the TreeNode instance by given a node's key, so it looks an easier traversal requirement and I already easily realized similar functions below, with which I did NOT put return in front of the function:
#preorder_List=[]
def preorder(treeNode):
if treeNode:
preorder_List.append(treeNode.getKey())
preorder(treeNode.has_left_child())
preorder(treeNode.has_right_child())
return preorder_List
so for my new requirement, I compose it like below first:
def getNode(treeNode,key):
if(treeNode):
if(treeNode.key==key):
print("got it=",treeNode.key)
return treeNode
else:
getNode(treeNode.left_child(),key)
getNode(treeNode.right_child(),key)
then the issue occurs, it finds the key/node but kept running and report a None error finally and then I put return in front of the both left and right branch like below:
def getNode(treeNode,key):
if(treeNode):
if(treeNode.key==key):
print("got it=",treeNode.key)
return treeNode
else:
return getNode(treeNode.left_child(),key)
return getNode(treeNode.right_child(),key)
but this makes the thing worse, it did reach the key found and return None earlier.
Then I tried to remove one "return" for the branch, no matter right or left. It works (Update: this worked when my test case contains only 3 nodes, when I put more nodes, it didn't work, or to say if the expected node is from right, then put return in front of right branch invocation works, for left one, it didn't). What's the better solution?
You need to be able to return the results of your recursive calls, but you don't always need to do so unconditionally. Sometimes you'll not get the result you need from the first recursion, so you need to recurse on the other one before returning anything.
The best way to deal with this is usually to assign the results of the recursion to a variable, which you can then test. So if getNode either returns a node (if it found the key), or None (if it didn't), you can do something like this:
result = getNode(treeNode.left_child(),key)
if result is not None:
return result
return getNode(treeNode.right_child(),key)
In this specific case, since None is falsey, you can use the or operator to do the "short-circuiting" for you:
return getNode(treeNode.left_child(),key) or getNode(treeNode.right_child(),key)
The second recursive call will only be made if the first one returned a falsey value (such as None).
Note that for some recursive algorithms, you may need to recurse multiple times unconditionally, then combine the results together before returning them. For instance, a function to add up the (numeric) key values in a tree might look something like this:
def sum_keys(node):
if node is None: # base case
return 0
left_sum = sumKeys(node.left_child()) # first recursion
right_sum = sumKeys(node.right_child()) # second recursion
return left_sum + right_sum + node.key # add recursive results to our key and return
Without knowing more about your objects:
Three base cases:
current node is None --> return None
current node matches the key --> return it
current node does not match, is the end of the branch --> return None
If not base case recurse. Short circuit the recursion with or: return the left branch if it a match or return the right branch result (which might also be None)
def getNode(treeNode,key):
if treeNode == None:
return None
elif treeNode.key == key:
print("got it=",treeNode.key)
return treeNode
elif not any(treeNode.has_left_child(), treeNode.has_right_child()):
return None
#left_branch = getNode(treeNode.left_child(),key)
#right_branch = getNode(treeNode.right_child(),key)
#return left_branch or right_branch
return getNode(treeNode.left_child(),key) or getNode(treeNode.right_child(),key)
Instead of return, use yield:
class Tree:
def __init__(self, **kwargs):
self.__dict__ = {i:kwargs.get(i) for i in ['left', 'key', 'right']}
t = Tree(key=10, right=Tree(key=20, left=Tree(key=18)), left=Tree(key=5))
def find_val(tree, target):
if tree.key == target:
yield target
print('found')
else:
if getattr(tree, 'left', None) is not None:
yield from find_val(tree.left, target)
if getattr(tree, 'right', None) is not None:
yield from find_val(tree.right, target)
print(list(find_val(t, 18)))
Output:
found
[18]
However, you could also implement the get_node function as a method in your binary tree class by implementing a __contains__ methods:
class Tree:
def __init__(self, **kwargs):
self.__dict__ = {i:kwargs.get(i) for i in ['left', 'key', 'right']}
def __contains__(self, _val):
if self.key == _val:
return True
_l, _r = self.left, self.right
return _val in [[], _l][bool(_l)] or _val in [[], _r][bool(_r)]
t = Tree(key=10, right=Tree(key=20, left=Tree(key=18)), left=Tree(key=5))
print({i:i in t for i in [10, 14, 18]})
Output:
{10: True, 14: False, 18: True}

Using a Function From a Module I Built Returns The First Value Despite Calling it Again With Different Arguments

I built the following module for identifying if a file exists in a directory based on its size and name, all so I can use it in a different part of a project. When I try to use the function for the first time, it works great. But when I call it again, different variables with different parameters return the first answer. What am I missing?
Module:
import os
from stat import *
import math
CORRECT_PATH = ''
FLAG = 1
def check_matching(pathname, desired_file_name):
global FLAG
if desired_file_name in pathname:
FLAG = 0
return pathname
def walktree(dirz, desired_file_name, size):
global CORRECT_PATH
global FLAG
for f in os.listdir(dirz):
try:
if FLAG:
pathname = os.path.join(dirz, f)
mode = os.stat(pathname)[ST_MODE]
if S_ISDIR(mode):
# It's a directory, recourse into it
walktree(pathname, desired_file_name, size)
elif S_ISREG(mode):
# It's a file, call the callback function
new_size = int(os.path.getsize(pathname))
if (new_size - int(size)) < math.fabs(0.95*int(size)):
CORRECT_PATH = check_matching(pathname, desired_file_name)
else:
# Unknown file type, print a message
print 'Skipping %s' % pathname
else:
try:
CORRECT_PATH = CORRECT_PATH.replace('\\', '/')
return True, CORRECT_PATH
except WindowsError as w:
#print w
if w[0] == 5:
return True, CORRECT_PATH
except WindowsError as e:
pass
# print e
# if e[0] == 5:
# return True, CORRECT_PATH # add correct path now
return False, ''
Now when I call the this code (This is an example, I'm using two different text files, with different sizes and different names which are saved on my local computer):
import LS_FINAL_FOUR
ans = LS_FINAL_FOUR.walktree("C:/", "a_test", 38)
print ans # (True, 'C:/a_test.txt')
ans1 = LS_FINAL_FOUR.walktree("C:/", "a_sample", 1000000)
print ans1 # (True, 'C:/a_test.txt')
Both return the same output. Very frustrating. Does anyone know the reason behind this and how to solve it?
Edit: I'm almost certain it's a problem with the module, but I just can't lay my finger on it.
You need to reset the FLAG to 1 each time walktree is called. As it is, the FLAG is set to zero once when you find the first file. Each subsequent call, you re-access this initial result, since if FLAG is false, you run this code again:
else:
try:
CORRECT_PATH = CORRECT_PATH.replace('\\', '/')
return True, CORRECT_PATH
Since CORRECT_PATH is stored as a global, the initial answer persists after your walktree method returns.
I'd suggest fixing up the code by not using globals. Just make check_matching return a result directly to walktree, and don't cache the path result after walktree returns.

Checking for None when accessing nested attributes

I am currently implementing an ORM that stores data defined in an XSD handled with a DOM generated by PyXB.
Many of the respective elements contain sub-elements and so forth, which each have a minOccurs=0 and thus may resolve to None in the DOM.
Hence when accessing some element hierarchy containing optional elements I now face the problem whether to use:
with suppress(AttributeError):
wanted_subelement = root.subelement.sub_subelement.wanted_subelement
or rather
if root.subelement is not None:
if root.subelement.sub_subelement is not None:
wanted_subelement = root.subelement.sub_subelement.wanted_subelement
While both styles work perfectly fine, which is preferable? (I am not Dutch, btw.)
This also works:
if root.subelement and root.subelement.sub_subelement:
wanted_subelement = root.subelement.sub_subelement.wanted_subelement
The if statement evaluates None as False and will check from left to right. So if the first element evaluates to false it will not try to access the second one.
If you have quite a few such lookups to perform, better to wrap this up in a more generic lookup function:
# use a sentinel object distinct from None
# in case None is a valid value for an attribute
notfound = object()
# resolve a python attribute path
# - mostly, a `getattr` that supports
# arbitrary sub-attributes lookups
def resolve(element, path):
parts = path.split(".")
while parts:
next, parts = parts[0], parts[1:]
element = getattr(element, next, notfound)
if element is notfound:
break
return element
# just to test the whole thing
class Element(object):
def __init__(self, name, **attribs):
self.name = name
for k, v in attribs.items():
setattr(self, k, v)
e = Element(
"top",
sub1=Element("sub1"),
nested1=Element(
"nested1",
nested2=Element(
"nested2",
nested3=Element("nested3")
)
)
)
tests = [
"notthere",
"does.not.exists",
"sub1",
"sub1.sub2",
"nested1",
"nested1.nested2",
"nested1.nested2.nested3"
]
for path in tests:
sub = resolve(e, path)
if sub is notfound:
print "%s : not found" % path
else:
print "%s : %s" % (path, sub.name)

Python Skipping 'for' loop

I'm making a program that searches a file for code snippets. However, in my search procedure, it skips the for loop entirely (inside the search_file procedure). I have looked through my code and have been unable to find a reason. Python seems to just skip all of the code inside the for loop.
import linecache
def load_file(name,mode,dest):
try:
f = open(name,mode)
except IOError:
pass
else:
dest = open(name,mode)
def search_file(f,title,keyword,dest):
found_dots = False
dest.append("")
dest.append("")
dest.append("")
print "hi"
for line in f:
print line
if line == "..":
if found_dots:
print "Done!"
found_dots = False
else:
print "Found dots!"
found_dots = True
elif found_dots:
if line[0:5] == "title=" and line [6:] == title:
dest[0] = line[6:]
elif line[0:5] == "keywd=" and line [6:] == keyword:
dest[1] = line[6:]
else:
dest[2] += line
f = ""
load_file("snippets.txt",'r',f)
search = []
search_file(f,"Open File","file",search)
print search
In Python, arguments are not passed by reference. That is, if you pass in an argument and the function changes that argument (not to be confused with data of that argument), the variable passed in will not be changed.
You're giving load_file an empty string, and that argument is referenced within the function as dest. You do assign dest, but that just assigns the local variable; it does not change f. If you want load_file to return something, you'll have to explicitly return it.
Since f was never changed from an empty string, an empty string is passed to search_file. Looping over a string will loop over the characters, but there are no characters in an empty string, so it does not execute the body of the loop.
Inside each function add global f then f would be treated as a global variable. You don't have to pass f into the functions either.

Categories