Here is my simple Python function:
def f(b):
return b and f(b)
To me this must be an infinite loop, no matter the value of b, because b is always the same. But:
With f(True) I get an overflow, which is what I expected
When I run f(False) I get False!
I'm very curious about that.
Recursion stops when b is falsey, i.e. False (or an empty string, or zero).
In some more detail,
x and y
does not evaluate y if x is False (or, as explicated above, another value which is interpreted as that when converted into a bool value, which Python calls "falsey"), as it is then impossible for y to affect the outcome of the expression. This is often called "short-circuiting".
Related
Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
This question does not appear to be about programming within the scope defined in the help center.
Closed 2 years ago.
Improve this question
I was wondering why Python 3.7 functions behave in a rather strange way. I think it's kinda weird and contradictory to the whole notion of hashability. Let me clarify what I encounter with a simple example code. Knowing that tuples are hashable, consider the following:
a = (-1, 20, 8)
b = (-1, 20, 8)
def f(x):
return min(x), max(x)
Now let us examine:
>>> print(a is b, a.__hash__() == b.__hash__())
False True
>>> print((-1, 20, 8) is (-1, 20, 8))
True
This is odd enough, but I guess "naming" hashable objects make them something different (their id()'s change during variable definition). How about functions? Functions are hashable, right? Let's see:
>>> print(f(a) is f(b))
False
>>> print(id(f(a)) == id(f(b)), f(a).__hash__() == f(b).__hash__())
True True
Now this is the climax of my confusion. You should be surprised that even f(a) is f(a) is False. But how so? Don't you think this kind of behavior is incorrect and should be addressed and fixed by Python community?
You can't guarantee two identical calls are the same since functions are also objects in Python, thus they can maintain state. Yet even if you put state apart you shouldn't rely that is will evaluate True if the contents of two objects are the same.
There are cases in which Python will optimize the code to use the same object as a singleton but you should't assume anything on this.
255 is 255 returns True due to implementation details of CPython while 256 is 256 returns False. If care only for deep equality use ==. is is designed for object equality checks.
c = 40
def f(x):
return c + x
a = 1
f(a)
# 41
c += 1
f(a)
# 42
f(a) is f(a)
# True
c += 500
f(a) is f(a)
# False
f(a) is f(a) can result in the same objects, for instance Python stores integers up to 255 as singletons so the first test returns True but when we are out of those optimizations (c += 500) each call will instantiate its own object to return and now f(a) is f(a) will return False.
is keyword in python compares if the operand are pointing to the same object. Python provides id() function to return a unique identifier for an object instance. So, a is b does not compare if objects contain the same value, it just return if a and b are the same object.
__hash__() function returns a value based on the content/value of the object.
>>> a = (-1, 20, 8)
>>> b = (-1, 20, 8)
>>> id(a)
2347044252768
>>> id(b)
2347044252336
>>> hash(a)
-3789721413161926883
>>> hash(b)
-3789721413161926883
Now the last question, f(a) is f(b) compares if the results returned by f(a) and f(b) points to the same object in memory.
If your function return min(x), max(x) will return a new tuple containing the min and max of x. Therefore, print(f(a) is f(b)) is False
f(a).__hash__() == f(b).__hash__() is True because this actually compares hash of the resulting value, not the hash of the function as you think.
If you want the hash of the function, you will do f.__hash__() or hash(f) since function in Python is just a callable object.
The only interesting part is print(id(f(a)) == id(f(b))) shows True. This is probably due to CPython expression bytecode optimizer.
If you do it separately, it returns False.
>>> c = f(a)
>>> d = f(b)
>>> print(id(f(a)) == id(f(b)))
True
>>> print(id(c) == id(d))
False
I'm not sure if it is a bug that should be fix, but it is an odd inconsistency. BTW, I'm using Python 3.7.2 on Windows 64-bit. The behavior might different on different Python version or implementation.
If you replace integer values with strings, the behavior also changes due to Python's string interning optimization.
Therefore, the lesson here is just like general guidelines in other language, avoid comparing object references/pointers if possible as you might be looking into some implementation details about how the objects are referenced, optimization and possible how its GC works.
Here's an interesting related article: Python Optimization: How it Can Make You a Better Programmer
Say I have these two functions:
def s(x,y,z):
if x <= 0:
return y
return z
def f(a,b):
return s(b, a+1, f(a,b-1)+1)
If I were to try and find f(5,2) in my head, it would go like this:
f(5,2) = s(2,6,f(5,1)+1)
f(5,1) = s(1,6,f(5,0)+1)
f(5,0) = s(0,6,f(5,-1)+1) = 6
f(5,1) = 7
f(5,2) = 8
I never evaluate f(5,-1) because it is not needed. The s function is going to return 6, since argument x is zero, thus evaluation of argument z is unnecessary.
If I were however to try and run this in python, it would keep recursing forever or or until I get a maximum recursion depth error, presumably because python wants to evaluate all the arguments before executing the s function.
My question is, how would I go about implementing these functions, or any similar scenario, in such a way that the recursion stops when it is no longer needed? Would it be possible to delay the evaluation of each argument until it is used in the function?
Your mind is working with 'insider knowledge' of how s() works. Python can't, so it can only follow the strict rules that all argument expressions to a call will have to be evaluated before the call can be made.
Python is a highly dynamic language, and at every step of execution, both s and f can be rebound to point to a different object. This means Python can't optimise recursion or inline function logic. It can't hoist the if x <= 0 test out of s() to avoid evaluating the value for z first.
If you as a programmer know the third expression needs to be avoided in certain circumstances, you need to make this optimisation yourself. Either merge the logic in s into f manually:
def f(a, b):
if b <= 0:
return a + 1
return f(a, b - 1) + 1
or postpone evaluating of the third expression until s() has determined if it needs to be calculated at all, by passing in a callable and make s responsible for evaluating it:
def s(x, y, z):
if x <= 0:
return y
return z() # evaluate the value for z late
def f(a, b):
# make the third argument a function so it is not evaluated until called
return s(b, a+1, lambda: f(a, b - 1) + 1)
When a function is called, all arguments are fully evaluated before they're passed to the function. In other words, f(5,-1) is being executed before s is even started.
Fortunately there's an easy way to evaluate expressions on demand: functions. Instead of passing the result of f(a,b-1) to z, pass it a function that computes that result:
def s(x,y,z):
if x <= 0:
return y
return z() # z is a function now
def f(a,b):
return s(b, a+1, lambda:f(a,b-1)+1)
print(f(5,2)) # output: 8
Please help me to understand how this works. Output is 4
a=4
b=7
x=lambda: a if 1 else b
lambda x: 'big' if x > 100 else 'small'
print(x())
First, let's remove this line as it doesn't do anything:
lambda x: 'big' if x > 100 else 'small'
This lambda expression is defined but never called. The fact that it's argument is also called x has nothing to do with the rest of the code.
Let's look at what remains:
a = 4
b = 7
x = lambda: a if 1 else b
print(x())
Here x becomes a function as it contains code. The lambda form can only contain expressions, not statements, so it has to use the expression form of if which is backward looking:
true-result if condition else false-result
In this case the condition is 1, which is always true, so the result of the function x() is always the value of a, assigned to 4 earlier in the code. Effectively, x() acts like:
def x():
return a
Understanding the differences between expressions and statements is key to understanding code like this.
Your x is always equals to 4, as it takes no arguments and if 1 is always True.
Then you have lambda expression that's not assigned to any variable, neither used elsewhere.
Eventualy, you print out x, which is always 4 as I said above.
P.S. I strongly suggest you to read Using lambda Functions from Dive into Python
Let me translate that for you.
You assign to x a lambda function with no arguments. Because 1 always evaluates as true, you always return the externally defined variable a, which evaluates as 4.
Then, you create a lambda function with one argument x, which you don't assign to a variable/access name, so it is lost forever.
Then, you call function x, which always returns a. Output is 4.
I want to understand how the return statement works. I am familiar with the return statement but not aware of the return in statement. Below is an example of a class method that uses it and I would like to know what it does.
def a(self, argv):
some = self.fnc("Format Specifier")
return argv in some
value in values means "True if value is in values otherwise False"
a simple example:
In [1]: "foo" in ("foo", "bar", "baz")
Out[1]: True
In [2]: "foo" in ("bar", "baz")
Out[2]: False
So in your case return argv in some means "return True if argv is in some otherwise return False"
It means whether argv is an element of some(in Boolean value). some could be list, tuple, dict etc.
It may be more clear if you know what happens in the background. When you use x in y, that is a shortcut for y.__contains__(x)1. When you define a class, you can define your own __contains__ method that can actually return anything you want. Usually, it returns either True or False. Therefore, argv in some will be the result of argv.__contains__(some): either True or False. You then return that.
1If y does not have the __contains__ method, it is converted to an iterator and each item in it is checked for equality with x.
The return itself is of no importance here: you can interpret this as:
return (argv in some)
Now the in keyword means:
The operators in and not in test for collection membership. x in s evaluates to true if x is a member of the collection s, and false otherwise. x not in s returns the negation of x in s. The collection membership test has traditionally been bound to sequences; an object is a member of a collection if the collection is a sequence and contains an element equal to that object. However, it make sense for many other object types to support membership tests without being a sequence. In particular, dictionaries (for keys) and sets support membership testing.
Python uses a fallback mechanism where it will check whether some (in this case) supports one of the following methods:
First it checks whether some is a list or tuple:
For the list and tuple types, x in y is true if and only if there exists an index i such that either x is y[i] or x == y[i] is true.
Next it checks whether both argv and some are strings:
For the Unicode and string types, x in y is true if and only if x is a substring of y. An equivalent test is y.find(x) != -1. Note, x and y need not be the same type; consequently, u'ab' in 'abc' will return True. Empty strings are always considered to be a substring of any other string, so "" in "abc" will return True.
Now every object that implements a __contains__ method supports such in and not in test as is further described in the documentation:
For user-defined classes which define the __contains__() method, x in y is true if and only if y.__contains__(x) is true.
So besides implemented usages for dictionaries, tuples and lists, you can define your own __contains__ method for arbitrary objects.
Another way to support this functionality is the following:
For user-defined classes which do not define __contains__() but do define __iter__(), x in y is true if some value z with x == z is produced while iterating over y. If an exception is raised during the iteration, it is as if in raised that exception.
And finally:
Lastly, the old-style iteration protocol is tried: if a class defines __getitem__(), x in y is true if and only if there is a non-negative integer index i such that x == y[i], and all lower integer indices do not raise IndexError exception. (If any other exception is raised, it is as if in raised that exception).
I understand the concept of recursion, where I get confused is in the flow control. I've seen this presented two ways, one I get, kind of, the other I don't. Example one:
def fact(n):
if n == 0:
return 1
else:
return n * fact(n-1)
So in this example, if we run fact(3), the following occurs:
fact(3) = 3*fact(3-1)`
fact(2) = 2*fact(2-1)
fact(1) = 1*fact(1-1)
fact(0) = 1
or combined: 3*2*1*1 = 6
Now for the following below, where I get tripped up is in how the flow control works. I have it ingrained in my head that when a function is called, everything else is suspended until that function completes, at which time the program returns to main. Here is what my brain thinks is happening below:
def factorial(n):
if n == 0:
return 1
else:
recurse = factorial(n-1)
result = n * recurse
return result
We call factorial(3):
factorial(3)=factorial(2)=factorial(1)=factorial(0)=1
The reason I think this is occuring is because result is assigned after the call and in my mind the code never gets there because flow control suspends main prior to result being assigned. I think of this function as just running the test of n==0 until 1 is returned and then the program exits.
Help me understand why I can't seem to conceputalize this.
Here's an outline of the flow of the program. It might be a little confusing to look at but it could possibly help. Here, different tab levels represent different stacks and each line is a command the program executes.
factorial(3)
| factorial(2)
| | factorial(1)
| | | factorial(0)
| | | RETURNS 1
| | recurse = 1
| | result = 1 * 1 [since n=1]
| | RETURNS 1 [returning result]
| recurse = 1 [catching returned result of 1]
| result = 2 * 1 [since n=2]
| RETURNS 2 [returning result]
recurse = 2 [catching returned result of 2]
result = 3 * 2 [since n=2]
RETURNS 6 [returning result]
Think of the meaning of the word reentrant - it means that the function can be entered more than once. Calling the function blocks that particular spot in the flow, but it doesn't block that piece of code from executing again in the next call. When the last call in the chain returns, that unblocks the one before it, and everything gets unblocked in a chain reaction.
You're somewhat incorrect in your understanding that everything else stops before the function returns. That is not true if say, function A calls function B. In that situation, A will run until it calls B, at which point it will pause, B will run and, assuming it doesn't call other functions, return, and A will resume.
In the case of recursion, this means going deeper at each level (A, B, C, D...) until the if clause allows function N to complete without calling any other functions. From that point on, the parent functions will resume one by one, until you're back into "main", as you call it.
I'm on my phone, so typing an example is kinda cumbersome at best. I will make sure to write one once I get home. Perhaps if you wrote a function that printed "This is function X" (and maybe add some indentation) you would visualize it better.
There is much less magic involved than you think. Each call to factorial is independent from each other, with each having its own set of parameters and local variables.
Let's look at just one level of recursion:
factorial(2)
Control will enter the function and the else block. Here, the inner call to factorial happens. So control will enter the function again, running into the if block and return 1. This ends the inner call.
Meanwhile, the outer call has been suspended. After the inner call returns, the result is stored in the recurse variable as control continues. Eventually, the outer call returns with 2.
def factorial(n):
if n == 0:
return 1
else:
recurse = factorial(n-1)
result = n * recurse
return result
So I guess you have an idea that n is local such that the call to factorial(2) inside factorial(3) doesn't mess with the first calls n so that result = n * recurse would be result = 3 * 2 instead of result = 1 * 1 * 1 * 1.
The scoping rules of Python is that every variable defined inside of a function is local. That means that both result and recurse exists in several incarnations for each call to factorial where n > 0.
Another thing that people have problems getting is that every call to factorial is a totally different incarnation that the last.. Actually it's not different that calling a totally different function:
def factorial(n):
if n == 0:
return 1
else:
return n * factorial2(n-1)
So what happens is that the result of this completely different function results in the factorial of one less that what you need.. and resumes running the continuation of the original function by multiplying n with the factorial2 of n-1 and return that result.
The code for factorial2 is very similar, but it calls factorial3 and it again calls factorial4.. That way there is no recursion only n different factorial functions that each multiply its own argument with the result fo a totally different function it has to wait for to calculate it's own value. This is of course silly because then you'll have a lot of the same functions with just different names and calling itself (recursion) is not anything different than calling a totally different function that does the same. In fact you can think of every call to the same function as if they were completely different instance that has nothing to do with the previous call since they don't share any other data than what is passed from caller to callee as arguments and what is returned back from the callee to caller as result just as if you were to use any other arithmetic function. As long as the answer is calculated from the arguments this will always be true. (every call with the same value always yields the same answer, ie. functional)
As Duffymo said, both sections of are equivalent, the second is just separated into 3 lines instead of one. If you can understand the first, then what is there to not understand about the second?
recurse = factorial(n-1)
result = n * recurse
return result
is the same as
result = n * factorial(n-1)
return result
is the same as
return (n * factorial(n-1))
The function will start at the sent number (lets say 3), recursivley move down by calling the function again of (n-1)->(3,2,1,0), at 0 it returns 1. Then we are at (3,2,1), we use 1*1 = 1 and return that. We now have (3,2) and use 2*1 = 2 and return that. Now we have (3), where we use 3*2 = 6, and return that. With () [nothing left] we are done the recursion.