Why implicitly check for emptiness in Python? [closed] - python

As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 11 years ago.
The Zen of Python says that explicit is better than implicit.
Yet the Pythonic way of checking a collection c for emptiness is:
if not c:
# ...
and checking if a collection is not empty is done like:
if c:
# ...
ditto for anything that can have "zeroness" or "emptiness" (tuples, integers, strings, None, etc)
What is the purpose of this? Will my code be buggier if I don't do this? Or does it enable more use cases (i.e: some kind of polymorphism) since people can override these boolean coercions?

This best practice is not without reason.
When testing if object: you are basically calling the objects __bool__ method, which can be overridden and implemented according to object behavior.
For example, the __bool__ method on collections (__nonzero__ in Python 2) will return a boolean value based on whether the collection is empty or not.
(Reference: http://docs.python.org/reference/datamodel.html)

Possibly trumped by the first and third items:
Beautiful is better than ugly.
Simple is better than complex.

"Simple is better than complex."
"Readability counts."
Take for example if users -- it is more readable that if len(users) == 0

if c:
# ...
is explicit. The "if" explicitly states that the expression following will be evaluated in boolean context.
It is fair to argue that
if c == 0:
# ...
is clearer, and a programming novice might agree.
But to an experienced programmer (this one at least) the latter is not clearer, it is redundant. I already know c will be compared with 0 because of the "if"; I don't need to be told twice.

Implicit boolean polymorphism is a part of many languages. In fact, the idiom
if msg:
print str(len(msg)) + ' char message: ' + msg
arguably comes from c!
void print_msg(char * msg) {
if (msg) {
printf("%d char message: %s", (int) strlen(msg), msg);
}
}
An unallocated pointer should contain NULL, which conveniently evaluates to false, allowing us to avoid a segmentation fault.
This is such a fundamental concept that it would hamper python to reject it. In fact, given the fairly clumsy version of boolean polymorphism available in c, it's immediately clear (to me at least) that python has made this behavior much more explicit by allowing any type to have a customizable, well-defined boolean conversion via __nonzero__.

If you prefer you can type if len(lst) > 0, but what would be the point? Python does use some conventions, one of which is that empty containers are considered falseish.
You can customize this with object.__nonzero__(self):
Called to implement truth value
testing and the built-in operation
bool(); should return False or True,
or their integer equivalents 0 or 1.
When this method is not defined,
__len__() is called, if it is
defined, and the object is considered
true if its result is nonzero. If a
class defines neither __len__() nor
__nonzero__(), all its instances are
considered true.

Others have already noted that this is a logical extension of the tradition (at least as early as C, perhaps earlier) that zero is false in a Boolean context, and nonzero is true. Others have also already noted that in Python you can achieve some special effects with customized magic methods and such. (And this does directly answer your question "does it enable more use cases?") Others have already noted that experienced Python programmers will have learned this convention and be used to the convention, so it is perfectly obvious and explicit to them.
But honestly, if you don't like that convention, and you are just programming for your own enjoyment and benefit, simply don't follow it. It is much, much better to be explicit, clear, and redundant than it is to be obfuscated, and it's virtually always better to err on the side of redundancy than obfuscation.
That said, when it comes down to it, it's not a difficult convention to learn, and if you are coding as part of a group project, it is best to stick with the project's conventions, even if you disagree with them.

Empty lists evaluating to False and non-empty lists evaluating to True is such a central Python convention that it is explicit enough. If the context of the if-statement even communicates the fact that c is a list, len(c) == 0 wouldn't be much more explicit.
I think len(c) == 0 only is more explicit if you really want to check the length and not for magic zeroness or emptiness because their evaluation may differ from the length check.

Related

Zen of Python 'Explicit is better than implicit'

I'm trying to understand what 'implicit' and 'explicit' really means in the context of Python.
a = []
# my understanding is that this is implicit
if not a:
print("list is empty")
# my understanding is that this is explicit
if len(a) == 0:
print("list is empty")
I'm trying to follow the Zen of Python rules, but I'm curious to know if this applies in this situation or if I am over-thinking it?
The two statements have very different semantics. Remember that Python is dynamically typed.
For the case where a = [], both not a and len(a) == 0 are equivalent. A valid alternative might be to check not len(a). In some cases, you may even want to check for both emptiness and listness by doing a == [].
But a can be anything. For example, a = None. The check not a is fine, and will return True. But len(a) == 0 will not be fine at all. Instead you will get TypeError: object of type 'NoneType' has no len(). This is a totally valid option, but the if statements do very different things and you have to pick which one you want.
(Almost) everything has a __bool__ method in Python, but not everything has __len__. You have to decide which one to use based on the situation. Things to consider are:
Have you already verified whether a is a sequence?
Do you need to?
Do you mind if your if statement crashed on non-sequences?
Do you want to handle other falsy objects as if they were empty lists?
Remember that making the code look pretty takes second place to getting the job done correctly.
Though this question is old, I'd like to offer a perspective.
In a dynamic language, my preference would be to always describe the expected type and objective of a variable in order to offer more purpose understanding. Then use the knowledge of the language to be succinct and increase readability where possible (in python, an empty list's boolean result is false). Thus the code:
lst_colours = []
if not lst_colours:
print("list is empty")
Even better to convey meaning is using a variable for very specific checks.
lst_colours = []
b_is_list_empty = not lst_colours
if b_is_list_empty:
print("list is empty")
Checking a list is empty would be a common thing to do several times in a code base. So even better such things in a separate file helper function library. Thus isolating common checks, and reducing code duplication.
lst_colours = []
if b_is_list_empty(lst_colours):
print("list is empty")
def b_is_list_empty (lst):
......
Most importantly, add meaning as much as possible, have an agreed company standard to chose how to tackle the simple things, like variable naming and implicit/explicit code choices.
Try to think of:
if not a:
...
as shorthand for:
if len(a) == 0:
...
I don't think this is a good example of a gotcha with Python's Zen rule of "explicit" over "implicit". This is done rather mostly because of readability. It's not that the second one is bad and the other is good. It's just that the first one is more skillful. If one understands boolean nature of lists in Python, I think you find the first is more readable and readability counts in Python.

How to correctly determine a type?

this has probably been asked a hundred times before, but you are a product of your own success!
In Python I see this in other peoples code
if type(number) != type(1):
Is there any good reason to do this, should not
if type(number) != int
be the better way to do this?
Is it a historical thing?
Neither is a the correct way.
The correct way is to use isinstance():
if not isinstance(number, int):
This makes sure that subclasses are not allowed either.
In the extremely rare case you must absolutely only bar int and allow subclasses, you use type() with not is; types are singletons:
if type(number) is not int:
Still more pythonic is to use ask for forgiveness rather than for permission; treat the value as an integer until proven otherwise (via an exception for example).
There is no good reason to do any of this. It is probably a symptom of someone used to other strongly typed languages writing Python code.
If you must (but again, think very hard about this) you should use isinstance():
>>> isinstance('1', str)
True
Python is a language for "consenting adults", which is a fancy way of saying we expect you to know what you are doing; just like that famous "with great power comes great responsibility" quote from your favorite superhero movie.
In Python, objects are expected to behave nicely and you are not supposed to have such surprises to have to do explicit type checking.
Normally such type checking is done to make sure you can do something to the value. A naive example would be to add two things (use the + sign), you might want to check that they are both some sort of number. Obviously 'hello' + 5 is not going to work.
So you might be tempted to write code like this:
def add_things(first, second):
if type(first) == int and type(second) == int:
print(first+second)
Then after reading these answers you'd change the above to:
def add_things_smarter(first, second):
if isinstance(first, int) and isinstance(second, int):
print(first+second)
However, its probably best to try to do the operation, then handle the error:
def add_things_better(first, second):
try:
print(first+second)
except ValueError:
print("I can't add {} + {}".format(first, second))
This is known as:
EAFP
Easier to ask for forgiveness than permission. This common Python
coding style assumes the existence of valid keys or attributes and
catches exceptions if the assumption proves false. This clean and fast
style is characterized by the presence of many try and except
statements. The technique contrasts with the LBYL style common to many
other languages such as C.
This is from the glossary section of the documentation, where you will find the definition of LBYL (Look Before You Leap).
Martijn Pieters and Burhan Khalid have given good answers, but neither are the whole story.
In Python, normally the best thing to do with a value is to assume it's the right type. You should avoid using type-based dispatch (that's what the type(x) is type(y) check is) unless it's important.
So instead of doing:
if type(number) is int:
do_int_stuff
else:
not_an_int
you should do:
try:
do_int_stuff
except NotTheRightTypeForThatException:
not_an_int
Better yet, don't get into that position in the first place; just have the right type. Although languages like Java let you overload function by their type, in Python it's normally best to assume that a function takes a mandatory non-optional interface (like requiring addition or being iterable) and never give them something they don't support.
But there are times where it's appropriate to do type-based dispatch, and both kinds (isinstance and type(x) is type(y)) have their place, although the former is far more common.
One time to use isinstance is when implementing operators (like +, -, *, /) for a new type. You want
X() * X()
to work, but you don't want to say what
X() * Y()
means unless you know you're doing the right thing. The reasons for this are nontrivial, but involve the fact that both types need to cooperate to decide what gets run. You would normally do an isinstance check, because if you implement
int(4) * int(2)
and someone subclasses int:
class MyInt(int): ...
you don't want * to break!
The uses of type(X) is type(Y) are much rarer. One practical case is where you are interfacing two type-systems; when converting one type to another language's analogue (or serializing it in something like JSON), it can make more sense to require exact types. In particular, this keeps the round-trip (converting and converting back) lossless.
Normally you'll know when you need this, and it's also far rarer than the others.

Why must Python list addition be homogenous?

Can anyone familiar with Python's internals (CPython, or other implementations) explain why list addition is required to be homogenous:
In [1]: x = [1]
In [2]: x+"foo"
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
C:\Users\Marcin\<ipython-input-2-94cd84126ddc> in <module>()
----> 1 x+"foo"
TypeError: can only concatenate list (not "str") to list
In [3]: x+="foo"
In [4]: x
Out[4]: [1, 'f', 'o', 'o']
Why shouldn't the x+"foo" above return the same value as the final value of x in the above transcript?
This question follows on from NPE's question here: Is the behaviour of Python's list += iterable documented anywhere?
Update: I know it is not required that heterogenous += work (but it does) and likewise, it is not required that heterogenous + be an error. This question is about why that latter choice was made.
It is too much to say that the results of adding a sequence to a list are uncertain. If that were a sufficient objection, it would make sense to prevent heterogenous +=. Update2: In particular, python always delegates operator calls to the lefthand operand, so no issue "what is the right thing to do" arises": the left-hand object always governs (unless it delegates to the right).
Update3: For anyone arguing that this is a design decision, please explain (a) why it is not documented; or (b) where it is documented.
Update4: "what should [1] + (2, ) return?" It should return a result value equal with the value of a variable x initially holding [1] immediately after x+=(2, ). This result is well-defined.
From the Zen of Python:
In the face of ambiguity, refuse the temptation to guess.
Let's look at what happens here:
x + y
This gives us a value, but of what type? When we add things in real life, we expect the type to be the same as the input types, but what if they are disparate? Well, in the real world, we refuse to add 1 and "a", it doesn't make sense.
What if we have similar types? In the real world, we look at context. The computer can't do this, so it has to guess. Python picks the left operand and lets that decide. Your issue occurs because of this lack of context.
Say a programmer wants to do ["a"] + "bc" - this could mean they want "abc" or ["a", "b", "c"]. Currently, the solution is to either call "".join() on the first operand or list() on the second, which allows the programmer to do what they want and is clear and explicit.
Your suggestion is for Python to guess (by having a built-in rule to pick a given operand), so the programmer can do the same thing by doing the addition - why is that better? It just means it's easier to get the wrong type by mistake, and we have to remember an arbitrary rule (left operand picks type). Instead, we get an error so we can give Python the information it needs to make the right call.
So why is += different? Well, that's because we are giving Python that context. With the in-place operation we are telling Python to modify a value, so we know that we are dealing with something of the type the value we are modifying is. This is the context Python needs to make the right call, so we don't need to guess.
When I talk about guessing, I'm talking about Python guessing the programmer's intent. This is something Python does a lot - see division in 3.x. / does float division, correcting the error of it being integer division in 2.x.
This is because we are implicitly asking for float division when we try to divide. Python takes this into account and it's operations are done according to that. Likewise, it's about guessing intent here. When we add with + our intent is unclear. When we use +=, it is very clear.
These bug reports suggest that this design quirk was a mistake.
Issue12318:
Yes, this is the expected behavior and yes, it is inconsistent.
It's been that way for a long while and Guido said he wouldn't do it again (it's in his list of regrets). However, we're not going to break code by changing it (list.__iadd__ working like list.extend).
Issue575536:
The intent was that list.__iadd__ correspond exactly to
list.extend(). There's no need to hypergeneralize
list.__add__() too: it's a feature that people who don't
want to get surprised by Martin-like examples can avoid
them by using plain + for lists.
(Of course, there are those of us who find this behaviour quite surprising, including the developer who opened that bug report).
(Thanks to #Mouad for finding these).
I believe Python designers made addition this way so that '+' operator stays consistently commutative with regard to result type: type(a + b) == type(b + a)
Everybody expects that 1 + 2 has the same result as 2 + 1. Would you expect [1] + 'foo' to be the same as 'foo' + [1]? If yes, what should be the result?
You have 3 choices, you either pick left operand as result type, right operand as result type, or raise an error.
+= is not commutative because it contains assignment. In this case you either pick left operand as result type or throw. The surprise here is that a += b is not the same as a = a + b. a += b does not translate in English to "Add a to b and assign result to a". It translates to "Add a to b in place". That's why it doesn't work on immutables such as string or tuple.
Thanks for the comments. Edited the post.
My guess is that Python is strongly typed, and there's not a clear indication of the right thing to do here. Are you asking Python to append the string itself, or to cast the string to a list (which is what you indicated you'd like it to do)?
Remember explicit is better than implicit. In the most common case, neither of those guesses is correct and you're accidentally trying to do something you didn't intent. Raising a TypeError and letting you sort it out is the safest, most Pythonic thing to do here.

Why does python not support ++i or i++ [duplicate]

Why are there no ++ and -- operators in Python?
It's not because it doesn't make sense; it makes perfect sense to define "x++" as "x += 1, evaluating to the previous binding of x".
If you want to know the original reason, you'll have to either wade through old Python mailing lists or ask somebody who was there (eg. Guido), but it's easy enough to justify after the fact:
Simple increment and decrement aren't needed as much as in other languages. You don't write things like for(int i = 0; i < 10; ++i) in Python very often; instead you do things like for i in range(0, 10).
Since it's not needed nearly as often, there's much less reason to give it its own special syntax; when you do need to increment, += is usually just fine.
It's not a decision of whether it makes sense, or whether it can be done--it does, and it can. It's a question of whether the benefit is worth adding to the core syntax of the language. Remember, this is four operators--postinc, postdec, preinc, predec, and each of these would need to have its own class overloads; they all need to be specified, and tested; it would add opcodes to the language (implying a larger, and therefore slower, VM engine); every class that supports a logical increment would need to implement them (on top of += and -=).
This is all redundant with += and -=, so it would become a net loss.
This original answer I wrote is a myth from the folklore of computing: debunked by Dennis Ritchie as "historically impossible" as noted in the letters to the editors of Communications of the ACM July 2012 doi:10.1145/2209249.2209251
The C increment/decrement operators were invented at a time when the C compiler wasn't very smart and the authors wanted to be able to specify the direct intent that a machine language operator should be used which saved a handful of cycles for a compiler which might do a
load memory
load 1
add
store memory
instead of
inc memory
and the PDP-11 even supported "autoincrement" and "autoincrement deferred" instructions corresponding to *++p and *p++, respectively. See section 5.3 of the manual if horribly curious.
As compilers are smart enough to handle the high-level optimization tricks built into the syntax of C, they are just a syntactic convenience now.
Python doesn't have tricks to convey intentions to the assembler because it doesn't use one.
I always assumed it had to do with this line of the zen of python:
There should be one — and preferably only one — obvious way to do it.
x++ and x+=1 do the exact same thing, so there is no reason to have both.
Of course, we could say "Guido just decided that way", but I think the question is really about the reasons for that decision. I think there are several reasons:
It mixes together statements and expressions, which is not good practice. See http://norvig.com/python-iaq.html
It generally encourages people to write less readable code
Extra complexity in the language implementation, which is unnecessary in Python, as already mentioned
Because, in Python, integers are immutable (int's += actually returns a different object).
Also, with ++/-- you need to worry about pre- versus post- increment/decrement, and it takes only one more keystroke to write x+=1. In other words, it avoids potential confusion at the expense of very little gain.
Clarity!
Python is a lot about clarity and no programmer is likely to correctly guess the meaning of --a unless s/he's learned a language having that construct.
Python is also a lot about avoiding constructs that invite mistakes and the ++ operators are known to be rich sources of defects.
These two reasons are enough not to have those operators in Python.
The decision that Python uses indentation to mark blocks rather
than syntactical means such as some form of begin/end bracketing
or mandatory end marking is based largely on the same considerations.
For illustration, have a look at the discussion around introducing a conditional operator (in C: cond ? resultif : resultelse) into Python in 2005.
Read at least the first message and the decision message of that discussion (which had several precursors on the same topic previously).
Trivia:
The PEP frequently mentioned therein is the "Python Enhancement Proposal" PEP 308. LC means list comprehension, GE means generator expression (and don't worry if those confuse you, they are none of the few complicated spots of Python).
My understanding of why python does not have ++ operator is following: When you write this in python a=b=c=1 you will get three variables (labels) pointing at same object (which value is 1). You can verify this by using id function which will return an object memory address:
In [19]: id(a)
Out[19]: 34019256
In [20]: id(b)
Out[20]: 34019256
In [21]: id(c)
Out[21]: 34019256
All three variables (labels) point to the same object. Now increment one of variable and see how it affects memory addresses:
In [22] a = a + 1
In [23]: id(a)
Out[23]: 34019232
In [24]: id(b)
Out[24]: 34019256
In [25]: id(c)
Out[25]: 34019256
You can see that variable a now points to another object as variables b and c. Because you've used a = a + 1 it is explicitly clear. In other words you assign completely another object to label a. Imagine that you can write a++ it would suggest that you did not assign to variable a new object but ratter increment the old one. All this stuff is IMHO for minimization of confusion. For better understanding see how python variables works:
In Python, why can a function modify some arguments as perceived by the caller, but not others?
Is Python call-by-value or call-by-reference? Neither.
Does Python pass by value, or by reference?
Is Python pass-by-reference or pass-by-value?
Python: How do I pass a variable by reference?
Understanding Python variables and Memory Management
Emulating pass-by-value behaviour in python
Python functions call by reference
Code Like a Pythonista: Idiomatic Python
It was just designed that way. Increment and decrement operators are just shortcuts for x = x + 1. Python has typically adopted a design strategy which reduces the number of alternative means of performing an operation. Augmented assignment is the closest thing to increment/decrement operators in Python, and they weren't even added until Python 2.0.
I'm very new to python but I suspect the reason is because of the emphasis between mutable and immutable objects within the language. Now, I know that x++ can easily be interpreted as x = x + 1, but it LOOKS like you're incrementing in-place an object which could be immutable.
Just my guess/feeling/hunch.
To complete already good answers on that page:
Let's suppose we decide to do this, prefix (++i) that would break the unary + and - operators.
Today, prefixing by ++ or -- does nothing, because it enables unary plus operator twice (does nothing) or unary minus twice (twice: cancels itself)
>>> i=12
>>> ++i
12
>>> --i
12
So that would potentially break that logic.
now if one needs it for list comprehensions or lambdas, from python 3.8 it's possible with the new := assignment operator (PEP572)
pre-incrementing a and assign it to b:
>>> a = 1
>>> b = (a:=a+1)
>>> b
2
>>> a
2
post-incrementing just needs to make up the premature add by subtracting 1:
>>> a = 1
>>> b = (a:=a+1)-1
>>> b
1
>>> a
2
I believe it stems from the Python creed that "explicit is better than implicit".
First, Python is only indirectly influenced by C; it is heavily influenced by ABC, which apparently does not have these operators, so it should not be any great surprise not to find them in Python either.
Secondly, as others have said, increment and decrement are supported by += and -= already.
Third, full support for a ++ and -- operator set usually includes supporting both the prefix and postfix versions of them. In C and C++, this can lead to all kinds of "lovely" constructs that seem (to me) to be against the spirit of simplicity and straight-forwardness that Python embraces.
For example, while the C statement while(*t++ = *s++); may seem simple and elegant to an experienced programmer, to someone learning it, it is anything but simple. Throw in a mixture of prefix and postfix increments and decrements, and even many pros will have to stop and think a bit.
The ++ class of operators are expressions with side effects. This is something generally not found in Python.
For the same reason an assignment is not an expression in Python, thus preventing the common if (a = f(...)) { /* using a here */ } idiom.
Lastly I suspect that there operator are not very consistent with Pythons reference semantics. Remember, Python does not have variables (or pointers) with the semantics known from C/C++.
as i understood it so you won't think the value in memory is changed.
in c when you do x++ the value of x in memory changes.
but in python all numbers are immutable hence the address that x pointed as still has x not x+1. when you write x++ you would think that x change what really happens is that x refrence is changed to a location in memory where x+1 is stored or recreate this location if doe's not exists.
Other answers have described why it's not needed for iterators, but sometimes it is useful when assigning to increase a variable in-line, you can achieve the same effect using tuples and multiple assignment:
b = ++a becomes:
a,b = (a+1,)*2
and b = a++ becomes:
a,b = a+1, a
Python 3.8 introduces the assignment := operator, allowing us to achievefoo(++a) with
foo(a:=a+1)
foo(a++) is still elusive though.
Maybe a better question would be to ask why do these operators exist in C. K&R calls increment and decrement operators 'unusual' (Section 2.8page 46). The Introduction calls them 'more concise and often more efficient'. I suspect that the fact that these operations always come up in pointer manipulation also has played a part in their introduction.
In Python it has been probably decided that it made no sense to try to optimise increments (in fact I just did a test in C, and it seems that the gcc-generated assembly uses addl instead of incl in both cases) and there is no pointer arithmetic; so it would have been just One More Way to Do It and we know Python loathes that.
This may be because #GlennMaynard is looking at the matter as in comparison with other languages, but in Python, you do things the python way. It's not a 'why' question. It's there and you can do things to the same effect with x+=. In The Zen of Python, it is given: "there should only be one way to solve a problem." Multiple choices are great in art (freedom of expression) but lousy in engineering.
I think this relates to the concepts of mutability and immutability of objects. 2,3,4,5 are immutable in python. Refer to the image below. 2 has fixed id until this python process.
x++ would essentially mean an in-place increment like C. In C, x++ performs in-place increments. So, x=3, and x++ would increment 3 in the memory to 4, unlike python where 3 would still exist in memory.
Thus in python, you don't need to recreate a value in memory. This may lead to performance optimizations.
This is a hunch based answer.
I know this is an old thread, but the most common use case for ++i is not covered, that being manually indexing sets when there are no provided indices. This situation is why python provides enumerate()
Example : In any given language, when you use a construct like foreach to iterate over a set - for the sake of the example we'll even say it's an unordered set and you need a unique index for everything to tell them apart, say
i = 0
stuff = {'a': 'b', 'c': 'd', 'e': 'f'}
uniquestuff = {}
for key, val in stuff.items() :
uniquestuff[key] = '{0}{1}'.format(val, i)
i += 1
In cases like this, python provides an enumerate method, e.g.
for i, (key, val) in enumerate(stuff.items()) :
In addition to the other excellent answers here, ++ and -- are also notorious for undefined behavior. For example, what happens in this code?
foo[bar] = bar++;
It's so innocent-looking, but it's wrong C (and C++), because you don't know whether the first bar will have been incremented or not. One compiler might do it one way, another might do it another way, and a third might make demons fly out of your nose. All would be perfectly conformant with the C and C++ standards.
(EDIT: C++17 has changed the behavior of the given code so that it is defined; it will be equivalent to foo[bar+1] = bar; ++bar; — which nonetheless might not be what the programmer is expecting.)
Undefined behavior is seen as a necessary evil in C and C++, but in Python, it's just evil, and avoided as much as possible.

Use of OR as branch control in FP

I undertook an interview last week in which I learnt a few things about python I didn't know about (or rather realise how they could be used), first up and the content of this question is the use of or for the purposes of branch control.
So, for example, if we run:
def f():
# do something. I'd use ... but that's actually a python object.
def g():
# something else.
f() or g()
Then if f() evaluates to some true condition then that value is returned, if not, g() is evaluated and whatever value it produces is returned, whether true or false. This gives us the ability to implement an if statement using or keywords.
We can also use and such that f() and g() will return the value of g() if f() is true and the value of f() if g() is false.
I am told that this (the use of or for branch control) is a common thing in languages such as lisp (hence the lisp tag). I'm currently following SICP learning Scheme, so I can see that (or (f x) (g x)) would return the value of (g x) assuming (f x) is #f.
I'm confused as to whether there is any advantage of this technique. It clearly achieves branch control but to me the built in keywords seem more self-explanatory.
I'm also confused as to whether or not this is "functional"? My understanding of pure functional programming is that you use constructs like this (an example from my recent erlang experiments):
makeeven(N,1) -> N+1;
makeeven(N,0) -> N;
makeeven(N) -> makeeven(N,N rem 2).
Or a better, more complicated example using template meta-programming in C++ (discovered via cpp-next.com). My thought process is that one aspect of functional programming boils down the use of piecewise defined functions in code for branch control (and if you can manage it, tail recursion).
So, my questions:
Is this "functional"? It appears that way and my interviewers said they had backgrounds in functional programming, but it didn't match what I thought was functional. I see no reason why you couldn't have a logical operator as part of a function - it seems to lend itself nicely to the concept of higher order functions. I just hadn't thought that the use of logical operators was how functional programmers achieved branch control. Right? Wrong? I can see that circuits use logic gates for branch control so I guess this is a similar (related) concept?
Is there some advantage to using this technique? Is it just language conciseness/a syntax issue, or are there implications in terms of building an interpreter to using this construct?
Are there any use cases for this technique? Or is it not used very often? Is it used at all? As a self-taught guy I'd never seen it before although that in itself isn't necessarily surprising.
I apologise for jumping over so many languages; I'm simply trying to tie together my understanding across them. Feel free to answer in any language mentioned. I also apologise if I've misunderstood any definitions or am missing something vital here, I've never formally studied computer science.
Your interviewers must have had a "functional background" way back. It used to be common to write
(or (some-condition) (some-side-effect))
but in CL and in Scheme implementation that support it, it is much better written with unless. Same goes for and vs when.
So, to be more concrete -- it's not more functional (and in fact the common use of these things was for one-sided conditionals, which are not functional to begin with); there is no advantage (which becomes very obvious in these languages when you know that things are implemented as macros anyway -- for example, most or and and implementations expand to an if); and any possible use cases should use when and unless if you have them in your implementation, otherwise it's better to define them as macros than to not use them.
Oh, and you could use a combination of them instead of a two sided if, but that would be obfuscatingly ugly.
I'm not aware of any issues with the way this code will execute, but it is confusing to read for the uninitiated. In fact, this kind of syntax is like a Python anti-pattern: you can do it, but it is in no way Pythonic.
condition and true_branch or false_branch works in all languages that have short circuting logical operators. On the other hand it's not really a good idea to use in a language where values have a boolean value.
For example
zero = (1==0) and 0 or 1 # (1==0) -> False
zero = (False and 0) or 1 # (False and X) -> X
zero = 0 or 1 # 0 is False in most languages
zero = False or 1
zero = 1
As Eli said; also, performing control flow purely with logical operators tends to be taught in introductory FP classes -- more as a mind exercise, really, not something that you necessarily want to use IRL. It's always good to be able to translate any control operator down to if.
Now, the big difference between FPs and other languages is that, in more functional languages, if is actually an expression, not a statement. An if block always has a value! The C family of languages has a macro version of this -- the test? consequent : alternative construct -- but it gets really unreadable if you nest more expressions.
Prior to Python 2.5, if you want to have a control-flow expression in Python you might have to use logical operators. In Python 2.5, though, there is an FP-like if-expression syntax, so you can do something like this:
(42 if True else 7) + 35
See PEP 308
You only mention the case where there are exactly 2 expressions to evaluate. What happens if there are 5?
;; returns first true value, evaluating only as many as needed
(or (f x) (g x) (h x) (i x) (j x))
Would you nest if-statements? I'm not sure how I'd do this in Python. It's almost like this:
any(c(x) for c in [f, g, h, i, j])
except Python's any throws away the value and just returns True. (There might be a way to do it with itertools.dropwhile, but it seems a little awkward to me. Or maybe I'm just missing the obvious way.)
(As an aside: I find that Lisp's builtins don't quite correspond to what their names are in other languages, which can be confusing. Lisp's IF is like C's ternary operator ?: or Python's conditional expressions, for example, not their if-statements. Likewise, Lisp's OR is in some ways more like (but not exactly like) Python's any(), which only takes 2 expressions. Since the normal IF returns a value already, there's no point in having a separate kind of "if" that can't be used like this, or a separate kind of "or" that only takes two values. It's already as flexible as the less common variant in other languages.)
I happen to be writing code like this right now, coincidentally, where some of the functions are "go ask some server for an answer", and I want to stop as soon as I get a positive response. I'd never use OR where I really want to say IF, but I'd rather say:
(setq did-we-pass (or (try-this x)
(try-that x)
(try-some-other-thing x)
(heck-maybe-this-will-work x))
than make a big tree of IFs. Does that qualify as "flow control" or "functional"? I guess it depends on your definitions.
It may be considered "functional" in the sense of style of programming that is/was preferred in functional language. There is nothing functional in it otherwise.
It's just syntax.
It may be sometimes more readable to use or, for example:
def foo(bar=None):
bar = bar or []
...
return bar
def baz(elems):
print "You have %s elements." % (len(elems) or "no")
You could use bar if bar else [], but it's quite elaborate.

Categories