Evaluation order of augmented operators (delimiters) in python - python

If I evaluate the following minimal example in python
a = [1, 2, 3]
a[-1] += a.pop()
I get
[1, 6]
So it seems that this is evaluated as
a[-1] = a[-1] + a.pop()
where each expression/operand would be evaluated in the order
third = first + second
so that on the lefthand side a[-1] is the 2nd element while on the righthand side it is the 3rd.
a[1] = a[2] + a.pop()
Can someone explain to me how one could infer this from the docs? Apparently '+=' is lexically a delimiter that also performs an operation (see here). What does that imply for its evaluation order?
EDIT:
I tried to clarify my question in a comment. I'll include it here for reference.
I want to understand if augmented operators have to be treated in a
special way (i.e. by expanding them) during lexical analysis, because
you kind of have to duplicate an expression and evaluate it twice.
This is not clear in the docs and I want to know where this behaviour
is specified. Other lexical delimiters (e.g. '}') behave differently.

Let me start with the question you asked at the end:
I want to understand if augmented operators have to be treated in a special way (i.e. by expanding them) during lexical analysis,
That one is simple; the answer is "no". A token is just a token and the lexical analyser just divides the input into tokens. As far as the lexical analyser is concerned, += is just a token, and that's what it returns for it.
By the way, the Python docs make a distinction between "operators" and "punctuation", but that's not really a significant difference for the current lexical analyser. It might have made sense in some previous incarnation of the parser based on operator-precedence parsing, in which an "operator" is a lexeme with associated precedence and associativity. But I don't know if Python ever used that particular parsing algorithm; in the current parser, both "operators" and "punctuation" are literal lexemes which appear as such in syntax rules. As you might expect, the lexical analyser is more concerned with the length of the tokens (<= and += are both two-character tokens) than with the eventual use inside the parser.
"Desugaring" -- the technical term for source tranformations which convert some language construct into a simpler construct -- is not usually performed either in the lexer or in the parser, although the internal workings of compilers are not subject to a Code of Conduct. Whether a language even has a desugaring component is generally considered an implementation detail, and may not be particularly visible; that's certainly true of Python. Python doesn't expose an interface to its tokeniser, either; the tokenizer module is a reimplementation in pure Python which does not produce exactly the same behaviour (although it's close enough to be a useful exploratory tool). But the parser is exposed in the ast module, which provides direct access to Python's own parser (at least in the CPython implementation), and that let's us see that no desugaring is done up to the point that the AST is constructed (note: requires Python3.9 for the indent option):
>>> import ast
>>> def showast(code):
... print(ast.dump(ast.parse(code), indent=2))
...
>>> showast('a[-1] += a.pop()')
Module(
body=[
AugAssign(
target=Subscript(
value=Name(id='a', ctx=Load()),
slice=UnaryOp(
op=USub(),
operand=Constant(value=1)),
ctx=Store()),
op=Add(),
value=Call(
func=Attribute(
value=Name(id='a', ctx=Load()),
attr='pop',
ctx=Load()),
args=[],
keywords=[]))],
type_ignores=[])
This produces exactly the syntax tree you would expect from the grammar, in which "augmented assignment" statements are represented as a specific production within assignment:
assignment:
| single_target augassign ~ (yield_expr | star_expressions)
single_target is a single assignable expression (such as a variable or, as in this case, a subscripted array); augassign is one of the augmented assignment operators, and the rest are alternatives for the right-hand side of the assignment. (You can ignore the "fence" grammar operator ~.) The parse tree produced by ast.dump is pretty close to the grammar, and shows no desugaring at all:
--------------------------
| | |
Subscript Add Call
--------- -----------------
| | | | |
a -1 Attribute [ ] [ ]
---------
| |
a 'pop'
The magic happens afterwards, which we can also see because the Python standard library also includes a disassembler:
>>> import dis
>>> dis.dis(compile('a[-1] += a.pop()', '--', 'exec'))
1 0 LOAD_NAME 0 (a)
2 LOAD_CONST 0 (-1)
4 DUP_TOP_TWO
6 BINARY_SUBSCR
8 LOAD_NAME 0 (a)
10 LOAD_METHOD 1 (pop)
12 CALL_METHOD 0
14 INPLACE_ADD
16 ROT_THREE
18 STORE_SUBSCR
20 LOAD_CONST 1 (None)
22 RETURN_VALUE
As can be seen, trying to summarize the evaluation order of augmented assignment as "left-to-right" is just an approximation. Here's what actually happens, as revealed in the virtual machine code above:
The target aggregate and its index are "computed" (lines 0 and 2), and then these two values are duplicated (line 4). (The duplication means that neither the target nor its subscript are evaluated twice.)
Then the duplicated values are used to lookup the value of the element (line 6). So it's at this point that the value of a[-1] is evaluated.
The right-hand side expression (a.pop()) is then evaluated (lines 8 through 12).
These two values (both 3, in this case) are combined with INPLACE_ADD because this is an ADD augmented assignment. In the case of integers, there's no difference between INPLACE_ADD and ADD, because integers are immutable values. But the compiler doesn't know that the first operand is an integer. a[-1] could be anything, including another list. So it emits an operand which will trigger the use of the __iadd__ method instead of __add__, in case there is a difference.
The original target and subscript, which have been patiently waiting on the stack since step 1, are then used to perform a subscripted store (lines 16 and 18. The subscript is still the subscript computed at line 2, -1. But at this point a[-1] refers to a different element of a.
The rotate is needed to get the arguments for into the correct order. Because the normal order of evaluation for assignment is to evaluate the right-hand side first, the virtual machine assumes that the new value will be at the bottom of the stack, followed by the object and its subscript.
Finally, None is returned as the value of the statement.
The precise workings of assignment and augmented assignment statements are documented in the Python reference manual. Another important source of information is the description of the __iadd__ special method. Evaluation (and evaluation order) for augmented assignment operations is sufficiently confusing that there is a Programming FAQ dedicated to it, which is worth reading carefully if you want to understand the exact mechanism.
Interesting though that information is, it's worth adding that writing programs which depend on details of the evaluation order inside an augmented assignment is not conducive to producing readable code. In almost all cases, augmented assignment which relies on non-obvious details of the procedure should be avoided, including statements such as the one that is the target of this question.

rici did a great job showing what's happening under the hood in the CPython reference interpreter, but there's a much simpler "source of truth" here in the language spec, which guarantees this behavior for any Python interpreter (not just CPython, but PyPy, Jython, IronPython, Cython, etc.). In the language spec, under Chapter 6: Expressions, section 6.16, Evaluation Order, it specifies:
Python evaluates expressions from left to right. Notice that while evaluating an assignment, the right-hand side is evaluated before the left-hand side.
That second sentence sounds like an exception to the general rule, but it isn't; assignment with = (including augmented assignment with += or the like) is not an expression in Python (the walrus operator introduced in 3.8 is an expression, but it can only assign to bare names, so there is never anything to "evaluate" on the left side, it's purely storing there, never reading from it), it's a statement, and the assignment statement has its own rules for order of evaluation. Those rules for assignment specify:
An assignment statement evaluates the expression list (remember that this can be a single expression or a comma-separated list, the latter yielding a tuple) and assigns the single resulting object to each of the target lists, from left to right.
This confirms the second sentence from the Expression Evaluation Order documentation; the expression list (the thing to be assigned) is evaluated first, then assignments to the targets proceed from there. So by the language spec itself, a[-1] += a.pop() must completely evaluate a.pop() first (the "expression list"), then perform assignment.
This behavior is required by the language spec, and has been for some time, so it can be relied on no matter what Python interpreter you're using.
That said, I'd recommend against code that relies on these guarantees from Python. For one, when you switch to other languages, the rules differ (and in some cases, e.g. many similar cases in C and C++, varying by version of the standard, there are no "rules", and trying to mutate the same object in multiple parts of an expression produces undefined behavior), so growing to rely on Python's behavior will hamper your ability to use other languages. Beyond that, it's still going to be confusing as hell, and just slight changes will avoid the confusion, for example, in your case, changing:
a[-1] += a.pop()
to just:
x = a.pop()
a[-1] += x
which, while admittedly a two-liner and therefore inferior!!!, achieves the same result, with meaningless overhead, and greater clarity.
TL;DR: The Python language spec guarantees that the right-hand side of += is fully evaluated before the augmented assignment operation begins and any of the code on the left-hand side is evaluated. But for code clarity, any code that relies on that guarantee should probably be refactored to avoid said reliance.

Related

Python slice/index evaluation order (before list definition)

I was wondering if there is a reason why python evaluates the slicing/indexing after the definition of a list/tuple ?
(This question concerns Python Code Golf so I know it's not readable or "good practice", it's just about the accepted syntax and the fundamental behavior of the language.)
We can use index to mimic the ternary operator's behavior:
a if x > 9 else b
(b,a)[x>9] # way shorter
But this has an issue: the content of the tuple is evaluated before the condition in the index.
I created an example to illustrate the point: a function f that reduces the size of a string by one recursively till the string is empty, then returns 0
f = lambda s: f(s) if len(s:=s[:-1]) else 0
print(f("abc")) # works fine
f = lambda s: (0, f(s))[len(s:=s[:-1])==0]
print(f("abc")) # max recursion depth error
The recursion depth error occurs because the tuple definition is evaluated before the index. It means that what is in the slice/index doesn't matter, the function will be called again and again.
I don't really understand why python doesn't evaluate the slice/index before because even an obvious case like the following fails:
f = lambda: (0, f())[0]
f() # max recursion depth error
On top of that, it could benefit in terms of memory usage and runtime if we just evaluate the single element (or the slice) we want from the array and not every single element:
x = 2
print([long_computation(), other_long_computation(), 0][x])
Is there any reason not to evaluate the slice/index before the tuple definition ?
Humans read left-to-right, the parser reads left-to-right, the compiler just converts whatever is parsed into bytecode. It would make sense- and it's easier- to just parse it left-to-right rather than adding special cases for stuff that can be already done properly in the left-to-right fashion. Why do you need to do this anyways? You don't need to do it. You can already do it properly in the current parser and compiler. The complexity of this special case and how rare it's ever used, both factors are enough as reasons to not evaluate the slice before the tuple definition.

Why is the last number returned when putting "and" in python? [duplicate]

First, the code:
>>> False or 'hello'
'hello'
This surprising behavior lets you check if x is not None and check the value of x in one line:
>>> x = 10 if randint(0,2) == 1 else None
>>> (x or 0) > 0
# depend on x value...
Explanation: or functions like this:
if x is false, then y, else x
No language that I know lets you do this. So, why does Python?
It sounds like you're combining two issues into one.
First, there's the issue of short-circuiting. Marcin's answer addresses this issue perfectly, so I won't try to do any better.
Second, there's or and and returning the last-evaluated value, rather than converting it to bool. There are arguments to be made both ways, and you can find many languages on either side of the divide.
Returning the last-evaluated value allows the functionCall(x) or defaultValue shortcut, avoids a possibly wasteful conversion (why convert an int 2 into a bool 1 if the only thing you're going to do with it is check whether it's non-zero?), and is generally easier to explain. So, for various combinations of these reasons, languages like C, Lisp, Javascript, Lua, Perl, Ruby, and VB all do things this way, and so does Python.
Always returning a boolean value from an operator helps to catch some errors (especially in languages where the logical operators and the bitwise operators are easy to confuse), and it allows you to design a language where boolean checks are strictly-typed checks for true instead of just checks for nonzero, it makes the type of the operator easier to write out, and it avoids having to deal with conversion for cases where the two operands are different types (see the ?: operator in C-family languages). So, for various combinations of these reasons, languages like C++, Fortran, Smalltalk, and Haskell all do things this way.
In your question (if I understand it correctly), you're using this feature to be able to write something like:
if (x or 0) < 1:
When x could easily be None. This particular use case isn't very useful, both because the more-explicit x if x else 0 (in Python 2.5 and later) is just as easy to write and probably easier to understand (at least Guido thinks so), but also because None < 1 is the same as 0 < 1 anyway (at least in Python 2.x, so you've always got at least one of the two options)… But there are similar examples where it is useful. Compare these two:
return launchMissiles() or -1
return launchMissiles() if launchMissiles() else -1
The second one will waste a lot of missiles blowing up your enemies in Antarctica twice instead of once.
If you're curious why Python does it this way:
Back in the 1.x days, there was no bool type. You've got falsy values like None, 0, [], (), "", etc., and everything else is true, so who needs explicit False and True? Returning 1 from or would have been silly, because 1 is no more true than [1, 2, 3] or "dsfsdf". By the time bool was added (gradually over two 2.x versions, IIRC), the current logic was already solidly embedded in the language, and changing would have broken a lot of code.
So, why didn't they change it in 3.0? Many Python users, including BDFL Guido, would suggest that you shouldn't use or in this case (at the very least because it's a violation of "TOOWTDI"); you should instead store the result of the expression in a variable, e.g.:
missiles = launchMissiles()
return missiles if missiles else -1
And in fact, Guido has stated that he'd like to ban launchMissiles() or -1, and that's part of the reason he eventually accepted the ternary if-else expression that he'd rejected many times before. But many others disagree, and Guido is a benevolent DFL. Also, making or work the way you'd expect everywhere else, while refusing to do what you want (but Guido doesn't want you to want) here, would actually be pretty complicated.
So, Python will probably always be on the same side as C, Perl, and Lisp here, instead of the same side as Java, Smalltalk, and Haskell.
No language that i know lets you do this. So, why Python do?
Then you don't know many languages. I can't think of one language that I do know that does not exhibit this "shortcircuiting" behaviour.
It does it because it is useful to say:
a = b or K
such that a either becomes b, if b is not None (or otherwise falsy), and if not it gets the default value K.
Actually a number of languages do. See Wikipedia about Short-Circuit Evaluation
For the reason why short-circuit evaluation exists, wikipedia writes:
If both expressions used as conditions are simple boolean variables,
it can be actually faster to evaluate both conditions used in boolean
operation at once, as it always requires a single calculation cycle,
as opposed to one or two cycles used in short-circuit evaluation
(depending on the value of the first).
This behavior is not surprising, and it's quite straightforward if you consider Python has the following features regarding or, and and not logical operators:
Short-circuit evaluation: it only evaluates operands up to where it needs to.
Non-coercing result: the result is one of the operands, not coerced to bool.
And, additionally:
The Truth Value of an object is False only for None, False, 0, "", [], {}. Everything else has a truth value of True (this is a simplification; the correct definition is in the official docs)
Combine those features, and it leads to:
or : if the first operand evaluates as True, short-circuit there and return it. Or return the 2nd operand.
and: if the first operand evaluates as False, short-circuit there and return it. Or return the 2nd operand.
It's easier to understand if you generalize to a chain of operations:
>>> a or b or c or d
>>> a and b and c and d
Here is the "rule of thumb" I've memorized to help me easily predict the result:
or : returns the first "truthy" operand it finds, or the last one.
and: returns the first "falsy" operand it finds, or the last one.
As for your question, on why python behaves like that, well... I think because it has some very neat uses, and it's quite intuitive to understand. A common use is a series of fallback choices, the first "found" (ie, non-falsy) is used. Think about this silly example:
drink = getColdBeer() or pickNiceWine() or random.anySoda or "meh, water :/"
Or this real-world scenario:
username = cmdlineargs.username or configFile['username'] or DEFAULT_USERNAME
Which is much more concise and elegant than the alternative.
As many other answers have pointed out, Python is not alone and many other languages have the same behavior, for both short-circuit (I believe most current languanges are) and non-coercion.
"No language that i know lets you do this. So, why Python do?" You seem to assume that all languages should be the same. Wouldn't you expect innovation in programming languages to produce unique features that people value?
You've just pointed out why it's useful, so why wouldn't Python do it? Perhaps you should ask why other languages don't.
You can take advantage of the special features of the Python or operator out of Boolean contexts. The rule of thumb is still that the result of your Boolean expressions is the first true operand or the last in the line.
Notice that the logical operators (or included) are evaluated before the assignment operator =, so you can assign the result of a Boolean expression to a variable in the same way you do with a common expression:
>>> a = 1
>>> b = 2
>>> var1 = a or b
>>> var1
1
>>> a = None
>>> b = 2
>>> var2 = a or b
>>> var2
2
>>> a = []
>>> b = {}
>>> var3 = a or b
>>> var3
{}
Here, the or operator works as expected, returning the first true operand or the last operand if both are evaluated to false.

Why does the yield function not require parentheses in Python?

In Python, I have many times seen the yield function used to create a generator. Both this and the print function technically both perform the action of methods because they return a value. However, during the change from Python 2 to Python 3, the print function gained parentheses like a normal method call, but yield stayed the same. Also, yield gains a yellowish color of a reserved keyword while print is the purple of a reserved method. Why is yield not considered a method and colored this way along with not using parentheses syntax?
(In a similar vein, why does return also lack parentheses?)
Let me add some more stuff, yield and continue are not given parentheses in many other languages as well. I just wanted to know what makes it different other than it is reserved. There are many other reserved methods out there which get parentheses.
So I went digging for an answer. And it turns out, there is one. From PEP 255, the pep that gave us the yield keyword
Q. Why a new keyword for "yield"? Why not a builtin function instead?
A. Control flow is much better expressed via keyword in Python, and
yield is a control construct. It's also believed that efficient
implementation in Jython requires that the compiler be able to
determine potential suspension points at compile-time, and a new
keyword makes that easy. The CPython referrence implementation also
exploits it heavily, to detect which functions are generator-
functions (although a new keyword in place of "def" would solve that
for CPython -- but people asking the "why a new keyword?" question
don't want any new keyword).
Q: Then why not some other special syntax without a new keyword? For
example, one of these instead of "yield 3":
return 3 and continue
return and continue 3
return generating 3
continue return 3
return >> , 3
from generator return 3
return >> 3
return << 3
>> 3
<< 3
* 3
A: Did I miss one ? Out of hundreds of messages, I counted three
suggesting such an alternative, and extracted the above from them.
It would be nice not to need a new keyword, but nicer to make yield
very clear -- I don't want to have to deduce that a yield is
occurring from making sense of a previously senseless sequence of
keywords or operators. Still, if this attracts enough interest,
proponents should settle on a single consensus suggestion, and Guido
will Pronounce on it.
print wasn't a function that gained parentheses: it went from being a statement to being a function. yield is still a statement, like return. Syntax highlighting is specific to your development environment.
You can find more information about the difference between expressions and statements here, and more about the difference between functions and statements here. Also see the documentation on simple statements and compound statements.
yield is not a function, its an keyword, and it does not require parenthesis according to its grammar -
yield_atom ::= "(" yield_expression ")"
yield_expression ::= "yield" [expression_list]
print used to be a statement in Python 2 , but it was changed to being a built-in function in Python 3 using PEP 3105
print was a keyword defined by the language specification in Python 2, and became a builtin function (defined by the standard library specification) Python 3. yield was, and still is, a keyword.

Limitations of variables in python

I realize this may be a bit broad, and thought this was an interesting question that I haven't really seen an answer to. It may be hidden in the python documentation somewhere, but as I'm new to python haven't gone through all of it yet.
So.. are there any general rules of things that we cannot set to be variables? Everything in python is an object and we can use variables for the typical standard usage of storing strings, integers, aliasing variables, lists, calling references to classes, etc and if we're clever even something along the lines as the below that I can think of off the top of my head, wherever this may be useful
var = lambda: some_function()
storing comparison operators to clean code up such as:
var = some_value < some_value ...
So, that being said I've never come across anything that I couldn't store as a variable if I really wanted to, and was wondering if there really are any limitations?
You can't store syntactical constructs in a variable. For example, you can't do
command = break
while condition:
if other_condition:
command
or
operator = +
three = 1 operator 2
You can't really store expressions and statements as objects in Python.
Sure, you can wrap an expression in a lambda, and you can wrap a series of statements in a code object or callable, but you can't easily manipulate them. For instance, changing all instances of addition to multiplication is not readily possible.
To some extent, this can be worked around with the ast module, which provides for parsing Python code into abstract syntax trees. You can then manipulate the trees, instead of the code itself, and pass it to compile() to turn it back into a code object.
However, this is a form of indirection, compensating for a feature Python itself lacks. ast can't really compare to the anything-goes flexibility of (say) Lisp macros.
According to the Language Reference, the right hand side of an assignment statement can be an 'expression list' or a 'yield expression'. An expression list is a comma-separated list of one or more expressions. You need to follow this through several more tokens to come up with anything concrete, but ultimately you can find that an 'expression' is any number of objects (literals or variable names, or the result of applying a unary operator such as not, ~ or - to a nested expression_list) chained together by any binary operator (such as the arithmetic, comparison or bitwise operators, or logical and and or) or the ternary a if condition else b.
You can also note in other parts of the language reference that an 'expression' is exactly something you can use as an argument to a function, or as the first part (before the for) of a list comprehension or generator expression.
This is a fairly broad definition - in fact, it amounts to "anything Python resolves to an object". But it does leave out a few things - for example, you can't directly store the less-than operator < in a variable, since it isn't a valid expression by itself (it has to be between two other expressions) and you have to put it in a function that uses it instead. Similarly, most of the Python keywords aren't expressions (the exceptions are True, False and None, which are all canonical names for certain objects).
Note especially that functions are also objects, and hence the name of a function (without calling it) is a valid expression. This means that your example:
var = lambda: some_function()
can be written as:
var = some_function
By definition, a variable is something which can vary, or change. In its broadest sense, a variable is no more than a way of referring to a location in memory in your given program. Another way to think of a variable is as a container to place your information in.
Unlike popular strongly typed languages, variable declaration in Python is not required. You can place pretty much anything in a variable so long as you can come up with a name for it. Furthermore, in addition to the value of a variable in Python being capable of changing, the type often can as well.
To address your question, I would say the limitations on a variable in Python relate only to a few basic necessary attributes:
A name
A scope
A value
(Usually) a type
As a result, things like operators (+ or * for instance) cannot be stored in a variable as they do not meet these basic requirements, and in general you cannot store expressions themselves as variables (unless you're wrapping them in a lambda expression).
As mentioned by Kevin, it's also worth noting that it is possible to sort of store an operator in a variable using the operator module , however even doing so you cannot perform the kinds of manipulations that a variable is otherwise subject to as really you are just making a value assignment. An example of the operator module:
import operator
operations = {"+": operator.add,
"-": operator.sub,}
operator_variable_string= input('Give me an operand:')
operator_function = operations[operator_variable_string]
result = operator_function(8, 4)

Why does python not support ++i or i++ [duplicate]

Why are there no ++ and -- operators in Python?
It's not because it doesn't make sense; it makes perfect sense to define "x++" as "x += 1, evaluating to the previous binding of x".
If you want to know the original reason, you'll have to either wade through old Python mailing lists or ask somebody who was there (eg. Guido), but it's easy enough to justify after the fact:
Simple increment and decrement aren't needed as much as in other languages. You don't write things like for(int i = 0; i < 10; ++i) in Python very often; instead you do things like for i in range(0, 10).
Since it's not needed nearly as often, there's much less reason to give it its own special syntax; when you do need to increment, += is usually just fine.
It's not a decision of whether it makes sense, or whether it can be done--it does, and it can. It's a question of whether the benefit is worth adding to the core syntax of the language. Remember, this is four operators--postinc, postdec, preinc, predec, and each of these would need to have its own class overloads; they all need to be specified, and tested; it would add opcodes to the language (implying a larger, and therefore slower, VM engine); every class that supports a logical increment would need to implement them (on top of += and -=).
This is all redundant with += and -=, so it would become a net loss.
This original answer I wrote is a myth from the folklore of computing: debunked by Dennis Ritchie as "historically impossible" as noted in the letters to the editors of Communications of the ACM July 2012 doi:10.1145/2209249.2209251
The C increment/decrement operators were invented at a time when the C compiler wasn't very smart and the authors wanted to be able to specify the direct intent that a machine language operator should be used which saved a handful of cycles for a compiler which might do a
load memory
load 1
add
store memory
instead of
inc memory
and the PDP-11 even supported "autoincrement" and "autoincrement deferred" instructions corresponding to *++p and *p++, respectively. See section 5.3 of the manual if horribly curious.
As compilers are smart enough to handle the high-level optimization tricks built into the syntax of C, they are just a syntactic convenience now.
Python doesn't have tricks to convey intentions to the assembler because it doesn't use one.
I always assumed it had to do with this line of the zen of python:
There should be one — and preferably only one — obvious way to do it.
x++ and x+=1 do the exact same thing, so there is no reason to have both.
Of course, we could say "Guido just decided that way", but I think the question is really about the reasons for that decision. I think there are several reasons:
It mixes together statements and expressions, which is not good practice. See http://norvig.com/python-iaq.html
It generally encourages people to write less readable code
Extra complexity in the language implementation, which is unnecessary in Python, as already mentioned
Because, in Python, integers are immutable (int's += actually returns a different object).
Also, with ++/-- you need to worry about pre- versus post- increment/decrement, and it takes only one more keystroke to write x+=1. In other words, it avoids potential confusion at the expense of very little gain.
Clarity!
Python is a lot about clarity and no programmer is likely to correctly guess the meaning of --a unless s/he's learned a language having that construct.
Python is also a lot about avoiding constructs that invite mistakes and the ++ operators are known to be rich sources of defects.
These two reasons are enough not to have those operators in Python.
The decision that Python uses indentation to mark blocks rather
than syntactical means such as some form of begin/end bracketing
or mandatory end marking is based largely on the same considerations.
For illustration, have a look at the discussion around introducing a conditional operator (in C: cond ? resultif : resultelse) into Python in 2005.
Read at least the first message and the decision message of that discussion (which had several precursors on the same topic previously).
Trivia:
The PEP frequently mentioned therein is the "Python Enhancement Proposal" PEP 308. LC means list comprehension, GE means generator expression (and don't worry if those confuse you, they are none of the few complicated spots of Python).
My understanding of why python does not have ++ operator is following: When you write this in python a=b=c=1 you will get three variables (labels) pointing at same object (which value is 1). You can verify this by using id function which will return an object memory address:
In [19]: id(a)
Out[19]: 34019256
In [20]: id(b)
Out[20]: 34019256
In [21]: id(c)
Out[21]: 34019256
All three variables (labels) point to the same object. Now increment one of variable and see how it affects memory addresses:
In [22] a = a + 1
In [23]: id(a)
Out[23]: 34019232
In [24]: id(b)
Out[24]: 34019256
In [25]: id(c)
Out[25]: 34019256
You can see that variable a now points to another object as variables b and c. Because you've used a = a + 1 it is explicitly clear. In other words you assign completely another object to label a. Imagine that you can write a++ it would suggest that you did not assign to variable a new object but ratter increment the old one. All this stuff is IMHO for minimization of confusion. For better understanding see how python variables works:
In Python, why can a function modify some arguments as perceived by the caller, but not others?
Is Python call-by-value or call-by-reference? Neither.
Does Python pass by value, or by reference?
Is Python pass-by-reference or pass-by-value?
Python: How do I pass a variable by reference?
Understanding Python variables and Memory Management
Emulating pass-by-value behaviour in python
Python functions call by reference
Code Like a Pythonista: Idiomatic Python
It was just designed that way. Increment and decrement operators are just shortcuts for x = x + 1. Python has typically adopted a design strategy which reduces the number of alternative means of performing an operation. Augmented assignment is the closest thing to increment/decrement operators in Python, and they weren't even added until Python 2.0.
I'm very new to python but I suspect the reason is because of the emphasis between mutable and immutable objects within the language. Now, I know that x++ can easily be interpreted as x = x + 1, but it LOOKS like you're incrementing in-place an object which could be immutable.
Just my guess/feeling/hunch.
To complete already good answers on that page:
Let's suppose we decide to do this, prefix (++i) that would break the unary + and - operators.
Today, prefixing by ++ or -- does nothing, because it enables unary plus operator twice (does nothing) or unary minus twice (twice: cancels itself)
>>> i=12
>>> ++i
12
>>> --i
12
So that would potentially break that logic.
now if one needs it for list comprehensions or lambdas, from python 3.8 it's possible with the new := assignment operator (PEP572)
pre-incrementing a and assign it to b:
>>> a = 1
>>> b = (a:=a+1)
>>> b
2
>>> a
2
post-incrementing just needs to make up the premature add by subtracting 1:
>>> a = 1
>>> b = (a:=a+1)-1
>>> b
1
>>> a
2
I believe it stems from the Python creed that "explicit is better than implicit".
First, Python is only indirectly influenced by C; it is heavily influenced by ABC, which apparently does not have these operators, so it should not be any great surprise not to find them in Python either.
Secondly, as others have said, increment and decrement are supported by += and -= already.
Third, full support for a ++ and -- operator set usually includes supporting both the prefix and postfix versions of them. In C and C++, this can lead to all kinds of "lovely" constructs that seem (to me) to be against the spirit of simplicity and straight-forwardness that Python embraces.
For example, while the C statement while(*t++ = *s++); may seem simple and elegant to an experienced programmer, to someone learning it, it is anything but simple. Throw in a mixture of prefix and postfix increments and decrements, and even many pros will have to stop and think a bit.
The ++ class of operators are expressions with side effects. This is something generally not found in Python.
For the same reason an assignment is not an expression in Python, thus preventing the common if (a = f(...)) { /* using a here */ } idiom.
Lastly I suspect that there operator are not very consistent with Pythons reference semantics. Remember, Python does not have variables (or pointers) with the semantics known from C/C++.
as i understood it so you won't think the value in memory is changed.
in c when you do x++ the value of x in memory changes.
but in python all numbers are immutable hence the address that x pointed as still has x not x+1. when you write x++ you would think that x change what really happens is that x refrence is changed to a location in memory where x+1 is stored or recreate this location if doe's not exists.
Other answers have described why it's not needed for iterators, but sometimes it is useful when assigning to increase a variable in-line, you can achieve the same effect using tuples and multiple assignment:
b = ++a becomes:
a,b = (a+1,)*2
and b = a++ becomes:
a,b = a+1, a
Python 3.8 introduces the assignment := operator, allowing us to achievefoo(++a) with
foo(a:=a+1)
foo(a++) is still elusive though.
Maybe a better question would be to ask why do these operators exist in C. K&R calls increment and decrement operators 'unusual' (Section 2.8page 46). The Introduction calls them 'more concise and often more efficient'. I suspect that the fact that these operations always come up in pointer manipulation also has played a part in their introduction.
In Python it has been probably decided that it made no sense to try to optimise increments (in fact I just did a test in C, and it seems that the gcc-generated assembly uses addl instead of incl in both cases) and there is no pointer arithmetic; so it would have been just One More Way to Do It and we know Python loathes that.
This may be because #GlennMaynard is looking at the matter as in comparison with other languages, but in Python, you do things the python way. It's not a 'why' question. It's there and you can do things to the same effect with x+=. In The Zen of Python, it is given: "there should only be one way to solve a problem." Multiple choices are great in art (freedom of expression) but lousy in engineering.
I think this relates to the concepts of mutability and immutability of objects. 2,3,4,5 are immutable in python. Refer to the image below. 2 has fixed id until this python process.
x++ would essentially mean an in-place increment like C. In C, x++ performs in-place increments. So, x=3, and x++ would increment 3 in the memory to 4, unlike python where 3 would still exist in memory.
Thus in python, you don't need to recreate a value in memory. This may lead to performance optimizations.
This is a hunch based answer.
I know this is an old thread, but the most common use case for ++i is not covered, that being manually indexing sets when there are no provided indices. This situation is why python provides enumerate()
Example : In any given language, when you use a construct like foreach to iterate over a set - for the sake of the example we'll even say it's an unordered set and you need a unique index for everything to tell them apart, say
i = 0
stuff = {'a': 'b', 'c': 'd', 'e': 'f'}
uniquestuff = {}
for key, val in stuff.items() :
uniquestuff[key] = '{0}{1}'.format(val, i)
i += 1
In cases like this, python provides an enumerate method, e.g.
for i, (key, val) in enumerate(stuff.items()) :
In addition to the other excellent answers here, ++ and -- are also notorious for undefined behavior. For example, what happens in this code?
foo[bar] = bar++;
It's so innocent-looking, but it's wrong C (and C++), because you don't know whether the first bar will have been incremented or not. One compiler might do it one way, another might do it another way, and a third might make demons fly out of your nose. All would be perfectly conformant with the C and C++ standards.
(EDIT: C++17 has changed the behavior of the given code so that it is defined; it will be equivalent to foo[bar+1] = bar; ++bar; — which nonetheless might not be what the programmer is expecting.)
Undefined behavior is seen as a necessary evil in C and C++, but in Python, it's just evil, and avoided as much as possible.

Categories