Why does python not support ++i or i++ [duplicate]

Why does python not support ++i or i++ [duplicate] - python

Why are there no ++ and -- operators in Python?

It's not because it doesn't make sense; it makes perfect sense to define "x++" as "x += 1, evaluating to the previous binding of x".
If you want to know the original reason, you'll have to either wade through old Python mailing lists or ask somebody who was there (eg. Guido), but it's easy enough to justify after the fact:
Simple increment and decrement aren't needed as much as in other languages. You don't write things like for(int i = 0; i < 10; ++i) in Python very often; instead you do things like for i in range(0, 10).
Since it's not needed nearly as often, there's much less reason to give it its own special syntax; when you do need to increment, += is usually just fine.
It's not a decision of whether it makes sense, or whether it can be done--it does, and it can. It's a question of whether the benefit is worth adding to the core syntax of the language. Remember, this is four operators--postinc, postdec, preinc, predec, and each of these would need to have its own class overloads; they all need to be specified, and tested; it would add opcodes to the language (implying a larger, and therefore slower, VM engine); every class that supports a logical increment would need to implement them (on top of += and -=).
This is all redundant with += and -=, so it would become a net loss.

This original answer I wrote is a myth from the folklore of computing: debunked by Dennis Ritchie as "historically impossible" as noted in the letters to the editors of Communications of the ACM July 2012 doi:10.1145/2209249.2209251
The C increment/decrement operators were invented at a time when the C compiler wasn't very smart and the authors wanted to be able to specify the direct intent that a machine language operator should be used which saved a handful of cycles for a compiler which might do a
load memory
load 1
add
store memory
instead of
inc memory
and the PDP-11 even supported "autoincrement" and "autoincrement deferred" instructions corresponding to *++p and *p++, respectively. See section 5.3 of the manual if horribly curious.
As compilers are smart enough to handle the high-level optimization tricks built into the syntax of C, they are just a syntactic convenience now.
Python doesn't have tricks to convey intentions to the assembler because it doesn't use one.

I always assumed it had to do with this line of the zen of python:
There should be one — and preferably only one — obvious way to do it.
x++ and x+=1 do the exact same thing, so there is no reason to have both.

Of course, we could say "Guido just decided that way", but I think the question is really about the reasons for that decision. I think there are several reasons:
It mixes together statements and expressions, which is not good practice. See http://norvig.com/python-iaq.html
It generally encourages people to write less readable code
Extra complexity in the language implementation, which is unnecessary in Python, as already mentioned

Because, in Python, integers are immutable (int's += actually returns a different object).
Also, with ++/-- you need to worry about pre- versus post- increment/decrement, and it takes only one more keystroke to write x+=1. In other words, it avoids potential confusion at the expense of very little gain.

Clarity!
Python is a lot about clarity and no programmer is likely to correctly guess the meaning of --a unless s/he's learned a language having that construct.
Python is also a lot about avoiding constructs that invite mistakes and the ++ operators are known to be rich sources of defects.
These two reasons are enough not to have those operators in Python.
The decision that Python uses indentation to mark blocks rather
than syntactical means such as some form of begin/end bracketing
or mandatory end marking is based largely on the same considerations.
For illustration, have a look at the discussion around introducing a conditional operator (in C: cond ? resultif : resultelse) into Python in 2005.
Read at least the first message and the decision message of that discussion (which had several precursors on the same topic previously).
Trivia:
The PEP frequently mentioned therein is the "Python Enhancement Proposal" PEP 308. LC means list comprehension, GE means generator expression (and don't worry if those confuse you, they are none of the few complicated spots of Python).

My understanding of why python does not have ++ operator is following: When you write this in python a=b=c=1 you will get three variables (labels) pointing at same object (which value is 1). You can verify this by using id function which will return an object memory address:
In [19]: id(a)
Out[19]: 34019256
In [20]: id(b)
Out[20]: 34019256
In [21]: id(c)
Out[21]: 34019256
All three variables (labels) point to the same object. Now increment one of variable and see how it affects memory addresses:
In [22] a = a + 1
In [23]: id(a)
Out[23]: 34019232
In [24]: id(b)
Out[24]: 34019256
In [25]: id(c)
Out[25]: 34019256
You can see that variable a now points to another object as variables b and c. Because you've used a = a + 1 it is explicitly clear. In other words you assign completely another object to label a. Imagine that you can write a++ it would suggest that you did not assign to variable a new object but ratter increment the old one. All this stuff is IMHO for minimization of confusion. For better understanding see how python variables works:
In Python, why can a function modify some arguments as perceived by the caller, but not others?
Is Python call-by-value or call-by-reference? Neither.
Does Python pass by value, or by reference?
Is Python pass-by-reference or pass-by-value?
Python: How do I pass a variable by reference?
Understanding Python variables and Memory Management
Emulating pass-by-value behaviour in python
Python functions call by reference
Code Like a Pythonista: Idiomatic Python

It was just designed that way. Increment and decrement operators are just shortcuts for x = x + 1. Python has typically adopted a design strategy which reduces the number of alternative means of performing an operation. Augmented assignment is the closest thing to increment/decrement operators in Python, and they weren't even added until Python 2.0.

I'm very new to python but I suspect the reason is because of the emphasis between mutable and immutable objects within the language. Now, I know that x++ can easily be interpreted as x = x + 1, but it LOOKS like you're incrementing in-place an object which could be immutable.
Just my guess/feeling/hunch.

To complete already good answers on that page:
Let's suppose we decide to do this, prefix (++i) that would break the unary + and - operators.
Today, prefixing by ++ or -- does nothing, because it enables unary plus operator twice (does nothing) or unary minus twice (twice: cancels itself)
>>> i=12
>>> ++i
12
>>> --i
12
So that would potentially break that logic.
now if one needs it for list comprehensions or lambdas, from python 3.8 it's possible with the new := assignment operator (PEP572)
pre-incrementing a and assign it to b:
>>> a = 1
>>> b = (a:=a+1)
>>> b
2
>>> a
2
post-incrementing just needs to make up the premature add by subtracting 1:
>>> a = 1
>>> b = (a:=a+1)-1
>>> b
1
>>> a
2

I believe it stems from the Python creed that "explicit is better than implicit".

First, Python is only indirectly influenced by C; it is heavily influenced by ABC, which apparently does not have these operators, so it should not be any great surprise not to find them in Python either.
Secondly, as others have said, increment and decrement are supported by += and -= already.
Third, full support for a ++ and -- operator set usually includes supporting both the prefix and postfix versions of them. In C and C++, this can lead to all kinds of "lovely" constructs that seem (to me) to be against the spirit of simplicity and straight-forwardness that Python embraces.
For example, while the C statement while(*t++ = *s++); may seem simple and elegant to an experienced programmer, to someone learning it, it is anything but simple. Throw in a mixture of prefix and postfix increments and decrements, and even many pros will have to stop and think a bit.

The ++ class of operators are expressions with side effects. This is something generally not found in Python.
For the same reason an assignment is not an expression in Python, thus preventing the common if (a = f(...)) { /* using a here */ } idiom.
Lastly I suspect that there operator are not very consistent with Pythons reference semantics. Remember, Python does not have variables (or pointers) with the semantics known from C/C++.

as i understood it so you won't think the value in memory is changed.
in c when you do x++ the value of x in memory changes.
but in python all numbers are immutable hence the address that x pointed as still has x not x+1. when you write x++ you would think that x change what really happens is that x refrence is changed to a location in memory where x+1 is stored or recreate this location if doe's not exists.

Other answers have described why it's not needed for iterators, but sometimes it is useful when assigning to increase a variable in-line, you can achieve the same effect using tuples and multiple assignment:
b = ++a becomes:
a,b = (a+1,)*2
and b = a++ becomes:
a,b = a+1, a
Python 3.8 introduces the assignment := operator, allowing us to achievefoo(++a) with
foo(a:=a+1)
foo(a++) is still elusive though.

Maybe a better question would be to ask why do these operators exist in C. K&R calls increment and decrement operators 'unusual' (Section 2.8page 46). The Introduction calls them 'more concise and often more efficient'. I suspect that the fact that these operations always come up in pointer manipulation also has played a part in their introduction.
In Python it has been probably decided that it made no sense to try to optimise increments (in fact I just did a test in C, and it seems that the gcc-generated assembly uses addl instead of incl in both cases) and there is no pointer arithmetic; so it would have been just One More Way to Do It and we know Python loathes that.

This may be because #GlennMaynard is looking at the matter as in comparison with other languages, but in Python, you do things the python way. It's not a 'why' question. It's there and you can do things to the same effect with x+=. In The Zen of Python, it is given: "there should only be one way to solve a problem." Multiple choices are great in art (freedom of expression) but lousy in engineering.

I think this relates to the concepts of mutability and immutability of objects. 2,3,4,5 are immutable in python. Refer to the image below. 2 has fixed id until this python process.
x++ would essentially mean an in-place increment like C. In C, x++ performs in-place increments. So, x=3, and x++ would increment 3 in the memory to 4, unlike python where 3 would still exist in memory.
Thus in python, you don't need to recreate a value in memory. This may lead to performance optimizations.
This is a hunch based answer.

I know this is an old thread, but the most common use case for ++i is not covered, that being manually indexing sets when there are no provided indices. This situation is why python provides enumerate()
Example : In any given language, when you use a construct like foreach to iterate over a set - for the sake of the example we'll even say it's an unordered set and you need a unique index for everything to tell them apart, say
i = 0
stuff = {'a': 'b', 'c': 'd', 'e': 'f'}
uniquestuff = {}
for key, val in stuff.items() :
uniquestuff[key] = '{0}{1}'.format(val, i)
i += 1
In cases like this, python provides an enumerate method, e.g.
for i, (key, val) in enumerate(stuff.items()) :

In addition to the other excellent answers here, ++ and -- are also notorious for undefined behavior. For example, what happens in this code?
foo[bar] = bar++;
It's so innocent-looking, but it's wrong C (and C++), because you don't know whether the first bar will have been incremented or not. One compiler might do it one way, another might do it another way, and a third might make demons fly out of your nose. All would be perfectly conformant with the C and C++ standards.
(EDIT: C++17 has changed the behavior of the given code so that it is defined; it will be equivalent to foo[bar+1] = bar; ++bar; — which nonetheless might not be what the programmer is expecting.)
Undefined behavior is seen as a necessary evil in C and C++, but in Python, it's just evil, and avoided as much as possible.

Related

Limitations of variables in python

I realize this may be a bit broad, and thought this was an interesting question that I haven't really seen an answer to. It may be hidden in the python documentation somewhere, but as I'm new to python haven't gone through all of it yet.
So.. are there any general rules of things that we cannot set to be variables? Everything in python is an object and we can use variables for the typical standard usage of storing strings, integers, aliasing variables, lists, calling references to classes, etc and if we're clever even something along the lines as the below that I can think of off the top of my head, wherever this may be useful
var = lambda: some_function()
storing comparison operators to clean code up such as:
var = some_value < some_value ...
So, that being said I've never come across anything that I couldn't store as a variable if I really wanted to, and was wondering if there really are any limitations?

You can't store syntactical constructs in a variable. For example, you can't do
command = break
while condition:
if other_condition:
command
or
operator = +
three = 1 operator 2

You can't really store expressions and statements as objects in Python.
Sure, you can wrap an expression in a lambda, and you can wrap a series of statements in a code object or callable, but you can't easily manipulate them. For instance, changing all instances of addition to multiplication is not readily possible.
To some extent, this can be worked around with the ast module, which provides for parsing Python code into abstract syntax trees. You can then manipulate the trees, instead of the code itself, and pass it to compile() to turn it back into a code object.
However, this is a form of indirection, compensating for a feature Python itself lacks. ast can't really compare to the anything-goes flexibility of (say) Lisp macros.

According to the Language Reference, the right hand side of an assignment statement can be an 'expression list' or a 'yield expression'. An expression list is a comma-separated list of one or more expressions. You need to follow this through several more tokens to come up with anything concrete, but ultimately you can find that an 'expression' is any number of objects (literals or variable names, or the result of applying a unary operator such as not, ~ or - to a nested expression_list) chained together by any binary operator (such as the arithmetic, comparison or bitwise operators, or logical and and or) or the ternary a if condition else b.
You can also note in other parts of the language reference that an 'expression' is exactly something you can use as an argument to a function, or as the first part (before the for) of a list comprehension or generator expression.
This is a fairly broad definition - in fact, it amounts to "anything Python resolves to an object". But it does leave out a few things - for example, you can't directly store the less-than operator < in a variable, since it isn't a valid expression by itself (it has to be between two other expressions) and you have to put it in a function that uses it instead. Similarly, most of the Python keywords aren't expressions (the exceptions are True, False and None, which are all canonical names for certain objects).
Note especially that functions are also objects, and hence the name of a function (without calling it) is a valid expression. This means that your example:
var = lambda: some_function()
can be written as:
var = some_function

By definition, a variable is something which can vary, or change. In its broadest sense, a variable is no more than a way of referring to a location in memory in your given program. Another way to think of a variable is as a container to place your information in.
Unlike popular strongly typed languages, variable declaration in Python is not required. You can place pretty much anything in a variable so long as you can come up with a name for it. Furthermore, in addition to the value of a variable in Python being capable of changing, the type often can as well.
To address your question, I would say the limitations on a variable in Python relate only to a few basic necessary attributes:
A name
A scope
A value
(Usually) a type
As a result, things like operators (+ or * for instance) cannot be stored in a variable as they do not meet these basic requirements, and in general you cannot store expressions themselves as variables (unless you're wrapping them in a lambda expression).
As mentioned by Kevin, it's also worth noting that it is possible to sort of store an operator in a variable using the operator module , however even doing so you cannot perform the kinds of manipulations that a variable is otherwise subject to as really you are just making a value assignment. An example of the operator module:
import operator
operations = {"+": operator.add,
"-": operator.sub,}
operator_variable_string= input('Give me an operand:')
operator_function = operations[operator_variable_string]
result = operator_function(8, 4)

Enum vs String as a parameter in a function

I noticed that many libraries nowadays seem to prefer the use of strings over enum-type variables for parameters.
Where people would previously use enums, e.g. dateutil.rrule.FR for a Friday, it seems that this has shifted towards using string (e.g. 'FRI').
Same in numpy (or pandas for that matter), where searchsorted for example uses of strings (e.g. side='left', or side='right') rather than a defined enum. For the avoidance of doubt, before python 3.4 this could have been easily implemented as an enum as such:
class SIDE:
RIGHT = 0
LEFT = 1
And the advantages of enums-type variable are clear: You can't misspell them without raising an error, they offer proper support for IDEs, etc.
So why use strings at all, instead of sticking to enum types? Doesn't this make the programs much more prone to user errors? It's not like enums create an overhead - if anything they should be slightly more efficient. So when and why did this paradigm shift happen?

I think enums are safer especially for larger systems with multiple developers.
As soon as the need arises to change the value of such an enum, looking up and replacing a string in many places is not my idea of fun :-)
The most important criteria IMHO is the usage: for use in a module or even a package a string seems to be fine, in a public API I'ld prefer enums.

[update]
As of today (2019) Python introduced dataclasses - combined with optional type annotations and static type analyzers like mypy I think this is a solved problem.
As for efficiency, attribute lookup is somewhat expensive in Python compared to most computer languages so I guess some libraries may still chose to avoid it for performance reasons.
[original answer]
IMHO it is a matter of taste. Some people like this style:
def searchsorted(a, v, side='left', sorter=None):
...
assert side in ('left', 'right'), "Invalid side '{}'".format(side)
...
numpy.searchsorted(a, v, side='right')
Yes, if you call searchsorted with side='foo' you may get an AssertionError way later at runtime - but at least the bug will be pretty easy to spot looking the traceback.
While other people may prefer (for the advantages you highlighted):
numpy.searchsorted(a, v, side=numpy.CONSTANTS.SIDE.RIGHT)
I favor the first because I think seldom used constants are not worth the namespace cruft. You may disagree, and people may align with either side due to other concerns.
If you really care, nothing prevents you from defining your own "enums":
class SIDE(object):
RIGHT = 'right'
LEFT = 'left'
numpy.searchsorted(a, v, side=SIDE.RIGHT)
I think it is not worth but again it is a matter of taste.
[update]
Stefan made a fair point:
As soon as the need arises to change the value of such an enum, looking up and replacing a string in many places is not my idea of fun :-)
I can see how painful this can be in a language without named parameters - using the example you have to search for the string 'right' and get a lot of false positives. In Python you can narrow it down searching for side='right'.
Of course if you are dealing with an interface that already has a defined set of enums/constants (like an external C library) then yes, by all means mimic the existing conventions.

I understand this question has already been answered, but there is one thing that has not at all been addressed: the fact that Python Enum objects must be explicitly called for their value when using values stored by Enums.
>>> class Test(Enum):
... WORD='word'
... ANOTHER='another'
...
>>> str(Test.WORD.value)
'word'
>>> str(Test.WORD)
'Test.WORD'
One simple solution to this problem is to offer an implementation of __str__()
>>> class Test(Enum):
... WORD='word'
... ANOTHER='another'
... def __str__(self):
... return self.value
...
>>> Test.WORD
<Test.WORD: 'word'>
>>> str(Test.WORD)
'word'
Yes, adding .value is not a huge deal, but it is an inconvenience nonetheless. Using regular strings requires zero extra effort, no extra classes, or redefinition of any default class methods. Still, there must be explicit casting to a string value in many cases, where a simple str would not have a problem.

i prefer strings for the reason of debugging. compare an object like
side=1, opt_type=0, order_type=6
to
side='BUY', opt_type='PUT', order_type='FILL_OR_KILL'
i also like "enums" where the values are strings:
class Side(object):
BUY = 'BUY'
SELL = 'SELL'
SHORT = 'SHORT'

Strictly speaking Python does not have enums - or at least it didn't prior to v3.4
https://docs.python.org/3/library/enum.html
I prefer to think of your example as programmer defined constants.
In argparse, one set of constants have string values. While the code uses the constant names, users more often use the strings.
e.g. argparse.ZERO_OR_MORE = '*'
arg.parse.OPTIONAL = '?'
numpy is one of the older 3rd party packages (at least its roots like numeric are). String values are more common than enums. In fact I can't off hand think of any enums (as you define them).

Use of OR as branch control in FP

I undertook an interview last week in which I learnt a few things about python I didn't know about (or rather realise how they could be used), first up and the content of this question is the use of or for the purposes of branch control.
So, for example, if we run:
def f():
# do something. I'd use ... but that's actually a python object.
def g():
# something else.
f() or g()
Then if f() evaluates to some true condition then that value is returned, if not, g() is evaluated and whatever value it produces is returned, whether true or false. This gives us the ability to implement an if statement using or keywords.
We can also use and such that f() and g() will return the value of g() if f() is true and the value of f() if g() is false.
I am told that this (the use of or for branch control) is a common thing in languages such as lisp (hence the lisp tag). I'm currently following SICP learning Scheme, so I can see that (or (f x) (g x)) would return the value of (g x) assuming (f x) is #f.
I'm confused as to whether there is any advantage of this technique. It clearly achieves branch control but to me the built in keywords seem more self-explanatory.
I'm also confused as to whether or not this is "functional"? My understanding of pure functional programming is that you use constructs like this (an example from my recent erlang experiments):
makeeven(N,1) -> N+1;
makeeven(N,0) -> N;
makeeven(N) -> makeeven(N,N rem 2).
Or a better, more complicated example using template meta-programming in C++ (discovered via cpp-next.com). My thought process is that one aspect of functional programming boils down the use of piecewise defined functions in code for branch control (and if you can manage it, tail recursion).
So, my questions:
Is this "functional"? It appears that way and my interviewers said they had backgrounds in functional programming, but it didn't match what I thought was functional. I see no reason why you couldn't have a logical operator as part of a function - it seems to lend itself nicely to the concept of higher order functions. I just hadn't thought that the use of logical operators was how functional programmers achieved branch control. Right? Wrong? I can see that circuits use logic gates for branch control so I guess this is a similar (related) concept?
Is there some advantage to using this technique? Is it just language conciseness/a syntax issue, or are there implications in terms of building an interpreter to using this construct?
Are there any use cases for this technique? Or is it not used very often? Is it used at all? As a self-taught guy I'd never seen it before although that in itself isn't necessarily surprising.
I apologise for jumping over so many languages; I'm simply trying to tie together my understanding across them. Feel free to answer in any language mentioned. I also apologise if I've misunderstood any definitions or am missing something vital here, I've never formally studied computer science.

Your interviewers must have had a "functional background" way back. It used to be common to write
(or (some-condition) (some-side-effect))
but in CL and in Scheme implementation that support it, it is much better written with unless. Same goes for and vs when.
So, to be more concrete -- it's not more functional (and in fact the common use of these things was for one-sided conditionals, which are not functional to begin with); there is no advantage (which becomes very obvious in these languages when you know that things are implemented as macros anyway -- for example, most or and and implementations expand to an if); and any possible use cases should use when and unless if you have them in your implementation, otherwise it's better to define them as macros than to not use them.
Oh, and you could use a combination of them instead of a two sided if, but that would be obfuscatingly ugly.

I'm not aware of any issues with the way this code will execute, but it is confusing to read for the uninitiated. In fact, this kind of syntax is like a Python anti-pattern: you can do it, but it is in no way Pythonic.

condition and true_branch or false_branch works in all languages that have short circuting logical operators. On the other hand it's not really a good idea to use in a language where values have a boolean value.
For example
zero = (1==0) and 0 or 1 # (1==0) -> False
zero = (False and 0) or 1 # (False and X) -> X
zero = 0 or 1 # 0 is False in most languages
zero = False or 1
zero = 1

As Eli said; also, performing control flow purely with logical operators tends to be taught in introductory FP classes -- more as a mind exercise, really, not something that you necessarily want to use IRL. It's always good to be able to translate any control operator down to if.
Now, the big difference between FPs and other languages is that, in more functional languages, if is actually an expression, not a statement. An if block always has a value! The C family of languages has a macro version of this -- the test? consequent : alternative construct -- but it gets really unreadable if you nest more expressions.
Prior to Python 2.5, if you want to have a control-flow expression in Python you might have to use logical operators. In Python 2.5, though, there is an FP-like if-expression syntax, so you can do something like this:
(42 if True else 7) + 35
See PEP 308

You only mention the case where there are exactly 2 expressions to evaluate. What happens if there are 5?
;; returns first true value, evaluating only as many as needed
(or (f x) (g x) (h x) (i x) (j x))
Would you nest if-statements? I'm not sure how I'd do this in Python. It's almost like this:
any(c(x) for c in [f, g, h, i, j])
except Python's any throws away the value and just returns True. (There might be a way to do it with itertools.dropwhile, but it seems a little awkward to me. Or maybe I'm just missing the obvious way.)
(As an aside: I find that Lisp's builtins don't quite correspond to what their names are in other languages, which can be confusing. Lisp's IF is like C's ternary operator ?: or Python's conditional expressions, for example, not their if-statements. Likewise, Lisp's OR is in some ways more like (but not exactly like) Python's any(), which only takes 2 expressions. Since the normal IF returns a value already, there's no point in having a separate kind of "if" that can't be used like this, or a separate kind of "or" that only takes two values. It's already as flexible as the less common variant in other languages.)
I happen to be writing code like this right now, coincidentally, where some of the functions are "go ask some server for an answer", and I want to stop as soon as I get a positive response. I'd never use OR where I really want to say IF, but I'd rather say:
(setq did-we-pass (or (try-this x)
(try-that x)
(try-some-other-thing x)
(heck-maybe-this-will-work x))
than make a big tree of IFs. Does that qualify as "flow control" or "functional"? I guess it depends on your definitions.

It may be considered "functional" in the sense of style of programming that is/was preferred in functional language. There is nothing functional in it otherwise.
It's just syntax.
It may be sometimes more readable to use or, for example:
def foo(bar=None):
bar = bar or []
...
return bar
def baz(elems):
print "You have %s elements." % (len(elems) or "no")
You could use bar if bar else [], but it's quite elaborate.

side effect gotchas in python/numpy? horror stories and narrow escapes wanted

I am considering moving from Matlab to Python/numpy for data analysis and numerical simulations. I have used Matlab (and SML-NJ) for years, and am very comfortable in the functional environment without side effects (barring I/O), but am a little reluctant about the side effects in Python. Can people share their favorite gotchas regarding side effects, and if possible, how they got around them? As an example, I was a bit surprised when I tried the following code in Python:
lofls = [[]] * 4 #an accident waiting to happen!
lofls[0].append(7) #not what I was expecting...
print lofls #gives [[7], [7], [7], [7]]
#instead, I should have done this (I think)
lofls = [[] for x in range(4)]
lofls[0].append(7) #only appends to the first list
print lofls #gives [[7], [], [], []]
thanks in advance

Confusing references to the same (mutable) object with references to separate objects is indeed a "gotcha" (suffered by all non-functional languages, ones which have mutable objects and, of course, references). A frequently seen bug in beginners' Python code is misusing a default value which is mutable, e.g.:
def addone(item, alist=[]):
alist.append(item)
return alist
This code may be correct if the purpose is to have addone keep its own state (and return the one growing list to successive callers), much as static data would work in C; it's not correct if the coder is wrongly assuming that a new empty list will be made at each call.
Raw beginners used to functional languages can also be confused by the command-query separation design decision in Python's built-in containers: mutating methods that don't have anything in particular to return (i.e., the vast majority of mutating methods) return nothing (specifically, they return None) -- they're doing all their work "in-place". Bugs coming from misunderstanding this are easy to spot, e.g.
alist = alist.append(item)
is pretty much guaranteed to be a bug -- it appends an item to the list referred to by name alist, but then rebinds name alist to None (the return value of the append call).
While the first issue I mentioned is about an early-binding that may mislead people who think the binding is, instead, a late one, there are issues that go the other way, where some people's expectations are for an early binding while the binding is, instead, late. For example (with a hypothetical GUI framework...):
for i in range(10):
Button(text="Button #%s" % i,
click=lambda: say("I'm #%s!" % i))
this will show ten buttons saying "Button #0", "Button #1", etc, but, when clicked, each and every one of them will say it's #9 -- because the i within the lambda is late bound (with a lexical closure). A fix is to take advantage of the fact that default values for argument are early-bound (as I pointed out about the first issue!-) and change the last line to
click=lambda i=i: say("I'm #%s!" % i))
Now lambda's i is an argument with a default value, not a free variable (looked up by lexical closure) any more, and so the code works as intended (there are other ways too, of course).

I stumbled upon this one recently again, (after years of python) while trying to remove a small dependency on numpy.
If you come from matlab you should use and trust numpy functions for mono-type array handling. Along with matplotlib, they are some very convenient packages for a smooth transition.
import numpy as np
np.zeros((4,)) # to make an array full of zeros [0,0,0,0]
np.zeros((4,1)) # another one full of zeros but 2 dimensions [[0],[0],[0],[0]]
np.zeros((4,0)) # an empty array like [[],[],[],[]]
np.zeros((0,4)) # another empty array, which can not be represented with python lists o_O
etc.

What are the important language features (idioms) of Python to learn early on [duplicate]

This question already has answers here:
The Zen of Python [closed]
(22 answers)
Python: Am I missing something? [closed]
(16 answers)
Closed 8 years ago.
I would be interested in knowing what the StackOverflow community thinks are the important language features (idioms) of Python. Features that would define a programmer as Pythonic.
Python (pythonic) idiom - "code expression" that is natural or characteristic to the language Python.
Plus, Which idioms should all Python programmers learn early on?
Thanks in advance
Related:
Code Like a Pythonista: Idiomatic Python
Python: Am I missing something?

Python is a language that can be described as:
"rules you can fit in the
palm of your hand with a huge bag of
hooks".
Nearly everything in python follows the same simple standards. Everything is accessible, changeable, and tweakable. There are very few language level elements.
Take for example, the len(data) builtin function. len(data) works by simply checking for a data.__len__() method, and then calls it and returns the value. That way, len() can work on any object that implements a __len__() method.
Start by learning about the types and basic syntax:
Dynamic Strongly Typed Languages
bool, int, float, string, list, tuple, dict, set
statements, indenting, "everything is an object"
basic function definitions
Then move on to learning about how python works:
imports and modules (really simple)
the python path (sys.path)
the dir() function
__builtins__
Once you have an understanding of how to fit pieces together, go back and cover some of the more advanced language features:
iterators
overrides like __len__ (there are tons of these)
list comprehensions and generators
classes and objects (again, really simple once you know a couple rules)
python inheritance rules
And once you have a comfort level with these items (with a focus on what makes them pythonic), look at more specific items:
Threading in python (note the Global Interpreter Lock)
context managers
database access
file IO
sockets
etc...
And never forget The Zen of Python (by Tim Peters)
Beautiful is better than ugly.
Explicit is better than implicit.
Simple is better than complex.
Complex is better than complicated.
Flat is better than nested.
Sparse is better than dense.
Readability counts.
Special cases aren't special enough to break the rules.
Although practicality beats purity.
Errors should never pass silently.
Unless explicitly silenced.
In the face of ambiguity, refuse the temptation to guess.
There should be one-- and preferably only one --obvious way to do it.
Although that way may not be obvious at first unless you're Dutch.
Now is better than never.
Although never is often better than *right* now.
If the implementation is hard to explain, it's a bad idea.
If the implementation is easy to explain, it may be a good idea.
Namespaces are one honking great idea -- let's do more of those!

This page covers all the major python idioms: http://python.net/~goodger/projects/pycon/2007/idiomatic/handout.html

An important idiom in Python is docstrings.
Every object has a __doc__ attribute that can be used to get help on that object. You can set the __doc__ attribute on modules, classes, methods, and functions like this:
# this is m.py
""" module docstring """
class c:
"""class docstring"""
def m(self):
"""method docstring"""
pass
def f(a):
"""function f docstring"""
return
Now, when you type help(m), help(m.f) etc. it will print the docstring as a help message.
Because it's just part of normal object introspection this can be used by documention generating systems like epydoc or used for testing purposes by unittest.
It can also be put to more unconventional (i.e. non-idiomatic) uses such as grammars in Dparser.
Where it gets even more interesting to me is that, even though doc is a read-only attribute on most objects, you can use them anywhere like this:
x = 5
""" pseudo docstring for x """
and documentation tools like epydoc can pick them up and format them properly (as opposed to a normal comment which stays inside the code formatting.

Decorators get my vote. Where else can you write something like:
def trace(num_args=0):
def wrapper(func):
def new_f(*a,**k):
print_args = ''
if num_args > 0:
print_args = str.join(',', [str(x) for x in a[0:num_args]])
print('entering %s(%s)' %(f.__name__,print_args))
rc = f(*a,**k)
if rc is not None:
print('exiting %s(%s)=%s' %(f.__name__,str(rc)))
else:
print('exiting %s(%s)' %(f.__name__))
return rc
return new_f
return wrapper
#trace(1)
def factorial(n):
if n < 2:
return 1
return n * factorial(n-1)
factorial(5)
and get output like:
entering factorial(5)
entering factorial(4)
entering factorial(3)
entering factorial(2)
entering factorial(1)
entering factorial(0)
exiting factorial(0)=1
exiting factorial(1)=1
exiting factorial(2)=2
exiting factorial(3)=6
exiting factorial(4)=24
exiting factorial(5)=120

Everything connected to list usage.
Comprehensions, generators, etc.

Personally, I really like Python syntax defining code blocks by using indentation, and not by the words "BEGIN" and "END" (as in Microsoft's Basic and Visual Basic - I don't like these) or by using left- and right-braces (as in C, C++, Java, Perl - I like these).
This really surprised me because, although indentation has always been very important to me, I didn't make to much "noise" about it - I lived with it, and it is considered a skill to be able to read other peoples, "spaghetti" code. Furthermore, I never heard another programmer suggest making indentation a part of a language. Until Python! I only wish I had realized this idea first.
To me, it is as if Python's syntax forces you to write good, readable code.
Okay, I'll get off my soap-box. ;-)

From a more advanced viewpoint, understanding how dictionaries are used internally by Python. Classes, functions, modules, references are all just properties on a dictionary. Once this is understood it's easy to understand how to monkey patch and use the powerful __gettattr__, __setattr__, and __call__ methods.

Here's one that can help. What's the difference between:
[ foo(x) for x in range(0, 5) ][0]
and
( foo(x) for x in range(0, 5) ).next()
answer:
in the second example, foo is called only once. This may be important if foo has a side effect, or if the iterable being used to construct the list is large.

Two things that struck me as especially Pythonic were dynamic typing and the various flavors of lists used in Python, particularly tuples.
Python's list obsession could be said to be LISP-y, but it's got its own unique flavor. A line like:
return HandEvaluator.StraightFlush, (PokerCard.longFaces[index + 4],
PokerCard.longSuits[flushSuit]), []
or even
return False, False, False
just looks like Python and nothing else. (Technically, you'd see the latter in Lua as well, but Lua is pretty Pythonic in general.)

Using string substitutions:
name = "Joe"
age = 12
print "My name is %s, I am %s" % (name, age)
When I'm not programming in python, that simple use is what I miss most.

Another thing you cannot start early enough is probably testing. Here especially doctests are a great way of testing your code by explaining it at the same time.
doctests are simple text file containing an interactive interpreter session plus text like this:
Let's instantiate our class::
>>> a=Something(text="yes")
>>> a.text
yes
Now call this method and check the results::
>>> a.canify()
>>> a.text
yes, I can
If e.g. a.text returns something different the test will fail.
doctests can be inside docstrings or standalone textfiles and are executed by using the doctests module. Of course the more known unit tests are also available.

I think that tutorials online and books only talk about doing things, not doing things in the best way. Along with the python syntax i think that speed in some cases is important.
Python provides a way to benchmark functions, actually two!!
One way is to use the profile module, like so:
import profile
def foo(x, y, z):
return x**y % z # Just an example.
profile.run('foo(5, 6, 3)')
Another way to do this is to use the timeit module, like this:
import timeit
def foo(x, y, z):
return x**y % z # Can also be 'pow(x, y, z)' which is way faster.
timeit.timeit('foo(5, 6, 3)', 'from __main__ import *', number = 100)
# timeit.timeit(testcode, setupcode, number = number_of_iterations)

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.