I didn't really pay as much attention to Python 3's development as I would have liked, and only just noticed some interesting new syntax changes. Specifically from this SO answer function parameter annotation:
def digits(x:'nonnegative number') -> "yields number's digits":
# ...
Not knowing anything about this, I thought it could maybe be used for implementing static typing in Python!
After some searching, there seemed to be a lot discussion regarding (entirely optional) static typing in Python, such as that mentioned in PEP 3107, and "Adding Optional Static Typing to Python" (and part 2)
..but, I'm not clear how far this has progressed. Are there any implementations of static typing, using the parameter-annotation? Did any of the parameterised-type ideas make it into Python 3?
Thanks for reading my code!
Indeed, it's not hard to create a generic annotation enforcer in Python. Here's my take:
'''Very simple enforcer of type annotations.
This toy super-decorator can decorate all functions in a given module that have
annotations so that the type of input and output is enforced; an AssertionError is
raised on mismatch.
This module also has a test function func() which should fail and logging facility
log which defaults to print.
Since this is a test module, I cut corners by only checking *keyword* arguments.
'''
import sys
log = print
def func(x:'int' = 0) -> 'str':
'''An example function that fails type checking.'''
return x
# For simplicity, I only do keyword args.
def check_type(*args):
param, value, assert_type = args
log('Checking {0} = {1} of {2}.'.format(*args))
if not isinstance(value, assert_type):
raise AssertionError(
'Check failed - parameter {0} = {1} not {2}.'
.format(*args))
return value
def decorate_func(func):
def newf(*args, **kwargs):
for k, v in kwargs.items():
check_type(k, v, ann[k])
return check_type('<return_value>', func(*args, **kwargs), ann['return'])
ann = {k: eval(v) for k, v in func.__annotations__.items()}
newf.__doc__ = func.__doc__
newf.__type_checked = True
return newf
def decorate_module(module = '__main__'):
'''Enforces type from annotation for all functions in module.'''
d = sys.modules[module].__dict__
for k, f in d.items():
if getattr(f, '__annotations__', {}) and not getattr(f, '__type_checked', False):
log('Decorated {0!r}.'.format(f.__name__))
d[k] = decorate_func(f)
if __name__ == '__main__':
decorate_module()
# This will raise AssertionError.
func(x = 5)
Given this simplicity, it's strange at the first sight that this thing is not mainstream. However, I believe there are good reasons why it's not as useful as it might seem. Generally, type checking helps because if you add integer and dictionary, chances are you made some obvious mistake (and if you meant something reasonable, it's still better to be explicit than implicit).
But in real life you often mix quantities of the same computer type as seen by compiler but clearly different human type, for example the following snippet contains an obvious mistake:
height = 1.75 # Bob's height in meters.
length = len(sys.modules) # Number of modules imported by program.
area = height * length # What's that supposed to mean???
Any human should immediately see a mistake in the above line provided it knows the 'human type' of variables height and length even though it looks to computer as perfectly legal multiplication of int and float.
There's more that can be said about possible solutions to this problem, but enforcing 'computer types' is apparently a half-solution, so, at least in my opinion, it's worse than no solution at all. It's the same reason why Systems Hungarian is a terrible idea while Apps Hungarian is a great one. There's more at the very informative post of Joel Spolsky.
Now if somebody was to implement some kind of Pythonic third-party library that would automatically assign to real-world data its human type and then took care to transform that type like width * height -> area and enforce that check with function annotations, I think that would be a type checking people could really use!
As mentioned in that PEP, static type checking is one of the possible applications that function annotations can be used for, but they're leaving it up to third-party libraries to decide how to do it. That is, there isn't going to be an official implementation in core python.
As far as third-party implementations are concerned, there are some snippets (such as http://code.activestate.com/recipes/572161/), which seem to do the job pretty well.
EDIT:
As a note, I want to mention that checking behavior is preferable to checking type, therefore I think static typechecking is not so great an idea. My answer above is aimed at answering the question, not because I would do typechecking myself in such a way.
"Static typing" in Python can only be implemented so that the type checking is done in run-time, which means it slows down the application. Therefore you don't want that as a generality. Instead you want some of your methods to check it's inputs. This can be easily done with plain asserts, or with decorators if you (mistakenly) think you need it a lot.
There is also an alternative to static type checking, and that is to use an aspect oriented component architecture like The Zope Component Architecture. Instead of checking the type, you adapt it. So instead of:
assert isinstance(theobject, myclass)
you do this:
theobject = IMyClass(theobject)
If theobject already implements IMyClass nothing happens. If it doesn't, an adapter that wraps whatever theobject is to IMyClass will be looked up, and used instead of theobject. If no adapter is found, you get an error.
This combined the dynamicism of Python with the desire to have a specific type in a specific way.
This is not an answer to question directly, but I found out a Python fork that adds static typing: mypy-lang.org, of course one can't rely on it as it's still small endeavor, but interesting.
Sure, static typing seems a bit "unpythonic" and I don't use it all the time. But there are cases (e.g. nested classes, as in domain specific language parsing) where it can really speed up your development.
Then I prefer using beartype explained in this post*. It comes with a git repo, tests and an explanation what it can and what it can't do ... and I like the name ;)
* Please don't pay attention to Cecil's rant about why Python doesn't come with batteries included in this case.
Related
I'm looking for the best recipie to allow inline definition of functions, or multi-line lambda, in python.
For example, I'd like to do the following:
def callfunc(func):
func("Hello")
>>> callfunc(define('x', '''
... print x, "World!"
... '''))
Hello World!
I've found an example for the define function in this answer:
def define(arglist, body):
g = {}
exec("def anonfunc({0}):\n{1}".format(
arglist,
"\n".join(" {0}".format(line) for line in body.splitlines())), g)
return g["anonfunc"]
This is one possible solution, but it is not ideal. Desireable features would be:
be smarter about indentation,
hide the innards better (e.g. don't have anonfunc in the function's scope)
provide access to variables in the surrounding scope / captures
better error handling
and some things I haven't thought of. I had a really nice implementation once that did most of the above, but I lost in unfortunately. I'm wondering if someone else has made something similar.
Disclaimer:
I'm well aware this is controversial among Python users, and regarded as a hack or unpythonic. I'm also aware of the discussions regaring multi-line-lambdas on the python-dev mailing list, and that a similar feature was omitted on purpose. However, from the same discussions I've learned that there is also interest in such a function by many others.
I'm not asking whether this is a good idea or not, but instead: Given that one has decided to implement this, (either out of fun and curiosity, madness, genuinely thinking this is a nice idea, or being held at gunpoint) how to make anonymous define work as close as possible to def using python's (2.7 or 3.x) current facilities?
Examples:
A bit more as to why, this can be really handy for callbacks in GUIs:
# gtk example:
self.ntimes = 0
button.connect('clicked', define('*a', '''
self.ntimes += 1
label.set_text("Button has been clicked %d times" % self.ntimes)
''')
The benefit over defining a function with def is that your code is in a more logical order. This is simplified code taken from a Twisted application:
# twisted example:
def sayHello(self):
d = self.callRemote(HelloCommand)
def handle_response(response):
# do something, this happens after (x)!
pass
d.addCallback(handle_response) # (x)
Note how it seems out of order. I usually break stuff like this up, to keep the code order == execution order:
def sayHello_d(self):
d = self.callRemote(HelloCommand)
d.addCallback(self._sayHello_2)
return d
def _sayHello_2(self, response):
# handle response
pass
This is better wrt. ordering but more verbose. Now, with the anonymous functions trick:
d = self.callRemote(HelloCommand)
d.addCallback(define('response', '''
print "callback"
print "got response from", response["name"]
'''))
If you come from a javascript or ruby background, python's abilities to deal with anonymous functions may indeed seem limited, but this is for a reason. Python designers decided that clarity of code is more important than conciseness. If you don't like that, you probably don't like python at all. There's nothing wrong about that, there are many other choices - why not to try a language that tastes better to you?
Putting chunks of code into strings and interpreting them on the fly is definitely a wrong way to "extend" a language, just because none of tools you're working with - from syntax highlighters to the python interpreter itself - would be able to deal with "stringified" code in a sensible way.
To answer the question as asked: what you're doing there is essentially an attempt to construct some better-than-python programming language and compile it to python on the fly. The idea is not new in the world of scripting languages and can be productive or not (CoffeeScript is an example of a successful implementation), but your very approach is wrong. format() not the tool you're looking for when working with code. If you're writing a compiler, do it properly: use a parser (e.g. pyparsing) to read your code in an AST, walk through the AST to generate python code (or even bytecode), catch syntax errors as you go and take measures to provide better runtime feedback (e.g. error context, line numbers etc). Finally, make sure your compiler works across different python versions and implementations.
Or just use ruby.
I'm not sure if I like Python's dynamic-ness. It often results in me forgetting to check a type, trying to call an attribute and getting the NoneType (or any other) has no attribute x error. A lot of them are pretty harmless but if not handled correctly they can bring down your entire app/process/etc.
Over time I got better predicting where these could pop up and adding explicit type checking, but because I'm only human I miss one occasionally and then some end-user finds it.
So I'm interested in your strategy to avoid these. Do you use type-checking decorators? Maybe special object wrappers?
Please share...
forgetting to check a type
This doesn't make much sense. You so rarely need to "check" a type. You simply run unit tests and if you've provided the wrong type object, things fail. You never need to "check" much, in my experience.
trying to call an attribute and
getting the NoneType (or any other)
has no attribute x error.
Unexpected None is a plain-old bug. 80% of the time, I omitted the return. Unit tests always reveal these.
Of those that remain, 80% of the time, they're plain old bugs due to an "early exit" which returns None because someone wrote an incomplete return statement. These if foo: return structures are easy to detect with unit tests. In some cases, they should have been if foo: return somethingMeaningful, and in still other cases, they should have been if foo: raise Exception("Foo").
The rest are dumb mistakes misreading the API's. Generally, mutator functions don't return anything. Sometimes I forget. Unit tests find these quickly, since basically, nothing works right.
That covers the "unexpected None" cases pretty solidly. Easy to unit test for. Most of the mistakes involve fairly trivial-to-write tests for some pretty obvious species of mistakes: wrong return; failure to raise an exception.
Other "has no attribute X" errors are really wild mistakes where a totally wrong type was used. That's either really wrong assignment statements or really wrong function (or method) calls. They always fail elaborately during unit testing, requiring very little effort to fix.
A lot of them are pretty harmless but if not handled correctly they can bring down your entire app/process/etc.
Um... Harmless? If it's a bug, I pray that it brings down my entire app as quickly as possible so I can find it. A bug that doesn't crash my app is the most horrible situation imaginable. "Harmless" isn't a word I'd use for a bug that fails to crash my app.
If you write good unit tests for all of your code, you should find the errors very quickly when testing code.
You can also use decorators to enforce the type of attributes.
>>> #accepts(int, int, int)
... #returns(float)
... def average(x, y, z):
... return (x + y + z) / 2
...
>>> average(5.5, 10, 15.0)
TypeWarning: 'average' method accepts (int, int, int), but was given
(float, int, float)
15.25
>>> average(5, 10, 15)
TypeWarning: 'average' method returns (float), but result is (int)
15
I'm not really a fan of them, but I can see their usefulness.
One tool to try to help you keep your pieces fitting together well is interfaces. zope.interface is the most notable package in the Python world for using interfaces. Check out http://wiki.zope.org/zope3/WhatAreInterfaces and http://glyph.twistedmatrix.com/2009/02/explaining-why-interfaces-are-great.html to start to get an idea how interfaces and z.i in particular work. Interfaces can prove very useful in a large Python codebases.
Interfaces are no substitute for testing. Reasonably comprehensive testing is especially important in highly dynamic languages like Python where there are types of bugs that could not exist in a statically types language. Tests will also help you catch the sorts of bugs that are not unique to dynamic languages. Fortunately, developing in Python means that testing is easy (due to the flexibility) and you have plenty of time to write them that you saved because you're using Python.
One advantage of TDD is that you end up writing code that is easier to write tests for.
Writing code first and then the tests can result in code that superficially works the same, but is much harder to write 100% coverage tests for.
Each case is likely to be different
It might make sense to have a decorator to check whether a particular parameter is None (or some other unexpected value) if you use it in a bunch of places.
Maybe it is appropriate to use the Null pattern - if the code is blowing up because you are setting the initial value to None, you could instead set the initial value to a null version of the object.
More and more wrappers can add up to quite a performance hit though, so it's always better to write code from the start that avoids the corner cases
forgetting to check a type
With duck typing, it shouldn't be necessary to check a type. But that's theory, in reality you will often want to validate input parameters (e.g. checking a UUID with a regex). For that purpose, I created myself some handy decorators for simple type and return type checking which are called like this:
#decorators.params(0, int, 2, str) # first parameter must be integer / third a string
#decorators.returnsOrNone(int, long) # must return an int/long value or None
def doSomething(integerParam, noMatterWhatParam, stringParam):
...
For everything else I mostly use assertions. Of course one often forgets to check a parameter, so it's necessary to test and to test often.
trying to call an attribute
Happens to me very seldom. Actually I often use methods instead of direct access to attributes (the "good" old getter/setter approach sometimes).
because I'm only human I miss one occasionally and then some end-user finds it
"Software is always completed at the customers'." - An anti-pattern which you should solve with unit tests that handle all possible cases in a function. Easier said than done, but it helps...
As for other common Python mistakes (mistyped names, wrong imports, ...), I'm using Eclipse with PyDev for projects (not for small scripts). PyDev warns you about most of the simple kinds of mistakes.
I haven’t done a lot of Python programming, but I’ve done no programming at all in staticly typed languages, so I don’t tend to think about things in terms of variable types. That might explain why I haven’t come across this problem much. (Although the small amount of Python programming I’ve done might explain that too.)
I do enjoy Python 3’s revised handling of strings (i.e. all strings are unicode, everything else is just a stream of bytes), because in Python 2 you might not notice TypeErrors until dealing with unusual real world string values.
You can hint your IDE via function doc, for example: http://www.pydev.org/manual_adv_type_hints.html, in JavaScript the jsDoc helps in a similar way.
But at some point you will face errors that a typed language would avoid immediately without unit tests (via the IDE compilation and the types/inference).
Of course this does not remove the benefit of unit tests, static analysis and assertions. For larger project I tend to use statically typed languages because they have very good IDE support (excellent autocompletion, heavy refactoring...). You can still use scripting or a DSL for some sub part of the project.
Something you can use to simplify your code is using the Null Object Design Pattern (to which I was introduced in Python Cookbook).
Roughly, the goal with Null objects is to provide an 'intelligent'
replacement for the often used primitive data type None in Python or
Null (or Null pointers) in other languages. These are used for many
purposes including the important case where one member of some group
of otherwise similar elements is special for whatever reason. Most
often this results in conditional statements to distinguish between
ordinary elements and the primitive Null value.
This object just eats the lack of attribute error, and you can avoid checking for their existence.
It's nothing more than
class Null(object):
def __init__(self, *args, **kwargs):
"Ignore parameters."
return None
def __call__(self, *args, **kwargs):
"Ignore method calls."
return self
def __getattr__(self, mname):
"Ignore attribute requests."
return self
def __setattr__(self, name, value):
"Ignore attribute setting."
return self
def __delattr__(self, name):
"Ignore deleting attributes."
return self
def __repr__(self):
"Return a string representation."
return "<Null>"
def __str__(self):
"Convert to a string and return it."
return "Null"
With this, if you do Null("any", "params", "you", "want").attribute_that_doesnt_exists() it won't explode, but just silently become the equivalent of pass.
Normally you'd do something like
if obj.attr:
obj.attr()
With this, you just do:
obj.attr()
and forget about it. Beware that extensive use of the Null object can potentially hide bugs in your code.
I tend to use
if x is None:
raise ValueError('x cannot be None')
But this will only work with the actual None value.
A more general approach is to test for the necessary attributes before you try to use them. For example:
def write_data(f):
# Here we expect f is a file-like object. But what if it's not?
if not hasattr(f, 'write'):
raise ValueError('write_data requires a file-like object')
# Now we can do stuff with f that assumes it is a file-like object
The point of this code is that instead of getting an error message like "NoneType has no attribute write", you get "write_data requires a file-like object". The actual bug isn't in write_data(), and isn't really a problem with NoneType at all. The actual bug is in the code that calls write_data(). The key is to communicate that information as directly as possible.
I try to make my code fool-proof, but I've noticed that it takes a lot of time to type things out and it takes more time to read the code.
Instead of:
class TextServer(object):
def __init__(self, text_values):
self.text_values = text_values
# <more code>
# <more methods>
I tend to write this:
class TextServer(object):
def __init__(self, text_values):
for text_value in text_values:
assert isinstance(text_value, basestring), u'All text_values should be str or unicode.'
assert 2 <= len(text_value), u'All text_values should be at least two characters long.'
self.__text_values = frozenset(text_values) # <They shouldn't change.>
# <more code>
#property
def text_values(self):
# <'text_values' shouldn't be replaced.>
return self.__text_values
# <more methods>
Is my python coding style too paranoid? Or is there a way to improve readability while keeping it fool-proof?
Note 1: I've added the comments between < and > just for clarification.
Note 2: The main fool I try to prevent to abuse my code is my future self.
Here's some good advice on Python idioms from this page:
Catch errors rather than avoiding them to avoid cluttering your code with special cases. This idiom is called EAFP ('easier to ask forgiveness than permission'), as opposed to LBYL ('look before you leap'). This often makes the code more readable. For example:
Worse:
#check whether int conversion will raise an error
if not isinstance(s, str) or not s.isdigit:
return None
elif len(s) > 10: #too many digits for int conversion
return None
else:
return int(str)
Better:
try:
return int(str)
except (TypeError, ValueError, OverflowError): #int conversion failed
return None
(Note that in this case, the second version is much better, since it correctly handles leading + and -, and also values between 2 and 10 billion (for 32-bit machines). Don't clutter your code by anticipating all the possible failures: just try it and use appropriate exception handling.)
"Is my python coding style too paranoid? Or is there a way to improve readability while keeping it fool-proof?"
Who's the fool you're protecting yourself from?
You? Are you worried that you didn't remember the API you wrote?
A peer? Are you worried that someone in the next cubicle will actively make an effort to pass the wrong things through an API? You can talk to them to resolve this problem. It saves a lot of code if you provide documentation.
A total sociopath who will download your code, refuse to read the API documentation, and then call all the methods with improper arguments? What possible help can you provide to them?
The "fool-proof" coding isn't really very helpful, since all of these scenarios are more easily addressed another way.
If you're fool-proofing against yourself, perhaps that's not really sensible.
If you're fool-proofing for a co-worker or peer, you should -- perhaps -- talk to them and make sure they understand the API docs.
If you're fool-proofing against some hypothetical sociopathic programmer who's out to subvert the API, there's nothing you can do. It's Python. They have the source. Why would they go through the effort to misuse the API when they can just edit the source to break things?
It's unusual in Python to use a private instance attribute, and then expose it through a property as you have. Just use self.text_values.
Your code is too paranoid (especially when you want to protect only against yourself).
In Python circles, LBYL is generally (but not always) frowned upon. But there's also the (often unstated) assumption that one has (good) unit tests.
Me personally? I think readability is of paramount importance. I mean, if you yourself think it's hard to read, what will others think? And less readable code is also more likely to catch bugs. Not to mention making working on it harder/more time consuming (you have to dig to find what the code actually does in all that LBYLing)
If you try to make your code totally foolproof, someone will invent a better fool. Seriously, a good rule of thumb is to guard against likely errors, but don't clutter your code trying to think of every conceivable way a caller could break you.
instead of spending time in assert and private variable i prefer spend time in documentation and test cases. i prefer read documentation and when i have to read code i prefer read tests. this is more true the more code grow up. in the same time tests give you a fool-proof code and useful use cases.
I base the need for error checking code on the consequences of the errors that it's checking for. If crap data gets into my system, how long will it be before I discover it, how hard will it be to determine that the problem was crap data, and how difficult will it be to fix? For cases like the one you've posted, the answers are generally "not long," "not hard," and "not difficult."
But data that's going to be persisted somewhere and then used as the input to a complicated algorithm in six weeks? I'll check the hell out of that.
I don't think this is specific to python. I'm a firm believer in Design by Contract: Ideally all functions should have clear pre- and post-conditions; unfortunately most languages (Eiffel being the canonical exception) don't provide particularly convenient ways to achieve this which contributes to the apparent conflict between clarity and correctness.
As a practical matter, one approach is to write a 'checkValues' method so as to avoid cluttering __init__. You can even compress it to:
def __init__(self, text_values):
self.text_values = checkValues( text_values )
def checkValues(text_values):
for text_value in text_values:
assert isinstance(text_value, basestring), u'All text_values should be str or unicode.'
assert 2 <= len(text_value), u'All text_values should be at least two characters long.'
return( frozenset( text_values ) )
Another approach would be using a folding text editor that can hide/show the pre-conditions with the aid of some commenting conventions; this would also be useful for auto-generating documentation.
Take all the energy you're putting into argument checking and channel it instead into writing clear, concise doc-strings.
Unless you're writing code for nuclear reactors; in which case I would appreciate you doing both.
Another way to think is: you only need to catch errors that you can correct. If you're checking the input just for aborting with an AssertionError, you're better off just allowing the code to raise the appropriate exception later so you can debug correctly.
This line in particular is pretty bad since it stops duck-typing:
assert isinstance(text_value, basestring), u'All text_values should be str or unicode.'
I have a programming experience with statically typed languages. Now writing code in Python I feel difficulties with its readability. Lets say I have a class Host:
class Host(object):
def __init__(self, name, network_interface):
self.name = name
self.network_interface = network_interface
I don't understand from this definition, what "network_interface" should be. Is it a string, like "eth0" or is it an instance of a class NetworkInterface? The only way I'm thinking about to solve this is a documenting the code with a "docstring". Something like this:
class Host(object):
''' Attributes:
#name: a string
#network_interface: an instance of class NetworkInterface'''
Or may be there are name conventions for things like that?
Using dynamic languages will teach you something about static languages: all the help you got from the static language that you now miss in the dynamic language, it wasn't all that helpful.
To use your example, in a static language, you'd know that the parameter was a string, and in Python you don't. So in Python you write a docstring. And while you're writing it, you realize you had more to say about it than, "it's a string". You need to say what data is in the string, and what format it should have, and what the default is, and something about error conditions.
And then you realize you should have written all that down for your static language as well. Sure, Java would force you know that it was a string, but there's all these other details that need to be specified, and you have to manually do that work in any language.
The docstring conventions are at PEP 257.
The example there follows this format for specifying arguments, you can add the types if they matter:
def complex(real=0.0, imag=0.0):
"""Form a complex number.
Keyword arguments:
real -- the real part (default 0.0)
imag -- the imaginary part (default 0.0)
"""
if imag == 0.0 and real == 0.0: return complex_zero
...
There was also a rejected PEP for docstrings for attributes ( rather than constructor arguments ).
The most pythonic solution is to document with examples. If possible, state what operations an object must support to be acceptable, rather than a specific type.
class Host(object):
def __init__(self, name, network_interface)
"""Initialise host with given name and network_interface.
network_interface -- must support the same operations as NetworkInterface
>>> network_interface = NetworkInterface()
>>> host = Host("my_host", network_interface)
"""
...
At this point, hook your source up to doctest to make sure your doc examples continue to work in future.
Personally I found very usefull to use pylint to validate my code.
If you follow pylint suggestion almost automatically your code become more readable,
you will improve your python writing skills, respect naming conventions. You can also define your own naming conventions and so on. It's very useful specially for a python beginner.
I suggest you to use.
Python, though not as overtly typed as C or Java, is still typed and will throw exceptions if you're doing things with types that simply do not play nice together.
To that end, if you're concerned about your code being used correctly, maintained correctly, etc. simply use docstrings, comments, or even more explicit variable names to indicate what the type should be.
Even better yet, include code that will allow it to handle whichever type it may be passed as long as it yields a usable result.
One benefit of static typing is that types are a form of documentation. When programming in Python, you can document more flexibly and fluently. Of course in your example you want to say that network_interface should implement NetworkInterface, but in many cases the type is obvious from the context, variable name, or by convention, and in these cases by omitting the obvious you can produce more readable code. Common is to describe the meaning of a parameter and implicitly giving the type.
For example:
def Bar(foo, count):
"""Bar the foo the given number of times."""
...
This describes the function tersely and precisely. What foo and bar mean will be obvious from context, and that count is a (positive) integer is implicit.
For your example, I'd just mention the type in the document string:
"""Create a named host on the given NetworkInterface."""
This is shorter, more readable, and contains more information than a listing of the types.
This question already has answers here:
The Zen of Python [closed]
(22 answers)
Python: Am I missing something? [closed]
(16 answers)
Closed 8 years ago.
I would be interested in knowing what the StackOverflow community thinks are the important language features (idioms) of Python. Features that would define a programmer as Pythonic.
Python (pythonic) idiom - "code expression" that is natural or characteristic to the language Python.
Plus, Which idioms should all Python programmers learn early on?
Thanks in advance
Related:
Code Like a Pythonista: Idiomatic Python
Python: Am I missing something?
Python is a language that can be described as:
"rules you can fit in the
palm of your hand with a huge bag of
hooks".
Nearly everything in python follows the same simple standards. Everything is accessible, changeable, and tweakable. There are very few language level elements.
Take for example, the len(data) builtin function. len(data) works by simply checking for a data.__len__() method, and then calls it and returns the value. That way, len() can work on any object that implements a __len__() method.
Start by learning about the types and basic syntax:
Dynamic Strongly Typed Languages
bool, int, float, string, list, tuple, dict, set
statements, indenting, "everything is an object"
basic function definitions
Then move on to learning about how python works:
imports and modules (really simple)
the python path (sys.path)
the dir() function
__builtins__
Once you have an understanding of how to fit pieces together, go back and cover some of the more advanced language features:
iterators
overrides like __len__ (there are tons of these)
list comprehensions and generators
classes and objects (again, really simple once you know a couple rules)
python inheritance rules
And once you have a comfort level with these items (with a focus on what makes them pythonic), look at more specific items:
Threading in python (note the Global Interpreter Lock)
context managers
database access
file IO
sockets
etc...
And never forget The Zen of Python (by Tim Peters)
Beautiful is better than ugly.
Explicit is better than implicit.
Simple is better than complex.
Complex is better than complicated.
Flat is better than nested.
Sparse is better than dense.
Readability counts.
Special cases aren't special enough to break the rules.
Although practicality beats purity.
Errors should never pass silently.
Unless explicitly silenced.
In the face of ambiguity, refuse the temptation to guess.
There should be one-- and preferably only one --obvious way to do it.
Although that way may not be obvious at first unless you're Dutch.
Now is better than never.
Although never is often better than *right* now.
If the implementation is hard to explain, it's a bad idea.
If the implementation is easy to explain, it may be a good idea.
Namespaces are one honking great idea -- let's do more of those!
This page covers all the major python idioms: http://python.net/~goodger/projects/pycon/2007/idiomatic/handout.html
An important idiom in Python is docstrings.
Every object has a __doc__ attribute that can be used to get help on that object. You can set the __doc__ attribute on modules, classes, methods, and functions like this:
# this is m.py
""" module docstring """
class c:
"""class docstring"""
def m(self):
"""method docstring"""
pass
def f(a):
"""function f docstring"""
return
Now, when you type help(m), help(m.f) etc. it will print the docstring as a help message.
Because it's just part of normal object introspection this can be used by documention generating systems like epydoc or used for testing purposes by unittest.
It can also be put to more unconventional (i.e. non-idiomatic) uses such as grammars in Dparser.
Where it gets even more interesting to me is that, even though doc is a read-only attribute on most objects, you can use them anywhere like this:
x = 5
""" pseudo docstring for x """
and documentation tools like epydoc can pick them up and format them properly (as opposed to a normal comment which stays inside the code formatting.
Decorators get my vote. Where else can you write something like:
def trace(num_args=0):
def wrapper(func):
def new_f(*a,**k):
print_args = ''
if num_args > 0:
print_args = str.join(',', [str(x) for x in a[0:num_args]])
print('entering %s(%s)' %(f.__name__,print_args))
rc = f(*a,**k)
if rc is not None:
print('exiting %s(%s)=%s' %(f.__name__,str(rc)))
else:
print('exiting %s(%s)' %(f.__name__))
return rc
return new_f
return wrapper
#trace(1)
def factorial(n):
if n < 2:
return 1
return n * factorial(n-1)
factorial(5)
and get output like:
entering factorial(5)
entering factorial(4)
entering factorial(3)
entering factorial(2)
entering factorial(1)
entering factorial(0)
exiting factorial(0)=1
exiting factorial(1)=1
exiting factorial(2)=2
exiting factorial(3)=6
exiting factorial(4)=24
exiting factorial(5)=120
Everything connected to list usage.
Comprehensions, generators, etc.
Personally, I really like Python syntax defining code blocks by using indentation, and not by the words "BEGIN" and "END" (as in Microsoft's Basic and Visual Basic - I don't like these) or by using left- and right-braces (as in C, C++, Java, Perl - I like these).
This really surprised me because, although indentation has always been very important to me, I didn't make to much "noise" about it - I lived with it, and it is considered a skill to be able to read other peoples, "spaghetti" code. Furthermore, I never heard another programmer suggest making indentation a part of a language. Until Python! I only wish I had realized this idea first.
To me, it is as if Python's syntax forces you to write good, readable code.
Okay, I'll get off my soap-box. ;-)
From a more advanced viewpoint, understanding how dictionaries are used internally by Python. Classes, functions, modules, references are all just properties on a dictionary. Once this is understood it's easy to understand how to monkey patch and use the powerful __gettattr__, __setattr__, and __call__ methods.
Here's one that can help. What's the difference between:
[ foo(x) for x in range(0, 5) ][0]
and
( foo(x) for x in range(0, 5) ).next()
answer:
in the second example, foo is called only once. This may be important if foo has a side effect, or if the iterable being used to construct the list is large.
Two things that struck me as especially Pythonic were dynamic typing and the various flavors of lists used in Python, particularly tuples.
Python's list obsession could be said to be LISP-y, but it's got its own unique flavor. A line like:
return HandEvaluator.StraightFlush, (PokerCard.longFaces[index + 4],
PokerCard.longSuits[flushSuit]), []
or even
return False, False, False
just looks like Python and nothing else. (Technically, you'd see the latter in Lua as well, but Lua is pretty Pythonic in general.)
Using string substitutions:
name = "Joe"
age = 12
print "My name is %s, I am %s" % (name, age)
When I'm not programming in python, that simple use is what I miss most.
Another thing you cannot start early enough is probably testing. Here especially doctests are a great way of testing your code by explaining it at the same time.
doctests are simple text file containing an interactive interpreter session plus text like this:
Let's instantiate our class::
>>> a=Something(text="yes")
>>> a.text
yes
Now call this method and check the results::
>>> a.canify()
>>> a.text
yes, I can
If e.g. a.text returns something different the test will fail.
doctests can be inside docstrings or standalone textfiles and are executed by using the doctests module. Of course the more known unit tests are also available.
I think that tutorials online and books only talk about doing things, not doing things in the best way. Along with the python syntax i think that speed in some cases is important.
Python provides a way to benchmark functions, actually two!!
One way is to use the profile module, like so:
import profile
def foo(x, y, z):
return x**y % z # Just an example.
profile.run('foo(5, 6, 3)')
Another way to do this is to use the timeit module, like this:
import timeit
def foo(x, y, z):
return x**y % z # Can also be 'pow(x, y, z)' which is way faster.
timeit.timeit('foo(5, 6, 3)', 'from __main__ import *', number = 100)
# timeit.timeit(testcode, setupcode, number = number_of_iterations)