Ruby's tap idiom in Python

Ruby's tap idiom in Python - python

There is a useful Ruby idiom that uses tap which allows you to create an object, do some operations on it and return it (I use a list here only as an example, my real code is more involved):
def foo
[].tap do |a|
b = 1 + 2
# ... and some more processing, maybe some logging, etc.
a << b
end
end
>> foo
=> [1]
With Rails there's a similar method called returning, so you can write:
def foo
returning([]) do |a|
b = 1 + 2
# ... and some more processing, maybe some logging, etc.
a << b
end
end
which speaks for itself. No matter how much processing you do on the object, it's still clear that it's the return value of the function.
In Python I have to write:
def foo():
a = []
b = 1 + 2
# ... and some more processing, maybe some logging, etc.
a.append(b)
return a
and I wonder if there is a way to port this Ruby idiom into Python. My first thought was to use with statement, but return with is not valid syntax.

Short answer: Ruby encourages method chaining, Python doesn't.
I guess the right question is: What is Ruby's tap useful for?
Now I don't know a lot about Ruby, but by googling I got the impression that tap is conceptually useful as method chaining.
In Ruby, the style: SomeObject.doThis().doThat().andAnotherThing() is quite idiomatic. It underlies the concept of fluent interfaces, for example. Ruby's tap is a special case of this where instead of having SomeObject.doThis() you define doThis on the fly.
Why I am explaining all this? Because it tells us why tap doesn't have good support in Python. With due caveats, Python doesn't do call chaining.
For example, Python list methods generally return None rather than returning the mutated list. Functions like map and filter are not list methods. On the other hand, many Ruby array methods do return the modified array.
Other than certain cases like some ORMs, Python code doesn't use fluent interfaces.
In the end it is the difference between idiomatic Ruby and idiomatic Python. If you are going from one language to the other you need to adjust.

You can implement it in Python as follows:
def tap(x, f):
f(x)
return x
Usage:
>>> tap([], lambda x: x.append(1))
[1]
However it won't be so much use in Python 2.x as it is in Ruby because lambda functions in Python are quite restrictive. For example you can't inline a call to print because it is a keyword, so you can't use it for inline debugging code. You can do this in Python 3.x although it isn't as clean as the Ruby syntax.
>>> tap(2, lambda x: print(x)) + 3
2
5

If you want this bad enough, you can create a context manager
class Tap(object):
def __enter__(self, obj):
return obj
def __exit__(*args):
pass
which you can use like:
def foo():
with Tap([]) as a:
a.append(1)
return a
There's no getting around the return statement and with really doesn't do anything here. But you do have Tap right at the start which clues you into what the function is about I suppose. It is better than using lambdas because you aren't limited to expressions and can have pretty much whatever you want in the with statement.
Overall, I would say that if you want tap that bad, then stick with ruby and if you need to program in python, use python to write python and not ruby. When I get around to learning ruby, I intend to write ruby ;)

I had an idea to achieve this using function decorators, but due to the distinction in python between expressions and statements, this ended up still requiring the return to be at the end.
The ruby syntax is rarely used in my experience, and is far less readable than the explicit python approach. If python had implicit returns or a way to wrap multiple statements up into a single expression then this would be doable - but it has neither of those things by design.
Here's my - somewhat pointless - decorator approach, for reference:
class Tapper(object):
def __init__(self, initial):
self.initial = initial
def __call__(self, func):
func(self.initial)
return self.initial
def tap(initial):
return Tapper(initial)
if __name__ == "__main__":
def tapping_example():
#tap([])
def tapping(t):
t.append(1)
t.append(2)
return tapping
print repr(tapping_example())

I partly agree with others in that it doesn't make much sense to implement this in Python. However, IMHO, Mark Byers's way is the way, but why lambdas(and all that comes with them)? can't you write a separate function to be called when needed?
Another way to do basically the same could be
map(afunction(), avariable)
but this beautiful feature is not a built-in in Python 3, I hear.

Hardly any Ruby programmers use tap in this way. In fact, all top Ruby programmers i know say tap has no use except in debugging.
Why not just do this in your code?
[].push(1)
and remember Array supports a fluent interface, so you can even do this:
[].push(1).push(2)

Related

Reason for parentheses after read, truncate and close commands?

Could someone explain to me what the parentheses after the
.read(), .truncate()and .close() commands are doing?
I'm confused by why they're empty and why it isn't written as:
read(filename), truncate(filename), and close(filename) rather than
filename.read(), filename.truncate(), and filename.close()?
But open is written as open(filename) and not filename.open()

Those parentheses () are for actually calling a certain function instead of merely referring to it.
Consider for example:
def func():
return "I am being called!"
print(func)
# <function func at 0x7f41f59a6b70>
print(func())
# I am being called!
Regarding the: func(x) versus x.func() syntax, this is a design / stylistic choice.
The first syntax is associated to procedural design / style, while the second syntax is associated to explicitly object-oriented design / style.
Python (as well as many other general purpose languages) support both.
When func was designed as:
def func(x):
...
you would use the procedural style.
When func is designed as:
class MyX(object):
def func(self):
...
you would use the object-oriented style.
Note that this would require x to be of type MyX (or a sub-class of it).
More info is available in Python tutorials, e.g. here.
Final note: in some programming languages (notably D and nim) there is the concept of Uniform Function Call Syntax, which basically allows you to write function and then have it called with whichever syntax (procedural or object-oriented) you prefer.

Is there a built-in "apply" function (like "lambda f: f()") as there used to be in Python 2?

Looking at this question, I realised that it is kind of awkward to use multiprocessing's Pool.map if you want is to run a list of functions in parallel:
from multiprocessing import Pool
def my_fun1(): return 1
def my_fun2(): return 2
def my_fun3(): return 3
with Pool(3) as p:
one, two, three = p.map(lambda f: f(), [my_fun1, my_fun2, my_fun3])
I'm not saying it is exactly cryptic, but I think I expected some conventional name for this, even if only within functools or something, similarly to apply/call in JavaScript (yes, I know JavaScript didn't have lambdas at the time those functions were defined, and no, I'm not saying JavaScript is an exemplary programming language, just an example). In fact, I definitely think something like this should be present in operator, but (unless my eyes deceive me) it seems to be absent. I read that in the case of the identity function the resolution was to let people define their own trivial functions, and I understand it better in that case because there are a couple of different variations you may want, but this one feels like a missing bit to me.
EDIT: As pointed out in the comments, Python 2 used to have an apply function for this purpose.

First, let's look at the practical question.
For any Python from 2.3 on, you can trivially write not just your no-argument apply, but a perfect-forwarding apply, as a one-liner, as explained in the 2.x docs for apply:
The use of apply() is equivalent to function(*args, **keywords)
In other words:
def apply(function, *args, **keywords):
return function(*args, **keywords)
… or, as an inline lambda:
lambda f, *a, **k: f(*a, **kw)
Of course the C implementation was a bit faster, but this is almost never relevant.1
If you're going to be using this more than once, I think defining the function out-of-line and reusing it by name is probably clearer, but the lamdba version is simple and obvious enough (even more so for your no-args use case) that I can't imagine anyone complaining about it.
Also, notice that this is actually more trivial than identity if you understand what you're doing, not less. With identity, it's ambiguous what you should return with multiple arguments (or keyword arguments), so you have to decide which behavior you want; with apple, there's only one obvious answer, and it's pretty much impossible to get wrong.
As for the history:
Python, like JavaScript, originally had no lambda. It's hard to dig up linkable docs for versions before 2.6, and hard to even find them before 2.3, but I think lambda was added in 1.5, and eventually reached the point where it could be used for perfect forwarding around 2.2. Before then, the docs recommended using apply for forwarding, but after that, the docs recommended using lambda in place of apply. In fact, there was no longer any recommended use of apply.
So in 2.3, the function was deprecated.2
During the Python-3000 discussions that led to 3.0, Guido suggested that all of the "functional programming" functions except maybe map and filter were unnecessary.3 Others made good cases for reduce and partial.4 But a big part of the case was that they're actually not trivial to write (in fully-general form), and easy to get wrong. That isn't true for apply. Also, people were able to find relevant uses of reduce and partial in real-world codebases, but the only uses of apply anyone could find were old pre-2.3 code. In fact, it was so rare that it wasn't even worth making the 2to3 tool transform calls to apply.
The final rationale for removing it was summarized in PEP 3100:
apply(): use f(*args, **kw) instead [2]
That footnote links to an essay by Guido called "Python Regrets", which is now a 404 link. The accompanying PowerPoint presentation is still available, however, or you can view an HTML flipbook of the presentation he wrote it for. But all it really says is the same one-liner, and IIRC, the only further discussion was "We already effectively got rid of it in 2.3."
1. In most idiomatic Python code that has to apply a function, the work inside that function is pretty heavy. In your case, of course, the overhead of calling the functions (pickling arguments and passing them over a pipe) is even heavier. The one case where it would matter is when you're doing "Haskell-style functional programming" instead of "Lisp-style"—that is, very few function definitions, and lots of functions made by transforming functions and composing the results. But that's already so slow (and stack-heavy) in Python that it's not a reasonable thing to do. (Flat use of decorators to apply a wrapper or three works great, but a potentially unbounded chain of wrappers will kill your performance.)
2. The formal deprecation mechanism didn't exist yet, so it was just moved to a "Non-essential Built-in Functions" section in the docs. But it was retroactively considered to be deprecated since 2.3, as you can see in the 2.7 docs.
3. Guido originally wanted to get rid of even them; the argument was that list comprehensions can do the same job better, as you can see in the "Regrets" flipbook. But promoting itertools.imap in place of map means it could be made lazy, like the new zip, and therefore better than comprehensions. I'm not sure why Guido didn't just make the same argument with generator expressions.
4. I'm not sure Guido himself was ever convinced for reduce, but the core devs as a whole were.

It sort of is in operator if you do one line of extra work:
>>> def foo():
... print 'hi'
...
>>> from operator import methodcaller
>>> call = methodcaller('__call__')
>>> call(foo)
hi
Of course, call = lambda f: f() is only one line as well...

How to check if a function is pure in Python?

A pure function is a function similar to a Mathematical function, where there is no interaction with the "Real world" nor side-effects. From a more practical point of view, it means that a pure function can not:
Print or otherwise show a message
Be random
Depend on system time
Change global variables
And others
All this limitations make it easier to reason about pure functions than non-pure ones. The majority of the functions should then be pure so that the program can have less bugs.
In languages with a huge type-system like Haskell the reader can know right from the start if a function is or is not pure, making the successive reading easier.
In Python this information may be emulated by a #pure decorator put on top of the function. I would also like that decorator to actually do some validation work. My problem lies in the implementation of such a decorator.
Right now I simply look the source code of the function for buzzwords such as global or random or print and complains if it finds one of them.
import inspect
def pure(function):
source = inspect.getsource(function)
for non_pure_indicator in ('random', 'time', 'input', 'print', 'global'):
if non_pure_indicator in source:
raise ValueError("The function {} is not pure as it uses `{}`".format(
function.__name__, non_pure_indicator))
return function
However it feels like a weird hack, that may or may not work depending on your luck, could you please help me in writing a better decorator?

I kind of see where you are coming from but I don't think this can work. Let's take a simple example:
def add(a,b):
return a + b
So this probably looks "pure" to you. But in Python the + here is an arbitrary function which can do anything, just depending on the bindings in force when it is called. So that a + b can have arbitrary side effects.
But it's even worse than that. Even if this is just doing standard integer + then there's more 'impure' stuff going on.
The + is creating a new object. Now if you are sure that only the caller has a reference to that new object then there is a sense in which you can think of this as a pure function. But you can't be sure that, during the creation process of that object, no reference to it leaked out.
For example:
class RegisteredNumber(int):
numbers = []
def __new__(cls,*args,**kwargs):
self = int.__new__(cls,*args,**kwargs)
self.numbers.append(self)
return self
def __add__(self,other):
return RegisteredNumber(super().__add__(other))
c = RegisteredNumber(1) + 2
print(RegisteredNumber.numbers)
This will show that the supposedly pure add function has actually changed the state of the RegisteredNumber class. This is not a stupidly contrived example: in my production code base we have classes which track each created instance, for example, to allow access via key.
The notion of purity just doesn't make much sense in Python.

(not an answer, but too long for a comment)
So if a function can return different values for the same set of arguments, it is not pure?
Remember that functions in Python are objects, so you want to check the purity of an object...
Take this example:
def foo(x):
ret, foo.x = x*x+foo.x, foo.x+1
return ret
foo.x=0
calling foo(3) repeatedly gives:
>>> foo(3)
9
>>> foo(3)
10
>>> foo(3)
11
...
Moreover, reading globals does not require to use the global statement, or the global() builtin inside your function. Global variables might change somewhere else, affecting the purity of your function.
All the above situation might be difficult to detect at runtime.

Recipe for anonymous functions in python?

I'm looking for the best recipie to allow inline definition of functions, or multi-line lambda, in python.
For example, I'd like to do the following:
def callfunc(func):
func("Hello")
>>> callfunc(define('x', '''
... print x, "World!"
... '''))
Hello World!
I've found an example for the define function in this answer:
def define(arglist, body):
g = {}
exec("def anonfunc({0}):\n{1}".format(
arglist,
"\n".join(" {0}".format(line) for line in body.splitlines())), g)
return g["anonfunc"]
This is one possible solution, but it is not ideal. Desireable features would be:
be smarter about indentation,
hide the innards better (e.g. don't have anonfunc in the function's scope)
provide access to variables in the surrounding scope / captures
better error handling
and some things I haven't thought of. I had a really nice implementation once that did most of the above, but I lost in unfortunately. I'm wondering if someone else has made something similar.
Disclaimer:
I'm well aware this is controversial among Python users, and regarded as a hack or unpythonic. I'm also aware of the discussions regaring multi-line-lambdas on the python-dev mailing list, and that a similar feature was omitted on purpose. However, from the same discussions I've learned that there is also interest in such a function by many others.
I'm not asking whether this is a good idea or not, but instead: Given that one has decided to implement this, (either out of fun and curiosity, madness, genuinely thinking this is a nice idea, or being held at gunpoint) how to make anonymous define work as close as possible to def using python's (2.7 or 3.x) current facilities?
Examples:
A bit more as to why, this can be really handy for callbacks in GUIs:
# gtk example:
self.ntimes = 0
button.connect('clicked', define('*a', '''
self.ntimes += 1
label.set_text("Button has been clicked %d times" % self.ntimes)
''')
The benefit over defining a function with def is that your code is in a more logical order. This is simplified code taken from a Twisted application:
# twisted example:
def sayHello(self):
d = self.callRemote(HelloCommand)
def handle_response(response):
# do something, this happens after (x)!
pass
d.addCallback(handle_response) # (x)
Note how it seems out of order. I usually break stuff like this up, to keep the code order == execution order:
def sayHello_d(self):
d = self.callRemote(HelloCommand)
d.addCallback(self._sayHello_2)
return d
def _sayHello_2(self, response):
# handle response
pass
This is better wrt. ordering but more verbose. Now, with the anonymous functions trick:
d = self.callRemote(HelloCommand)
d.addCallback(define('response', '''
print "callback"
print "got response from", response["name"]
'''))

If you come from a javascript or ruby background, python's abilities to deal with anonymous functions may indeed seem limited, but this is for a reason. Python designers decided that clarity of code is more important than conciseness. If you don't like that, you probably don't like python at all. There's nothing wrong about that, there are many other choices - why not to try a language that tastes better to you?
Putting chunks of code into strings and interpreting them on the fly is definitely a wrong way to "extend" a language, just because none of tools you're working with - from syntax highlighters to the python interpreter itself - would be able to deal with "stringified" code in a sensible way.
To answer the question as asked: what you're doing there is essentially an attempt to construct some better-than-python programming language and compile it to python on the fly. The idea is not new in the world of scripting languages and can be productive or not (CoffeeScript is an example of a successful implementation), but your very approach is wrong. format() not the tool you're looking for when working with code. If you're writing a compiler, do it properly: use a parser (e.g. pyparsing) to read your code in an AST, walk through the AST to generate python code (or even bytecode), catch syntax errors as you go and take measures to provide better runtime feedback (e.g. error context, line numbers etc). Finally, make sure your compiler works across different python versions and implementations.
Or just use ruby.

What are the important language features (idioms) of Python to learn early on [duplicate]

This question already has answers here:
The Zen of Python [closed]
(22 answers)
Python: Am I missing something? [closed]
(16 answers)
Closed 8 years ago.
I would be interested in knowing what the StackOverflow community thinks are the important language features (idioms) of Python. Features that would define a programmer as Pythonic.
Python (pythonic) idiom - "code expression" that is natural or characteristic to the language Python.
Plus, Which idioms should all Python programmers learn early on?
Thanks in advance
Related:
Code Like a Pythonista: Idiomatic Python
Python: Am I missing something?

Python is a language that can be described as:
"rules you can fit in the
palm of your hand with a huge bag of
hooks".
Nearly everything in python follows the same simple standards. Everything is accessible, changeable, and tweakable. There are very few language level elements.
Take for example, the len(data) builtin function. len(data) works by simply checking for a data.__len__() method, and then calls it and returns the value. That way, len() can work on any object that implements a __len__() method.
Start by learning about the types and basic syntax:
Dynamic Strongly Typed Languages
bool, int, float, string, list, tuple, dict, set
statements, indenting, "everything is an object"
basic function definitions
Then move on to learning about how python works:
imports and modules (really simple)
the python path (sys.path)
the dir() function
__builtins__
Once you have an understanding of how to fit pieces together, go back and cover some of the more advanced language features:
iterators
overrides like __len__ (there are tons of these)
list comprehensions and generators
classes and objects (again, really simple once you know a couple rules)
python inheritance rules
And once you have a comfort level with these items (with a focus on what makes them pythonic), look at more specific items:
Threading in python (note the Global Interpreter Lock)
context managers
database access
file IO
sockets
etc...
And never forget The Zen of Python (by Tim Peters)
Beautiful is better than ugly.
Explicit is better than implicit.
Simple is better than complex.
Complex is better than complicated.
Flat is better than nested.
Sparse is better than dense.
Readability counts.
Special cases aren't special enough to break the rules.
Although practicality beats purity.
Errors should never pass silently.
Unless explicitly silenced.
In the face of ambiguity, refuse the temptation to guess.
There should be one-- and preferably only one --obvious way to do it.
Although that way may not be obvious at first unless you're Dutch.
Now is better than never.
Although never is often better than *right* now.
If the implementation is hard to explain, it's a bad idea.
If the implementation is easy to explain, it may be a good idea.
Namespaces are one honking great idea -- let's do more of those!

This page covers all the major python idioms: http://python.net/~goodger/projects/pycon/2007/idiomatic/handout.html

An important idiom in Python is docstrings.
Every object has a __doc__ attribute that can be used to get help on that object. You can set the __doc__ attribute on modules, classes, methods, and functions like this:
# this is m.py
""" module docstring """
class c:
"""class docstring"""
def m(self):
"""method docstring"""
pass
def f(a):
"""function f docstring"""
return
Now, when you type help(m), help(m.f) etc. it will print the docstring as a help message.
Because it's just part of normal object introspection this can be used by documention generating systems like epydoc or used for testing purposes by unittest.
It can also be put to more unconventional (i.e. non-idiomatic) uses such as grammars in Dparser.
Where it gets even more interesting to me is that, even though doc is a read-only attribute on most objects, you can use them anywhere like this:
x = 5
""" pseudo docstring for x """
and documentation tools like epydoc can pick them up and format them properly (as opposed to a normal comment which stays inside the code formatting.

Decorators get my vote. Where else can you write something like:
def trace(num_args=0):
def wrapper(func):
def new_f(*a,**k):
print_args = ''
if num_args > 0:
print_args = str.join(',', [str(x) for x in a[0:num_args]])
print('entering %s(%s)' %(f.__name__,print_args))
rc = f(*a,**k)
if rc is not None:
print('exiting %s(%s)=%s' %(f.__name__,str(rc)))
else:
print('exiting %s(%s)' %(f.__name__))
return rc
return new_f
return wrapper
#trace(1)
def factorial(n):
if n < 2:
return 1
return n * factorial(n-1)
factorial(5)
and get output like:
entering factorial(5)
entering factorial(4)
entering factorial(3)
entering factorial(2)
entering factorial(1)
entering factorial(0)
exiting factorial(0)=1
exiting factorial(1)=1
exiting factorial(2)=2
exiting factorial(3)=6
exiting factorial(4)=24
exiting factorial(5)=120

Everything connected to list usage.
Comprehensions, generators, etc.

Personally, I really like Python syntax defining code blocks by using indentation, and not by the words "BEGIN" and "END" (as in Microsoft's Basic and Visual Basic - I don't like these) or by using left- and right-braces (as in C, C++, Java, Perl - I like these).
This really surprised me because, although indentation has always been very important to me, I didn't make to much "noise" about it - I lived with it, and it is considered a skill to be able to read other peoples, "spaghetti" code. Furthermore, I never heard another programmer suggest making indentation a part of a language. Until Python! I only wish I had realized this idea first.
To me, it is as if Python's syntax forces you to write good, readable code.
Okay, I'll get off my soap-box. ;-)

From a more advanced viewpoint, understanding how dictionaries are used internally by Python. Classes, functions, modules, references are all just properties on a dictionary. Once this is understood it's easy to understand how to monkey patch and use the powerful __gettattr__, __setattr__, and __call__ methods.

Here's one that can help. What's the difference between:
[ foo(x) for x in range(0, 5) ][0]
and
( foo(x) for x in range(0, 5) ).next()
answer:
in the second example, foo is called only once. This may be important if foo has a side effect, or if the iterable being used to construct the list is large.

Two things that struck me as especially Pythonic were dynamic typing and the various flavors of lists used in Python, particularly tuples.
Python's list obsession could be said to be LISP-y, but it's got its own unique flavor. A line like:
return HandEvaluator.StraightFlush, (PokerCard.longFaces[index + 4],
PokerCard.longSuits[flushSuit]), []
or even
return False, False, False
just looks like Python and nothing else. (Technically, you'd see the latter in Lua as well, but Lua is pretty Pythonic in general.)

Using string substitutions:
name = "Joe"
age = 12
print "My name is %s, I am %s" % (name, age)
When I'm not programming in python, that simple use is what I miss most.

Another thing you cannot start early enough is probably testing. Here especially doctests are a great way of testing your code by explaining it at the same time.
doctests are simple text file containing an interactive interpreter session plus text like this:
Let's instantiate our class::
>>> a=Something(text="yes")
>>> a.text
yes
Now call this method and check the results::
>>> a.canify()
>>> a.text
yes, I can
If e.g. a.text returns something different the test will fail.
doctests can be inside docstrings or standalone textfiles and are executed by using the doctests module. Of course the more known unit tests are also available.

I think that tutorials online and books only talk about doing things, not doing things in the best way. Along with the python syntax i think that speed in some cases is important.
Python provides a way to benchmark functions, actually two!!
One way is to use the profile module, like so:
import profile
def foo(x, y, z):
return x**y % z # Just an example.
profile.run('foo(5, 6, 3)')
Another way to do this is to use the timeit module, like this:
import timeit
def foo(x, y, z):
return x**y % z # Can also be 'pow(x, y, z)' which is way faster.
timeit.timeit('foo(5, 6, 3)', 'from __main__ import *', number = 100)
# timeit.timeit(testcode, setupcode, number = number_of_iterations)

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.