Is generator faster than while loop in python?

Is generator faster than while loop in python? - python

Question is simple, I have following code that does the same thing in python2:
for _ in range(n): # or xrange(),they have similar performance according to my test
pass
i = 0
while i < n:
i+=1
pass
the for loop is faster than the while loop, when n = 1000000, each takes roughly 0.105544 and 0.2389421
on the surface it looks like while loop is doing the increment and boundary check, but as far as I know, the generator or iterator has to perform the same amount of hard work, so if the work done is the same, why is one faster than another?
from python generator wiki
def generator(n):
i = 0
while i < n:
yield i
i += 1
in the case of an iterator, there is usually a member function called next, and every time it is called, it will return the "next item in the iterable", to me this means a lot of function calls, thus huge overhead on stack (more assembly code to do push and pop stack) and based on my knowledge on coroutine (generator), it trys to circumvent this by creating a new separated stack (just like thread, it manages its own program counter), although it will no longer deal with tons of function calls, it bears the same problem as thread, namely overhead of context switch.
How can the while loop be slower when it does not face any of the overheads I mentioned above?

I expect the performance difference you're seeing has to do with what parts of the code are defined in Python and which are defined inside the interpreter (in C, for cpython). The calls to next in the for loop case, for instance, are going to be handled in C, and for a range or other built-in iterable, the implementation of the function will also be in C, so it may be pretty fast. The bounds check on the while loop on the other hand is a Python expression, which needs to be evaluated on each pass of the loop. Python code is almost always going to be slower than C code, so it's not too shocking that a for loop may be faster than a while loop in some situations.
Note however that both kinds of loops are probably much faster than any sort of useful work you might be doing inside of them. It is almost never worth focusing your efforts on the very small performance differences between different kinds of loops like this, rather than on larger issues like the complexity of your algorithms or the efficiency of your data structures.
The only exception might be if you've done a bunch of profiling of your code and found that a specific loop is the greatest performance bottleneck for your particular program. If that's the case, micro-optimize to your heart's content.

Related

What actually happens when setting parallel=True in #njit numba?

Could someone please explain roughly what happens when one runs a #njit-ted python function which contains a nested for loop (each iteration from each of the loops is independent of the others) and sets parallel=True and puts prange instead of range?
#njit(parallel=True)
def f():
C = np.empty((80, 20, 18), dtype=np.complex128)
for i in prange(80):
for j in prange(20):
for k in range(18):
C[i, j, k] = do_smth(i, j, k) # where do_smth(i, j, k) is #njit-ted and will further call other functions
Similarly, what happens when using prange only for the outermost loop? (i.e. letting for j in range(20): ... )
I understand what a thread is and I put NUMBA_NUM_THREADS (the environmental variable) to be the number of cores of the processor.
I did some profiling using the timeit module and it seems that the parallel=True keyword only slows the execution of the f() function when the .py script is called on a machine with 20 cores (by a considerable amount (even 4 times slower)).
f() above further calls more functions (first one being do_smth()) also having their structure resembling the f()'s (nested for loops which, at each of their iterations, call other #njit-ted functions) structure.
I checked them as above. Is my approach good? I.e. to profile them timeit and changing the keywords params inside their #njit decorator (I played with parallel, fastmath and nogil) and creating a table in which I note the execution times. My aim was to find the best execution time from the results I obtain.

Could someone please explain roughly what happens when one runs a #njit-ted python function which contains a nested for loop and sets parallel=True and puts prange instead of range?
This is explained in the documentation, but basically, when parallel=True is set, prange split the loop iteration in blocks so they are executed in multiple threads. The exact scheduling is dependent of the underlying parallel runtime (eg. TBB, OpenMP, etc.). The loops is analyzed by Numba so to know whether a reduction is needed or not (not all patterns are allowed). It can also fuse parallel loops if needed (though it does not work on my machine with Numba 0.55.2, even on trivial reduction loops: only the outer loop is parallelized). Note that its takes time to create threads and the bigger the number of core, the slower it is. This is why multi-threaded computations should last for a relatively long time so for multiple threads to be useful.
Similarly, what happens when using prange only for the outermost loop? (i.e. letting for j in range(20): ... )
In theory, it is generally better to specify more parallelism. In practice, it is not always useful and sometimes even detrimental because the runtime can use inefficient methods (loop fusion can cause slow modulus to be used with some OpenMP backends).
If you use it only on the outer i-based loop, then only this loop is parallelized (using all the cores by default, so 4 iterations per loop if a static schedule is selected by the backend).
I did some profiling using the timeit module and it seems that the parallel=True keyword only slows the execution of the f() function when the .py script is called on a machine with 20 cores (by a considerable amount (even 4 times slower)).
Parallel programming is not easy. At least, far more than most people think. This is why researcher teams worked on it for decades and it is still an active field of research.
There are many effects that can be responsible for this, including:
Allocator contention (very frequent)
Undefined behavior in the code (frequent): typically a race condition (example)
False-sharing (quite frequent)
NUMA effects (eg. access to remote pages)
Other resource saturation (eg. memory) though it generally make the code barely scale and do not cause a slowdown (unless there is a contention)
A bug in Numba (quite rare)
Also note that the first call cause the function to be compiled so it is slower (and parallel codes are even slower to compile).

Benefits of using enumerate?

I'm a beginner at Python.
I'm wondering is enumerate a more efficient way of doing this? Or does it not matter so much here, and really only comes into play when doing more complex things?
My code without enumerate:
for x in thing:
if thing.index(x) % 2 == 0:
x += 7
print (x)
else:
print (x)
and my code using enumerate:
for index,x in enumerate(thing):
if index % 2 == 0:
x += 7
print (x)
else:
print (x)

list.index has a complexity of O(n) which means you'll be traversing the list more than twice (also considering the for loop itself), and it returns the first index of a given item, which means you'll get incorrect results for lists with duplicate items.
enumerate solves this by simply generating the indices and the items on the fly; I don't think you can get more performance than what the builtin enumerate provides.
Also keep in mind that enumerate is evaluated lazily; a huge plus for large lists. On the contrary, you don't want to call the index method of a large list, even if there were no duplicates in the list and the results were correct, you'll still be making unnecesary traversals across the list.

If you're wondering about efficiency, there are several tools you can use to check which solution/algorithm is more efficient. This is called profiling.
The first aim of profiling is to test a representative system to identify what’s slow (or using too much RAM, or causing too much disk I/O or network I/O).
Profiling typically adds an overhead (10x to 100x slowdowns can be typical), and you still want your code to be used as similarly to in a real-world situation as possible. Extract a test case and isolate the piece of the system that you need to test. Preferably, it’ll have been written to be in its own set of modules already.
Basic techniques include the %timeit magic in IPython, time.time(), and a timing decorator (see example below). You can use these techniques to understand the behavior of statements and functions.
Then you have cProfile which will give you a high-level view of the problem so you can direct your attention to the critical functions.
Next, look at line_profiler, which will profile your chosen functions on a line-by-line basis. The result will include a count of the number of times each line is called and the percentage of time spent on each line. This is exactly the information you need to understand what’s running slowly and why.
perf stat helps you understand the number of instructions that are ultimately executed on a CPU and how efficiently the CPU’s caches are utilized. This allows for advanced-level tuning of matrix operations.
heapy can track all of the objects inside Python’s memory. This is great for hunting down strange memory leaks. If you’re working with long-running systems,
then dowser will interest you: it allows you to introspect live objects in a long-running process via a web browser interface.
To help you understand why your RAM usage is high, check out memory_profiler. It is particularly useful for tracking RAM usage over time on a labeled chart, so you can explain to colleagues (or yourself) why certain functions use more RAM than expected.
Example: Defining a decorator to automate timing measurements
from functools import wraps
def timefn(fn):
#wraps(fn)
def measure_time(*args, **kwargs):
t1 = time.time()
result = fn(*args, **kwargs)
t2 = time.time()
print ("#timefn:" + fn.func_name + " took " + str(t2 - t1) + " seconds")
return result
return measure_time
#timefn
def your_func(var1, var2):
...
For more information, I suggest reading High performance Python (Micha Gorelick; Ian Ozsvald) from which the above was sourced.

Understanding len function with iterators

Reading the documentation I have noticed that the built-in function len doesn't support all iterables but just sequences and mappings (and sets). Before reading that, I always thought that the len function used the iteration protocol to evaluate the length of an object, so I was really surprised reading that.
I read the already-posted questions (here and here) but I am still confused, I'm still not getting the real reason why not allow len to work with all iterables in general.
Is it a more conceptual/logical reason than an implementational one? I mean when I'm asking the length of an object, I'm asking for one property (how many elements it has), a property that objects as generators don't have because they do not have elements inside, the produce elements.
Furthermore generator objects can yield infinite elements bring to an undefined length, something that can not happen with other objects as lists, tuples, dicts, etc...
So am I right, or are there more insights/something more that I'm not considering?

The biggest reason is that it reduces type safety.
How many programs have you written where you actually needed to consume an iterable just to know how many elements it had, throwing away anything else?
I, in quite a few years of coding in Python, never needed that. It's a non-sensical operation in normal programs. An iterator may not have a length (e.g. infinite iterators or generators that expects inputs via send()), so asking for it doesn't make much sense. The fact that len(an_iterator) produces an error means that you can find bugs in your code. You can see that in a certain part of the program you are calling len on the wrong thing, or maybe your function actually needs a sequence instead of an iterator as you expected.
Removing such errors would create a new class of bugs where people, calling len, erroneously consume an iterator, or use an iterator as if it were a sequence without realizing.
If you really need to know the length of an iterator, what's wrong with len(list(iterator))? The extra 6 characters? It's trivial to write your own version that works for iterators, but, as I said, 99% of the time this simply means that something with your code is wrong, because such an operation doesn't make much sense.
The second reason is that, with that change, you are violating two nice properties of len that currently hold for all (known) containers:
It is known to be cheap on all containers ever implemented in Python (all built-ins, standard library, numpy & scipy and all other big third party libraries do this on both dynamically sized and static sized containers). So when you see len(something) you know that the len call is cheap. Making it work with iterators would mean that suddenly all programs might become inefficient due to computations of the length.
Also note that you can, trivially, implement O(1) __len__ on every container. The cost to pre-compute the length is often negligible, and generally worth paying.
The only exception would be if you implement immutable containers that have part of their internal representation shared with other instances (to save memory). However, I don't know of any implementation that does this, and most of the time you can achieve better than O(n) time anyway.
In summary: currently everybody implements __len__ in O(1) and it's easy to continue to do so. So there is an expectation for calls to len to be O(1). Even if it's not part of the standard. Python developers intentionally avoid C/C++'s style legalese in their documentation and trust the users. In this case, if your __len__ isn't O(1), it's expected that you document that.
It is known to be not destructive. Any sensible implementation of __len__ doesn't change its argument. So you can be sure that len(x) == len(x), or that n = len(x);len(list(x)) == n.
Even this property is not defined in the documentation, however it's expected by everyone, and currently, nobody violates it.
Such properties are good, because you can reason and make assumptions about code using them.
They can help you ensure the correctness of a piece of code, or understand its asymptotic complexity. The change you propose would make it much harder to look at some code and understand whether it's correct or what would be it's complexity, because you have to keep in mind the special cases.
In summary, the change you are proposing has one, really small, pro: saving few characters in very particular situations, but it has several, big, disadvantages which would impact a huge portion of existing code.
An other minor reason. If len consumes iterators I'm sure that some people would start to abuse this for its side-effects (replacing the already ugly use of map or list-comprehensions). Suddenly people can write code like:
len(print(something) for ... in ...)
to print text, which is really just ugly. It doesn't read well. Stateful code should be relagated to statements, since they provide a visual cue of side-effects.

How to improve Python code speed

I was solving this python challenge http://coj.uci.cu/24h/problem.xhtml?abb=2634 and this is my answer
c = int(input())
l = []
for j in range(c) :
i = raw_input().split()[1].split('/')
l.append(int(i[1]))
for e in range(1,13) :
print e , l.count(e)
But it was not the fastest python solution, so i tried to find how to improve the speed and i found that xrange was faster than range. But when i tried the following code it was actually slower
c = int(input())
l = []
for j in xrange(c):
i = raw_input().split()[1].split('/')[1]
l.append(i)
for e in xrange(1,13) :
print e , l.count(`e`)
so i have 2 questions :
How can i improve the speed of my script
Where can i find information on how to improve python speed
When i was looking for this info i found sites like this one https://wiki.python.org/moin/PythonSpeed/PerformanceTips
but it doesn't specify for example, if it is faster/slower to split a string multiple times in a single line or in multiple lines, for example using part of the script mentioned above :
i = raw_input().split()[1].split('/')[1]
vs
i = raw_input().split()
i = i[1].split('/')
i = i[1]
Edit : I have tried all your suggestions but my first answer is still the fastest and i don't know why. My firs answer was 151ms and #Bakuriu's answer was 197ms and my answer using collections.Counter was 188ms.
Edit 2 : Please disregard my last edit, i just found out that the method for checking your code performance in the site mentioned above does not work, if you upload the same code more times the performance is different each time, some times it's slower and sometimes faster

Assuming you are using CPython, the golden rule is to push as much work as possible into built-in functions, since these are written in C and thus avoid the interpreter overhead.
This means that you should:
Avoid explicit loops when there is a function/method that already does what you want
Avoid expensive lookups in inner loops. In rare circumstances you may go as far as use local variables to store built-in functions.
Use the right data structures. Don't simply use lists and dicts. The standard library contains other data types, and there are many libraries out there. Consider which should be the efficient operations to solve your problem and choose the correct data structure
Avoid meta-programming. If you need speed you don't want a simple attribute lookup to trigger 10 method calls with complex logic behind the scenes. (However where you don't really need speed metaprogramming is really cool!)
Profile your code to find the bottleneck and optimize the bottleneck. Often what we think about performance of some concrete code is completely wrong.
Use the dis module to disassemble the bytecode. This gives you a simple way to see what the interpreter will really do. If you really want to know how the interpreter works you should try to read the source for PyEval_EvalFrameEx which contains the mainloop of the interpreter (beware: hic sunt leones!).
Regarding CPython you should read An optimization anecdote by Guido Van Rossum. It gives many insights as to how performance can change with various solutions. An other example could be this answer (disclaimer: it's mine) where the fastest solution is probably very counter intuitive for someone not used to CPython workings.
An other good thing to do is to study all most used built-in and stdlib data types, since each one has both positive and negative proporties. In this specific case calling list.count() is an heavy operation, since it has to scan the whole list every time it is performed. That's probably were a lot of the time is consumed in your solution.
One way to minimize interpreter overhead is to use collections.Counter, which also avoids scanning the data multiple times:
from collections import Counter
counts = Counter(raw_input().split('/')[-2] for _ in range(int(raw_input())))
for i in range(1, 13):
print(i, counts[str(i)])
Note that there is no need to convert the month to an integer, so you can avoid those function calls (assuming the months are always written in the same way. No 07 and 7).
Also I don't understand why you are splitting on whitespace and then on the / when you can simply split by the / and take the one-to-last element from the list.
An other (important) optimization could be to read all stdin to avoid multiple IO calls, however this may not work in this situation since the fact that they tell you how many employees are there probably means that they are not sending an EOF.
Note that different versions of python have completely different ways of optimizing code. For example PyPy's JIT works best when you perform simply operations in loops that the JIT is able to analyze and optimize. So it's about the opposite of what you would do in CPython.

Why program functionally in Python?

At work we used to program our Python in a pretty standard OO way. Lately, a couple guys got on the functional bandwagon. And their code now contains lots more lambdas, maps and reduces. I understand that functional languages are good for concurrency but does programming Python functionally really help with concurrency? I am just trying to understand what I get if I start using more of Python's functional features.

Edit: I've been taken to task in the comments (in part, it seems, by fanatics of FP in Python, but not exclusively) for not providing more explanations/examples, so, expanding the answer to supply some.
lambda, even more so map (and filter), and most especially reduce, are hardly ever the right tool for the job in Python, which is a strongly multi-paradigm language.
lambda main advantage (?) compared to the normal def statement is that it makes an anonymous function, while def gives the function a name -- and for that very dubious advantage you pay an enormous price (the function's body is limited to one expression, the resulting function object is not pickleable, the very lack of a name sometimes makes it much harder to understand a stack trace or otherwise debug a problem -- need I go on?!-).
Consider what's probably the single most idiotic idiom you sometimes see used in "Python" (Python with "scare quotes", because it's obviously not idiomatic Python -- it's a bad transliteration from idiomatic Scheme or the like, just like the more frequent overuse of OOP in Python is a bad transliteration from Java or the like):
inc = lambda x: x + 1
by assigning the lambda to a name, this approach immediately throws away the above-mentioned "advantage" -- and doesn't lose any of the DISadvantages! For example, inc doesn't know its name -- inc.__name__ is the useless string '<lambda>' -- good luck understanding a stack trace with a few of these;-). The proper Python way to achieve the desired semantics in this simple case is, of course:
def inc(x): return x + 1
Now inc.__name__ is the string 'inc', as it clearly should be, and the object is pickleable -- the semantics are otherwise identical (in this simple case where the desired functionality fits comfortably in a simple expression -- def also makes it trivially easy to refactor if you need to temporarily or permanently insert statements such as print or raise, of course).
lambda is (part of) an expression while def is (part of) a statement -- that's the one bit of syntax sugar that makes people use lambda sometimes. Many FP enthusiasts (just as many OOP and procedural fans) dislike Python's reasonably strong distinction between expressions and statements (part of a general stance towards Command-Query Separation). Me, I think that when you use a language you're best off using it "with the grain" -- the way it was designed to be used -- rather than fighting against it; so I program Python in a Pythonic way, Scheme in a Schematic (;-) way, Fortran in a Fortesque (?) way, and so on:-).
Moving on to reduce -- one comment claims that reduce is the best way to compute the product of a list. Oh, really? Let's see...:
$ python -mtimeit -s'L=range(12,52)' 'reduce(lambda x,y: x*y, L, 1)'
100000 loops, best of 3: 18.3 usec per loop
$ python -mtimeit -s'L=range(12,52)' 'p=1' 'for x in L: p*=x'
100000 loops, best of 3: 10.5 usec per loop
so the simple, elementary, trivial loop is about twice as fast (as well as more concise) than the "best way" to perform the task?-) I guess the advantages of speed and conciseness must therefore make the trivial loop the "bestest" way, right?-)
By further sacrificing compactness and readability...:
$ python -mtimeit -s'import operator; L=range(12,52)' 'reduce(operator.mul, L, 1)'
100000 loops, best of 3: 10.7 usec per loop
...we can get almost back to the easily obtained performance of the simplest and most obvious, compact, and readable approach (the simple, elementary, trivial loop). This points out another problem with lambda, actually: performance! For sufficiently simple operations, such as multiplication, the overhead of a function call is quite significant compared to the actual operation being performed -- reduce (and map and filter) often forces you to insert such a function call where simple loops, list comprehensions, and generator expressions, allow the readability, compactness, and speed of in-line operations.
Perhaps even worse than the above-berated "assign a lambda to a name" anti-idiom is actually the following anti-idiom, e.g. to sort a list of strings by their lengths:
thelist.sort(key=lambda s: len(s))
instead of the obvious, readable, compact, speedier
thelist.sort(key=len)
Here, the use of lambda is doing nothing but inserting a level of indirection -- with no good effect whatsoever, and plenty of bad ones.
The motivation for using lambda is often to allow the use of map and filter instead of a vastly preferable loop or list comprehension that would let you do plain, normal computations in line; you still pay that "level of indirection", of course. It's not Pythonic to have to wonder "should I use a listcomp or a map here": just always use listcomps, when both appear applicable and you don't know which one to choose, on the basis of "there should be one, and preferably only one, obvious way to do something". You'll often write listcomps that could not be sensibly translated to a map (nested loops, if clauses, etc), while there's no call to map that can't be sensibly rewritten as a listcomp.
Perfectly proper functional approaches in Python often include list comprehensions, generator expressions, itertools, higher-order functions, first-order functions in various guises, closures, generators (and occasionally other kinds of iterators).
itertools, as a commenter pointed out, does include imap and ifilter: the difference is that, like all of itertools, these are stream-based (like map and filter builtins in Python 3, but differently from those builtins in Python 2). itertools offers a set of building blocks that compose well with each other, and splendid performance: especially if you find yourself potentially dealing with very long (or even unbounded!-) sequences, you owe it to yourself to become familiar with itertools -- their whole chapter in the docs makes for good reading, and the recipes in particular are quite instructive.
Writing your own higher-order functions is often useful, especially when they're suitable for use as decorators (both function decorators, as explained in that part of the docs, and class decorators, introduced in Python 2.6). Do remember to use functools.wraps on your function decorators (to keep the metadata of the function getting wrapped)!
So, summarizing...: anything you can code with lambda, map, and filter, you can code (more often than not advantageously) with def (named functions) and listcomps -- and usually moving up one notch to generators, generator expressions, or itertools, is even better. reduce meets the legal definition of "attractive nuisance"...: it's hardly ever the right tool for the job (that's why it's not a built-in any more in Python 3, at long last!-).

FP is important not only for concurrency; in fact, there's virtually no concurrency in the canonical Python implementation (maybe 3.x changes that?). in any case, FP lends itself well to concurrency because it leads to programs with no or fewer (explicit) states. states are troublesome for a few reasons. one is that they make distributing the computation hard(er) (that's the concurrency argument), another, far more important in most cases, is the tendency to inflict bugs. the biggest source of bugs in contemporary software is variables (there's a close relationship between variables and states). FP may reduce the number of variables in a program: bugs squashed!
see how many bugs can you introduce by mixing the variables up in these versions:
def imperative(seq):
p = 1
for x in seq:
p *= x
return p
versus (warning, my.reduce's parameter list differs from that of python's reduce; rationale given later)
import operator as ops
def functional(seq):
return my.reduce(ops.mul, 1, seq)
as you can see, it's a matter of fact that FP gives you fewer opportunities to shoot yourself in the foot with a variables-related bug.
also, readability: it may take a bit of training, but functional is way easier to read than imperative: you see reduce ("ok, it's reducing a sequence to a single value"), mul ("by multiplication"). wherease imperative has the generic form of a for cycle, peppered with variables and assignments. these for cycles all look the same, so to get an idea of what's going on in imperative, you need to read almost all of it.
then there's succintness and flexibility. you give me imperative and I tell you I like it, but want something to sum sequences as well. no problem, you say, and off you go, copy-pasting:
def imperative(seq):
p = 1
for x in seq:
p *= x
return p
def imperative2(seq):
p = 0
for x in seq:
p += x
return p
what can you do to reduce the duplication? well, if operators were values, you could do something like
def reduce(op, seq, init):
rv = init
for x in seq:
rv = op(rv, x)
return rv
def imperative(seq):
return reduce(*, 1, seq)
def imperative2(seq):
return reduce(+, 0, seq)
oh wait! operators provides operators that are values! but.. Alex Martelli condemned reduce already... looks like if you want to stay within the boundaries he suggests, you're doomed to copy-pasting plumbing code.
is the FP version any better? surely you'd need to copy-paste as well?
import operator as ops
def functional(seq):
return my.reduce(ops.mul, 1, seq)
def functional2(seq):
return my.reduce(ops.add, 0, seq)
well, that's just an artifact of the half-assed approach! abandoning the imperative def, you can contract both versions to
import functools as func, operator as ops
functional = func.partial(my.reduce, ops.mul, 1)
functional2 = func.partial(my.reduce, ops.add, 0)
or even
import functools as func, operator as ops
reducer = func.partial(func.partial, my.reduce)
functional = reducer(ops.mul, 1)
functional2 = reducer(ops.add, 0)
(func.partial is the reason for my.reduce)
what about runtime speed? yes, using FP in a language like Python will incur some overhead. here i'll just parrot what a few professors have to say about this:
premature optimization is the root of all evil.
most programs spend 80% of their runtime in 20% percent of their code.
profile, don't speculate!
I'm not very good at explaining things. Don't let me muddy the water too much, read the first half of the speech John Backus gave on the occasion of receiving the Turing Award in 1977. Quote:
5.1 A von Neumann Program for Inner Product
c := 0
for i := I step 1 until n do
c := c + a[i] * b[i]
Several properties of this program are
worth noting:
Its statements operate on an invisible "state" according to complex
rules.
It is not hierarchical. Except for the right side of the assignment
statement, it does not construct
complex entities from simpler ones.
(Larger programs, however, often do.)
It is dynamic and repetitive. One must mentally execute it to
understand it.
It computes word-at-a-time by repetition (of the assignment) and by
modification (of variable i).
Part of the data, n, is in the program; thus it lacks generality and
works only for vectors of length n.
It names its arguments; it can only be used for vectors a and b.
To become general, it requires a
procedure declaration. These involve
complex issues (e.g., call-by-name
versus call-by-value).
Its "housekeeping" operations are represented by symbols in
scattered places (in the for statement
and the subscripts in the assignment).
This makes it impossible to
consolidate housekeeping operations,
the most common of all, into single,
powerful, widely useful operators.
Thus in programming those operations
one must always start again at square
one, writing "for i := ..." and
"for j := ..." followed by
assignment statements sprinkled with
i's and j's.

I program in Python everyday, and I have to say that too much 'bandwagoning' toward OO or functional could lead toward missing elegant solutions. I believe that both paradigms have their advantages to certain problems - and I think that's when you know what approach to use. Use a functional approach when it leaves you with a clean, readable, and efficient solution. Same goes for OO.
And that's one of the reasons I love Python - the fact that it is multi-paradigm and lets the developer choose how to solve his/her problem.

This answer is completely re-worked. It incorporates a lot of observations from the other answers.
As you can see, there is a lot of strong feelings surrounding the use of functional programming constructs in Python. There are three major groups of ideas here.
First, almost everybody but the people who are most wedded to the purest expression of the functional paradigm agree that list and generator comprehensions are better and clearer than using map or filter. Your colleagues should be avoiding the use of map and filter if you are targeting a version of Python new enough to support list comprehensions. And you should be avoiding itertools.imap and itertools.ifilter if your version of Python is new enough for generator comprehensions.
Secondly, there is a lot of ambivalence in the community as a whole about lambda. A lot of people are really annoyed by a syntax in addition to def for declaring functions, especially one that involves a keyword like lambda that has a rather strange name. And people are also annoyed that these small anonymous functions are missing any of the nice meta-data that describes any other kind of function. It makes debugging harder. Lastly the small functions declared by lambda are often not terribly efficient as they require the overhead of a Python function call each time they are invoked, which is frequently in an inner loop.
Lastly, most (meaning > 50%, but most likely not 90%) people think that reduce is a little strange and obscure. I myself admit to having print reduce.__doc__ whenever I want to use it, which isn't all that often. Though when I see it used, the nature of the arguments (i.e. function, list or iterator, scalar) speak for themselves.
As for myself, I fall in the camp of people who think the functional style is often very useful. But balancing that thought is the fact that Python is not at heart a functional language. And overuse of functional constructs can make programs seem strangely contorted and difficult for people to understand.
To understand when and where the functional style is very helpful and improves readability, consider this function in C++:
unsigned int factorial(unsigned int x)
{
int fact = 1;
for (int i = 2; i <= n; ++i) {
fact *= i;
}
return fact
}
This loop seems very simple and easy to understand. And in this case it is. But its seeming simplicity is a trap for the unwary. Consider this alternate means of writing the loop:
unsigned int factorial(unsigned int n)
{
int fact = 1;
for (int i = 2; i <= n; i += 2) {
fact *= i--;
}
return fact;
}
Suddenly, the loop control variable no longer varies in an obvious way. You are reduced to looking through the code and reasoning carefully about what happens with the loop control variable. Now this example is a bit pathological, but there are real-world examples that are not. And the problem is with the fact that the idea is repeated assignment to an existing variable. You can't trust the variable's value is the same throughout the entire body of the loop.
This is a long recognized problem, and in Python writing a loop like this is fairly unnatural. You have to use a while loop, and it just looks wrong. Instead, in Python you would write something like this:
def factorial(n):
fact = 1
for i in xrange(2, n):
fact = fact * i;
return fact
As you can see, the way you talk about the loop control variable in Python is not amenable to fooling with it inside the loop. This eliminates a lot of the problems with 'clever' loops in other imperative languages. Unfortunately, it's an idea that's semi-borrowed from functional languages.
Even this lends itself to strange fiddling. For example, this loop:
c = 1
for i in xrange(0, min(len(a), len(b))):
c = c * (a[i] + b[i])
if i < len(a):
a[i + 1] = a[a + 1] + 1
Oops, we again have a loop that is difficult to understand. It superficially resembles a really simple and obvious loop, and you have to read it carefully to realize that one of the variables used in the loop's computation is being messed with in a way that will effect future runs of the loop.
Again, a more functional approach to the rescue:
from itertools import izip
c = 1
for ai, bi in izip(a, b):
c = c * (ai + bi)
Now by looking at the code we have some strong indication (partly by the fact that the person is using this functional style) that the lists a and b are not modified during the execution of the loop. One less thing to think about.
The last thing to be worried about is c being modified in strange ways. Perhaps it is a global variable and is being modified by some roundabout function call. To rescue us from this mental worry, here is a purely function approach:
from itertools import izip
c = reduce(lambda x, ab: x * (ab[0] + ab[1]), izip(a, b), 1)
Very concise, and the structure tells us that x is purely an accumulator. It is a local variable everywhere it appear. The final result is unambiguously assigned to c. Now there is much less to worry about. The structure of the code removes several classes of possible error.
That is why people might choose a functional style. It is concise and clear, at least if you understand what reduce and lambda do. There are large classes of problems that could afflict a program written in a more imperative style that you know won't afflict your functional style program.
In the case of factorial, there is a very simple and clear way to write this function in Python in a functional style:
import operator
def factorial(n):
return reduce(operator.mul, xrange(2, n+1), 1)

The question, which seems to be mostly ignored here:
does programming Python functionally really help with concurrency?
No. The value FP brings to concurrency is in eliminating state in computation, which is ultimately responsible for the hard-to-grasp nastiness of unintended errors in concurrent computation. But it depends on the concurrent programming idioms not themselves being stateful, something that doesn't apply to Twisted. If there are concurrency idioms for Python that leverage stateless programming, I don't know of them.

Here's a short summary of positive answers when/why to program functionally.
List comprehensions were imported from Haskell, a FP language. They are Pythonic. I'd prefer to write
y = [i*2 for i in k if i % 3 == 0]
than to use an imperative construct (loop).
I'd use lambda when giving a complicated key to sort, like list.sort(key=lambda x: x.value.estimate())
It's cleaner to use higher-order functions than to write code using OOP's design patterns like visitor or abstract factory
People say that you should program Python in Python, C++ in C++ etc. That's true, but certainly you should be able to think in different ways at the same thing. If while writing a loop you know that you're really doing reducing (folding), then you'll be able to think on a higher level. That cleans your mind and helps to organize. Of course lower-level thinking is important too.
You should NOT overuse those features - there are many traps, see Alex Martelli's post. I'd subjectively say the most serious danger is that excessive use of those features will destroy readability of your code, which is a core attribute of Python.

The standard functions filter(), map() and reduce() are used for various operations on a list and all of the three functions expect two arguments: A function and a list
We could define a separate function and use it as an argument to filter() etc., and its probably a good idea if that function is used several times, or if the function is too complex to be written in a single line. However, if it's needed only once and it's quite simple, it's more convenient to use a lambda construct to generate a (temporary) anonymous function and pass it to filter().
This helps in readability and compact code.
Using these function, would also turn out to be efficient, because the looping on the elements of the list is done in C, which is a little bit faster than looping in python.
And object oriented way is forcibly needed when states are to be maintained, apart from abstraction, grouping, etc., If the requirement is pretty simple, I would stick with functional than to Object Oriented programming.

Map and Filter have their place in OO programming. Right next to list comprehensions and generator functions.
Reduce less so. The algorithm for reduce can rapidly suck down more time than it deserves; with a tiny bit of thinking, a manually-written reduce-loop will be more efficient than a reduce which applies a poorly-thought-out looping function to a sequence.
Lambda never. Lambda is useless. One can make the argument that it actually does something, so it's not completely useless. First: Lambda is not syntactic "sugar"; it makes things bigger and uglier. Second: the one time in 10,000 lines of code that think you need an "anonymous" function turns into two times in 20,000 lines of code, which removes the value of anonymity, making it into a maintenance liability.
However.
The functional style of no-object-state-change programming is still OO in nature. You just do more object creation and fewer object updates. Once you start using generator functions, much OO programming drifts in a functional direction.
Each state change appears to translate into a generator function that builds a new object in the new state from old object(s). It's an interesting world view because reasoning about the algorithm is much, much simpler.
But that's no call to use reduce or lambda.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.