What is the 'pythonic' equivalent to the 'fold' function from functional programming? - python

What is the most idiomatic way to achieve something like the following, in Haskell:
foldl (+) 0 [1,2,3,4,5]
--> 15
Or its equivalent in Ruby:
[1,2,3,4,5].inject(0) {|m,x| m + x}
#> 15
Obviously, Python provides the reduce function, which is an implementation of fold, exactly as above, however, I was told that the 'pythonic' way of programming was to avoid lambda terms and higher-order functions, preferring list-comprehensions where possible. Therefore, is there a preferred way of folding a list, or list-like structure in Python that isn't the reduce function, or is reduce the idiomatic way of achieving this?

The Pythonic way of summing an array is using sum. For other purposes, you can sometimes use some combination of reduce (from the functools module) and the operator module, e.g.:
def product(xs):
return reduce(operator.mul, xs, 1)
Be aware that reduce is actually a foldl, in Haskell terms. There is no special syntax to perform folds, there's no builtin foldr, and actually using reduce with non-associative operators is considered bad style.
Using higher-order functions is quite pythonic; it makes good use of Python's principle that everything is an object, including functions and classes. You are right that lambdas are frowned upon by some Pythonistas, but mostly because they tend not to be very readable when they get complex.

Starting Python 3.8, and the introduction of assignment expressions (PEP 572) (:= operator), which gives the possibility to name the result of an expression, we can use a list comprehension to replicate what other languages call fold/foldleft/reduce operations:
Given a list, a reducing function and an accumulator:
items = [1, 2, 3, 4, 5]
f = lambda acc, x: acc * x
accumulator = 1
we can fold items with f in order to obtain the resulting accumulation:
[accumulator := f(accumulator, x) for x in items]
# accumulator = 120
or in a condensed formed:
acc = 1; [acc := acc * x for x in [1, 2, 3, 4, 5]]
# acc = 120
Note that this is actually also a "scanleft" operation as the result of the list comprehension represents the state of the accumulation at each step:
acc = 1
scanned = [acc := acc * x for x in [1, 2, 3, 4, 5]]
# scanned = [1, 2, 6, 24, 120]
# acc = 120

Haskell
foldl (+) 0 [1,2,3,4,5]
Python
reduce(lambda a,b: a+b, [1,2,3,4,5], 0)
Obviously, that is a trivial example to illustrate a point. In Python you would just do sum([1,2,3,4,5]) and even Haskell purists would generally prefer sum [1,2,3,4,5].
For non-trivial scenarios when there is no obvious convenience function, the idiomatic pythonic approach is to explicitly write out the for loop and use mutable variable assignment instead of using reduce or a fold.
That is not at all the functional style, but that is the "pythonic" way. Python is not designed for functional purists. See how Python favors exceptions for flow control to see how non-functional idiomatic python is.

In Python 3, the reduce has been removed: Release notes. Nevertheless you can use the functools module
import operator, functools
def product(xs):
return functools.reduce(operator.mul, xs, 1)
On the other hand, the documentation expresses preference towards for-loop instead of reduce, hence:
def product(xs):
result = 1
for i in xs:
result *= i
return result

Not really answer to the question, but one-liners for foldl and foldr:
a = [8,3,4]
## Foldl
reduce(lambda x,y: x**y, a)
#68719476736
## Foldr
reduce(lambda x,y: y**x, a[::-1])
#14134776518227074636666380005943348126619871175004951664972849610340958208L

You can reinvent the wheel as well:
def fold(f, l, a):
"""
f: the function to apply
l: the list to fold
a: the accumulator, who is also the 'zero' on the first call
"""
return a if(len(l) == 0) else fold(f, l[1:], f(a, l[0]))
print "Sum:", fold(lambda x, y : x+y, [1,2,3,4,5], 0)
print "Any:", fold(lambda x, y : x or y, [False, True, False], False)
print "All:", fold(lambda x, y : x and y, [False, True, False], True)
# Prove that result can be of a different type of the list's elements
print "Count(x==True):",
print fold(lambda x, y : x+1 if(y) else x, [False, True, True], 0)

The actual answer to this (reduce) problem is: Just use a loop!
initial_value = 0
for x in the_list:
initial_value += x #or any function.
This will be faster than a reduce and things like PyPy can optimize loops like that.
BTW, the sum case should be solved with the sum function

I believe some of the respondents of this question have missed the broader implication of the fold function as an abstract tool. Yes, sum can do the same thing for a list of integers, but this is a trivial case. fold is more generic. It is useful when you have a sequence of data structures of varying shape and want to cleanly express an aggregation. So instead of having to build up a for loop with an aggregate variable and manually recompute it each time, a fold function (or the Python version, which reduce appears to correspond to) allows the programmer to express the intent of the aggregation much more plainly by simply providing two things:
A default starting or "seed" value for the aggregation.
A function that takes the current value of the aggregation (starting with the "seed") and the next element in the list, and returns the next aggregation value.

Related

Better way to call a chain of functions in python?

I have a chain of operations which needs to occur one after the other and each depends on the previous function's output.
Like this:
out1 = function1(initial_input)
out2 = function2(out1)
out3 = function3(out2)
out4 = function4(out3)
and so on about 10 times. It looks a little ugly in the code.
What is the best way to write it? Is there someway to handle it using some functional programming magic? Is there a better way to call and execute this function chain?
You can use functools.reduce:
out = functools.reduce(lambda x, y : y(x), [f1, f2, f3, f4], initial_value)
Quoting functools.reduce documentation:
Apply a function of two arguments cumulatively to the items of a sequence,
from left to right, so as to reduce the sequence to a single value.
For example, reduce(lambda x, y: x+y, [1, 2, 3, 4, 5]) calculates
((((1+2)+3)+4)+5). If initial is present, it is placed before the items
of the sequence in the calculation, and serves as a default when the
sequence is empty.
Here, we use the fact that functions can be treated as any variable in Python, and an anonymous functions which simply does "apply x to function y".
This "reduce" operation is part of a very general pattern which have been applied successfully to parallelize tasks (see http://en.wikipedia.org/wiki/MapReduce).
Use a loop:
out = initial_input
for func in [function1, function2, function3, function4]:
out = func(out)
To propagate functional programming a bit:
In [1]: compose = lambda f, g: lambda arg: f(g(arg))
In [2]: from functools import reduce
In [3]: funcs = [lambda x:x+1, lambda x:x*2]
In [4]: f = reduce(compose, funcs)
In [5]: f(1)
Out[5]: 3
In [6]: f(3)
Out[6]: 7
You can pass the return values directly to the next function:
out4 = function4(function3(function2(function1(initial_input))))
But this isn't necessarily better, and is perhaps less readable.

Is there a 'foreach' function in Python 3?

When I meet the situation I can do it in javascript, I always think if there's an foreach function it would be convenience. By foreach I mean the function which is described below:
def foreach(fn,iterable):
for x in iterable:
fn(x)
they just do it on every element and didn't yield or return something,i think it should be a built-in function and should be more faster than writing it with pure Python, but I didn't found it on the list,or it just called another name?or I just miss some points here?
Maybe I got wrong, cause calling an function in Python cost high, definitely not a good practice for the example. Rather than an out loop, the function should do the loop in side its body looks like this below which already mentioned in many python's code suggestions:
def fn(*args):
for x in args:
dosomething
but I thought foreach is still welcome base on the two facts:
In normal cases, people just don't care about the performance
Sometime the API didn't accept iterable object and you can't rewrite its source.
Every occurence of "foreach" I've seen (PHP, C#, ...) does basically the same as pythons "for" statement.
These are more or less equivalent:
// PHP:
foreach ($array as $val) {
print($val);
}
// C#
foreach (String val in array) {
console.writeline(val);
}
// Python
for val in array:
print(val)
So, yes, there is a "foreach" in python. It's called "for".
What you're describing is an "array map" function. This could be done with list comprehensions in python:
names = ['tom', 'john', 'simon']
namesCapitalized = [capitalize(n) for n in names]
Python doesn't have a foreach statement per se. It has for loops built into the language.
for element in iterable:
operate(element)
If you really wanted to, you could define your own foreach function:
def foreach(function, iterable):
for element in iterable:
function(element)
As a side note the for element in iterable syntax comes from the ABC programming language, one of Python's influences.
Other examples:
Python Foreach Loop:
array = ['a', 'b']
for value in array:
print(value)
# a
# b
Python For Loop:
array = ['a', 'b']
for index in range(len(array)):
print("index: %s | value: %s" % (index, array[index]))
# index: 0 | value: a
# index: 1 | value: b
map can be used for the situation mentioned in the question.
E.g.
map(len, ['abcd','abc', 'a']) # 4 3 1
For functions that take multiple arguments, more arguments can be given to map:
map(pow, [2, 3], [4,2]) # 16 9
It returns a list in python 2.x and an iterator in python 3
In case your function takes multiple arguments and the arguments are already in the form of tuples (or any iterable since python 2.6) you can use itertools.starmap. (which has a very similar syntax to what you were looking for). It returns an iterator.
E.g.
for num in starmap(pow, [(2,3), (3,2)]):
print(num)
gives us 8 and 9
The correct answer is "python collections do not have a foreach". In native python we need to resort to the external for _element_ in _collection_ syntax which is not what the OP is after.
Python is in general quite weak for functionals programming. There are a few libraries to mitigate a bit. I helped author one of these infixpy
pip install infixpy https://pypi.org/project/infixpy/
from infixpy import Seq
(Seq([1,2,3]).foreach(lambda x: print(x)))
1
2
3
Also see: Left to right application of operations on a list in Python 3
Here is the example of the "foreach" construction with simultaneous access to the element indexes in Python:
for idx, val in enumerate([3, 4, 5]):
print (idx, val)
Yes, although it uses the same syntax as a for loop.
for x in ['a', 'b']: print(x)
This does the foreach in python 3
test = [0,1,2,3,4,5,6,7,8,"test"]
for fetch in test:
print(fetch)
Look at this article. The iterator object nditer from numpy package, introduced in NumPy 1.6, provides many flexible ways to visit all the elements of one or more arrays in a systematic fashion.
Example:
import random
import numpy as np
ptrs = np.int32([[0, 0], [400, 0], [0, 400], [400, 400]])
for ptr in np.nditer(ptrs, op_flags=['readwrite']):
# apply random shift on 1 for each element of the matrix
ptr += random.choice([-1, 1])
print(ptrs)
d:\>python nditer.py
[[ -1 1]
[399 -1]
[ 1 399]
[399 401]]
If I understood you right, you mean that if you have a function 'func', you want to check for each item in list if func(item) returns true; if you get true for all, then do something.
You can use 'all'.
For example: I want to get all prime numbers in range 0-10 in a list:
from math import sqrt
primes = [x for x in range(10) if x > 2 and all(x % i !=0 for i in range(2, int(sqrt(x)) + 1))]
If you really want you can do this:
[fn(x) for x in iterable]
But the point of the list comprehension is to create a list - using it for the side effect alone is poor style. The for loop is also less typing
for x in iterable: fn(x)
I know this is an old thread but I had a similar question when trying to do a codewars exercise.
I came up with a solution which nests loops, I believe this solution applies to the question, it replicates a working "for each (x) doThing" statement in most scenarios:
for elements in array:
while elements in array:
array.func()
If you're just looking for a more concise syntax you can put the for loop on one line:
array = ['a', 'b']
for value in array: print(value)
Just separate additional statements with a semicolon.
array = ['a', 'b']
for value in array: print(value); print('hello')
This may not conform to your local style guide, but it could make sense to do it like this when you're playing around in the console.
In short, the functional programming way to do this is:
def do_and_return_fn(og_fn: Callable[[T], None]):
def do_and_return(item: T) -> T:
og_fn(item)
return item
return do_and_return
# where og_fn is the fn referred to by the question.
# i.e. a function that does something on each element, but returns nothing.
iterable = map(do_and_return_fn(og_fn), iterable)
All of the answers that say "for" loops are the same as "foreach" functions are neglecting the point that other similar functions that operate on iters in python such as map, filter, and others in itertools are lazily evaluated.
Suppose, I have an iterable of dictionaries coming from my database and I want to pop an item off of each dictionary element when the iterator is iterated over. I can't use map because pop returns the item popped, not the original dictionary.
The approach I gave above would allow me to achieve this if I pass lambda x: x.pop() as my og_fn,
What would be nice is if python had a built-in lazy function with an interface like I constructed:
foreach(do_fn: Callable[[T], None], iterable: Iterable)
Implemented with the function given before, it would look like:
def foreach(do_fn: Callable[[T], None], iterable: Iterable[T]) -> Iterable[T]:
return map(do_and_return_fn(do_fn), iterable)
# being called with my db code.
# Lazily removes the INSERTED_ON_SEC_FIELD on every element:
doc_iter = foreach(lambda x: x.pop(INSERTED_ON_SEC_FIELD, None), doc_iter)
No there is no from functools import foreach support in python. However, you can just implement in the same number of lines as the import takes, anyway:
foreach = lambda f, iterable: (*map(f, iterable),)
Bonus:
variadic support: foreach = lambda f, iterable, *args: (*map(f, iterable, *args),) and you can be more efficient by avoiding constructing the tuple of Nones

Map list by partial function vs lambda

I was wondering whether for most examples it is more 'pythonic' to use lambda or the partial function?
For example, I might want to apply imap on some list, like add 3 to every element using:
imap(lambda x : x + 3, my_list)
Or to use partial:
imap(partial(operator.add, 3), my_list)
I realize in this example a loop could probably accomplish it easier, but I'm thinking about more non-trivial examples.
In Haskell, I would easily choose partial application in the above example, but I'm not sure for Python. To me, the lambda seems the the better choice, but I don't know what the prevailing choice is for most python programmers.
To be truly equivalent to imap, use a generator expression:
(x + 3 for x in mylist)
Like imap, this doesn't immediately construct an entire new list, but instead computes elements of the resulting sequence on-demand (and is thus much more efficient than a list comprehension if you're chaining the result into another iteration).
If you're curious about where partial would be a better option than lambda in the real world, it tends to be when you're dealing with variable numbers of arguments:
>>> from functools import partial
>>> def a(*args):
... return sum(args)
...
>>> b = partial(a, 2, 3)
>>> b(6, 7, 8)
26
The equivalent version using lambda would be...
>>> b = lambda *args: a(2, 3, *args)
>>> b(6, 7, 8)
26
which is slightly less concise - but lambda does give you the option of out-of-order application, which partial does not:
>>> def a(x, y, z):
... return x + y - z
...
>>> b = lambda m, n: a(m, 1, n)
>>> b(2, 5)
-2
In the given example, lambda seems most appropriate. It's also easier on the eyes.
I have never seen the use of partial functions in the wild.
lambda is certainly many times more common. Unless you're doing functional programming in an academic setting, you should probably steer away from functools.
This is pythonic. No library needed, or even builtins, just a simple generator expression.
( x + 3 for x in my_list )
This creates a generator, similar to imap.
If you're going to make a list out of it anyway, use a list comprehension instead:
[ x + 3 for x in my_list ]

Finding maximum of a list of lists by sum of elements in Python

What's the idiomatic way to do maximumBy (higher order function taking a comparison function for the test), on a list of lists, where the comparison we want to make is the sum of the list, in Python?
Here's a Haskell implementation and example output:
> maximumBy (compare `on` sum) [[1,2,3],[4,5,6],[1,3,5]]
> [4,5,6]
And implementations of those base library functions, just for completeness (in case you want to use reduce or something :)
maximumBy cmp xs = foldl1 maxBy xs
where
maxBy x y = case cmp x y of GT -> x; _ -> y
k `on` f = \x y -> f x `k` f y
sum = foldl' (+) 0
Since Python 2.5 you can use max with a key parameter:
>>> max(a, key=sum)
[4, 5, 6]
It isn't terribly efficient, but:
reduce(lambda x,y: x if sum(x)>sum(y) else y, [[1,2,3],[4,5,6],[1,3,5]])
If max didn't have the key parameter you could code the DSU pattern explicitly:
max(izip(imap(sum,a),a))[1]
izip and imap are from the itertools module in python 2 and do what zip and map do, but lazily using Python generators, to avoid consing an intermediate list. In Python 3, the map and zip builtins are lazy.

Why is there no first(iterable) built-in function in Python?

I'm wondering if there's a reason that there's no first(iterable) in the Python built-in functions, somewhat similar to any(iterable) and all(iterable) (it may be tucked in a stdlib module somewhere, but I don't see it in itertools). first would perform a short-circuit generator evaluation so that unnecessary (and a potentially infinite number of) operations can be avoided; i.e.
def identity(item):
return item
def first(iterable, predicate=identity):
for item in iterable:
if predicate(item):
return item
raise ValueError('No satisfactory value found')
This way you can express things like:
denominators = (2, 3, 4, 5)
lcd = first(i for i in itertools.count(1)
if all(i % denominators == 0 for denominator in denominators))
Clearly you can't do list(generator)[0] in that case, since the generator doesn't terminate.
Or if you have a bunch of regexes to match against (useful when they all have the same groupdict interface):
match = first(regex.match(big_text) for regex in regexes)
You save a lot of unnecessary processing by avoiding list(generator)[0] and short-circuiting on a positive match.
In Python 2, if you have an iterator, you can just call its next method. Something like:
>>> (5*x for x in xrange(2,4)).next()
10
In Python 3, you can use the next built-in with an iterator:
>>> next(5*x for x in range(2,4))
10
There's a Pypi package called “first” that does this:
>>> from first import first
>>> first([0, None, False, [], (), 42])
42
Here's how you would use to return the first odd number, for example:
>> first([2, 14, 7, 41, 53], key=lambda x: x % 2 == 1)
7
If you just want to return the first element from the iterator regardless of whether is true or not, do this:
>>> first([0, None, False, [], (), 42], key=lambda x: True)
0
It's a very small package: it only contains this function, it has no dependencies, and it works on Python 2 and 3. It's a single file, so you don't even have to install it to use it.
In fact, here's almost the entire source code (from version 2.0.1, by Hynek Schlawack, released under the MIT licence):
def first(iterable, default=None, key=None):
if key is None:
for el in iterable:
if el:
return el
else:
for el in iterable:
if key(el):
return el
return default
I asked a similar question recently (it got marked as a duplicate of this question by now). My concern also was that I'd liked to use built-ins only to solve the problem of finding the first true value of a generator. My own solution then was this:
x = next((v for v in (f(x) for x in a) if v), False)
For the example of finding the first regexp match (not the first matching pattern!) this would look like this:
patterns = [ r'\d+', r'\s+', r'\w+', r'.*' ]
text = 'abc'
firstMatch = next(
(match for match in
(re.match(pattern, text) for pattern in patterns)
if match),
False)
It does not evaluate the predicate twice (as you would have to do if just the pattern was returned) and it does not use hacks like locals in comprehensions.
But it has two generators nested where the logic would dictate to use just one. So a better solution would be nice.
There is a "slice" iterator in itertools. It emulates the slice operations that we're familiar with in python. What you're looking for is something similar to this:
myList = [0,1,2,3,4,5]
firstValue = myList[:1]
The equivalent using itertools for iterators:
from itertools import islice
def MyGenFunc():
for i in range(5):
yield i
mygen = MyGenFunc()
firstValue = islice(mygen, 0, 1)
print firstValue
There's some ambiguity in your question. Your definition of first and the regex example imply that there is a boolean test. But the denominators example explicitly has an if clause; so it's only a coincidence that each integer happens to be true.
It looks like the combination of next and itertools.ifilter will give you what you want.
match = next(itertools.ifilter(None, (regex.match(big_text) for regex in regexes)))
Haskell makes use of what you just described, as the function take (or as the partial function take 1, technically). Python Cookbook has generator-wrappers written that perform the same functionality as take, takeWhile, and drop in Haskell.
But as to why that's not a built-in, your guess is as good as mine.

Categories