Python itertools—takewhile(): multiple predicates - python

Suppose a generator yields the below tuples one by one (from left to right)
(1, 2, 3), (2, 5, 6), (3, 7, 10), (4, 5, 11), (3, 5, 15), (4, 5, 9), (4, 6, 12)
...
and suppose I'd like to iterate as long as the predicate is true. Let that predicate be sum(yielded_value) < 20. Then the iterator will stop by (3, 5, 15). I can do it with, say:
list(itertools.takewhile(lambda x: sum(x) < 20, some_generator()))
Question, how do I write a similar expression with two predicates? Suppose I want:
list(itertools.takewhile(lambda x: sum(x) < 20 and first_value_of_tuple > 3, some_generator()))
(which, in this case, stop by (4, 6, 12).)

You can access to elements of each tuple with index.
list(itertools.takewhile(lambda x: sum(x) < 20 and x[0] > 3, some_generator()))

Since everything in itertools is lazily iterated, and you are using and for two predicates, you can simply use two takewhile iterators. Sometimes I find this more readable than putting both predicates in a single predicate function or lambda:
lessthan20 = itertools.takewhile(lambda x: sum(x) < 20, some_generator())
greaterthan3 = itertools.takewhile(lambda x: x[0] > 3, lessthan20)
list(greaterthan3)
It also makes it so that you don't have a single huge one liner if you need to add even more predicates in the future.

If you have additional predicates and need to access all the elements of your tuples you can also unpack them in your lambda function:
list(itertools.takewhile(lambda (x, y, z): x+y+z < 20 and x > 3 and y < 7 and z > 1, some_generator()))
This also asserts that all your tuples have length 3. If you get a tuple with 4 values, it fails hard, as opposed to continuing silently. Obviously only useful in some contexts.

Related

Functional programming vs list comprehension

Mark Lutz in his book "Learning Python" gives an example:
>>> [(x,y) for x in range(5) if x%2==0 for y in range(5) if y%2==1]
[(0, 1), (0, 3), (2, 1), (2, 3), (4, 1), (4, 3)]
>>>
a bit later he remarks that 'a map and filter equivalent' of this is possible though complex and nested.
The closest one I ended up with is the following:
>>> list(map(lambda x:list(map(lambda y:(y,x),filter(lambda x:x%2==0,range(5)))), filter(lambda x:x%2==1,range(5))))
[[(0, 1), (2, 1), (4, 1)], [(0, 3), (2, 3), (4, 3)]]
>>>
The order of tuples is different and nested list had to be introduced. I'm curious what would be the equivalent.
A note to append to #Kasramvd's explanation.
Readability is important in Python. It's one of the features of the language. Many will consider the list comprehension the only readable way.
Sometimes, however, especially when you are working with multiple iterations of conditions, it is clearer to separate your criteria from logic. In this case, using the functional method may be preferable.
from itertools import product
def even_and_odd(vals):
return (vals[0] % 2 == 0) and (vals[1] %2 == 1)
n = range(5)
res = list(filter(even_and_odd, product(n, n)))
One important point that you have to notice is that your nested list comprehension is of O(n2) order. Meaning that it's looping over a product of two ranges. If you want to use map and filter you have to create all the combinations. You can do that after or before filtering but what ever you do you can't have all those combinations with those two functions, unless you change the ranges and/or modify something else.
One completely functional approach is to use itertools.product() and filter as following:
In [16]: from itertools import product
In [17]: list(filter(lambda x: x[0]%2==0 and x[1]%2==1, product(range(5), range(5))))
Out[17]: [(0, 1), (0, 3), (2, 1), (2, 3), (4, 1), (4, 3)]
Also note that using a nested list comprehension with two iterations is basically more readable than multiple map/filter functions. And regarding the performance using built-in funcitons is faster than list comprehension when your function are merely built-in so that you can assure all of them are performing at C level. When you break teh chain with something like a lambda function which is Python/higher lever operation your code won't be faster than a list comprehension.
I think the only confusing part in the expression [(x, y) for x in range(5) if x % 2 == 0 for y in range(5) if y % 2 == 1] is that there an implicit flatten operation is hidden.
Let's consider the simplified version of the expression first:
def even(x):
return x % 2 == 0
def odd(x):
return not even(x)
c = map(lambda x: map(lambda y: [x, y],
filter(odd, range(5))),
filter(even, range(5)))
print(c)
# i.e. for each even X we have a list of odd Ys:
# [
# [[0, 1], [0, 3]],
# [[2, 1], [2, 3]],
# [[4, 1], [4, 3]]
# ]
However, we need pretty the same but flattened list [(0, 1), (0, 3), (2, 1), (2, 3), (4, 1), (4, 3)].
From the official python docs we can grab the example of flatten function:
from itertools import chain
flattened = list(chain.from_iterable(c)) # we need list() here to unroll an iterator
print(flattened)
Which is basically an equivalent for the following list comprehension expression:
flattened = [x for sublist in c for x in sublist]
print(flattened)
# ... which is basically an equivalent to:
# result = []
# for sublist in c:
# for x in sublist:
# result.append(x)
Range support step argument, so I come up with this solution using itertools.chain.from_iterable to flatten inner list:
from itertools import chain
list(chain.from_iterable(
map(
lambda x:
list(map(lambda y: (x, y), range(1, 5, 2))),
range(0, 5, 2)
)
))
Output:
Out[415]: [(0, 1), (0, 3), (2, 1), (2, 3), (4, 1), (4, 3)]

Eliminating tuples from list of tuples based on a given criterion

So the problem is essentially this: I have a list of tuples made up of n ints that have to be eliminated if they dont fit certain criteria. This criterion boils down to that each element of the tuple must be equal to or less than the corresponding int of another list (lets call this list f) in the exact position.
So, an example:
Assuming I have a list of tuples called wk, made up of tuples of ints of length 3, and a list f composed of 3 ints. Like so:
wk = [(1,3,8),(8,9,1),(1,1,1)]
f = [2,5,8]
=== After applying the function ===
wk_result = [(1,3,8),(1,1,1)]
The rationale would be that when looking at the first tuple of wk ((1,3,8)), the first element of it is smaller than the first element of f. The second element of wk also complies with the rule, and the same applies for the third. This does not apply for the second tuple tho given that the first and second element (8 and 9) are bigger than the first and second elements of f (2 and 5).
Here's the code I have:
for i,z in enumerate(wk):
for j,k in enumerate(z):
if k <= f[j]:
pass
else:
del wk[i]
When I run this it is not eliminating the tuples from wk. What could I be doing wrong?
EDIT
One of the answers provided by user #James actually made it a whole lot simpler to do what I need to do:
[t for t in wk if t<=tuple(f)]
#returns:
[(1, 3, 8), (1, 1, 1)]
The thing is in my particular case it is not getting the job done, so I assume it might have to do with the previous steps of the process which I will post below:
max_i = max(f)
siz = len(f)
flist = [i for i in range(1,max_i +1)]
def cartesian_power(seq, p):
if p == 0:
return [()]
else:
result = []
for x1 in seq:
for x2 in cartesian_power(seq, p - 1):
result.append((x1,) + x2)
return result
wk = cartesian_power(flist, siz)
wk = [i for i in wk if i <= tuple(f) and max(i) == max_i]
What is happening is the following: I cannot use the itertools library to do permutations, that is why I am using a function that gets the job done. Once I produce a list of tuples (wk) with all possible permutations, I filter this list using two parameters: the one that brought me here originally and another one not relevant for the discussion.
Ill show an example of the results with numbers, given f = [2,5,8]:
[(1, 1, 8), (1, 2, 8), (1, 3, 8), (1, 4, 8), (1, 5, 8), (1, 6, 8), (1, 7, 8), (1, 8, 1), (1, 8, 2), (1, 8, 3), (1, 8, 4), (1, 8, 5), (1, 8, 6), (1, 8, 7), (1, 8, 8), (2, 1, 8), (2, 2, 8), (2, 3, 8), (2, 4, 8), (2, 5, 8)]
As you can see, there are instances where the ints in the tuple are bigger than the corresponding position in the f list, like (1,6,8) where the second position of the tuple (6) is bigger than the number in the second position of f (5).
You can use list comprehension with a (short-circuiting) predicate over each tuple zipped with the list f.
wk = [(1, 3, 8), (8, 9, 1), (1, 1, 1), (1, 9, 1)]
f = [2, 5, 8] # In this contrived example, f could preferably be a 3-tuple as well.
filtered = [t for t in wk if all(a <= b for (a, b) in zip(t, f))]
print(filtered) # [(1, 3, 8), (1, 1, 1)]
Here, all() has been used to specify a predicate that all tuple members must be less or equal to the corresponding element in the list f; all() will short-circuit its testing of a tuple as soon as one of its members does not pass the tuple member/list member <= sub-predicate.
Note that I added a (1, 9, 1) tuple for an example where the first tuple element passes the sub-predicate (<= corresponding element in f) whereas the 2nd tuple element does not (9 > 5).
You can do this with a list comprehension. It iterates over the list of tuples and checks that all of the elements of the tuple are less than or equal to the corresponding elements in f. You can compare tuples directly for element-wise inequality
[t for t in wk if all(x<=y for x,y in zip(t,f)]
# returns:
[(1, 3, 8), (1, 1, 1)]
Here is without loop solution which will compare each element in tuple :
wk_1 = [(1,3,8),(8,9,1),(1,1,1)]
f = [2,5,8]
final_input=[]
def comparison(wk, target):
if not wk:
return 0
else:
data=wk[0]
if data[0]<=target[0] and data[1]<=target[1] and data[2]<=target[2]:
final_input.append(data)
comparison(wk[1:],target)
comparison(wk_1,f)
print(final_input)
output:
[(1, 3, 8), (1, 1, 1)]
P.S : since i don't know you want less and equal or only less condition so modify it according to your need.

How to implement this using map and filter?

How to write a statement using map and filter to get same result as this list comprehension expression:
[(x,y) for x in range(10) if x%5==0
for y in range(10) if y%5==1]
result:
[(0, 1), (0, 6), (5, 1), (5, 6)]
I know it seems to be pointless, but I'm just really curious
This is how I did it without comprehesions:
sum(map(lambda x: map(lambda y: (x,y), filter(lambda y: y%5==1,range(10))), filter(lambda x: x%5==0,range(10))),[])
Executing:
>>> sum(map(lambda x: map(lambda y: (x,y), filter(lambda y: y%5==1,range(10))), filter(lambda x: x%5==0,range(10))),[])
[(0, 1), (0, 6), (5, 1), (5, 6)]
The last, and (maybe)nasty trick is using sum to flatten the list. I was getting [[(0, 1), (0, 6)], [(5, 1), (5, 6)]].
Judging by the comments to one of the other answers, you want the cartesian product part of this - that is, this bit:
[(x,y) for x in range(10) for y in range(10)]
also done using just map and filter. This is impossible. The output of map has exactly as many elements as the iterable you input into it. The output of filter has length of no more than the length of the input. The length of a cartesian product is the product of the lengths of the inputs, which no combination of map and filter can give you.
To do the cartesian product part, you need some kind of nested loops. You can write them out yourself, as you have in the list comprehension, or you can use itertools.product:
product(range(10), range(10))
After that, you just need two applications of filter - to directly translate from a listcomp to map/filter, you need one application of filter for each if-part of the listcomp. The ifs are attached to each variable iteration rather than to the result, so you put them around each argument to product - which also avoids having to nest them creatively. Finally, we need to pass it through list since product gives an iterator.
It looks like this:
list(product(filter(lambda x: (x%5 == 0), range(10)), filter(lambda y: (y % 5 == 1), range(10))))

List comprehension and function returning multiple values

I wanted to use list comprehension to avoid writing a for loop appending to some lists. But can it work with a function that returns multiple values? I expected this (simplified example) code to work...
def calc(i):
a = i * 2
b = i ** 2
return a, b
steps = [1,2,3,4,5]
ay, be = [calc(s) for s in steps]
... but it doesn't :(
The for-loop appending to each list works:
def calc(i):
a = i * 2
b = i ** 2
return a, b
steps = [1,2,3,4,5]
ay, be = [],[]
for s in steps:
a, b = calc(s)
ay.append(a)
be.append(b)
Is there a better way or do I just stick with this?
Use zip with *:
>>> ay, by = zip(*(calc(x) for x in steps))
>>> ay
(2, 4, 6, 8, 10)
>>> by
(1, 4, 9, 16, 25)
The horrendous "space efficient" version that returns iterators:
from itertools import tee
ay, by = [(r[i] for r in results) for i, results in enumerate(tee(map(calc, steps), 2))]
But basically just use zip because most of the time it's not worth the ugly.
Explanation:
zip(*(calc(x) for x in steps))
will do (calc(x) for x in steps) to get an iterator of [(2, 1), (4, 4), (6, 9), (8, 16), (10, 25)].
When you unpack, you do the equivalent of
zip((2, 1), (4, 4), (6, 9), (8, 16), (10, 25))
so all of the items are stored in memory at once. Proof:
def return_args(*args):
return args
return_args(*(calc(x) for x in steps))
#>>> ((2, 1), (4, 4), (6, 9), (8, 16), (10, 25))
Hence all items are in memory at once.
So how does mine work?
map(calc, steps) is the same as (calc(x) for x in steps) (Python 3). This is an iterator. On Python 2, use imap or (calc(x) for x in steps).
tee(..., 2) gets two iterators that store the difference in iteration. If you iterate in lockstep the tee will take O(1) memory. If you do not, the tee can take up to O(n). So now we have a usage that lets us have O(1) memory up to this point.
enumerate obviously will keep this in constant memory.
(r[i] for r in results) returns an iterator that takes the ith item from each of the results. This means it receives, in this case, a pair (so r=(2,1), r=(4,4), etc. in turn). It returns the specific iterator.
Hence if you iterate ay and by in lockstep constant memory will be used. The memory usage is proportional to the distance between the iterators. This is useful in many cases (imagine diffing a file or suchwhat) but as I said most of the time it's not worth the ugly. There's an extra constant-factor overhead, too.
You should have shown us what
[calc(s) for s in xrange(5)]
does give you, i.e.
[(0, 0), (2, 1), (4, 4), (6, 9), (8, 16)]
While it isn't the 2 lists that you want, it is still a list of lists. Further more, doesn't that look just like?
zip((0, 2, 4, 6, 8), (0, 1, 4, 9, 16))
zip repackages a set of lists. Usually it is illustrated with 2 longer lists, but it works just as well many short lists.
The third step is to remember that fn(*[arg1,arg2, ...]) = fn(arg1,arg2, ...), that is, the * unpacks a list.
Put it all together to get hcwhsa's answer.

Return a sequence of a variable length whose summation is equal to a given integer

In the form f(x,y,z) where x is a given integer sum, y is the minimum length of the sequence, and z is the maximum length of the sequence. But for now let's pretend we're dealing with a sequence of a fixed length, because it will take me a long time to write the question otherwise.
So our function is f(x,r) where x is a given integer sum and r is the length of a sequence in the list of possible sequences.
For x = 10, and r = 2, these are the possible combinations:
1 + 9
2 + 8
3 + 7
4 + 6
5 + 5
Let's store that in Python as a list of pairs:
[(1,9), (2,8), (3,7), (4,6), (5,5)]
So usage looks like:
>>> f(10,2)
[(1,9), (2,8), (3,7), (4,6), (5,5)]
Back to the original question, where a sequence is return for each length in the range (y,x). I the form f(x,y,z), defined earlier, and leaving out sequences of length 1 (where y-z == 0), this would look like:
>>> f(10,1,3)
[{1: [(1,9), (2,8), (3,7), (4,6), (5,5)],
2: [(1,1,8), (1,2,7), (1,3,6) ... (2,4,4) ...],
3: [(1,1,1,7) ...]}]
So the output is a list of dictionaries where the value is a list of pairs. Not exactly optimal.
So my questions are:
Is there a library that handles this already?
If not, can someone help me write both of the functions I mentioned? (fixed sequence length first)?
Because of the huge gaps in my knowledge of fairly trivial math, could you ignore my approach to integer storage and use whatever structure the makes the most sense?
Sorry about all of these arithmetic questions today. Thanks!
The itertools module will definately be helpful as we're dealing with premutations - however, this looks suspiciously like a homework task...
Edit: Looks like fun though, so I'll do an attempt.
Edit 2: This what you want?
from itertools import combinations_with_replacement
from pprint import pprint
f = lambda target_sum, length: [sequence for sequence in combinations_with_replacement(range(1, target_sum+1), length) if sum(sequence) == target_sum]
def f2(target_sum, min_length, max_length):
sequences = {}
for length in range(min_length, max_length + 1):
sequence = f(target_sum, length)
if len(sequence):
sequences[length] = sequence
return sequences
if __name__ == "__main__":
print("f(10,2):")
print(f(10,2))
print()
print("f(10,1,3)")
pprint(f2(10,1,3))
Output:
f(10,2):
[(1, 9), (2, 8), (3, 7), (4, 6), (5, 5)]
f(10,1,3)
{1: [(10,)],
2: [(1, 9), (2, 8), (3, 7), (4, 6), (5, 5)],
3: [(1, 1, 8),
(1, 2, 7),
(1, 3, 6),
(1, 4, 5),
(2, 2, 6),
(2, 3, 5),
(2, 4, 4),
(3, 3, 4)]}
The problem is known as Integer Partitions, and has been widely studied.
Here you can find a paper comparing the performance of several algorithms (and proposing a particular one), but there are a lot of references all over the Net.
I just wrote a recursive generator function, you should figure out how to get a list out of it yourself...
def f(x,y):
if y == 1:
yield (x, )
elif y > 1:
for head in range(1, x-y+2):
for tail in f(x-head, y-1):
yield tuple([head] + list(tail))
def f2(x,y,z):
for u in range(y, z+1):
for v in f(x, u):
yield v
EDIT: I just see it is not exactly what you wanted, my version also generates duplicates where only the ordering differs. But you can simply filter them out by ordering all results and check for duplicate tuples.

Categories