How to convert recursive function to generator? - python

I have the following recursive function to generate a list of valid configurations for a (named) list of positions, where each position can be used only once:
def generate_configurations(configurations, named_positions, current):
if len(current) == len(named_positions):
configurations.append(current)
return configurations
name, positions = named_positions[len(current)]
for x in positions:
if x not in current:
generate_configurations(configurations, named_positions, current + (x,))
return configurations
Here is an example of how I call it:
named_positions = [('a', [0,1,2]),
('b', [1,3]),
('c', [1,2])]
for comb in generate_configurations([], named_positions, ()):
print comb
Which gives the following output:
(0, 1, 2)
(0, 3, 1)
(0, 3, 2)
(1, 3, 2)
(2, 3, 1)
Also, it is possible there are no valid combinations, e.g. for named_positions = [('a', [3]), ('b', [3])].
Now depending on the input named_positions, the configurations list can quickly become huge, resulting in a MemoryError. I believe this function could be re-written as a generator, so I tried the following:
def generate_configurations(named_positions, current):
if len(current) == len(named_positions):
yield current
name, positions = named_positions[len(current)]
for x in positions:
if x not in current:
generate_configurations(named_positions, current + (x,))
named_positions = [('a', [0,1,2]),
('b', [1,3]),
('c', [1,2])]
for comb in generate_configurations(named_positions, ()):
print comb
but this doesn't generate any results at all. What am I doing wrong?

You need to yield up the recursive call stack or the inner yields never happen and get discarded. Since this is tagged Python 2.7, the recursive calls would be handled by changing:
if x not in current:
# Creates the generator, but doesn't run it out to get and yield values
generate_configurations(named_positions, current + (x,))
to:
if x not in current:
# Actually runs the generator and yields values up the call stack
for y in generate_configurations(named_positions, current + (x,)):
yield y
In Python 3.3 and higher, you can delegate directly with:
if x not in current:
yield from generate_configurations(named_positions, current + (x,)):

When you are using generators, you need to make sure that your sub-generator recursive calls pass back up to the calling method.
def recur_generator(n):
yield my_thing
yield my_other_thing
if n > 0:
yield from recur_generator(n-1)
Notice here that the yield from is what passes the yield calls back up to the parent call.
You should change the recursive call line to
yield from generate_configurations(named_positions, current + (x,))
Otherwise, your generator is fine.
EDIT: Didn't notice that this was python2. You can use
for x in recur_generator(n-1):
yield x
instead of yield from.

Related

Generator expressions vs generator functions and surprisingly eager evaluation

For reasons which are not relevant I am combining some data structures in a certain way, whilst also replacing the Python 2.7's default dict with OrderedDict. The data structures use tuples as keys in dictionaries. Please ignore those details (the replacement of the dict type is not useful below, but it is in the real code).
import __builtin__
import collections
import contextlib
import itertools
def combine(config_a, config_b):
return (dict(first, **second) for first, second in itertools.product(config_a, config_b))
#contextlib.contextmanager
def dict_as_ordereddict():
dict_orig = __builtin__.dict
try:
__builtin__.dict = collections.OrderedDict
yield
finally:
__builtin__.dict = dict_orig
This works as expected initially (dict can take non-string keyword arguments as a special case):
print 'one level nesting'
with dict_as_ordereddict():
result = combine(
[{(0, 1): 'a', (2, 3): 'b'}],
[{(4, 5): 'c', (6, 7): 'd'}]
)
print list(result)
print
Output:
one level nesting
[{(0, 1): 'a', (4, 5): 'c', (2, 3): 'b', (6, 7): 'd'}]
However, when nesting calls to the combine generator expression, it can be seen that the dict reference is treated as OrderedDict, lacking the special behaviour of dict to use tuples as keyword arguments:
print 'two level nesting'
with dict_as_ordereddict():
result = combine(combine(
[{(0, 1): 'a', (2, 3): 'b'}],
[{(4, 5): 'c', (6, 7): 'd'}]
),
[{(8, 9): 'e', (10, 11): 'f'}]
)
print list(result)
print
Output:
two level nesting
Traceback (most recent call last):
File "test.py", line 36, in <module>
[{(8, 9): 'e', (10, 11): 'f'}]
File "test.py", line 8, in combine
return (dict(first, **second) for first, second in itertools.product(config_a, config_b))
File "test.py", line 8, in <genexpr>
return (dict(first, **second) for first, second in itertools.product(config_a, config_b))
TypeError: __init__() keywords must be strings
Furthermore, implementing via yield instead of a generator expression fixes the problem:
def combine_yield(config_a, config_b):
for first, second in itertools.product(config_a, config_b):
yield dict(first, **second)
print 'two level nesting, yield'
with dict_as_ordereddict():
result = combine_yield(combine_yield(
[{(0, 1): 'a', (2, 3): 'b'}],
[{(4, 5): 'c', (6, 7): 'd'}]
),
[{(8, 9): 'e', (10, 11): 'f'}]
)
print list(result)
print
Output:
two level nesting, yield
[{(0, 1): 'a', (8, 9): 'e', (2, 3): 'b', (4, 5): 'c', (6, 7): 'd', (10, 11): 'f'}]
Questions:
Why does some item (only the first?) from the generator expression get evaluated before required in the second example, or what is it required for?
Why is it not evaluated in the first example? I actually expected this behaviour in both.
Why does the yield-based version work?
Before going into the details note the following: itertools.product evaluates the iterator arguments in order to compute the product. This can be seen from the equivalent Python implementation in the docs (the first line is relevant):
def product(*args, **kwds):
pools = map(tuple, args) * kwds.get('repeat', 1)
...
You can also try this with a custom class and a short test script:
import itertools
class Test:
def __init__(self):
self.x = 0
def __iter__(self):
return self
def next(self):
print('next item requested')
if self.x < 5:
self.x += 1
return self.x
raise StopIteration()
t = Test()
itertools.product(t, t)
Creating the itertools.product object will show in the output that all the iterators items are immediately requested.
This means, as soon as you call itertools.product the iterator arguments are evaluated. This is important because in the first case the arguments are just two lists and so there's no problem. Then you evaluate the final result via list(result after the context manager dict_as_ordereddict has returned and so all calls to dict will be resolved as the normal builtin dict.
Now for the second example the inner call to combine works still fine, now returning a generator expression which is then used as one of the arguments to the second combine's call to itertools.product. As we've seen above these arguments are immediately evaluated and so the generator object is asked to generate its values. In order to do so, it needs to resolve dict. However now we're still inside the context manager dict_as_ordereddict and for that reason dict will be resolved as OrderedDict which doesn't accept non-string keys for keyword arguments.
It is important to notice here that the first version which uses return needs to create the generator object in order to return it. That involves creating the itertools.product object. That means this version is as lazy as itertools.product.
Now to the question why the yield version works. By using yield, invoking the function will return a generator. Now this is a truly lazy version in the sense that execution of the function body doesn't start until items are requested. This means neither the inner nor the outer call to convert will start executing the function body and thus invoking itertools.product until the items are requested via list(result). You can check that by putting an additional print statement inside that function and right behind the context manager:
def combine(config_a, config_b):
print 'start'
# return (dict(first, **second) for first, second in itertools.product(config_a, config_b))
for first, second in itertools.product(config_a, config_b):
yield dict(first, **second)
with dict_as_ordereddict():
result = combine(combine(
[{(0, 1): 'a', (2, 3): 'b'}],
[{(4, 5): 'c', (6, 7): 'd'}]
),
[{(8, 9): 'e', (10, 11): 'f'}]
)
print 'end of context manager'
print list(result)
print
With the yield version we'll notice that it prints the following:
end of context manager
start
start
I.e. the generators are started only when the results are requested via list(result). This is different from the return version (uncomment in the above code). Now you'll see
start
start
and before the end of the context manager is reached the error is already raised.
On a side note, in order for your code to work, the replacement of dict needs to be ineffective (and it is for the first version), so I don't see why you would use that context manager at all. Secondly, dict literals are not ordered in Python 2, and neither are keyword arguments so that also defeats the purpose of using OrderedDict. Also note that in Python 3 that non-string keyword arguments behavior of dict has been removed and the clean way to update dictionaries of any keys is to use dict.update.

Unpacking and re-packing a tuple (Python 2.x)

I've written a function that accepts, works and return simple, non-nested tuples.
eg.:
myfun((1,2,3,4)):
... -> logic
return (1,2,3,4) -> the numbers can change, but the shape will be the same
Since the logic works only with mono-dimensional tuples but is conceptually the same for each level of nesting. I was wondering if there's a way to convert a nested tuple like ((1,2,(3,)),(4,)) into the plain (1,2,3,4) and then convert it back to ((1,2,(3,)),(4,)).
Basically what I want is to unpack a generic input tuple, work with it, and then pack the results in the same shape of the given one.
Is there a Pythonic way to accomplish such a task?
Probably the unpacking could be solved with recursion, however I'm not sure about the "re-packing" part.
The unpacking is not that hard:
def unpack(parent):
for child in parent:
if type(child) == tuple:
yield from unpack(child)
else:
yield child
for example, can do the trick.
Repacking is a bit trickier. I came up with the following, which works but is not very pythonic, I'm afraid:
def repack(structured, flat):
output = []
global flatlist
flatlist = list(flat)
for child in structured:
if type(child) == tuple:
output.append(repack(child, flatlist))
else:
output.append(flatlist.pop(0))
return tuple(output)
Example usage is:
nested = ((1, 2, (3,)), (4,))
plain = tuple(unpack(nested))
renested = repack(nested, plain)
Hope this helps!
This should work for the repacking:
x = (1,(2,3),(4,(5,6)))
y = (9,8,7,6,5,4)
def map_shape(x, y, start=0):
if type(x) == tuple:
l = []
for item in x:
mapped, n_item = map_shape(item, y[start:])
start += n_item
l.append(mapped)
return tuple(l), start
else:
return y[start], start+1
map_shape(x,y)[0]
Output:
(9, (8, 7), (6, (5, 4)))
I submit my version. It uses the same function to flat and reconstruct the list. If flat is None it flattens, otherwise it reconstructs by yielding a tuple.
import collections
def restructure(original, flat=None):
for el in original:
if isinstance(el, collections.Iterable) and not isinstance(el, (str, bytes)):
if flat:
yield tuple(restructure(el, flat))
else:
yield from restructure(el)
else:
yield next(flat) if flat else el
def gen():
i = 0
while True:
yield i
i += 1
def myfun(iterable):
flat = tuple(restructure(iterable))
# your transformation ..
flat = gen() # assigning infinite number generator for testing
return restructure(iterable, flat=iter(flat))
x = (1, (2, 3), (4, (5, 6)))
print(tuple(y for y in myfun(x))) # (0, (1, 2), (3, (4, 5)))

Can generators be recursive?

I naively tried to create a recursive generator. Didn't work. This is what I did:
def recursive_generator(lis):
yield lis[0]
recursive_generator(lis[1:])
for k in recursive_generator([6,3,9,1]):
print(k)
All I got was the first item 6.
Is there a way to make such code work? Essentially transferring the yield command to the level above in a recursion scheme?
Try this:
def recursive_generator(lis):
yield lis[0]
yield from recursive_generator(lis[1:])
for k in recursive_generator([6,3,9,1]):
print(k)
I should point out this doesn't work because of a bug in your function. It should probably include a check that lis isn't empty, as shown below:
def recursive_generator(lis):
if lis:
yield lis[0]
yield from recursive_generator(lis[1:])
In case you are on Python 2.7 and don't have yield from, check this question out.
Why your code didn't do the job
In your code, the generator function:
returns (yields) the first value of the list
then it creates a new iterator object calling the same generator function, passing a slice of the list to it
and then stops
The second instance of the iterator, the one recursively created, is never being iterated over. That's why you only got the first item of the list.
A generator function is useful to automatically create an iterator object (an object that implements the iterator protocol), but then you need to iterate over it: either manually calling the next() method on the object or by means of a loop statement that will automatically use the iterator protocol.
So, can we recursively call a generator?
The answer is yes. Now back to your code, if you really want to do this with a generator function, I guess you could try:
def recursive_generator(some_list):
"""
Return some_list items, one at a time, recursively iterating over a slice of it...
"""
if len(some_list)>1:
# some_list has more than one item, so iterate over it
for i in recursive_generator(some_list[1:]):
# recursively call this generator function to iterate over a slice of some_list.
# return one item from the list.
yield i
else:
# the iterator returned StopIteration, so the for loop is done.
# to finish, return the only value not included in the slice we just iterated on.
yield some_list[0]
else:
# some_list has only one item, no need to iterate on it.
# just return the item.
yield some_list[0]
some_list = [6,3,9,1]
for k in recursive_generator(some_list):
print(k)
Note: the items are returned in reversed order, so you might want to use some_list.reverse() before calling the generator the first time.
The important thing to note in this example is: the generator function recursively calls itself in a for loop, which sees an iterator and automatically uses the iteration protocol on it, so it actually gets values from it.
This works, but I think this is really not useful. We are using a generator function to iterate over a list and just get the items out, one at a time, but... a list is an iterable itself, so no need for generators!
Of course I get it, this is just an example, maybe there are useful applications of this idea.
Another example
Let's recycle the previous example (for lazyness). Lets say we need to print the items in a list, adding to every item the count of previous items (just a random example, not necessarily useful).
The code would be:
def recursive_generator(some_list):
"""
Return some_list items, one at a time, recursively iterating over a slice of it...
and adding to every item the count of previous items in the list
"""
if len(some_list)>1:
# some_list has more than one item, so iterate over it
for i in recursive_generator(some_list[1:]):
# recursively call this generator function to iterate over a slice of some_list.
# return one item from the list, but add 1 first.
# Every recursive iteration will add 1, so we basically add the count of iterations.
yield i + 1
else:
# the iterator returned StopIteration, so the for loop is done.
# to finish, return the only value not included in the slice we just iterated on.
yield some_list[0]
else:
# some_list has only one item, no need to iterate on it.
# just return the item.
yield some_list[0]
some_list = [6,3,9,1]
for k in recursive_generator(some_list):
print(k)
Now, as you can see, the generator function is actually doing something before returning list items AND the use of recursion starts to make sense. Still, just a stupid example, but you get the idea.
Note: off course, in this stupid example the list is expected to contain only numbers. If you really want to go try and break it, just put in a string in some_list and have fun. Again, this is only an example, not production code!
Recursive generators are useful for traversing non-linear structures. For example, let a binary tree be either None or a tuple of value, left tree, right tree. A recursive generator is the easiest way to visit all nodes. Example:
tree = (0, (1, None, (2, (3, None, None), (4, (5, None, None), None))),
(6, None, (7, (8, (9, None, None), None), None)))
def visit(tree): #
if tree is not None:
try:
value, left, right = tree
except ValueError: # wrong number to unpack
print("Bad tree:", tree)
else: # The following is one of 3 possible orders.
yield from visit(left)
yield value # Put this first or last for different orders.
yield from visit(right)
print(list(visit(tree)))
# prints nodes in the correct order for 'yield value' in the middle.
# [1, 3, 2, 5, 4, 0, 6, 9, 8, 7]
Edit: replace if tree with if tree is not None to catch other false values as errors.
Edit 2: about putting the recursive calls in the try: clause (comment by #jpmc26).
For bad nodes, the code above just logs the ValueError and continues. If, for instance, (9,None,None) is replaced by (9,None), the output is
Bad tree: (9, None)
[1, 3, 2, 5, 4, 0, 6, 8, 7]
More typical would be to reraise after logging, making the output be
Bad tree: (9, None)
Traceback (most recent call last):
File "F:\Python\a\tem4.py", line 16, in <module>
print(list(visit(tree)))
File "F:\Python\a\tem4.py", line 14, in visit
yield from visit(right)
File "F:\Python\a\tem4.py", line 14, in visit
yield from visit(right)
File "F:\Python\a\tem4.py", line 12, in visit
yield from visit(left)
File "F:\Python\a\tem4.py", line 12, in visit
yield from visit(left)
File "F:\Python\a\tem4.py", line 7, in visit
value, left, right = tree
ValueError: not enough values to unpack (expected 3, got 2)
The traceback gives the path from the root to the bad node. One could wrap the original visit(tree) call to reduce the traceback to the path: (root, right, right, left, left).
If the recursive calls are included in the try: clause, the error is recaught, relogged, and reraised at each level of the tree.
Bad tree: (9, None)
Bad tree: (8, (9, None), None)
Bad tree: (7, (8, (9, None), None), None)
Bad tree: (6, None, (7, (8, (9, None), None), None))
Bad tree: (0, (1, None, (2, (3, None, None), (4, (5, None, None), None))), (6, None, (7, (8, (9, None), None), None)))
Traceback (most recent call last):
... # same as before
The multiple logging reports are likely more noise than help. If one wants the path to the bad node, it might be easiest to wrap each recursive call in its own try: clause and raise a new ValueError at each level, with the contructed path so far.
Conclusion: if one is not using an exception for flow control (as may be done with IndexError, for instance) the presence and placements of try: statements depends on the error reporting one wants.
The reason your recursive call only executes once is that you are essentially creating nested generators. That is, you are creating a new generator inside a generator each time you call the function recursive_generator recursively.
Try the following and you will see.
def recursive_generator(lis):
yield lis[0]
yield recursive_generator(lis[1:])
for k in recursive_generator([6,3,9,1]):
print(type(k))
One simple solution, like others mention, is to use yield from.
Up to Python 3.4, a generator function used to have to raise StopIteration exception when it is done.
For the recursive case other exceptions (e.g. IndexError) are raised earlier than StopIteration, therefore we add it manually.
def recursive_generator(lis):
if not lis: raise StopIteration
yield lis[0]
yield from recursive_generator(lis[1:])
for k in recursive_generator([6, 3, 9, 1]):
print(k)
def recursive_generator(lis):
if not lis: raise StopIteration
yield lis.pop(0)
yield from recursive_generator(lis)
for k in recursive_generator([6, 3, 9, 1]):
print(k)
Note that for loop will catch StopIteration exception.
More about this here
Yes you can have recursive generators. However, they suffer from the same recursion depth limit as other recursive functions.
def recurse(x):
yield x
yield from recurse(x)
for (i, x) in enumerate(recurse(5)):
print(i, x)
This loop gets to about 3000 (for me) before crashing.
However, with some trickery, you can create a function that feeds a generator to itself. This allows you to write generators like they are recursive but are not: https://gist.github.com/3noch/7969f416d403ba3a54a788b113c204ce

Reversing a nested tuple in Python using the function reversed

I have a tuple and would like to reverse it in Python.
The tuple looks like this : (2, (4, (1, (10, None)))).
I tried reversing in Python by:
a = (2, (4, (1, (10, None))))
b = reversed(a)
It returns me this:
<reversed object at 0x02C73270>
How do I get the reverse of a? Or must I write a function to do this?
The result should look like this:
((((None, 10), 1), 4), 2)
def my_reverser(x):
try:
x_ = x[::-1]
except TypeError:
return x
else:
return x if len(x) == 1 else tuple(my_reverser(e) for e in x_)
Try this deep-reverse function:
def deep_reverse(t):
return tuple(deep_reverse(x) if isinstance(x, tuple) else x
for x in reversed(t))
This will handle arbitrarily nested tuples, not just two-tuples.
As explained in the documentation, the reversed function returns an iterator (hence the <reversed at ...>). If you want to get a list or a tuple out of it, just use list(reversed(...)) or tuple(reversed(...)).
However, it's only part of our problem: you'll be reversing the initial object (2, (...)) as (...,2), while the ... stays the same. You have to implement a recursive reverse: if one element of your input tuple is an iterable, you need to reverse it to.
It does not make sense to do this with reversed, sorry. But a simple recursive function would return what you want:
def reversedLinkedTuple(t):
if t is None:
return t
a, b = t
return reversedLinkedTuple(b), a
reversed is usable only on reversible iterable objects like lists, tuples and the like. What you are using (a linked list) isn't iterable in the sense of the Python built-in iter.
You could write a wrapping class for your linked list which implements this and then offers a reverse iterator, but I think that would be overkill and would not really suit your needs.
def reverse(x):
while x >= 0:
print(x)
x = x = 1
reverse(x)

transitive closure python tuples

Does anyone know if there's a python builtin for computing transitive closure of tuples?
I have tuples of the form (1,2),(2,3),(3,4) and I'm trying to get (1,2),(2,3),(3,4),(1,3)(2,4)
Thanks.
There's no builtin for transitive closures.
They're quite simple to implement though.
Here's my take on it:
def transitive_closure(a):
closure = set(a)
while True:
new_relations = set((x,w) for x,y in closure for q,w in closure if q == y)
closure_until_now = closure | new_relations
if closure_until_now == closure:
break
closure = closure_until_now
return closure
call:
transitive_closure([(1,2),(2,3),(3,4)])
result:
set([(1, 2), (1, 3), (1, 4), (2, 3), (3, 4), (2, 4)])
call:
transitive_closure([(1,2),(2,1)])
result:
set([(1, 2), (1, 1), (2, 1), (2, 2)])
Just a quick attempt:
def transitive_closure(elements):
elements = set([(x,y) if x < y else (y,x) for x,y in elements])
relations = {}
for x,y in elements:
if x not in relations:
relations[x] = []
relations[x].append(y)
closure = set()
def build_closure(n):
def f(k):
for y in relations.get(k, []):
closure.add((n, y))
f(y)
f(n)
for k in relations.keys():
build_closure(k)
return closure
Executing it, we'll get
In [3]: transitive_closure([(1,2),(2,3),(3,4)])
Out[3]: set([(1, 2), (1, 3), (1, 4), (2, 3), (2, 4), (3, 4)])
We can perform the "closure" operation from a given "start node" by repeatedly taking a union of "graph edges" from the current "endpoints" until no new endpoints are found. We need to do this at most (number of nodes - 1) times, since this is the maximum length of a path. (Doing things this way avoids getting stuck in infinite recursion if there is a cycle; it will waste iterations in the general case, but avoids the work of checking whether we are done i.e. that no changes were made in a given iteration.)
from collections import defaultdict
def transitive_closure(elements):
edges = defaultdict(set)
# map from first element of input tuples to "reachable" second elements
for x, y in elements: edges[x].add(y)
for _ in range(len(elements) - 1):
edges = defaultdict(set, (
(k, v.union(*(edges[i] for i in v)))
for (k, v) in edges.items()
))
return set((k, i) for (k, v) in edges.items() for i in v)
(I actually tested it for once ;) )
Suboptimal, but conceptually simple solution:
def transitive_closure(a):
closure = set()
for x, _ in a:
closure |= set((x, y) for y in dfs(x, a))
return closure
def dfs(x, a):
"""Yields single elements from a in depth-first order, starting from x"""
for y in [y for w, y in a if w == x]:
yield y
for z in dfs(y, a):
yield z
This won't work when there's a cycle in the relation, i.e. a reflexive point.
Here's one essentially the same as the one from #soulcheck that works on adjacency lists rather than edge lists:
def inplace_transitive_closure(g):
"""g is an adjacency list graph implemented as a dict of sets"""
done = False
while not done:
done = True
for v0, v1s in g.items():
old_len = len(v1s)
for v2s in [g[v1] for v1 in v1s]:
v1s |= v2s
done = done and len(v1s) == old_len
If you have a lot of tupels (more than 5000), you might want to consider using the scipy code for matrix powers (see also http://www.ics.uci.edu/~irani/w15-6B/BoardNotes/MatrixMultiplication.pdf)
from scipy.sparse import csr_matrix as csr
def get_closure(tups):
index2id = list(set([tup[0] for tup in tups]) | set([tup[1] for tup in tups]));
id2index = {index2id[i]:i for i in xrange(len(index2id))};
tups_re = tups + [(index2id[i],index2id[i],) for i in xrange(len(index2id))]; # Unfortunately you have to make the relation reflexive first - you could also add the diagonal to M
M = csr( ([True for tup in tups_re],([id2index[tup[0]] for tup in tups_re],[id2index[tup[1]] for tup in tups_re])),shape=(len(index2id),len(index2id)),dtype=bool);
M_ = M**n; # n is maximum path length of your relation
temp = M_.nonzero();
#TODO: You might want to remove the added reflexivity tupels again
return [(index2id[temp[0][i]],index2id[temp[1][i]],) for i in xrange(len(temp[0]))];
In the best case, you can choose n wisely if you know a bit about your relation/graph -- that is how long the longest path can be. Otherwise you have to choose M.shape[0], which might blow up in your face.
This detour also has its limits, in particular you should be sure than the closure does not get too large (the connectivity is not too strong), but you would have the same problem in the python implementation.
You can create a graph from those tuples then use connnected components algorithm from the created graph. Networkx is library that supports connnected components algorithm.

Categories