Generator comprehensions for look ahead algorithm in Python - python

I called for help yesterday on how to look ahead in Python. My problem was to iterate through all the possible edges to add to a network, and for each possible network with an edge added, look at all the possible edges to add, and so on (n depth). At the end, compare all networks produced at depth n with the root network, and actually add the best first step (best first edge to add to accomplish the best result at depth n). When that first edge is added, do the depth search again, and so on until a good network is found. Like a moving window, I may say (see lookahead algorithm in Python for a more thorough explanation of the problem).
Unfortunately for the clarity of the question, the code requires igraph, which is available here: http://igraph.org/python/#downloads
#Peter Gibson promptly answered, guiding me through the logic of Generator comprehensions, and helped me produce this code:
from igraph import * # http://igraph.org/python/
def delta(g,gOld): # evaluates the improvement of the graph from one generation to the next
print "delta"
return g.diameter()-gOld.diameter()
def possible_new_edges(G):
print "Possible new edges"
allPossibleNewEdges = []
for n1 in range(50):
for n2 in range(n1,50):
if G.are_connected(G.vs[n1],G.vs[n2]) == False and n1 != n2:
allPossibleNewEdges.append(G.vs[n1],G.vs[n2])
return allPossibleNewEdges
def add_optimal_edge(graph, n=3):
print "Add optimal edge"
paths = [[graph]] # start off with just one graph path, which contains a single graph
for generation in range(n):
print "Generation:", generation
# path[-1] is the latest graph for each generation
paths = (path + path[-1].add_edge(e) for path in paths for e in path[-1].possible_new_edges())
# select best path by comparison of final generation against original graph
best = max(paths, lambda path: comp_delta(path[-1],graph))
return best[1] # returns the first generation graph
graph = Graph.Erdos_Renyi(50, .15, directed=False, loops=False) # create a random root graph of density 0.15
add_optimal_edge(graph)
The generator is concise and elegant. Let's say a little too elegant for my unwieldy Python style, and there are a few things I need to understand to make it work. The code runs with this error:
return best[1] # returns the first generation graph
TypeError: 'generator' object has no attribute '__getitem__'
I think it's because of a wrong use of functions with the generator...
So, my question is: what's the proper way to use functions in such a generator? I need to call possible_new_edges() and delta(), what do I need to pass them (the graph?) and how to do so?
Thanks so much!

Trying the code from your gist, I found several fairly minor errors that were preventing the code from running. I've included fixed code below. However, this doesn't really solve the problem. That's because your algorithm needs to consider a truly vast number of potential graphs, which it cannot do in any sort of reasonable time.
In my testing, looking one step ahead works perfectly well, but looking two steps takes a very long time (10s of minutes, at least, I've never waited for it to finish) and three steps will probably take days. This is because your possible_new_edges function returns more than a thousand possible edges. Each one will be added to a copy of your initial graph. Then for each each succeeding step, the process will repeat on each of the expanded graphs from the previous step. This results in an exponential explosion of graphs, as you have to evaluate something on the order of 1000**n graphs to see which is the best.
So, to get a practical result you'll still need to change things. I don't know graph theory or your problem domain well enough to suggest what.
Anyway, here are the changed parts of the "working" code (with the original comments removed so that my notes on what I've changed are more clear):
def possible_new_edges(G):
print("Possible new edges")
allPossibleNewEdges = []
for n1 in range(50):
for n2 in range(n1,50):
if G.are_connected(G.vs[n1],G.vs[n2]) == False and n1 != n2:
allPossibleNewEdges.append((G.vs[n1],G.vs[n2])) # append a tuple
return allPossibleNewEdges
def add_optimal_edge(graph, n=3):
print("Add optimal edge")
paths = [[graph]]
for generation in range(n):
print("Generation:", generation)
paths = (path + [path[-1] + e] # use + to add an edge, and to extend the path
for path in paths
for e in possible_new_edges(path[-1])) # call this function properly
best = max(paths, key=lambda path: comp_delta(path[-1],graph))
return best[1]
If the generator expression in the loop confuses you, it might help to replace it with a list comprehension (by replacing the outermost parentheses with square brackets). You can then inspect the paths list inside the loop (and do things like print its len()). The logic of the code is the same either way, the generator expressions just put off computing the expanded results until the max function starts iterating over paths in order to find the best scoring one.
Using list comprehensions will work for n=1 certainly, but you may start running out of memory as you try n=2 (and you certainly will for n=3 or more). The version above won't you run out of memory (as the generator expression only expands O(n) graphs at a time), but that doesn't mean it runs fast enough to inspect billions of graphs in sensible amount of time.

Related

Converting recursive function to completely iterative function without using extra space

Is it possible to convert a recursive function like the one below to a completely iterative function?
def fact(n):
if n <= 1:
return
for i in range(n):
fact(n-1)
doSomethingFunc()
It seems pretty easy to do given extra space like a stack or a queue, but I was wondering if we can do this in O(1) space complexity?
Note, we cannot do something like:
def fact(n):
for i in range (factorial(n)):
doSomethingFunc()
since it takes a non-constant amount of memory to store the result of factorial(n).
Well, generally speaking no.
I mean, the space taken in the stack by recursive functions is not just an inconvenient of this programming style. It is the memory needed for the computation.
So, sure, for lot of algorithm, that space is unnecessary and could be spared. For a classical factorial for example
def fact(n):
if n<=1:
return 1
else:
return n*fact(n-1)
the stacking of all the n, n-1, n-2, ..., 1 arguments is not really necessary.
So, sure, you can find an implementation that get rid of it. But that is optimization (For example, in the specific case of terminal recursion. But I am pretty sure that you add that "doSomething" to make clear that you don't want to focus on that specific case).
You cannot assume in general that an algorithm that don't need all those values exist, recursive or iterative. Or else, that would be saying that all algorithm exist in a O(1) space complexity version.
Example: base representation of a positive integer
def baseRepr(num, base):
if num>=base:
s=baseRepr(num//base, base)
else:
s=''
return s+chr(48+num%base)
Not claiming it is optimal, or even well written.
But, the stacking of the arguments is needed. It is the way you implicitly store the digits that you compute in the reverse order.
An iterative function would also need some memory to store those digits, since you have to compute the last one first.
Well, I am pretty sure that for this simple example, you could find a way to compute from left to right, for example using a log computation to know in advance the number of digits or something. But that's not the point. Just imagine that there is no other algorithm known than the one computing digits from right to left. Then you need to store them. Either implicitly in the stack using recursion, or explicitly in allocated memory. So again, memory used in the stack is not just an inconvenience of recursion. It is the way recursive algorithm store things, that would be stored otherwise in iterative algorithm
Note, we cannot do something like:
def fact(n):
for i in range (factorial(n)):
doSomethingFunc()
since it takes a non-constant amount of memory to store the result of
factorial(n).
Yes.
I was wondering if we can do this in O(1) space complexity?
So, no.

Returning list of different results that are created recursively in Python

Lately I've been working with some recursive problems in Python where I have to generate a list of possible configurations (i.e list of permutations of a given string, list of substrings, etc..) using recursion. I'm having a very hard time in finding the best practice and also in understanding how to manage this sort of variable in recursion.
I'll give the example of the generate binary trees problem. I more-or-less know what I have to implement in the recursion:
If n=1, return just one node.
If n=3, return the only possible binary tree.
For n>3, crate one node and then explore the possibilities: left node is childless, right node is childless, neither node is childless. Explore these possibilites recursively.
Now the thing I'm having the most trouble visualising is how exactly I am going to arrive to the list of trees. Currently the practice I do is pass along a list in the function call (as an argument) and the function would return this list, but then the problem is in case 3 when calling the recursive function to explore the possibilites for the nodes it would be returning a list and not appending nodes to a tree that I am building. When I picture the recursion tree in my head I imagine a "tree" variable that is unique to each of the tree leaves, and these trees are added to a list which is returned by the "root" (i.e first) call. But I don't know if that is possible. I thought of a global list and the recursive function not returning anything (just appending to it) but the problem I believe is that at each call the function would receive a copy of the variable.
How can I deal with generating combinations and returning lists of configurations in these cases in recursion? While I gave an example, the more general the answer the better. I would also like to know if there is a "best practice" when it comes to that.
Currently the practice I do is pass along a list in the function call (as an argument) and the function would return this list
This is not the purest way to attack a recursive problem. It would be better if you can make the recursive function such that it solves the sub problem without an extra parameter variable that it must use. So the recursive function should just return a result as if it was the only call that was ever made (by the testing framework). So in the example, that recursive call should return a list with trees.
Alternatively the recursive function could be a sub-function that doesn't return a list, but yields the individual values (in this case: trees). The caller can then decide whether to pack that into a list or not. This is more pythonic.
As to the example problem, it is also important to identify some invariants. For instance, it is clear that there are no solutions when n is even. As to recursive aspect: once you have decided to create a root, then both its left and right sided subtree will have an odd number of nodes. Of course, this is an observation that is specific to this problem, but it is important to look for such problem properties.
Finally, it is equally important to see if the same sub problems can reoccur multiple times. This surely is the case in the example problem: for instance, the left subtree may sometimes have the same number of nodes as the right subtree. In such cases memoization will improve efficiency (dynamic programming).
When the recursive function returns a list, the caller can then iterate that list to retrieve its elements (trees in the example), and use them to build an extended result that satisfies the caller's task. In the example case that means that the tree taken from the recursively retrieved list, is appended as a child to a new root. Then this new tree is appended to a new list (not related to the one returned from the recursive call). This new list will in many cases be longer, although this depends on the type of problem.
To further illustrate the way to tackle these problems, here is a solution for the example problem: one which uses the main function for the recursive calls, and using memoization:
class Solution:
memo = { 1: [TreeNode()] }
def allPossibleFBT(self, n: int) -> List[Optional[TreeNode]]:
# If we didn't solve this problem before...
if n not in self.memo:
# Create a list for storing the results (the trees)
results = []
# Before creating any root node,
# decide the size of the left subtree.
# It must be odd
for num_left in range(1, n, 2):
# Make the recursive call to get all shapes of the
# left subtree
left_shapes = self.allPossibleFBT(num_left)
# The remainder of the nodes must be in the right subtree
num_right = n - 1 - num_left # The root also counts as 1
right_shapes = self.allPossibleFBT(num_right)
# Now iterate the results we got from recursion and
# combine them in all possible ways to create new trees
for left in left_shapes:
for right in right_shapes:
# We have a combination. Now create a new tree from it
# by putting a root node on top of the two subtrees:
tree = TreeNode(0, left, right)
# Append this possible shape to our results
results.append(tree)
# All done. Save this for later re-use
self.memo[n] = results
return self.memo[n]
This code can be made more compact using list comprehension, but it may make the code less readable.
Don't pass information into the recursive calls, unless they need that information to compute their local result. It's much easier to reason about recursion when you write without side effects. So instead of having the recursive call put its own results into a list, write the code so that the results from the recursive calls are used to create the return value.
Let's take a trivial example, converting a simple loop to recursion, and using it to accumulate a sequence of increasing integers.
def recursive_range(n):
if n == 0:
return []
return recursive_range(n - 1) + [n]
We are using functions in the natural way: we put information in with the arguments, and get information out using the return value (rather than mutation of the parameters).
In your case:
Now the thing I'm having the most trouble visualising is how exactly I am going to arrive to the list of trees.
So you know that you want to return a list of trees at the end of the process. So the natural way to proceed, is that you expect each recursive call to do that, too.
How can I deal with generating combinations and returning lists of configurations in these cases in recursion? While I gave an example, the more general the answer the better.
The recursive calls return their lists of results for the sub-problems. You use those results to create the list of results for the current problem.
You don't need to think about how recursion is implemented in order to write recursive algorithms. You don't need to think about the call stack. You do need to think about two things:
What are the base cases?
How does the problem break down recursively? (Alternately: why is recursion a good fit for this problem?)
The thing is, recursion is not special. Making the recursive call is just like calling any other function that would happen to give you the correct answer for the sub-problem. So all you need to do is understand how solving the sub-problems helps you to solve the current one.

Recursively operating on a tree structure: How do I get the state of the "entire" tree?

First, context:
As a side project, I'm building a computer algebra system in Python that yields the steps it takes to solve an equation.
So far, I've been able to parse algebraic expressions and equations into an expression tree. It's structured something like this (not the actual code—may not be running):
# Other operators and math functions are based off this.
# Numbers and symbols also have their own classes with 'parent' attributes.
class Operator(object):
def __init__(self, *args):
self.children = args
for child in self.children:
child.parent = self
# the parser does something like this:
expr = Add(1, Mult(3, 4), 5)
On top of this, I have a series of functions that operate recursively to simplify expressions. They're not purely functional, but I'm trying to avoid relying on mutability for operations, instead returning a modified copy of the node I'm working with. Each function looks something like this:
def simplify(node):
for index, child in enumerate(node.children):
if isinstance(child, Operator):
node.children[index] = simplify(node)
else:
# perform some operations to simplify numbers and symbols
pass
return node
The challenge comes in the "step by step" part. I'd like for my "simplification" functions to all be nested generators that "yield" the steps it takes to solve something. So basically, every time each function performs an operation, I'd like to be able to do something like this: yield (deepcopy(node), expression, "Combined like terms.") so that whatever is relying on this library can output something like:
5x + 3*4x + 3
5x + 12x + 3 Simplified product 3*4x into 12x
17x + 3 Combined like terms 5x + 12x = 17x
However, each function only has knowledge about the node it's operating on, but has no idea what the overall expression looks like.
So this is my question: What would be the best way of maintaining the "state" of the entire expression tree so that each "step" has knowledge of the entire expression?
Here are the solutions I've come up with:
Do every operation in place and either use a global variable or an instance variable in a class to store a pointer to the equation. I don't like this because unit testing is tougher, since now I have to set up the class first. You also lose other advantages of a more functional approach.
Pass through the root of the expression to every function. However, this either means I have to repeat every operation to also update the expression or that I have to rely on mutability.
Have the top level function 'reconstruct' the expression tree based on each step I yield. For example, if I yield 5x + 4x = 9x, have the top level function find the (5x + 4x) node and replace it with '9x'. This seems like the best solution, but how best to 'reconstruct' each step?
Two final, related questions: Does any of this make sense? I have a lot of caffeine in my system right now and have no idea if I'm being clear.
Am I worrying too much about mutability? Is this a case of premature optimization?
You might be asking about tree zippers. Check: Functional Pearl: Weaving a Web and see if it applies to what you want. From reading your question, I think you're asking to do recursion on a tree structure, but be able to navigate back to the top as necessary. Zippers act as a "breadcrumb" to let you get back to the ancestors of the tree.
I have an implementation of one in JavaScript.
Are you using Polish notation to construct the tree?
For the step by step simplification you can just use a loop until no modifications (operations) can be made in the tree.

Multi-recursive functions

I’d like to be pointed toward a reference that could better explain recursion when a function employs multiple recursive calls. I think I get how Python handles memory when a function employs a single instance of recursion. I can use print statements to track where the data is at any given point while the function processes the data. I can then walk each of those steps back to see how the resultant return value was achieved.
Once multiple instances of recursion are firing off during a single function call I am no longer sure how the data is actually being processed. The previously illuminating method of well-placed print statements reveals a process that looks quantum, or at least more like voodoo.
To illustrate my quandary here are two basic examples: the Fibonacci and Hanoi towers problems.
def getFib(n):
if n == 1 or n == 2:
return 1
return getFib(n-1) + getFib(n-2)
The Fibonacci example features two inline calls. Is getFib(n-1) resolved all the way through the stack first, then getFib(n-2) resolved similarly, each of the resultants being put into new stacks, and those stacks added together line by line, with those sums being totaled for the result?
def hanoi(n, s, t, b):
assert n > 0
if n ==1:
print 'move ', s, ' to ', t
else:
hanoi(n-1,s,b,t)
hanoi(1,s,t,b)
hanoi(n-1,b,t,s)
Hanoi presents a different problem, in that the function calls are in successive lines. When the function gets to the first call, does it resolve it to n=1, then move to the second call which is already n=1, then to the third until n=1?
Again, just looking for reference material that can help me get smart on what’s going on under the hood here. I’m sure it’s likely a bit much to explain in this setting.
http://www.pythontutor.com/visualize.html
There's even a Hanoi link there so you can follow the flow of code.
This is a link to the hanoi code that they show on their site, but it may have to be adapated to visualize your exact code.
http://www.pythontutor.com/visualize.html#code=%23+move+a+stack+of+n+disks+from+stack+a+to+stack+b,%0A%23+using+tmp+as+a+temporary+stack%0Adef+TowerOfHanoi(n,+a,+b,+tmp)%3A%0A++++if+n+%3D%3D+1%3A%0A++++++++b.append(a.pop())%0A++++else%3A%0A++++++++TowerOfHanoi(n-1,+a,+tmp,+b)%0A++++++++b.append(a.pop())%0A++++++++TowerOfHanoi(n-1,+tmp,+b,+a)%0A++++++++%0Astack1+%3D+%5B4,3,2,1%5D%0Astack2+%3D+%5B%5D%0Astack3+%3D+%5B%5D%0A++++++%0A%23+transfer+stack1+to+stack3+using+Tower+of+Hanoi+rules%0ATowerOfHanoi(len(stack1),+stack1,+stack3,+stack2)&mode=display&cumulative=false&heapPrimitives=false&drawParentPointers=false&textReferences=false&showOnlyOutputs=false&py=2&curInstr=0

Pygraph - path between two nodes with specific weight

I want to find a path in a graph that has connects two nodes and does not use the same node twice. The sum of the weights of the edges must be within a certain range.
I need to implement this in pygraph. I'm not sure if there is already an algorithm that I can use for this purpose or not. What's the best way to achieve this?
EDIT: I misunderstood the question initially. I've corrected my answer. This functionality isn't built into the pygraphlib library, but you can easily implement it. Consider something like this, which basically gets the shortest path, decides if it's in a predefined range, then removes the edge with the smallest weight, and computes the new shortest path, and repeats.
from pygraphlib import pygraph, algo
edges = [(1,2),(2,3),(3,4),(4,6),(6,7),(3,5),(4,5),(7,1),(2,5),(5,7)]
graph = pygraph.from_list(edges)
pathList = []
shortestPath = algo.shortest_path(graph, startNode, endNode)
cost = shortestPath[len(shortestPath)-1][1]
while cost <= maxCost:
if cost >= minCost:
pathList.append(shortestPath)
minEdgeWt = float('inf')
for i in range(len(shortestPath)-1):
if shortestPath[i+1][1] - shortestPath[i][1] < minEdgeWt:
minEdgeWt = shortestPath[i+1][1] - shortestPath[i][1]
edgeNodes = (shortestPath[i][0], shortestPath[i+1][0])
#Not sure of the syntax here, edgeNodes is a tuple, and hide_edge requires an edge.
graph.hide_edge(edgeNodes)
shortestPath = alog.shortest_path(graph, startNode, endNode)
cost = shortestPath[len(shortestPath)-1][1]
return pathList
Note that I couldn't find a copy of pygraphlib, seeing as it is no longer under development, so I couldn't test the above code. It should work, mod the syntax uncertainty. Also, if possible, I would recommend using networkx[link] for any kind of graph manipulation in python, as it is more complete, under active development, and more completely documented then pygraphlib. Just a suggestion.

Categories