Making code that is both iterative and recursive purely iterative - python

I have huge lists (>1,000,000 in most cases) that are partitioned into n x n grids. These lists are of some coordinates. I define a neighbouring relation between points - if they are within range of each other, they are 'neighbours' and put into a 'cluster'. We can add cells to these clusters, so cell A is a neighbour of cell B, B a neighbour of C, C is not a neighbour of A. A, B, and C would be in the same cluster.
Given that background, I have the following code that tries to assign points to clusters in Python 3.6:
for i in range(n):
for j in range(n):
tile = image[i][j]
while tile:
cell = tile.pop()
cluster = create_and_return_cluster(cell, image_clusters[i][j], (i, j))
clusterise(cell, cluster, (i, j), image, n, windows, image_clusters)
def clusterise(cell, cluster, tile_index, image, n, windows, image_clusters):
neighbouring_windows, neighbouring_indices = get_neighbouring_windows(tile_index[0], tile_index[1], n, windows)
neighbours = get_and_remove_all_neighbours(cell, image, tile_index, neighbouring_windows, neighbouring_indices)
if neighbours:
for neighbour, (n_i, n_j) in neighbours:
add_to_current_cluster(cluster, image_clusters, neighbour, (n_j, n_j))
clusterise(neighbour, cluster, (n_i, n_j), image, n, windows, image_clusters)
Because of the massive size of the lists, I've had issues with RecursionError and have been scouring the internet for suggestions on tail-recursion. The problem is that this algorithm needs to branch from neighbours to pick up neighbours of those neighbours, and so on. As you can imagine, this gets pretty big pretty quickly in terms of stack frames.
My question is: is it possible to make this algorithm use tail-recursion, or how would one go about making it tail-recursive? I know the cluster argument is essentially an accumulator in this case, but given how the lists shrink and the nasty for-loop in clusterise() I am not sure how to successfully convert to tail-recursion. Does anyone have any ideas? Ideally supported with an explanation.
NB: I am well aware that Python does not optimise tail-recursion by default, yes I am aware that other languages optimise tail-recursion. My question is whether it can be done in this case using Python. I don't want to change if I don't absolutely have to and much of my other code is already in Python.

Just use a queue or stack to track which neighbours to process next; the following function does exactly the same work as your recursive function, iteratively:
from collections import deque
def clusterise(cell, cluster, tile_index, image, n, windows, image_clusters):
to_process = deque([(cell, tile_index)])
while to_process:
cell, tile_index = to_process.pop()
neighbouring_windows, neighbouring_indices = get_neighbouring_windows(tile_index[0], tile_index[1], n, windows)
neighbours = get_and_remove_all_neighbours(cell, image, tile_index, neighbouring_windows, neighbouring_indices)
if neighbours:
for neighbour, (n_i, n_j) in neighbours:
add_to_current_cluster(cluster, image_clusters, neighbour, (n_j, n_j))
to_process.append((neighbour, (n_i, n_j))
So instead of using the Python stack to track what still needs to be processed, we move the varying arguments (cell and tile_index) to a deque stack managed by the function instead, which isn't bound like the Python stack is. You can also use it as a queue (pop from the beginning instead of the end with to_process.popleft()) for a breadth-first processing order. Note that your recursive solution process cells depth-first.
As a side note: yes, you can use a regular Python list as a stack too, but due to the nature of how a list object is grown and shrunk dynamically, for a stack the deque linked-list implementation is more efficient. And it's easier to switch between a stack and a queue this way. See Raymond Hettinger's remarks on deque performance.

Related

handle trajectory singularity: delete photons causing error

I am coding a black hole (actually photons orbiting a black hole) and I need to handle an exception for radius values that are smaller than the limit distance
I've tried using if and while True
def Hamiltonian(r, pt, pr, pphi):
H = (-((1-rs/r)**-1)*(pt**2)/2 + (1-rs/r)*(pr**2)/2 + (pphi**2)/(2* (r**2)))
if np.amax(H) < 10e-08:
print("Your results are correct")
else:
print("Your results are wrong")
return(H)
def singularity(H, r):
if (r).any < 1.5*rs:
print(H)
else:
print("0")
return(H, r)
print(Hamiltonian(r00, pt00, pr00, pphi00))
I'd like to handle the case r < 1.5*rs so that I don't have the division error message and weird orbits anymore. So far my code doesn't change anything to the problem, I still get this error message :
"RuntimeWarning: divide by zero encountered in true_divide
H = (-((1-rs/r)**-1)*(pt**2)/2 + (1-rs/r)*(pr**2)/2 + (pphi**2)/(2*(r**2)))"
and my orbits are completely wrong (for example my photon should go right in the black hole, but since there's a singularity at r < 1.5*rs, they go in and leave again and so on)
I'd like to delete the photons causing the problem but I don't know how, can anyone please help me?
I think you mention that we do not know exactly what goes on at a singularity. Any answer provided here most likely would not be accurate, but let's just assume that you know the behavior/dynamics at the near 0 neighborhood. I do not know how you are calling the Hamiltonian function but I can imagine you are doing it one of two ways.
You have a pre-defined loop that loops through each value you are passing into the function and outputting the resulting H for each value in that pre-defined loop.
You are passing in vectors of the same length and doing element by element math in the H function.
In the first case, you could either do a pre-check and write a new function for the near 0 behavior and call that new function if you are in the near zero neighborhood. You could also just have the check in the hamiltonian function itself and call the new function if you are in the near 0 neighborhood. This latter method is what I would prefer as it keeps the front-end (the best word I have for it) fairly clean and encapsulates the logic/math in your Hamiltonian function. In your comments you say you want to delete the photon within some radius, and that would be extremely easy to do with this method by just terminating the for loop by breaking out at that time step and plotting what you had until that time step.
In the second case, you would have to manually build the new vector by having checks throughout your vectors if they fall within the singularity neighborhood. This would be a little more difficult and would depend on the shape of your inputs and doing element by element math is more difficult to debug in my opinion.

Python performance questions for: toggling +/-1, set instantiation, set membership check

I've been working on the following code which sort of maximizes the number of unique (in lowest common denominator) p by q blocks with some constraints. It is working perfectly. For small inputs. E.g. input 50000, output 1898.
I need to run it on numbers greater than 10^18, and while I have a different solution that gets the job done, this particular version gets super slow (made my desktop reboot at one point), and this is what my question is about.
I'm trying to figure out what is causing the slowdown in the following code, and to figure out in what order of magnitude they are slow.
The candidates for slowness:
1) the (-1)**(i+1) term? Does Python do this efficiently, or is it literally multiplying out -1 by itself a ton of times?
[EDIT: still looking for how operation.__pow__ works, but having tested setting j=-j: this is faster.]
2) set instantiation/size? Is the set getting too large? Obviously this would impact membership check if the set can't get built.
3) set membership check? This indicates O(1) behavior, although I suppose the constant continues to change.
Thanks in advance for insight into these processes.
import math
import time
a=10**18
ti=time.time()
setfrac=set([1])
x=1
y=1
k=2
while True:
k+=1
t=0
for i in xrange(1,k):
mo = math.ceil(k/2.0)+((-1)**(i+1))*(math.floor(i/2.0)
if (mo/(k-mo) not in setfrac) and (x+(k-mo) <= a and y+mo <= a):
setfrac.add(mo/(k-mo))
x+=k-mo
y+=mo
t+=1
if t==0:
break
print len(setfrac)+1
print x
print y
to=time.time()-ti
print to

Timeout issues with Python itertools.combinations()

We've got a script that uses itertools.combinations() and it seems to hang with a large input size.
I'm a relatively inexperienced Python programmer so I'm not sure how to fix this problem. Is there a more suitable library? Or is there a way to enable verbose logging to that I can debug why the method call is hanging?
Any help is much appreciated.
[Edit]
def findsubsets(S,m):
return set( itertools.combinations(S, m) )
for s in AllSearchTerms:
S.append(itemsize)
itemsize = itemsize + 1
for i in range (1,6):
Subset = findsubsets(S,i)
for sub in Subset:
for s in sub:
sublist.append(AllSearchTerms[s])
PComb.append(sublist)
sublist = []
You have two things in your code that will hang for large input sizes.
First, your function findsubsets calls itertools.combinations then converts the result to a set. The result of itertools.combinations is a generator, yielding each combination one at a time without storing them or all calculating them all at once. When you convert that to a set, you force Python to calculate and store them all at once. Therefore the line return set( itertools.combinations(S, m) ) is almost certainly where your program hangs. You can check that by placing print statements (or some other kind of logging statements) immediately before and after that line, and if you see the preceding print and the program hangs before you see the succeeding one, you have found the problem. The solution is not to convert the combinations to a set. Leave it as a generator, and your program can grab one combination at a time, as needed.
Second, even if you do what I just suggested, your loop for sub in Subset: is a fairly tight loop that uses every combination. If the input size is large, that loop will take a very long time and implementing my previous paragraph will not help. You probably should reorganize your program to avoid the large input sizes, or at least show some kind of progress during that loop. The combinations function has a predictable output size so you can even show the percent done in a progress bar.
There is no logging inside itertools.combinations since it is not needed when used properly, and there is no logging in the conversion of the generator to a set. You can implement logging in your own tight loop, if needed.

How to try all possible paths?

I need to try all possible paths, branching every time I hit a certain point. There are <128 possible paths for this problem, so no need to worry about exponential scaling.
I have a player that can take steps through a field. The player
takes a step, and on a step there could be an encounter.
There are two options when an encounter is found: i) Input 'B' or ii) Input 'G'.
I would like to try both and continue repeating this until the end of the field is reached. The end goal is to have tried all possibilities.
Here is the template, in Python, for what I am talking about (Step object returns the next step using next()):
from row_maker_inlined import Step
def main():
initial_stats = {'n':1,'step':250,'o':13,'i':113,'dng':0,'inp':'Empty'}
player = Step(initial_stats)
end_of_field = 128
# Walk until reaching an encounter:
while player.step['n'] < end_of_field:
player.next()
if player.step['enc']:
print 'An encounter has been reached.'
# Perform an input on an encounter step:
player.input = 'B'
# Make a branch of player?
# perform this on the branch:
# player.input = 'G'
# Keep doing this, and branching on each encounter, until the end is reached.
As you can see, the problem is rather simple. Just I have no idea, as a beginner programmer, how to solve such a problem.
I believe I may need to use recursion in order to keep branching. But I really just do not understand how one 'makes a branch' using recursion, or anything else.
What kind of solution should I be looking at?
You should be looking at search algorithms like breath first search (BFS) and depth first search (DFS).
Wikipedia has this as the pseudo-code implementation of BFS:
procedure BFS(G, v) is
let Q be a queue
Q.enqueue(v)
label v as discovered
while Q is not empty
v← Q.dequeue()
for all edges from v to w in G.adjacentEdges(v) do
if w is not labeled as discovered
Q.enqueue(w)
label w as discovered
Essentially, when you reach an "encounter" you want to add this point to your queue at the end. Then you pick your FIRST element off of the queue and explore it, putting all its children into the queue, and so on. It's a non-recursive solution that is simple enough to do what you want.
DFS is similar but instead of picking the FIRST element form the queue, you pick the last. This makes it so that you explore a single path all the way to a dead end before coming back to explore another.
Good luck!

Solving a programming-contest using Python [closed]

It's difficult to tell what is being asked here. This question is ambiguous, vague, incomplete, overly broad, or rhetorical and cannot be reasonably answered in its current form. For help clarifying this question so that it can be reopened, visit the help center.
Closed 11 years ago.
I found this problem, which I thought would be interesting to solve but couldn't really come up with a correct solution-
Inside a room, there is a monster with
N heads, and a human (with 1 head).
The human has two laser guns. The
first gun, A destroys C1 heads when
fired and the second gun,B destroys C2
heads when fired [The guns may destroy
both the monster's as well as human
heads, but the guns prioritize monster
heads over human ones].
Also, if after the firing of the gun
the monster has a positive non-zero
number of heads left, the monster will
grow additional heads. The monster
grows G1 heads when gun A is used and
it grows G2 heads when gun B is used.
The problem is to input N, C1, C2, G1
and G2, then find out what would be
the shortest combination of
gun-choice(A or B) the human must use
to kill the monster(the monster dies
when No. of heads=0).
[Note- this problem is from a programming contest that has already ended]
I tried approaching this problem using recursion but found myself clueless about how to actually come up with the solution. So, if you could give some hints how to approach the problem, that'd be great.
First of all: Dijkstra's is not the optimal solution :)
Given this sentence: "The guns may destroy both the monster's as well as human heads"
I take it if you can shoot 10 heads and the monster only has 5 heads, you can't kill it because that would kill you too. Is that correct?
In any case, any solution would be of the form:
ABABAABBABBB... (some string of A's and B's)
On the final hit you kill C1 or C2 heads. On every other hit you kill C1 - G1 or C2 - G2 heads.
If the final hit is from A, you have to destroy N-A heads with shots doing (C1-G1) or (C2-G2) damage.
If the final hit is from B, you have to destroy N-B heads with shots doing (C1-G1) or (C2-G2) damage.
Any K can be represented in the form:
X*i + Y*j = K
Of course, X and Y have to be coprime, etc.
K heads can be destroyed by i shots of damage X and j shots of damage Y.
You can find out the values of i and j with the extended greatest common divisor algorithm.
Solve for X = (C1-G1), Y = (C2-G2) and K = (N-A)
Also solve for X = (C1-G1), Y = (C2-G2) and K = (N-B)
The smallest answer is the correct one :)
That's it :)
Ah, I see you have found a solution in C++ already using Dijkstra's algorithm: http://hashsamrat.blogspot.com/2010/10/surviving-monster-programming-problem.html
However, you seem to be thinking about 'recursion' and other methods.
The solution is separate than the implementation. Thus what you really want to do would be to use the same algorithm (Dijkstra's, which is just breadth-first search carefully done so you visit the shortest paths first), but in python rather than C++.
You could just copy the C++ line-by-line, using python idioms to make the code cleaner and more elegant. But you'll still be using the same algorithm. (Alternatively, you could Google for the hundreds of ways people have implemented Dijkstra's in python. Or you could write it yourself; all you need is a priority queue (see wikipedia), and if time isn't an issue, you can write a poorly-performing priority queue in the form of a dictionary-of-lists.)
edit: Thinking about it, if by "shortest set of choices" you just mean "fewest gunshots", you don't really need Dijkstra's at all; it's just breadth-first-search (which is equivalent to Dijkstra's when all edges have weight 1).
In particular, the logic to generate a new node is as follows:
def howManyHeadsLeft(currentHeads, damage, regen):
newHeads = heads - damage
if {this results in blowing off our own head} and newHeads>0: #modify depending on assumptions
# we killed ourselves without taking monster down with us
return {} # the empty set of possible search nodes
else:
newHeads += regen
# we could just say return {newHeads} here,
# but that would be terribly slow to keep on searching the same
# state over and over again, so we use a cache to avoid doing that
# this is called dynamic programming
if {we have never seen newHeads before}:
return {newHeads}
else
return {}
def newSearchNodes(currentHeads):
return howManyHeadsLeft(currentHeads, atypeDamage, atypeRegen) | howManyHeadsLeft(currentHeads, btypeDamage, btypeRegen)
The 'goal' condition for the search is having just enough damage to kill the hydra without killing yourself (modify as appropriate depending on assumptions):
heads==1+atypeDamage or heads==1+btypeDamage
Of course it is also possible that no solution exists (regen > damage for both types of guns), in which case this algorithm might run forever, but could probably be modified to terminate.

Categories