Hash multiple iterations of a value over itself - python

I am trying to write a function which calculates multiple iteration hashes of a specific value (and output each iteration in the meantime).
However, I can't get my head over how to perform, for instance, md5 hash function on itself multiple times. For instance:
a = hashlib.md5('fun').hexdigest()
b = hashlib.md5(a).hexdigest()
c = hashlib.md5(b).hexdigest()
d = hashlib.md5(c).hexdigest()
.......
I think the recursion is the solution, but I just can't seem to implement it properly. This is the general factorial recursion example, but how do I adapt it to hashes:
def factorial(n):
if n == 0:
return 1
else:
return n * factorial(n - 1)

This is a classic application of generators. Python allows a maximum of 500 recursions due to its unusually heavy stack. For anything which might be executed anywhere near that many times, iteration will often be faster. Using a generator allows you to break after any desired number of executions and allows flat usage of the desired logic in your code. The following example prints the output of 10 such executions.
from itertools import islice
def hashes(n):
while True:
n = hashlib.md5(n).hexdigest()
yield n
for h in islice(hashes('fun'), 10):
print(h)

In general, you are looking for a loop like
while True:
x = f(x)
where you repeatedly replace the input with the result of the most recent application.
For your specific example,
def iterated_hash(x):
while True:
x = hashlib.md5(x).hexdigest()
return x
However, since you don't really want to do this an infinite number of times, you need to supply a count:
def iterated_hash(x, n):
while True:
if n == 0:
return x
x = hashlib.md5(x).hexdigest()
or with a for loop,
def iterated_hash(x, n):
for _ in range(n):
x = hashlib.md5(x).hexdigest()
return x
(Practically speaking, you want to use the for loop, but it's nice to see how the for loop is just a finite special case of the more general infinite loop.)

Just iterate as many times as needed:
def make_hash(text, iterations):
a = hashlib.md5(text).hexdigest()
for _ in range(iterations):
a = hashlib.md5(a).hexdigest()
return a
a = make_hash('fun', 5) # 5 iterations

Related

How do I locate the recursion conditions?

My code is as follows.
I tried coding out for each case first, so given n = 4, my code looks like this:
a = overlay_frac(0,blank_bb,scale(1/4,rune))
b = overlay_frac(1/4,blank_bb,scale(1/2,rune))
c = overlay_frac(1/2,blank_bb,scale(3/4,rune))
d = overlay_frac(3/4,blank_bb,scale(1,rune))
show (overlay(a,(overlay(b,(overlay(c,d))))))
My understanding is that the recursion pattern is:
a = overlay_frac((1/n)-(1/n),blank_bb,scale(1/n,rune))
b = overlay_frac((2/n)-(1/n),blank_bb,scale(2/n,rune))
c = overlay_frac((3/n)-(1/n),blank_bb,scale(3/n,rune))
d = overlay_frac((4/n)-(1/n),blank_bb,sale(4/n,rune))
Hence, the recursion pattern that I came up with is:
def tree(n,rune):
if n==1:
return rune
else:
for i in range(n+1):
return overlay(overlay_frac(1-(1/n),blank_bb,scale(i/n,rune)),tree(n-1,rune))
When I hardcode this, everything turns out just fine, but I suspect I'm not doing the recursion properly. Where have I gone wrong?
You are in fact trying to do an iteration within a recursive call. In stead of using loop, you can use an inner function to memorize your status. The coefficient you defined is actually changed with both n and i, but for a given n it changed with i only. The status you need to memorize with inner function is then i, which is the same as you looping through i.
You can still achieve your goal by doing so
def f(i, n):
return overlay_frac((i/n)-(1/n),blank_bb,scale(i/n,rune))
# for each iteration, you check if i is equal to n
# if yes, return the result (base case)
# otherwise, you apply next coefficient to the previous result
# you start with i = 0 and increase by one every iteration until i reach to n (base case)
# notice how similar this recursive call looks like a loop
# the only difference is the status are updated within the function call itself
# therefore you will not have the problem of earlier return
def recursion(n):
def iteration(i, out):
if i == n:
return out
else:
return iteration(i+1, overlay(f(n-1, n), out))
return iteration(0, f(n, n))
Here, n is assumed to be the times of overlay you want to apply. When n = 0, no function applied on the last coefficient f(n, n). When n = 1, the output would be overlay applied once on coefficient with i = n - 1 and coefficient with i = n.
This way avoids the earlier return inside your loop.
In fact you can omit the inner function by adding additional argument to your outer function. Then you need to assign the default initial i. The inner function is not really necessary here. The key is to use the function argument to memorize the status (variable i in this case).
def f(i, n):
return overlay_frac((i/n)-(1/n),blank_bb,scale(i/n,rune))
def recursion(n, i=0):
if i == n:
return f(n, n)
else:
return overlay(f(n-1, n), recursion(n, i+1))
Your first two code blocks don't correspond to the same operations. This would be equivalent to your first block (in Python 3).
def overlayer(n, rune):
def layer(k):
# Scale decreases linearly with k
return overlay_frac((1 - (k+1)/n), blank_bb, scale(1-k/n, rune))
result = layer(0)
for i in range(1, n):
# Overlay on top of previous layers
result = overlay(layer(i), result)
return result
show(overlayer(4, rune))
Let's look at your equations again:
a = overlay_frac(0,blank_bb,scale(1/4,rune))
b = overlay_frac(1/4,blank_bb,scale(1/2,rune))
c = overlay_frac(1/2,blank_bb,scale(3/4,rune))
d = overlay_frac(3/4,blank_bb,scale(1,rune))
show (overlay(a,(overlay(b,(overlay(c,d))))))
What you wrote as "recursion" is not a recursion formula. If you compare your formulas for the recursion with the ones you gave us, you can infer n=4 which makes no sense. For a recursion pattern you need to describe your inner variables as a manifestation of the same expression with only a different parameter. That is, you should replace:
f_n = overlay_frac((1/4)*(n-1),blank_bb,sale(n/4,rune))
such that f_1=a, f_2=b etc...
Then your recursion fomula that you want to calculate translates to:
show (overlay(f_1,(overlay(f_2,(overlay(f_3,f_4))))))
You can write the function f_n as f(n) (and maybe other paramters) in your code and then do
def recurse(n):
if n == 4:
return f(4)
else:
return overlay(f(n),recurse(n+1))
then call:
show( recurse (1))
You need to assert that n<5and integer, otherwise you'll end up in an infinity loop.
There may still be some mistake, but it should be along those lines. Once you've actually written it like this however, it (maybe) doesn't really make sense to do a recursion anyways. If you only want to do it for n_max=4, that is. Just call the function in one line by replacing a,b,c,d with f_1,f_2,f_3,f_4

Why does the python filter do not overflow when it processes an infinite sequence?

def _odd_iter():
n = 1
while True:
n = n + 2
yield n
def filt(n):
return lambda x: x % n > 0
def primes():
yield 2
it = _odd_iter()
while True:
n = next(it)
yield n
it = filter(filt(n),it)
For example: 【3,5,7,9,11,13,15 ......】
If I have to take number 7 from this sequence I want to judge whether it is a prime number that must be divided in 3 and 5 to determine And 3,5 of these information must be stored up even if the inert load or the more information will be stored in the future will be more and more slow calculation of the actual experiment but in fact generate prime speed is not lower and the memory does not explode and I want to know what the internal principles
In Python 3, as your post is tagged, filter is a lazily-evaluated generator-type object. If you tried to evaluate that entire filter object with e.g. it = list(filter(filt(n),it)), you would have a bad time. You would have an equally bad time if you ran your code in Python 2, in which filter() automatically returns a list.
A filter on an infinite iterable is not inherently problematic, though, because you can use it in a perfectly acceptable way, like a for loop:
it = filter(filt(n),it)
for iteration in it:
if input():
print(iteration)
else:
break

Avoid variable recomputation?

I have a line code like this -
while someMethod(n) < length and List[someMethod(n)] == 0:
# do something
n += 1
where someMethod(arg) does some computation on the number n. The problem with this code is that I'm doing the same computation twice, which is something I need to avoid.
One option is to do this -
x = someMethod(n)
while x < length and List[x] == 0:
# do something
x = someMethod(n + 1)
I am storing the value of someMethod(n) in a variable x and then using it later. However, the problem with this approach is that the code is inside a recursive method which is called multiple times. As a result, a lot of excess instances of variables x are being created which slows the code down.
Here's the snipped of the code -
def recursion(x, n, i):
while someMethod(n) < length and List[someMethod(n)] == 0:
# do something
n += 1
# some condition
recursion(x - 1, n, someList(i + 1))
and this recursion method is called many times throughout the code and the recursion is quite deep.
Is there some alternative available to deal with a problem like this?
Please try to be language independent if possible.
You can use memoization with decorators technique:
def memoize(f):
memo = dict()
def wrapper(x):
if x not in memo:
memo[x] = f(x)
return memo[x]
return wrapper
#memoize
def someMethod(x):
return <your computations with x>
As i understand your code correctly you are looking for some sort of memorization.
https://en.wikipedia.org/wiki/Memoization
it means that on every recursive call you have to save as mush as possible past calculations to use it in current calculation.

Random prime Number in python

I currently have ↓ set as my randprime(p,q) function. Is there any way to condense this, via something like a genexp or listcomp? Here's my function:
n = randint(p, q)
while not isPrime(n):
n = randint(p, q)
It's better to just generate the list of primes, and then choose from that line.
As is, with your code there is the slim chance that it will hit an infinite loop, either if there are no primes in the interval or if randint always picks a non-prime then the while loop will never end.
So this is probably shorter and less troublesome:
import random
primes = [i for i in range(p,q) if isPrime(i)]
n = random.choice(primes)
The other advantage of this is there is no chance of deadlock if there are no primes in the interval. As stated this can be slow depending on the range, so it would be quicker if you cached the primes ahead of time:
# initialising primes
minPrime = 0
maxPrime = 1000
cached_primes = [i for i in range(minPrime,maxPrime) if isPrime(i)]
#elsewhere in the code
import random
n = random.choice([i for i in cached_primes if p<i<q])
Again, further optimisations are possible, but are very much dependant on your actual code... and you know what they say about premature optimisations.
Here is a script written in python to generate n random prime integers between tow given integers:
import numpy as np
def getRandomPrimeInteger(bounds):
for i in range(bounds.__len__()-1):
if bounds[i + 1] > bounds[i]:
x = bounds[i] + np.random.randint(bounds[i+1]-bounds[i])
if isPrime(x):
return x
else:
if isPrime(bounds[i]):
return bounds[i]
if isPrime(bounds[i + 1]):
return bounds[i + 1]
newBounds = [0 for i in range(2*bounds.__len__() - 1)]
newBounds[0] = bounds[0]
for i in range(1, bounds.__len__()):
newBounds[2*i-1] = int((bounds[i-1] + bounds[i])/2)
newBounds[2*i] = bounds[i]
return getRandomPrimeInteger(newBounds)
def isPrime(x):
count = 0
for i in range(int(x/2)):
if x % (i+1) == 0:
count = count+1
return count == 1
#ex: get 50 random prime integers between 100 and 10000:
bounds = [100, 10000]
for i in range(50):
x = getRandomPrimeInteger(bounds)
print(x)
So it would be great if you could use an iterator to give the integers from p to q in random order (without replacement). I haven't been able to find a way to do that. The following will give random integers in that range and will skip anything that it's tested already.
import random
fail = False
tested = set([])
n = random.randint(p,q)
while not isPrime(n):
tested.add(n)
if len(tested) == p-q+1:
fail = True
break
while n in s:
n = random.randint(p,q)
if fail:
print 'I failed'
else:
print n, ' is prime'
The big advantage of this is that if say the range you're testing is just (14,15), your code would run forever. This code is guaranteed to produce an answer if such a prime exists, and tell you there isn't one if such a prime does not exist. You can obviously make this more compact, but I'm trying to show the logic.
next(i for i in itertools.imap(lambda x: random.randint(p,q)|1,itertools.count()) if isPrime(i))
This starts with itertools.count() - this gives an infinite set.
Each number is mapped to a new random number in the range, by itertools.imap(). imap is like map, but returns an iterator, rather than a list - we don't want to generate a list of inifinite random numbers!
Then, the first matching number is found, and returned.
Works efficiently, even if p and q are very far apart - e.g. 1 and 10**30, which generating a full list won't do!
By the way, this is not more efficient than your code above, and is a lot more difficult to understand at a glance - please have some consideration for the next programmer to have to read your code, and just do it as you did above. That programmer might be you in six months, when you've forgotten what this code was supposed to do!
P.S - in practice, you might want to replace count() with xrange (NOT range!) e.g. xrange((p-q)**1.5+20) to do no more than that number of attempts (balanced between limited tests for small ranges and large ranges, and has no more than 1/2% chance of failing if it could succeed), otherwise, as was suggested in another post, you might loop forever.
PPS - improvement: replaced random.randint(p,q) with random.randint(p,q)|1 - this makes the code twice as efficient, but eliminates the possibility that the result will be 2.

Running average in Python

Is there a pythonic way to build up a list that contains a running average of some function?
After reading a fun little piece about Martians, black boxes, and the Cauchy distribution, I thought it would be fun to calculate a running average of the Cauchy distribution myself:
import math
import random
def cauchy(location, scale):
p = 0.0
while p == 0.0:
p = random.random()
return location + scale*math.tan(math.pi*(p - 0.5))
# is this next block of code a good way to populate running_avg?
sum = 0
count = 0
max = 10
running_avg = []
while count < max:
num = cauchy(3,1)
sum += num
count += 1
running_avg.append(sum/count)
print running_avg # or do something else with it, besides printing
I think that this approach works, but I'm curious if there might be a more elegant approach to building up that running_avg list than using loops and counters (e.g. list comprehensions).
There are some related questions, but they address more complicated problems (small window size, exponential weighting) or aren't specific to Python:
calculate exponential moving average in python
How to efficiently calculate a running standard deviation?
Calculating the Moving Average of a List
You could write a generator:
def running_average():
sum = 0
count = 0
while True:
sum += cauchy(3,1)
count += 1
yield sum/count
Or, given a generator for Cauchy numbers and a utility function for a running sum generator, you can have a neat generator expression:
# Cauchy numbers generator
def cauchy_numbers():
while True:
yield cauchy(3,1)
# running sum utility function
def running_sum(iterable):
sum = 0
for x in iterable:
sum += x
yield sum
# Running averages generator expression (** the neat part **)
running_avgs = (sum/(i+1) for (i,sum) in enumerate(running_sum(cauchy_numbers())))
# goes on forever
for avg in running_avgs:
print avg
# alternatively, take just the first 10
import itertools
for avg in itertools.islice(running_avgs, 10):
print avg
You could use coroutines. They are similar to generators, but allows you to send in values. Coroutines was added in Python 2.5, so this won't work in versions before that.
def running_average():
sum = 0.0
count = 0
value = yield(float('nan'))
while True:
sum += value
count += 1
value = yield(sum/count)
ravg = running_average()
next(ravg) # advance the corutine to the first yield
for i in xrange(10):
avg = ravg.send(cauchy(3,1))
print 'Running average: %.6f' % (avg,)
As a list comprehension:
ravg = running_average()
next(ravg)
ravg_list = [ravg.send(cauchy(3,1)) for i in xrange(10)]
Edits:
Using the next() function instead of the it.next() method. This is so it also will work with Python 3. The next() function has also been back-ported to Python 2.6+.
In Python 2.5, you can either replace the calls with it.next(), or define a next function yourself.
(Thanks Adam Parkin)
I've got two possible solutions here for you. Both are just generic running average functions that work on any list of numbers. (could be made to work with any iterable)
Generator based:
nums = [cauchy(3,1) for x in xrange(10)]
def running_avg(numbers):
for count in xrange(1, len(nums)+1):
yield sum(numbers[:count])/count
print list(running_avg(nums))
List Comprehension based (really the same code as the earlier):
nums = [cauchy(3,1) for x in xrange(10)]
print [sum(nums[:count])/count for count in xrange(1, len(nums)+1)]
Generator-compatabile Generator based:
Edit: This one I just tested to see if I could make my solution compatible with generators easily and what it's performance would be. This is what I came up with.
def running_avg(numbers):
sum = 0
for count, number in enumerate(numbers):
sum += number
yield sum/(count+1)
See the performance stats below, well worth it.
Performance characteristics:
Edit: I also decided to test Orip's interesting use of multiple generators to see the impact on performance.
Using timeit and the following (1,000,000 iterations 3 times):
print "Generator based:", ', '.join(str(x) for x in Timer('list(running_avg(nums))', 'from __main__ import nums, running_avg').repeat())
print "LC based:", ', '.join(str(x) for x in Timer('[sum(nums[:count])/count for count in xrange(1, len(nums)+1)]', 'from __main__ import nums').repeat())
print "Orip's:", ', '.join(str(x) for x in Timer('list(itertools.islice(running_avgs, 10))', 'from __main__ import itertools, running_avgs').repeat())
print "Generator-compatabile Generator based:", ', '.join(str(x) for x in Timer('list(running_avg(nums))', 'from __main__ import nums, running_avg').repeat())
I get the following results:
Generator based: 17.653908968, 17.8027219772, 18.0342400074
LC based: 14.3925321102, 14.4613749981, 14.4277560711
Orip's: 30.8035550117, 30.3142540455, 30.5146529675
Generator-compatabile Generator based: 3.55352187157, 3.54164409637, 3.59098005295
See comments for code:
Orip's genEx based: 4.31488609314, 4.29926609993, 4.30518198013
Results are in seconds, and show the LC new generator-compatible generator method to be consistently faster, your results may vary though. I expect the massive difference between my original generator and the new one is the fact that the sum isn't calculated on the fly.

Categories