counting back generations of a number - python

I am trying to reverse engineer a set of numbers given to me (f,m) I need to go through and find how many generations it takes starting from 1,1 using the algorithm below for each generation:
x = 1
y = 1
new_generation = y+x
x OR y = new_generation
IE, I do not know if X, or Y is changed, the other variable is left the same... A list of possible outputs would look like this for the ending values of 4 and 7:
f = 4
m = 7
[1, 1]
[2, 1, 1, 2]
[3, 1, 2, 3, 3, 2, 1, 3]
[4, 1, 3, 4, 5, 3, 2, 5, 5, 2, 3, 5, 4, 3, 1, 4]
[5, 1, 4, 5, **7, 4**, 3, 7, 7, 5, 2, 7, 7, 2, 5, 7, 7, 3, **4, 7**, 5, 4, 1, 5]
Where every two sets of numbers (2,1) and (1,2) are a possible output. Note the ** denote the answer (in this case the order doesn't matter so long as both m and f have their value in the list).
Clearly there is exponential growth here, so I can't (or it less efficient) to make a list and then find the answer; instead I am using the following code to reverse this process...
def answer(m,f):
#the variables will be sent to me as a string, so here I convert them...
m = (int(m))
f = (int(f))
global counter
#While I have not reduced my given numbers to my starting numbers....
while m != 1 or f != 1:
counter +=1
#If M is greater, I know the last generation added F to M, so remove it
if m > f:
m = m-f
#If F is greater, I know the last generation added M to M, so remove it
elif f > m:
f = f-m
else:
#They can never be the same (one must always be bigger, so if they are the same and NOT 1, it can't be done in any generation)
return "impossible"
return str(counter)
print(answer("23333","30000000000"))
This returns the correct answer (for instance, 4,7 returns "4" which is correct) but it takes to long when I pass larger numbers (I must be able to handle 10^50, insane amount, I know!).
My thought was I should be able to apply some mathematical equation to the number to reduce it and them multiple the generations, but I'm having trouble finding a way to do this that also holds the integrity of the answer (for instance, if I divide the bigger by the smaller, on small numbers (7, 300000) I get a very close (but wrong) answer, however on closer numbers such as (23333, 300000) the answer is no where even close, which makes sense due to the differences in the generation path). Note I have also tried this in a recursive function (to find generations) and using the a non-reversed method (building the list and checking the answer; which was significantly slower for obvious reasons)
Here are some test cases with their answers:
f = "1"
m = "2"
Output: "1"
f = "4"
m = "7"
Output: "4"
f = "4"
m = "2"
Output: "impossible"
Any help is much appreciated! P.S. I am running Python 2.7.6
[EDIT]
The below code is working as desired.
from fractions import gcd
def answer(m,f):
#Convert strings to ints...
m = (int(m))
f = (int(f))
#If they share a common denominator (GCD) return impossible
if gcd(m,f) != 1:
return "impossible"
counter = 0
#While there is still a remainder...
while m != 0 and f != 0:
if m > f:
counter += m // f
#M now equals the remainder.
m %= f
elif f > m:
counter += f // m
f %= m
return str(counter - 1)

This is not a Python question, nor is it really a programming question. This is a problem designed to make you think. As such, if you just get the answer from somebody else, you will gain no knowledge or hindsight from the exercise.
Just add a print(m, f) in your while loop and watch how the numbers evolve for small inputs. For example, try with something like (3, 100): don't you see any way you could speed things up, rather than repeatedly removing 3 from the bigger number?

You are on the right track with the top-down approach you posted. You can speed it up by a huge factor if you use integer division instead of repeated subtraction.
def answer(m, f):
m = int(m)
f = int(f)
counter = 0
while m != 0 and f != 0:
if f > m:
m, f = f, m
print(m, f, counter, sep="\t")
if f != 1 and m % f == 0:
return "impossible"
counter += m // f
m %= f
return str(counter - 1)
Using the above, answer(23333, 30000000000) yields
30000000000 23333 0
23333 15244 1285732
15244 8089 1285733
8089 7155 1285734
7155 934 1285735
934 617 1285742
617 317 1285743
317 300 1285744
300 17 1285745
17 11 1285762
11 6 1285763
6 5 1285764
5 1 1285765
1285769
and answer(4, 7) yields
7 4 0
4 3 1
3 1 2
4

Try a form of recursion:
(Python 2.7.6)
def back():
global f,m,i
if f<m:
s=m//f
i+=s
m-=s*f
elif m<f:
s=f//m
i+=s
f-=s*m
else:
return False
return True
while True:
f=int(raw_input('f = '))
m=int(raw_input('m = '))
i=0
while True:
if f==m==1:
print 'Output:',str(i)
break
else:
if not back():
print 'Output: impossible'
break
print
(Python 3.5.2)
def back():
global f,m,i
if f<m:
s=m//f
i+=s
m-=s*f
elif m<f:
s=f//m
i+=s
f-=s*m
else:
return False
return True
while True:
f=int(input('f = '))
m=int(input('m = '))
i=0
while True:
if f==m==1:
print('Output:',str(i))
break
else:
if not back():
print('Output: impossible')
break
print()
Note: I am a Python 3.5 coder so I have tried to backdate my code, please let me know if there is something wrong with it.
The input format is also different: instead of f = "some_int" it is now f = some_int, and the output is formatted similarly.

Related

"Homemade" infinite precision sum function

How would you go about this:
Write your own infinite precision "sum", "product", and "to the power of" functions, that represent numbers as lists of
digits between 0 and 9 with least significant digit first.
Thus: 0 is represented as the empty list [], and 10 is represented as [0,1].
You may assume that numbers are non-negative (no need for negative numbers, or for subtraction).
I have functions to convert to and from.
eg:
iint(5387) == [7, 8, 3, 5] and pint([7, 8, 3, 5]) == 5387
def iint(n):
# list of all digits in the int
digits = [int(x) for x in str(n)]
# reverse the list
digits.reverse()
return digits
def pint(I):
# new int c
c = 0
# iterates through list
for i in range(len(I)):
# add to c digit in the list multiplied by 10^of its position in the list. 1, 10, 100, 1000 ect.
c = c + I[i] * (10 ** i)
return c
# add two infinite integers
def iadd(I, J):
pass
First though would be just convert back to int do the calculation and then back again but that would "gut the question".
Not looking for a complete solution just some pointers on where to start for iadd()because I am completely stumped. I assume after you get iadd() the rest should be simple enough.
For writing your iadd function, one way is to use test-driven development; write your test inputs, and your expected outputs. assert that they're equal, then rewrite your function so it passes the testcase.
In the particular case of needing to add two lists of numbers together, "how would you do that by hand?" (as noted by a comment) is an excellent place to start. You might
starting from the least-significant digit
add individual digits together (including a carry from the previous digit, if any)
carry the "high" digit if the result is > 9
record the result of that addition
loop from step 2 until you exhaust the shorter number
if you have a carry digit "left over," handle that properly
if one of the input numbers has more digits than the other, properly handle the "left over" digits
Here's a code snippet that should help give some ideas:
for d in range(min(len(I),len(J))):
added = I[d] + J[d]
digit = added%10 + carry
carry = added//10
And some testcases to try:
assert iadd([1], [1]) == [2] # 1 + 1 == 2
assert iadd([0,1], [1]) == [1,1] # 10 + 1 == 11
assert iadd([9,1], [1]) == [0,2] # 19 + 1 == 20
assert iadd([9,9,9,9,9], [2]) == [1,0,0,0,0,1] # 99,999 + 2 == 100,001
assert iadd([4,0,2], [9,2,3,4,1]) == [3,3,5,4,1] # 201 + 14,329 == 14,533
Itertools's zip_longest should be very useful to implement the addition operation.
For example:
def iint(N): return [int(d) for d in reversed(str(N))]
def pint(N): return int("".join(map(str,reversed(N))))
from itertools import zip_longest
def iadd(A,B):
result = [0]
for a,b in zip_longest(A,B,fillvalue=0):
result[-1:] = reversed(divmod(a+b+result[-1],10))
while result and not result[-1]: result.pop(-1) # remove leading zeros
return result
a = iint(1234)
b = iint(8910)
print(iadd(a,b)) # [4, 4, 1, 0, 1] (10144)
For the multiplication, you should make sure to keep the intermediate results below 100
def iprod(A,B):
result = []
for iA,a in enumerate(A):
if not a: continue
result = iadd(result,[0]*iA+[a*d for d in B]) # a*d+9 <= 90
return result
print(iprod(a,b)) # [0, 4, 9, 4, 9, 9, 0, 1] 10994940
For the power operation, you'll want to break down the process into a reasonable number of multiplications. This can be achieved by decomposing the exponent into powers of 2 and multiplying the result by the compounded squares of the base (for 1 bits). But you'll need to make a division by 2 function to implement that.
This strategy is based on the fact that multiplying a base raised to various powers, adds these powers:
A^7 * A^6 = A^13
and that any number can be expressed as the sum of powers of two:
13 = 1 + 4 + 8,
so
A^13 = A^1 * A^4 * A^8.
This reduces the number of multiplications for A^B down to 2log(B) which is much less than multiplying A by itself B-1 times (although we'll be dealing with larger numbers).
def idiv2(N):
result = N.copy() or [0]
carry = 0
for i,d in enumerate(reversed(N),1):
result[-i],carry = divmod(result[-i]+carry*10,2)
return result if result[-1] else result[:-1]
def ipow(A,B):
result, a2n = [1], [] # a2n is compounded squares A, A^2, A^4, A^8, ...
while B:
a2n = iprod(a2n,a2n) if a2n else A
if B[0]%2 : result = iprod(result,a2n)
B = idiv2(B)
return result
print(ipow(iint(12),iint(13)))
# [2, 7, 0, 9, 7, 3, 5, 0, 2, 3, 9, 9, 6, 0, 1] 106993205379072
print(len(ipow(a,b))) # 27544 digits (takes a long time)
Further optimization could be achieved by creating a specialized square function and using it instead of iprod(a2n,a2n)

Why does the code give me wrong output for an amount? (Change making problem)

I was trying to solve the coin change problem in which we have to find the minimum number of coins that add up to a specific amount.
Here is the solution I came up with:
import sys
denomination = [1,6,10]
amount = 12
def coin_change(amount,denomination):
coins = 0
ans = [0]*(amount+1)
temp = sys.maxsize
for i in range(len(ans)):
for j in range(len(denomination)):
if denomination[j] <= i:
ans[i] = min(temp, ans[i-denomination[j]]) + 1
return ans
print(coin_change(amount,denomination))
The output is
[0, 1, 2, 3, 4, 5, 1, 2, 3, 4, 1, 2, 3].
Why is the last number in the output a 3 for the amount 12? I have gone through the code so many times, but I still don't understand why this happens. It gives 1 for the amount 6, so it should give 2 for the amount 12 instead of 3.
What's wrong with my code?
The problem is that min(temp, ...) is a useless call, as you never reduce the value of temp. This expression is always going to return the second argument. Obviously you really need to compare alternatives and choose the optimal one, so this is wrong.
And that is the reason you get 3. The last denomination that is tried is 10 (when j is 2). Before that try, ans[12] was actually 2, but it gets overwritten with 3 (10+1+1)!
Here is a correction:
import sys
denomination = [1,6,10]
amount = 12
def coin_change(amount,denomination):
ans = [sys.maxsize]*(amount+1) # initialise with maximum value
for i in range(len(ans)):
for j in range(len(denomination)):
if denomination[j] <= i:
if denomination[j] == i:
ans[i] = 1 # base case
else: # see if we can improve what we have
ans[i] = min(ans[i], ans[i-denomination[j]] + 1)
return ans
print(coin_change(amount,denomination))

if a number is divisible by all the entries of a list then

This came up while attempting Problem 5 of Project Euler, I'm sorry if this is vague or obvious I am new to programming.
Suppose I have a list of integers
v = range(1,n) = [1, ..., n]
What I want to do is this:
if m is divisible by all the entries of v then I want to set
m/v[i] for i starting at 2 and iterating up
then I want to keep repeating this process until I eventually get something which is not divisible by all the entries of v.
Here is a specific example:
let v=[1,2,3,4] and m = 24
m is divisible by 1, 2, 3, and 4, so we divide m by 2 giving us
m=12 which is divisible by 1, 2, 3, and 4 , so we divide by 3
giving us m=4 which is not divisible by 1, 2, 3, and 4. So we stop here.
Is there a way to do this in python using a combination of loops?
I think this code will solve your problem:
i=1
while(True):
w=[x for x in v if (m%x)==0]
if(w==v):
m/=v[i]
i+=1
continue
elif(m!=v):
break
Try this out of size, have a feeling this is what you were asking for:
v = [1,2,3,4]
m = 24
cont = True
c = 1
d = m
while cont:
d = d/c
for i in v:
if d % i != 0:
cont = False
result = d
break
c+=1
print (d)
Got an output of 4.
I think this piece of code should do what you're asking for:
v = [1,2,3,4]
m = 24
index = 1
done = False
while not done:
if all([m % x == 0 for x in v]):
m = m // v[index]
if index + 1 == len(v):
print('Exhausted v')
done = True
else:
index += 1
else:
done = True
print('Not all elements in v evenly divide m')
That said, this is not the best way to go about solving Project Euler problem 5. A more straightforward and faster approach would be:
solved = False
num = 2520
while not solved:
num += 2520
if all([num % x == 0 for x in [11, 13, 14, 16, 17, 18, 19, 20]]):
solved = True
print(num)
In this approach, we known that the answer will be a multiple of 2520, so we increment the value we're checking by that amount. We also know that the only values that need to be checked are in [11, 13, 14, 16, 17, 18, 19, 20], because the number in the range [1,20] that aren't in that list are factors of at least one of the numbers in the list.

Python MemoryError occurring with large loop

I'm trying to create a script to solve a question for me on Project Euler, but it keeps returning a MemoryError. I have absolutely no idea why.
divisible = True
possible = dict()
for i in xrange(1, 100000000):
for n in xrange(1, 21):
if i%n != 0:
divisible = False
else:
if i in possible:
possible[i].append(n)
else:
possible[i] = [n]
if len(possible[i]) == 20:
print i
break
Python seems to think it occurs on this line possible[i] = [n]
The problem is in your line
if len(possible[i]) == 20:
You mean to say
if len(possible) == 20:
As it is, your code will keep on running - and presumably, since the the loop count is so large, some stack fills up...
Also - although I don't know for sure what you are trying to achieve, your break command is in the innermost loop - so you break out of it, then go around again... and since the length will only be exactly 20 once, you are still stuck. Check your logic.
for example, the following small change to your code produces useful output (although I don't know if it's useful for you... but it might give you some ideas):
divisible = True
possible = dict()
for i in xrange(1, 100000000):
for n in xrange(1, 21):
if i%n != 0:
divisible = False
else:
if i in possible:
possible[i].append(n)
else:
possible[i] = [n]
if len(possible) == 20:
print i
break
else:
print i, possible[i]
Output:
1 [1]
2 [1, 2]
3 [1, 3]
4 [1, 2, 4]
5 [1, 5]
6 [1, 2, 3, 6]
7 [1, 7]
8 [1, 2, 4, 8]
9 [1, 3, 9]
10 [1, 2, 5, 10]
11 [1, 11]
12 [1, 2, 3, 4, 6, 12]
13 [1, 13]
14 [1, 2, 7, 14]
15 [1, 3, 5, 15]
16 [1, 2, 4, 8, 16]
17 [1, 17]
18 [1, 2, 3, 6, 9, 18]
19 [1, 19]
20
EDIT reading through the code more carefully, I think what you are trying to do is find the number that has exactly 20 factors; thus your condition was correct. The problem is that you are storing all the other terms as well - and that is a very very large number of lists. If you are only after this last number (after all the only output is i just before the break), then you really don't need to keep all the other terms. The following code does just that - it's been running merrily on my computer, taking about 20 MB of memory for the longest time now (but no answer yet...)
divisible = True
possible = [];
biggest = 0;
bigN = 100000000;
for i in xrange(1, bigN):
for n in xrange(1, 21):
if i%n != 0:
divisible = False
else:
if len(possible) > 0:
possible.append(n)
else:
possible = [n]
if len(possible) >= 20:
print i
print possible
break
else:
if bigN < 1000:
print i, possible; # handy for debugging
if biggest < len(possible):
biggest = len(possible);
possible = []
The "manual" way to calculate what you are doing is finding the prime factors for all numbers from 1 to 20; counting the largest number of times a prime occurs in each; and taking their product:
2 = 2
3 = 3
4 = 22
5 = 5
6 = 2 3
7 = 7
8 = 222
9 = 33
10 = 2 5
11 = 11
12 = 22 3
13 = 13
14 = 2 7
15 = 3 5
16 = 2222
17 = 17
18 = 2 33
19 = 19
20 = 22 5
Answer: (2*2*2*2)*(3*3)*5*7*11*13*17*19 = 232792560
The memory error is occuring due to the size of :
possible = dict()
One you keep pushing integers into it, its size keeps on growing, and you get memory error.
Carefully see if that can be avoided in the solution.eg. If the answer requires only to tell the count of factors, and not all factors, then do not store all values in a list and calculate its length.
Instead increment counters for each number.
I am not sure what the question is, but this can be replaced by this :
if len(possible[i]) == 20:
print i
break
can be :
if i in possible:
possible[i] += 1
else:
possible[i] = 0
if possible[i] == 20:
print i
break
A quick back of the envelope calculation. You have something like 100000000 integers which if you stored them all would be something on the order of 0.4 Gb (in C). Of course, these are python integers, so each one is more than 4 bytes ... On my system, each is 24(!) bytes which takes your 0.4 Gb up to 2.34 Gb. Now you have each one stored in up to 21 lists ... So that's (up to) an additional 21 pointers to each. Assuming a 4byte pointer to an int, you can see that we're already starting to consume HUGE amounts of memory.
Also note that for performance reasons, lists are over-allocated. It's likely that you're using way more memory than you need because your lists aren't full.
Of course, you're not actually storing them all, and you have an early break condition which (apparently) isn't getting hit. It's likely that you have a logic error in there somewhere.
The check should be outside the innerloop for it to terminate properly. Otherwise, the program would never stop, at intended exit.
divisible = True
possible = dict()
for i in xrange(1, 100000000):
for n in xrange(1, 21):
if i%n != 0:
divisible = False
else:
if i in possible:
possible[i].append(n)
else:
possible[i] = [n]
if len(possible[i]) == 20:
print i
break
BTW, a faster method would be to find LCM than going for a bruteforce method like this.
edit:
One variant which uses no memory.
divisible = True
possible = []
for i in xrange(0, 1000000000):
count = 0
for n in xrange(1, 21):
if i%n != 0:
divisible = False
else:
count += 1
if count == 20:
possible.append(i)
print i
else:
print "\r", "%09d %d %d" % (i, 232792560, count),
print possible
I would say you need to change the approach here. You need solutions which fit under the 1 minute rule. And discussing the solution here defeats the very purpose of Project Euler. So I would suggest that you think of a different approach to solve the problem. An approach which might solve the problem in less than a second.
As for the memory issue, with the current approach, it is almost impossible to get rid of it. So changing the approach will solve this issue too. Though this post does not answer your question, it is in line with Project Euler's principles!

Python - Memoization and Collatz Sequence

When I was struggling to do Problem 14 in Project Euler, I discovered that I could use a thing called memoization to speed up my process (I let it run for a good 15 minutes, and it still hadn't returned an answer). The thing is, how do I implement it? I've tried to, but I get a keyerror(the value being returned is invalid). This bugs me because I am positive I can apply memoization to this and get this faster.
lookup = {}
def countTerms(n):
arg = n
count = 1
while n is not 1:
count += 1
if not n%2:
n /= 2
else:
n = (n*3 + 1)
if n not in lookup:
lookup[n] = count
return lookup[n], arg
print max(countTerms(i) for i in range(500001, 1000000, 2))
Thanks.
There is also a nice recursive way to do this, which probably will be slower than poorsod's solution, but it is more similar to your initial code, so it may be easier for you to understand.
lookup = {}
def countTerms(n):
if n not in lookup:
if n == 1:
lookup[n] = 1
elif not n % 2:
lookup[n] = countTerms(n / 2)[0] + 1
else:
lookup[n] = countTerms(n*3 + 1)[0] + 1
return lookup[n], n
print max(countTerms(i) for i in range(500001, 1000000, 2))
The point of memoising, for the Collatz sequence, is to avoid calculating parts of the list that you've already done. The remainder of a sequence is fully determined by the current value. So we want to check the table as often as possible, and bail out of the rest of the calculation as soon as we can.
def collatz_sequence(start, table={}): # cheeky trick: store the (mutable) table as a default argument
"""Returns the Collatz sequence for a given starting number"""
l = []
n = start
while n not in l: # break if we find ourself in a cycle
# (don't assume the Collatz conjecture!)
if n in table:
l += table[n]
break
elif n%2 == 0:
l.append(n)
n = n//2
else:
l.append(n)
n = (3*n) + 1
table.update({n: l[i:] for i, n in enumerate(l) if n not in table})
return l
Is it working? Let's spy on it to make sure the memoised elements are being used:
class NoisyDict(dict):
def __getitem__(self, item):
print("getting", item)
return dict.__getitem__(self, item)
def collatz_sequence(start, table=NoisyDict()):
# etc
In [26]: collatz_sequence(5)
Out[26]: [5, 16, 8, 4, 2, 1]
In [27]: collatz_sequence(5)
getting 5
Out[27]: [5, 16, 8, 4, 2, 1]
In [28]: collatz_sequence(32)
getting 16
Out[28]: [32, 16, 8, 4, 2, 1]
In [29]: collatz_sequence.__defaults__[0]
Out[29]:
{1: [1],
2: [2, 1],
4: [4, 2, 1],
5: [5, 16, 8, 4, 2, 1],
8: [8, 4, 2, 1],
16: [16, 8, 4, 2, 1],
32: [32, 16, 8, 4, 2, 1]}
Edit: I knew it could be optimised! The secret is that there are two places in the function (the two return points) that we know l and table share no elements. While previously I avoided calling table.update with elements already in table by testing them, this version of the function instead exploits our knowledge of the control flow, saving lots of time.
[collatz_sequence(x) for x in range(500001, 1000000)] now times around 2 seconds on my computer, while a similar expression with #welter's version clocks in 400ms. I think this is because the functions don't actually compute the same thing - my version generates the whole sequence, while #welter's just finds its length. So I don't think I can get my implementation down to the same speed.
def collatz_sequence(start, table={}): # cheeky trick: store the (mutable) table as a default argument
"""Returns the Collatz sequence for a given starting number"""
l = []
n = start
while n not in l: # break if we find ourself in a cycle
# (don't assume the Collatz conjecture!)
if n in table:
table.update({x: l[i:] for i, x in enumerate(l)})
return l + table[n]
elif n%2 == 0:
l.append(n)
n = n//2
else:
l.append(n)
n = (3*n) + 1
table.update({x: l[i:] for i, x in enumerate(l)})
return l
PS - spot the bug!
This is my solution to PE14:
memo = {1:1}
def get_collatz(n):
if n in memo : return memo[n]
if n % 2 == 0:
terms = get_collatz(n/2) + 1
else:
terms = get_collatz(3*n + 1) + 1
memo[n] = terms
return terms
compare = 0
for x in xrange(1, 999999):
if x not in memo:
ctz = get_collatz(x)
if ctz > compare:
compare = ctz
culprit = x
print culprit

Categories