Down to zero problem - getting time exceeded error - python

Trying to solve hackerrank problem.
You are given Q queries. Each query consists of a single number N. You can perform 2 operations on N in each move. If N=a×b(a≠1, b≠1), we can change N=max(a,b) or decrease the value of N by 1.
Determine the minimum number of moves required to reduce the value of N to 0.
I have used BFS approach to solve this.
a. Generating all prime numbers using seive
b. using prime numbers I can simply avoid calculating the factors
c. I enqueue -1 along with all the factors to get to zero.
d. I have also used previous results to not enqueue encountered data.
This still is giving me time exceeded. Any idea? Added comments also in the code.
import math
#find out all the prime numbers
primes = [1]*(1000000+1)
primes[0] = 0
primes[1] = 0
for i in range(2, 1000000+1):
if primes[i] == 1:
j = 2
while i*j < 1000000:
primes[i*j] = 0
j += 1
n = int(input())
for i in range(n):
memoize= [-1 for i in range(1000000)]
count = 0
n = int(input())
queue = []
queue.append((n, count))
while len(queue):
data, count = queue.pop(0)
if data <= 1:
count += 1
break
#if it is a prime number then just enqueue -1
if primes[data] == 1 and memoize[data-1] == -1:
queue.append((data-1, count+1))
memoize[data-1] = 1
continue
#enqueue -1 along with all the factors
queue.append((data-1, count+1))
sqr = int(math.sqrt(data))
for i in range(sqr, 1, -1):
if data%i == 0:
div = max(int(data/i), i)
if memoize[div] == -1:
memoize[div] = 1
queue.append((div, count+1))
print(count)

There are two large causes of slowness with this code.
Clearing an array is slower than clearing a set
The first problem is this line:
memoize= [-1 for i in range(1000000)]
this prepares 1 million integers and is executed for each of your 1000 test cases. A faster approach is to simply use a Python set to indicate which values have already been visited.
Unnecessary loop being executed
The second problem is this line:
if primes[data] == 1 and memoize[data-1] == -1:
If you have a prime number, and you have already visited this number, you actually do the slow loop searching for prime factors which will never find any solutions (because it is a prime).
Faster code
In fact, the improvement due to using sets is so much that you don't even need your prime testing code and the following code passes all tests within the time limit:
import math
n = int(input())
for i in range(n):
memoize = set()
count = 0
n = int(input())
queue = []
queue.append((n, count))
while len(queue):
data, count = queue.pop(0)
if data <= 1:
if data==1:
count += 1
break
if data-1 not in memoize:
memoize.add(data-1)
queue.append((data-1, count+1))
sqr = int(math.sqrt(data))
for i in range(sqr, 1, -1):
if data%i == 0:
div = max(int(data/i), i)
if div not in memoize:
memoize.add(div)
queue.append((div, count+1))
print(count)

Alternatively, there's a O(n*sqrt(n)) time and O(n) space complexity solution that passes all the test cases just fine.
The idea is to cache minimum counts for each non-negative integer number up to 1,000,000 (the maximum possible input number in the question) !!!BEFORE!!! running any query. After doing so, for each query just return a minimum count for a given number stored in the cache. So, retrieving a result will have O(1) time complexity per query.
To find minimal counts for each number (let's call it down2ZeroCounts), we should consider several cases:
0 and 1 have 0 and 1 minimal counts correspondingly.
Prime number p doesn't have factors other than 1 and itself. Hence, its minimal count is 1 plus a minimal count of p - 1 or more formally down2ZeroCounts[p] = down2ZeroCounts[p - 1] + 1.
For a composite number num it's a bit more complicated. For any pair of factors a > 1,b > 1 such that num = a*b the minimal count of num is either down2ZeroCounts[a] + 1 or down2ZeroCounts[b] + 1 or down2ZeroCounts[num - 1] + 1.
So, we can gradually build minimal counts for each number in ascending order. Calculating a minimal count of each consequent number will be based on optimal counts for lower numbers and so in the end a list of optimal counts will be built.
To better understand the approach please check the code:
from __future__ import print_function
import os
import sys
maxNumber = 1000000
down2ZeroCounts = [None] * 1000001
def cacheDown2ZeroCounts():
down2ZeroCounts[0] = 0
down2ZeroCounts[1] = 1
currentNum = 2
while currentNum <= maxNumber:
if down2ZeroCounts[currentNum] is None:
down2ZeroCounts[currentNum] = down2ZeroCounts[currentNum - 1] + 1
else:
down2ZeroCounts[currentNum] = min(down2ZeroCounts[currentNum - 1] + 1, down2ZeroCounts[currentNum])
for i in xrange(2, currentNum + 1):
product = i * currentNum
if product > maxNumber:
break
elif down2ZeroCounts[product] is not None:
down2ZeroCounts[product] = min(down2ZeroCounts[product], down2ZeroCounts[currentNum] + 1)
else:
down2ZeroCounts[product] = down2ZeroCounts[currentNum] + 1
currentNum += 1
def downToZero(n):
return down2ZeroCounts[n]
if __name__ == '__main__':
fptr = open(os.environ['OUTPUT_PATH'], 'w')
q = int(raw_input())
cacheDown2ZeroCounts()
for q_itr in xrange(q):
n = int(raw_input())
result = downToZero(n)
fptr.write(str(result) + '\n')
fptr.close()

Related

Performance improvement for calculating the powerset of a list of integers

I am trying to compute the powerset of a list of prime numbers. I have already done some research and the prefered way of doing this seems to be using a line like
itertools.chain.from_iterable(itertools.combinations(primes, r) for r in range(2, len(primes) + 1))
and then iterating over all combinations to get the products with math.prod(). All in all, the code currently looks like this:
number = 200
p1 = []
# calculate all primes below specified number
for i in range(2, number + 1):
isPrime = True
for prime in p1:
if i % prime == 0:
isPrime = False
if isPrime:
p1.append(i)
Pp = []
myIterable = itertools.chain.from_iterable(itertools.combinations(p1, r) for r in range(2, len(p1) + 1))
# convert iterable to integer array of products -- The code below is extremely slow and should be improved
for x in myIterable:
newValue = math.prod(x)
if newValue <= number:
Pp.append(newValue)
This works, but it is not feasible for any "number" greater than 100 because of too high execution time. The problem is the last for loop, which takes forever to compute. Everything else performs reasonably well. The powerset has to be constricted to sets, whos products are less or equal to number, as done using the last if statement, or else the memory will explode.
The solution to this problem was to create a pointer array, which crawls through the prime array until the product of the pointed primes gets too high. The needed helper functions can be implemented like this:
def calcProductOfPointers(pointerArray, dataArray):
prod = 1
for pointer in pointerArray:
prod *= dataArray[pointer]
return prod
def incrementPointer(pointerArray, dataArray, threshold):
ret = False
for i in range(1, len(pointerArray) + 1):
index = len(pointerArray) - i
pointerArray[index] += 1
if calcProductOfPointers(pointerArray, dataArray) <= threshold and pointerArray[index] < len(dataArray):
ret = True
break
elif index > 0:
pointerArray[index] = pointerArray[index - 1] + 2
else:
break
return ret
And then the iteration over all powersets can be substituted with this code:
Pp = []
for i in range(2, len(p1) + 1): # start at a minimum of 2 prime factors
primePointers = []
for index in range(i):
primePointers.append(index)
if calcProductOfPointers(primePointers, p1) > number:
break
while calcProductOfPointers(primePointers, p1) <= number:
Pp.append(calcProductOfPointers(primePointers, p1))
if not incrementPointer(primePointers, p1, number):
break

How to count the number of unique numbers in sorted array using Binary Search?

I am trying to count the number of unique numbers in a sorted array using binary search. I need to get the edge of the change from one number to the next to count. I was thinking of doing this without using recursion. Is there an iterative approach?
def unique(x):
start = 0
end = len(x)-1
count =0
# This is the current number we are looking for
item = x[start]
while start <= end:
middle = (start + end)//2
if item == x[middle]:
start = middle+1
elif item < x[middle]:
end = middle -1
#when item item greater, change to next number
count+=1
# if the number
return count
unique([1,1,1,1,1,1,1,1,1,1,2,2,2,2,2,2,2,2,2,2,2,5,5,5,5,5,5,5,5,5,5])
Thank you.
Edit: Even if the runtime benefit is negligent from o(n), what is my binary search missing? It's confusing when not looking for an actual item. How can I fix this?
Working code exploiting binary search (returns 3 for given example).
As discussed in comments, complexity is about O(k*log(n)) where k is number of unique items, so this approach works well when k is small compared with n, and might become worse than linear scan in case of k ~ n
def countuniquebs(A):
n = len(A)
t = A[0]
l = 1
count = 0
while l < n - 1:
r = n - 1
while l < r:
m = (r + l) // 2
if A[m] > t:
r = m
else:
l = m + 1
count += 1
if l < n:
t = A[l]
return count
print(countuniquebs([1,1,1,1,1,1,1,1,1,1,2,2,2,2,2,2,2,2,2,2,2,5,5,5,5,5,5,5,5,5,5]))
I wouldn't quite call it "using a binary search", but this binary divide-and-conquer algorithm works in O(k*log(n)/log(k)) time, which is better than a repeated binary search, and never worse than a linear scan:
def countUniques(A, start, end):
len = end-start
if len < 1:
return 0
if A[start] == A[end-1]:
return 1
if len < 3:
return 2
mid = start + len//2
return countUniques(A, start, mid+1) + countUniques(A, mid, end) - 1
A = [1,1,1,1,1,1,1,1,1,1,2,2,2,2,2,2,2,2,2,2,2,3,4,5,5,5,5,5,5,5,5,5,5]
print(countUniques(A,0,len(A)))

Is there a way to make this reverse factorial code run more efficiently

I am just starting to learn python and made a program where it calculates the factorial number based on the factorial.
For example if I give the program the number 120 it will tell me it's factorial is 5
anyways my question is how can I make this code more efficient and faster.
Num = int(input())
i=0
for i in range(0,Num):
i = i + 1
x = Num/i
Num = x
if (x==1):
print(i)
Multiplications are much faster than divisions. You should try to reach the number with a factorial instead of dividing it iteratively:
def unfactorial(n):
f,i = 1,1
while f < n:
i += 1
f *= i
return i if f == n else None
unfactorial(120) # 5
A few things you can do:
Num = int(input())
i=0 # your for loop will initialize i, you don't need to do this here
for i in range(0,Num):
i = i + 1 # your for loop will increment i, no need to do this either
x = Num/i # you don't need the extra variable 'x' here
Num = x
if (x==1):
print(i)
You can rewrite this to look something like:
for index in range(1, number): # start range at 1
number /= index # this means; number = number / index
if number==1:
return index
Compute the factorials in ascending order until you reach (or exceed) the factorial you are looking for, using the previous factorial to efficiently compute the next.
def reverse_factorial(num):
i = 1
while num > 1:
i += 1
num /= i
return i
print(reverse_factorial(int(input())))

Trying to write a code that will find the sum of the even numbers of the Fibonacci sequence?

I'm new to programming and I'm trying to write a program in Python that will find the sum of the even numbers of the numbers below 4,000,000 in the Fibonacci sequence. I'm not sure what I'm doing wrong but nothing will print. Thanks for any help.
def fib():
listx = []
for x in range(4000000):
if x == 0:
return 1
elif x == 1:
return 1
else:
listx.append(fib(x - 1) + fib(x - 2))
return listx
def evens(fib):
y = 0
for x in fib():
if x % 2 == 0:
y += x
else:
continue
print (y)
Here's an approach that uses a generator to keep memory usage to a minimum:
def fib_gen(up_to):
n, m = 0, 1
while n <= up_to:
yield n
n, m = m, n + m
total = 0
for f in fib_gen(4000000):
if f % 2 == 0:
total += f
Another option:
def fib_gen(up_to, filter):
n, m = 0, 1
while n <= up_to:
if filter(n):
yield n
n, m = m, n + m
sum(fib_gen(4000000, lambda f: f % 2 == 0)) # sum of evens
sum(fib_gen(4000000, lambda f: f % 2)) # sum of odds
First things first, there appears to be some contention between your requirements and the code you've delivered :-) The text of your question (presumably taken from an assignment, or Euler #2) requests the ...
sum of the even numbers of the numbers below 4,000,000 in the Fibonacci sequence.
Your code is summing the even numbers from the first four million Fibonacci numbers which is vastly different. The four millionth Fibonacci number has, according to Binet's formula, north of 800,000 digits in it (as opposed to the seven digits in the highest one below four million).
So, assuming the text to be more correct than the code, you don't actually need to construct a list and then evaluate every item in it, that's rather wasteful on memory.
The Fibonacci numbers can be generated on the fly and then simply accumulated if they're even. It's also far more useful to be able to use an arbitrary method to accumulate the numbers, something like the following:
def sumFibWithCond(limit, callback):
# Set up initial conditions.
grandparent, parent, child = 0, 0, 1
accum = 0
# Loop until number is at or beyond limit.
while child < limit:
# Add any suitable number to the accumulator.
accum = accum + callback(child)
# Set up next Fibonacci cycle.
grandparent, parent = parent, child
child = grandparent + child
# Return accumulator when done.
return accum
def accumulateEvens(num):
# Return even numbers as-is, zero for odd numbers.
if num % 2 == 0:
return num
return 0
sumEvensBelowFourMillion = sumFibWithCond(4000000, accumulateEvens)
Of special note is the initial conditions. The numbers are initialised to 0, 0, 1 since we want to ensure we check every Fibonacci number (in child) for the accumulating condition. This means the initial value of child should be one assuming, as per the question, that's the first number you want.
This doesn't make any difference in the current scenario since one is not even but, were you to change the accumulating condition to "odd numbers" (or any other condition that allowed for one), it would make a difference.
And, if you'd prefer to subscribe to the Fibonacci sequence starting with zero, the starting values should be 0, 1, 0 instead.
Maybe this will help you.
def sumOfEvenFibs():
# a,b,c in the Fibonacci sequence
a = 1
b = 1
result = 0
while b < 4000000:
if b % 2 == 0:
result += b
c = a + b
a = b
b = c
return result

Python - Why does this prime factorization function get better performance from this?

I have this prime factorization function that I wrote:
def prime_factorization(n):
prime_factors = {}
for i in _prime_candidates(n):
if n % i == 0:
prime_factors[i] = 0
while n % i == 0:
n /= i
prime_factors[i] += 1
if n != 1: prime_factors[int(n)] = 1
return prime_factors
def _prime_candidates(n):
yield 2
for i in range(3, int(n**.5)+1, 2):
yield i
It takes around 0.387 seconds on my machine for n = 10^13. But if I copy the content of the for loop and run it for the number 2 before running the actual for loop, I get the same correct results but with a running time of about 0.003 seconds for n = 10^13. You can see that code below:
def prime_factorization(n):
prime_factors = {}
if n % 2 == 0:
prime_factors[2] = 0
while n % 2 == 0:
n /= 2
prime_factors[2] += 1
for i in _prime_candidates(n):
if n % i == 0:
prime_factors[i] = 0
while n % i == 0:
n /= i
prime_factors[i] += 1
if n != 1: prime_factors[int(n)] = 1
return prime_factors
def _prime_candidates(n):
yield 2
for i in range(3, int(n**.5)+1, 2):
yield i
Why does this cause such a massive performance gain?
Edit: I'm using Python 3.5 and I'm using the clock() function of the time module to benchmark.
In your initial version, _prime_candidates gets passed 10^13, so it generates candidates up to the square root of that.
In your second version, _prime_candidates gets passed 5^13, because all the factors of 2 have been divided out. It generates a much smaller number of candidates to test.
By folding the _prime_candidates logic into prime_factorization and recomputing the upper bound whenever you find a factor, you can get an even better, more general improvement:
def prime_factorization(n):
prime_factors = {}
factor_multiplicity = 0
while n % 2 == 0:
n //= 2
factor_multiplicity += 1
if factor_multiplicity:
prime_factors[2] = factor_multiplicity
factor_bound = n**.5
candidate = 3
while candidate <= factor_bound:
factor_multiplicity = 0
while n % i == 0:
n //= i
factor_multiplicity += 1
if factor_multiplicity:
prime_factors[candidate] = factor_multiplicity
factor_bound = n**.5
candidate += 2
if n != 1:
prime_factors[n] = 1
return prime_factors
Note that for large enough n, the computation of n**.5 eventually generates the wrong bound due to the limits of floating-point precision. You could fix this by comparing candidate * candidate <= n, or by using something like the decimal module to compute the bound to sufficient precision.
The reason is inside _prime_candidates function.
In your first example it generates all numbers 3,5,...,3162277 and you try to divide your n by all these candidates.
In your second example you firstly greatly reduce your n so _prime_candidates generates numbers 3,5,...,34939. it's much less numbers to check.

Categories