SPOJ prime1 wrong answer in python - python

I am getting Wrong Answer with the following code in python for SPOJ's PRIME1 problem at http://www.spoj.com/problems/PRIME1/. I have tested it on various testcases myself, but cannot find a failing testcase. Can someone please spot the problem in my code?
This code produces nothing for testcases that don't give any prime as output. First i pre-compute primes upto sqrt(1 billion) and then if the requested range has high value less than sqrt(1 billion), i simply print the primes from the pre-computed array, else i run sieve() with usePrimes = True, which uses the pre-computed primes to rule out the non-primes in the given range.
Thanks.
import math
from bisect import bisect_left
from bisect import bisect_right
primes = []
upper_bound = int(math.sqrt(1000000000)) + 1
usePrimes = False
printNL = False
T = 0
def sieve(lo, hi):
global usePrimes, primes, printNL
atleast_one = False
arr = range(lo,hi+1)
if usePrimes:
for p in primes:
if p*p > hi:
break
less = int(lo/p) * p
if less < lo:
less += p
while less <= hi:
arr[less - lo] = 0
less += p
for num in arr:
if num != 0:
atleast_one = True
if printNL:
print ''
printNL = False
print num
else:
atleast_one = True
for k in xrange(2,hi):
if k*k > hi:
break
if arr[k] == 0:
continue
less = k + k
while less <= hi:
arr[less] = 0
less += k
for num in arr:
if num > 1:
primes.append(num)
return atleast_one
def printPrimesInRange(lo,hi):
global primes, printNL
atleast_one = False
if hi < upper_bound:
for p in primes[bisect_left(primes,lo):bisect_right(primes,hi)]:
atleast_one = True
if printNL:
print ''
printNL = False
print p
else:
atleast_one = sieve(lo,hi)
return atleast_one
sieve(0, upper_bound)
usePrimes = True
T = input()
while T > 0:
lo, hi = [eval(y) for y in raw_input().split(' ')]
atleastOne = printPrimesInRange(lo,hi)
if atleastOne:
printNL = True
T -= 1

If you change the upper_bound to upper_bound = int(math.sqrt(1000000000)) + 123456, then it will pass all the test cases.
Now, can you figure why so? I'll leave it as an exercise to you.

Related

Further Optimisation of Project Euler problem 14 (Collatz Sequence)

When I first starting trying the question, my code would take over a minute to even finish running and give me the answer. I have already tried dynamic programming and storing previous numbers so it doesn't have to run the same number multiple times. I have also tried compacting (n3)+1 and n / 2 into a single line with ((n3)+1) but both of these has only managed to cut my code to 10 seconds. Is there anything else I can try to speed up my code?
def Collatz(n):
dic = {a: 0 for a in range(1,1000000)}
dic[1] = 0
dic[2] = 1
number,length = 1,1
for i in range(3,n,1):
z = i
testlength = 0
loop = "T"
while loop == "T":
if z % 2 == 0:
z = z / 2
testlength += 1
else:
z = ((z*3)+1) / 2
testlength += 2
if z < i:
testlength += dic[z]
loop = "F"
dic[i] = testlength
if testlength > length:
print(i,testlength)
number,length = i,testlength
return number,length
print(Collatz(1000000))
When you calculate the sequence for one input, you find out the sequence length for all the intermediate values. It helps to remember all of these in the dictionary so you never have to calculate a sequence twice of any number < n.
I also started at (n-1)//2, since there's no point testing any number x if 2x is going to be tested later, because 2x will certainly have a longer sequence:
def Collatz(n):
dic = [-1]*n
dic[1] = 0
bestlen = 0
bestval = 1
q=[]
for i in range((n-1)//2,n,1):
q.clear()
z = i
while z >= n or dic[z] < 0:
q.append(z)
if z % 2 == 0:
z = z//2
else:
z = z*3+1
testlen = len(q)+dic[z]
if testlen > bestlen:
bestlen = testlen
bestval = i
print (bestval, bestlen)
for j in range(0,len(q)):
z = q[j]
if z < n:
dic[z] = testlen-j
return bestval, bestlen
print(Collatz(1000000))
Although the answer from Matt Timmermanns is fast, it is not quite as easy to understand as a recursive function. Here is my attempt that is actually faster for n = 10*million and perhaps easier to understand...
f = 10000000
def collatz(n):
if n>=collatz.bounds:
if (n % 4) == 0:
return collatz(n//4)+2
if (n % 2) == 0:
return collatz(n//2)+1
return collatz((3*n+1)//2)+2
if collatz.memory[n]>=0:
return collatz.memory[n]
if (n % 2) == 0:
count = collatz(n//2)+1
else:
count = collatz((3*n+1)//2)+2
collatz.memory[n] = count
return count
collatz.memory = [-1]*f
collatz.memory[1] = 0
collatz.bounds = f
highest = max(collatz(i) for i in range(f//2, f+1))
highest_n = collatz.memory.index(highest)
print(f"collatz({highest_n}) is {highest}")
My results:
$ time /usr/bin/python3 collatz.py
collatz(8400511) is 685
real 0m9.445s
user 0m9.375s
sys 0m0.060s
Compared to
$ time /usr/bin/python3 mattsCollatz.py
(8400511, 685)
real 0m10.672s
user 0m10.599s
sys 0m0.066s

Finding the sum of prime numbers between m and n (m and n included in the sum)

def isPrime(n, i):
if i == n-1:
return ("True")
elif n%i == 0:
return ("False")
else:
return isPrime(n, i+1)
def sumOfPrime(m,n):
if m > 0 and n > 0 and m <= n:
if isPrime(m,2)==True:
temp = temp + m
return temp
else:
return (sumOfPrime(m+1,n))
else:
return temp
how can I fix the error "UnboundLocalError: local variable 'temp' referenced before assignment" without using a global variable
I reviewed your code, and this is my proposal:
def isPrime(n, i=None):
if i is None:
i = n - 1
while i >= 2:
if n % i == 0:
return False
else:
return isPrime(n, i-1)
else:
return True
def sumOfPrime(m, n):
sum = 0
for value in range(m, n+1):
if isPrime(value):
sum = sum + value
return sum
# --- test ---
result = sumOfPrime(1, 9)
print (result) # <-- prints 18
If the difference between m and n is quite high, then it is recommended that you use some type of sieve, to filter primes out in a given range. Otherwise, iterating over numbers from m to n and checking if the number is prime, it is going to be expensive for large m and n.
def is_prime(n):
if n <= 2:
return n > 1
for i in range(2, int(n ** 0.5) + 1):
if n % i == 0:
return False
return True
def prime_range_sum(m, n):
return sum(i for i in range(m, n + 1) if is_prime(i))
print(prime_range_sum(1, 9))
# prints 17
Here's my version, which I kept as close as possible to the original, while fixing some errors and making some adjustments.
def isPrime(n, i=2): # since you always use 2, just make it default
if i == n-1:
return True # return a boolean True instead of a string
elif n%i == 0:
return False # return a boolean False instead of a string
else:
return isPrime(n, i+1)
def sumOfPrime(m,n,total=0): # we will need to carry the total around, make default to 0
if 0 < m <= n: # we can simplify this complex condition
if isPrime(m):
total += m # if it's prime, increase the total...
return sumOfPrime(m+1, n, total) # and pass it to the next recursion
return total # if this is the last recursion, return total
# Example run
total = sumOfPrime(10,45)
print(total) # prints 264

Having trouble with implementeing the Miller-Rabin compositeness in Python

I'm not sure if this is the right place to post this question so if it isn't let me know! I'm trying to implement the Miller Rabin test in python. The test is to find the first composite number that is a witness to N, an odd number. My code works for numbers that are somewhat smaller in length but stops working when I enter a huge number. (The "challenge" wants to find the witness of N := 14779897919793955962530084256322859998604150108176966387469447864639173396414229372284183833167 in which my code returns that it is prime when it isn't) The first part of the test is to convert N into the form 2^k + q, where q is a prime number.
Is there some limit with python that doesn't allow huge numbers for this?
Here is my code for that portion of the test.
def convertN(n): #this turns n into 2^x * q
placeholder = False
list = []
#this will be x in the equation
count = 1
while placeholder == False:
#x = result of division of 2^count
x = (n / (2**count))
#y tells if we can divide by 2 again or not
y = x%2
#if y != 0, it means that we cannot divide by 2, loop exits
if y != 0:
placeholder = True
list.append(count) #x
list.append(x) #q
else:
count += 1
#makes list to return
#print(list)
return list
The code for the actual test:
def test(N):
#if even return false
if N == 2 | N%2 == 0:
return "even"
#convert number to 2^k+q and put into said variables
n = N - 1
nArray = convertN(n)
k = nArray[0]
q = int(nArray[1])
#this is the upper limit a witness can be
limit = int(math.floor(2 * (math.log(N))**2))
#Checks when 2^q*k = 1 mod N
for a in range(2,limit):
modu = pow(a,q,N)
for i in range(k):
print(a,i,modu)
if i==0:
if modu == 1:
break
elif modu == -1:
break
elif i != 0:
if modu == 1:
#print(i)
return a
#instead of recalculating 2^q*k+1, can square old result and modN that.
modu = pow(modu,2,N)
Any feedback is appreciated!
I don't like unanswered questions so I decided to give a small update.
So as it turns out I was entering the wrong number from the start. Along with that my code should have tested not for when it equaled to 1 but if it equaled -1 from the 2nd part.
The fixed code for the checking
#Checks when 2^q*k = 1 mod N
for a in range(2,limit):
modu = pow(a,q,N)
witness = True #I couldn't think of a better way of doing this so I decided to go with a boolean value. So if any of values of -1 or 1 when i = 0 pop up, we know it's not a witness.
for i in range(k):
print(a,i,modu)
if i==0:
if modu == 1:
witness = False
break
elif modu == -1:
witness = False
break
#instead of recalculating 2^q*k+1, can square old result and modN that.
modu = pow(modu,2,N)
if(witness == True):
return a
Mei, i wrote a Miller Rabin Test in python, the Miller Rabin part is threaded so it's very fast, faster than sympy, for larger numbers:
import math
def strailing(N):
return N>>lars_last_powers_of_two_trailing(N)
def lars_last_powers_of_two_trailing(N):
""" This utilizes a bit trick to find the trailing zeros in a number
Finding the trailing number of zeros is simply a lookup for most
numbers and only in the case of 1 do you have to shift to find the
number of zeros, so there is no need to bit shift in 7 of 8 cases.
In those 7 cases, it's simply a lookup to find the amount of zeros.
"""
p,y=1,2
orign = N
N = N&15
if N == 1:
if ((orign -1) & (orign -2)) == 0: return orign.bit_length()-1
while orign&y == 0:
p+=1
y<<=1
return p
if N in [3, 7, 11, 15]: return 1
if N in [5, 13]: return 2
if N == 9: return 3
return 0
def primes_sieve2(limit):
a = [True] * limit
a[0] = a[1] = False
for (i, isprime) in enumerate(a):
if isprime:
yield i
for n in range(i*i, limit, i):
a[n] = False
def llinear_diophantinex(a, b, divmodx=1, x=1, y=0, offset=0, withstats=False, pow_mod_p2=False):
""" For the case we use here, using a
llinear_diophantinex(num, 1<<num.bit_length()) returns the
same result as a
pow(num, 1<<num.bit_length()-1, 1<<num.bit_length()). This
is 100 to 1000x times faster so we use this instead of a pow.
The extra code is worth it for the time savings.
"""
origa, origb = a, b
r=a
q = a//b
prevq=1
#k = powp2x(a)
if a == 1:
return 1
if withstats == True:
print(f"a = {a}, b = {b}, q = {q}, r = {r}")
while r != 0:
prevr = r
a,r,b = b, b, r
q,r = divmod(a,b)
x, y = y, x - q * y
if withstats == True:
print(f"a = {a}, b = {b}, q = {q}, r = {r}, x = {x}, y = {y}")
y = 1 - origb*x // origa - 1
if withstats == True:
print(f"x = {x}, y = {y}")
x,y=y,x
modx = (-abs(x)*divmodx)%origb
if withstats == True:
print(f"x = {x}, y = {y}, modx = {modx}")
if pow_mod_p2==False:
return (x*divmodx)%origb, y, modx, (origa)%origb
else:
if x < 0: return (modx*divmodx)%origb
else: return (x*divmodx)%origb
def MillerRabin(arglist):
""" This is a standard MillerRabin Test, but refactored so it can be
used with multi threading, so you can run a pool of MillerRabin
tests at the same time.
"""
N = arglist[0]
primetest = arglist[1]
iterx = arglist[2]
powx = arglist[3]
withstats = arglist[4]
primetest = pow(primetest, powx, N)
if withstats == True:
print("first: ",primetest)
if primetest == 1 or primetest == N - 1:
return True
else:
for x in range(0, iterx-1):
primetest = pow(primetest, 2, N)
if withstats == True:
print("else: ", primetest)
if primetest == N - 1: return True
if primetest == 1: return False
return False
# For trial division, we setup this global variable to hold primes
# up to 1,000,000
SFACTORINT_PRIMES=list(primes_sieve2(100000))
# Uses MillerRabin in a unique algorithimically deterministic way and
# also uses multithreading so all MillerRabin Tests are performed at
# the same time, speeding up the isprime test by a factor of 5 or more.
# More k tests can be performed than 5, but in my testing i've found
# that's all you need.
def sfactorint_isprime(N, kn=5, trialdivision=True, withstats=False):
from multiprocessing import Pool
if N == 2:
return True
if N % 2 == 0:
return False
if N < 2:
return False
# Trial Division Factoring
if trialdivision == True:
for xx in SFACTORINT_PRIMES:
if N%xx == 0 and N != xx:
return False
iterx = lars_last_powers_of_two_trailing(N)
""" This k test is a deterministic algorithmic test builder instead of
using random numbers. The offset of k, from -2 to +2 produces pow
tests that fail or pass instead of having to use random numbers
and more iterations. All you need are those 5 numbers from k to
get a primality answer. I've tested this against all numbers in
https://oeis.org/A001262/b001262.txt and all fail, plus other
exhaustive testing comparing to other isprimes to confirm it's
accuracy.
"""
k = llinear_diophantinex(N, 1<<N.bit_length(), pow_mod_p2=True) - 1
t = N >> iterx
tests = []
if kn % 2 == 0: offset = 0
else: offset = 1
for ktest in range(-(kn//2), (kn//2)+offset):
tests.append(k+ktest)
for primetest in range(len(tests)):
if tests[primetest] >= N:
tests[primetest] %= N
arglist = []
for primetest in range(len(tests)):
if tests[primetest] >= 2:
arglist.append([N, tests[primetest], iterx, t, withstats])
with Pool(kn) as p:
s=p.map(MillerRabin, arglist)
if s.count(True) == len(arglist): return True
else: return False
sinn=14779897919793955962530084256322859998604150108176966387469447864639173396414229372284183833167
print(sfactorint_isprime(sinn))

Python 3: Optimizing Project Euler Problem #14

I'm trying to solve the Hackerrank Project Euler Problem #14 (Longest Collatz sequence) using Python 3. Following is my implementation.
cache_limit = 5000001
lookup = [0] * cache_limit
lookup[1] = 1
def collatz(num):
if num == 1:
return 1
elif num % 2 == 0:
return num >> 1
else:
return (3 * num) + 1
def compute(start):
global cache_limit
global lookup
cur = start
count = 1
while cur > 1:
count += 1
if cur < cache_limit:
retrieved_count = lookup[cur]
if retrieved_count > 0:
count = count + retrieved_count - 2
break
else:
cur = collatz(cur)
else:
cur = collatz(cur)
if start < cache_limit:
lookup[start] = count
return count
def main(tc):
test_cases = [int(input()) for _ in range(tc)]
bound = max(test_cases)
results = [0] * (bound + 1)
start = 1
maxCount = 1
for i in range(1, bound + 1):
count = compute(i)
if count >= maxCount:
maxCount = count
start = i
results[i] = start
for tc in test_cases:
print(results[tc])
if __name__ == "__main__":
tc = int(input())
main(tc)
There are 12 test cases. The above implementation passes till test case #8 but fails for test cases #9 through #12 with the following reason.
Terminated due to timeout
I'm stuck with this for a while now. Not sure what else can be done here.
What else can be optimized here so that I stop getting timed out?
Any help will be appreciated :)
Note: Using the above implementation, I'm able to solve the actual Project Euler Problem #14. It is giving timeout only for those 4 test cases in hackerrank.
Yes, there are things you can do to your code to optimize it. But I think, more importantly, there is a mathematical observation you need to consider which is at the heart of the problem:
whenever n is odd, then 3 * n + 1 is always even.
Given this, one can always divide (3 * n + 1) by 2. And that saves one a fair bit of time...
Here is an improvement (it takes 1.6 seconds): there is no need to compute the sequence of every number. You can create a dictionary and store the number of the elements of a sequence. If a number that has appeared already comes up, the sequence is computed as dic[original_number] = dic[n] + count - 1. This saves a lot of time.
import time
start = time.time()
def main(n,dic):
'''Counts the elements of the sequence starting at n and finishing at 1'''
count = 1
original_number = n
while True:
if n < original_number:
dic[original_number] = dic[n] + count - 1 #-1 because when n < original_number, n is counted twice otherwise
break
if n == 1:
dic[original_number] = count
break
if (n % 2 == 0):
n = n/2
else:
n = 3*n + 1
count += 1
return dic
limit = 10**6
dic = {n:0 for n in range(1,limit+1)}
if __name__ == '__main__':
n = 1
while n < limit:
dic=main(n,dic)
n += 1
print('Longest chain: ', max(dic.values()))
print('Number that gives the longest chain: ', max(dic, key=dic.get))
end = time.time()
print('Time taken:', end-start)
The trick to solve this question is to compute the answers for only largest input and save the result as lookup for all smaller inputs rather than calculating for extreme upper bound.
Here is my implementation which passes all the Test Cases.(Python3)
MAX = int(5 * 1e6)
ans = [0]
steps = [0]*(MAX+1)
def solve(N):
if N < MAX+1:
if steps[N] != 0:
return steps[N]
if N == 1:
return 0
else:
if N % 2 != 0:
result = 1+ solve(3*N + 1) # This is recursion
else:
result = 1 + solve(N>>1) # This is recursion
if N < MAX+1:
steps[N]=result # This is memoization
return result
inputs = [int(input()) for _ in range(int(input()))]
largest = max(inputs)
mx = 0
collatz=1
for i in range(1,largest+1):
curr_count=solve(i)
if curr_count >= mx:
mx = curr_count
collatz = i
ans.append(collatz)
for _ in inputs:
print(ans[_])
this is my brute force take:
'
#counter
C = 0
N = 0
for i in range(1,1000001):
n = i
c = 0
while n != 1:
if n % 2 == 0:
_next = n/2
else:
_next= 3*n+1
c = c + 1
n = _next
if c > C:
C = c
N = i
print(N,C)
Here's my implementation(for the question specifically on Project Euler website):
num = 1
limit = int(input())
seq_list = []
while num < limit:
sequence_num = 0
n = num
if n == 1:
sequence_num = 1
else:
while n != 1:
if n % 2 == 0:
n = n / 2
sequence_num += 1
else:
n = 3 * n + 1
sequence_num += 1
sequence_num += 1
seq_list.append(sequence_num)
num += 1
k = seq_list.index(max(seq_list))
print(k + 1)

Why does set( ) make this code run so much faster?

I wrote some code for Project Euler Problem 35:
#Project Euler: Problem 35
import time
start = time.time()
def sieve_erat(n):
'''creates list of all primes < n'''
x = range(2,n)
b = 0
while x[b] < int(n ** 0.5) + 1:
x = filter(lambda y: y % x[b] != 0 or y == x[b], x)
b += 1
else:
return x
def circularPrimes(n):
'''returns # of circular primes below n'''
count = 0
primes = sieve_erat(n)
b = set(primes)
for prime in primes:
inc = 0
a = str(prime)
while inc < len(a):
if int(a) not in b:
break
a = a[-1] + a[0:len(a) - 1]
inc += 1
else:
count += 1
else:
return count
print circularPrimes(1000000)
elapsed = (time.time() - start)
print "Found in %s seconds" % elapsed
I am wondering why this code (above) runs so much faster when I set b = set(primes) in the circularPrimes function. The running time for this code is about 8 seconds. Initially, I did not set b = set(primes) and my circularPrimes function was this:
def circularPrimes(n):
'''returns # of circular primes below n'''
count = 0
primes = sieve_erat(n)
for prime in primes:
inc = 0
a = str(prime)
while inc < len(a):
if int(a) not in primes:
break
a = a[-1] + a[0:len(a) - 1]
inc += 1
else:
count += 1
else:
return count
My initial code (without b = set(primes)) ran so long that I didn't wait for it to finish. I am curious as to why there is such a large discrepancy in terms of running time between the two pieces of code as I do not believe that primes would have had any duplicates that would have made iterating through it take so much longer that iterating through set(primes). Maybe my idea of set( ) is wrong. Any help is welcome.
I believe the culprit here is if int(a) not in b:. Sets are implemented internally as hashtables, meaning that checking for membership is significantly less expensive than with a list (since you just need to check for collision).
You can check out the innards of sets here.

Categories