How to deal with excessively large inputs in python?

How to deal with excessively large inputs in python? - python

I am a beginner and I was practicing a question on hackerrank.
I wrote this code as part of a problem which timed out for large inputs:
K = int(input())
roomnos = input().split()
setroomnos = set(roomnos)
for r in setroomnos:
if roomnos.count(r) == 1:
print(r)
break
The next one was accepted by the judge for all test cases
K = int(input())
roomnos = [int(i) for i in input().split()]
setroomnos = set(roomnos)
c = (K * sum(setroomnos) - sum(roomnos)) // (K - 1)
print(c)
can you please explain why the first one timed out for large inputs and the second one worked fine
PS: The fundamental operation is to find a no which appears only once in a list as opposed to other nos which appear K times

Your first solution uses O(n) for with an O(n) count inside it - leading to O(n^2) complexity. Your second example doesn't nest operations in that way, and so is O(n) complexity.

Related

Coin change problem: difference between these two methods

I am implementing the coin change problem in python in CS50's pset6. When I first tackled the problem, this was the algorithm I used:
import time
while True:
try:
totalChange = input('How much change do I owe you? ')
totalChange = float(totalChange) # check it it's a valid numeric value
if totalChange < 0:
print('Error: Please enter a positive numeric value')
continue
break
except:
print('Error: Please enter a positive numeric value')
start_time1 = time.time()
change1 = int(totalChange * 100) # convert money into cents
n = 0
while change1 >= 25:
change1 -= 25
n += 1
while change1 >= 10:
change1 -= 10
n += 1
while change1 >= 5:
change1 -= 5
n += 1
while change1 >= 1:
change1 -= 1
n += 1
print(f'Method1: {n}')
print("--- %s seconds ---" % (time.time() - start_time1))
Having watched the lecture on dynamic programming, I wanted to implement it into this problem. This was my attempt:
while True:
try:
totalChange = input('How much change do I owe you? ')
totalChange = float(totalChange) # check it it's a valid numeric value
if totalChange < 0:
print('Error: Please enter a positive numeric value')
continue
break
except:
print('Error: Please enter a positive numeric value')
start_time2 = time.time()
change2 = int(totalChange*100)
rowsCoins = [1,5,10,25]
colsCoins = list(range(change2 + 1))
n = len(rowsCoins)
m = len(colsCoins)
matrix = [[i for i in range(m)] for j in range(n)]
for i in range(1,n):
for j in range(1,m):
if rowsCoins[i] == j:
matrix[i][j] = 1
elif rowsCoins[i] > j:
matrix[i][j] = matrix[i-1][j]
else:
matrix[i][j] = min(matrix[i-1][j], 1 + matrix[i][j-rowsCoins[i]])
print(f'Method2: {matrix[-1][-1]}')
print("--- %s seconds ---" % (time.time() - start_time2))
When I run the program, it gives the correct answers, but it takes a much longer time.
How could I adjust the second code so that it is correctly implementing dynamic programming. Is my problem that I am starting the loops from the top left corner of the matrix instead of the bottom right?
What are the time complexities of the algorithms for each code that I wrote (as well as for a correct implementation of dynamic programming). I suspect that for the first code, it follows O(n^4), and for the second code O(n*m), and a correct implementation of dynamic programming should be O(n). Am I correct to think this?
Any help for a better understanding of these algorithms is much appreciated.

I think both algorithms are basically O(n).
n in this case is the size of the number entered.
In the first algorithm, it's not O(n^4) as that would suggest you have 4 nested loops looping n times. Instead, you have 4 loops that run sequentially. If they didn't modify change1 at all, that would potentially be O(4n), which is the same as O(n).
In the second algorithm, your choice of variable names confuses things a little. n is a constant, and m is based on the size of the input, so is what would typically be called n. So, if we rename n to c and m to n, we get O(c*n) which, again, is the same as O(n).
The key point here is that for any particular n, and O(n) algorithm isn't necessarily faster than, say, an O(n^2) algorithm. Big O notation just describes how the amount of work done varies with the size of the input. What it does say, is that as n gets bigger, the time taken by an O(n) algorithm will increase slower than the time taken by an O(n^2) algorithm, so for some large enough n, the algorithm with the lower complexity will be quicker.

How could I adjust the second code so that it is correctly implementing dynamic programming. Is my problem that I am starting the loops from the top left corner of the matrix instead of the bottom right?
IMHO, this problem is not suitable for dynamic programming, so it is hard to implement the correct dp. Check a greedy solution https://github.com/endiliey/cs50/blob/master/pset6/greedy.py which should be the best solution.
What are the time complexities of the algorithms for each code that I wrote (as well as for a correct implementation of dynamic programming).
Basically both of your codes should be O(n), but it does not mean that they have the same time complexity, as you have said, the dp solution is much slower. That is because they have different factor(ratio). For example, 4n and 0.25n both are O(n) but they have different time complexity.
The greedy solution should have a time complexity of O(1).

Optimizing the run time of the nested for loop

I am just getting started with competitive programming and after writing the solution to certain problem i got the error of RUNTIME exceeded.
max( | a [ i ] - a [ j ] | + | i - j | )
Where a is a list of elements and i,j are index i need to get the max() of the above expression.
Here is a short but complete code snippet.
t = int(input()) # Number of test cases
for i in range(t):
n = int(input()) #size of list
a = list(map(int, str(input()).split())) # getting space separated input
res = []
for s in range(n): # These two loops are increasing the run-time
for d in range(n):
res.append(abs(a[s] - a[d]) + abs(s - d))
print(max(res))
Input File This link may expire(Hope it works)
1<=t<=100
1<=n<=10^5
0<=a[i]<=10^5
Run-time on leader-board for C language is 5sec and that for Python is 35sec while this code takes 80sec.
It is an online judge so independent on machine.numpy is not available.
Please keep it simple i am new to python.
Thanks for reading.

For a given j<=i, |a[i]-a[j]|+|i-j| = max(a[i]-a[j]+i-j, a[j]-a[i]+i-j).
Thus for a given i, the value of j<=i that maximizes |a[i]-a[j]|+|i-j| is either the j that maximizes a[j]-j or the j that minimizes a[j]+j.
Both these values can be computed as you run along the array, giving a simple O(n) algorithm:
def maxdiff(xs):
mp = mn = xs[0]
best = 0
for i, x in enumerate(xs):
mp = max(mp, x-i)
mn = min(mn, x+i)
best = max(best, x+i-mn, -x+i+mp)
return best
And here's some simple testing against a naive but obviously correct algorithm:
def maxdiff_naive(xs):
best = 0
for i in xrange(len(xs)):
for j in xrange(i+1):
best = max(best, abs(xs[i]-xs[j]) + abs(i-j))
return best
import random
for _ in xrange(500):
r = [random.randrange(1000) for _ in xrange(50)]
md1 = maxdiff(r)
md2 = maxdiff_naive(r)
if md1 != md2:
print "%d != %d\n%s" % (md1, md2, r)
exit
It takes a fraction of a second to run maxdiff on an array of size 10^5, which is significantly better than your reported leaderboard scores.

"Competitive programming" is not about saving a few milliseconds by using a different kind of loop; it's about being smart about how you approach a problem, and then implementing the solution efficiently.
Still, one thing that jumps out is that you are wasting time building a list only to scan it to find the max. Your double loop can be transformed to the following (ignoring other possible improvements):
print(max(abs(a[s] - a[d]) + abs(s - d) for s in range(n) for d in range(n)))
But that's small fry. Worry about your algorithm first, and then turn to even obvious time-wasters like this. You can cut the number of comparisons to half, as #Brett showed you, but I would first study the problem and ask myself: Do I really need to calculate this quantity n^2 times, or even 0.5*n^2 times? That's how you get the times down, not by shaving off milliseconds.

HackerRank Python - some test cases get "Terminated due to timeout", how can i optimize the code?

Currently, on HackerRank, I am trying to solve the Circular Array Rotation problem. Most of the test cases work, but some are "Terminated due to timeout".
How can I change my code to optimise it?
#!/bin/python3
import sys
n,k,q = input().strip().split(' ')
n,k,q = [int(n),int(k),int(q)]
a = [int(a_temp) for a_temp in input().strip().split(' ')]
m = []
for a0 in range(q):
m.append(int(input().strip()))
for i in range (0, k % n):
temp = a[n-1] # Stores the last element temporarily
a.pop(n-1) # Removes the last element
a = [temp] + a # Appends the temporary element to the start (prepends)
for i in range (0, q):
print(a[m[i]])

There's no need to transform the list at all. Just subtract k from the index you're passed whenever you do a lookup (perhaps with a modulus if k could be larger than n). This is O(1) per lookup, or O(q) overall.
Even if you wanted to transform the actual list, there's no need to do it one element at a time (which will require k operations that each take O(n) time, so O(n*k) total). You can simply concatenate a[-k:] and a[:-k] (again, perhaps with a modulus to fix the k > n case), taking O(n) time just once.

Python Code Optimisation [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
This question does not appear to be about programming within the scope defined in the help center.
Closed 9 years ago.
Improve this question
Recently I found a puzzle that required me to list all cyclic primes below a number.
In this context cyclic means that if we rotate the digits it is still prime:
eg.
1193 is prime
1931 is prime
9311 is prime
3119 is prime
This is the code I origanly wrote:
a=[]
upto=1000000
for x in range(upto):
a.append([x,0])
print('generated table')
a[1][1]=1
a[0][1]=1
for n in range(2,int(math.sqrt(upto))):
for k in range(2,(int(upto/n)+2)):
try:
a[n*k][1]=1
except IndexError:
pass
print('sive complete')
p=[]
for e in a:
if (e[1]==0):
p.append(e[0])
print('primes generated')
s=[]
for e in p:
pr=True
w=str(e)
if all(c not in w for c in ['2','4','6','8','5','0']):
for x in (w[i:]+w[:i] for i in range(len(w))):
if int(x) not in p:
pr=False
if pr==True:
s.append(e)
print('found',e)
print(s)
It was fairly slow! (about 12s) I know, the prime generation isn't perfect but, the final bit is the slowest. I knew that this process for upto=10e6 can be done in under a second, so after some research I removed any string manipulations in favor of this function:
def rotate(n):
prev=[]
for l in range(6,0,-1):
if(n<10**l):
length=l
while(n not in prev):
prev.append(n)
n=(n // 10) + (n % 10) * 10**(length-1)
yield n
I also removed the 5,0,2,4,6,8 testing as I didn't know how to implement it. The result? It runs even slower! (over ten minutes, I guess the 5,0,2,4,6,8 testing was a good idea)
I tried using time.time() but I didn't find anything terribly inefficient (in the first code). How is it possible to improve this code? Are there any bad practices I'm currently using?

Here is some optimized code:
import math
upto = 1000000
a = [True] * upto
p = []
for n in xrange(2,upto):
if a[n]:
p.append(n)
for k in xrange(2,(upto+n-1)//n):
a[k*n] = False
print('primes generated')
s = []
p = set(p)
for e in p:
pr=True
w=str(e)
if all(c not in w for c in ['2','4','6','8','5','0']):
for x in (w[i:]+w[:i] for i in range(len(w))):
if int(x) not in p:
pr=False
break
if pr:
s.append(e)
print(s)
most important optimizations:
simplified the sieve code
converted the list of primes into a set. This makes the test x in p be logaritmic instead of linear
added a break statement when found a non prime rotation
added cleaner (but equivalent) code:
import math
upto=1000000
sieve = [True] * upto
primes = set()
for n in xrange(2,upto):
if sieve[n]:
primes.add(n)
for k in xrange(2,(upto+n-1)//n):
sieve[k*n] = False
def good(e):
w = str(e)
for c in w:
if c not in '1379':
return False
for i in xrange(1,len(w)):
x = int(w[i:]+w[:i])
if x not in primes:
return False
return True
print filter(good,primes)

You can cut down on the time required for the first test by doing a set comparison instead of doing the full iteration each time like so:
flags = set('246850')
if not set(str(e)).intersection(flags):
# etc...
Which not only scales logarithmically, but also lets you pick up another factor of two on this step. You can even speed this up further and make it a little more elegant by transitioning it over to a generator that you can then use to do the final check like so:
flags = set('246850')
primes = set(p)
easy_checks = (str(prime) for prime in primes if not set(str(prime)).intersection(flags))
Finally you can rewrite that final bit to get rid of all the appending and whatnot, which tends to be super slow like so:
test = lambda number: any((int(number[i:]+number[:i]) in primes for i in xrange(len(number))))
final = [number for number in easy_checks if test(number)]

time complexity of variable loops

i want to try to calculate the O(n) of my program (in python). there are two problems:
1: i have a very basic knowledge of O(n) [aka: i know O(n) has to do with time and calculations]
and
2: all of the loops in my program are not set to any particular value. they are based on the input data.

The n in O(n) means precisely the input size. So, if I have this code:
def findmax(l):
maybemax = 0
for i in l:
if i > maybemax:
maybemax = i
return maybemax
Then I'd say that the complexity is O(n) -- how long it takes is proportional to the input size (since the loop loops as many times as the length of l).
If I had
def allbigger(l, m):
for el in l:
for el2 in m:
if el < el2:
return False
return True
then, in the worst case (that is, when I return True), I have one loop of length len(l) and inside it, one of length len(m), so I say that it's O(l * m) or O(n^2) if the lists are expected to be about the same length.

Try this out to start, then head to wiki:
Plain English Explanation of Big O Notation

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

How to deal with excessively large inputs in python? - python

Your first solution uses O(n) for with an O(n) count inside it - leading to O(n^2) complexity. Your second example doesn't nest operations in that way, and so is O(n) complexity.

Related

Coin change problem: difference between these two methods

Optimizing the run time of the nested for loop

HackerRank Python - some test cases get "Terminated due to timeout", how can i optimize the code?

Python Code Optimisation [closed]

time complexity of variable loops

Categories

Resources