I'm trying to find out the time complexity (Big-O) of functions and trying to provide appropriate reason.
First function goes:
r = 0
# Assignment is constant time. Executed once. O(1)
for i in range(n):
for j in range(i+1,n):
for k in range(i,j):
r += 1
# Assignment and access are O(1). Executed n^3
like this.
I see that this is triple nested loop, so it must be O(n^3).
but I think my reasoning here is very weak. I don't really get what is going
on inside the triple nested loop here
Second function is:
i = n
# Assignment is constant time. Executed once. O(1)
while i>0:
k = 2 + 2
i = i // 2
# i is reduced by the equation above per iteration.
# so the assignment and access which are O(1) is executed
# log n times ??
I found out this algorithm to be O(1). But like the first function,
I don't see what is going on in the while-loop.
Can someone explain thoroughly about the time complexity of the two
functions? Thanks!
For such a simple case, you could find the number of iterations of the innermost loop as a function of n exactly:
sum_(i=0)^(n-1)(sum_(j=i+1)^(n-1)(sum_(k=i)^(j-1) 1)) = 1/6 n (n^2-1)
i.e., Θ(n**3) time complexity (see Big Theta): it assumes that r += 1 is O(1) if r has O(log n) digits (model has words with log n bits).
The second loop is even simpler: i //= 2 is i >>= 1. n has Θ(log n) digits and each iteration drops one binary digit (shift right) and therefore the whole loop is Θ(log n) time complexity if we assume that the i >> 1 shift of log(n) digits is O(1) operation (same model as in the first example).
Well first of all, for the first function, the time complexity seems to be closer to O(N log N) because the parameters of each loop decreases each time.
Also, for the second function, the runtime is O(log2 N). Except, say i == n == 2. After one run i is 1. After another i is 0.5. After another i is 0.25. And so on... I assume you would want int(i).
For a rigorous mathematical approach to each function, you can go to https://www.coursera.org/course/algo. It's a great course for this sort of thing. I was sort of sloppy in my calculations.
Related
I know there are many other questions out there asking for the general guide of how to calculate the time complexity, such as this one.
From them I have learnt that when there is a loop, such as the (for... if...) in my Python programme, the Time complexity is N * N where N is the size of input. (please correct me if this is also wrong) (Edited once after being corrected by an answer)
# greatest common divisor of two integers
a, b = map(int, input().split())
list = []
for i in range(1, a+b+1):
if a % i == 0 and b % i == 0:
list.append(i)
n = len(list)
print(list[n-1])
However, do other parts of the code also contribute to the time complexity, that will make it more than a simple O(n) = N^2 ? For example, in the second loop where I was finding the common divisors of both a and b (a%i = 0), is there a way to know how many machine instructions the computer will execute in finding all the divisors, and the consequent time complexity, in this specific loop?
I wish the question is making sense, apologise if it is not clear enough.
Thanks for answering
First, a few hints:
In your code there is no nested loop. The if-statement does not constitute a loop.
Not all nested loops have quadratic time complexity.
Writing O(n) = N*N doesn't make any sense: what is n and what is N? Why does n appear on the left but N is on the right? You should expect your time complexity function to be dependent on the input of your algorithm, so first define what the relevant inputs are and what names you give them.
Also, O(n) is a set of functions (namely those asymptotically bounded from above by the function f(n) = n, whereas f(N) = N*N is one function. By abuse of notation, we conventionally write n*n = O(n) to mean n*n ∈ O(n) (which is a mathematically false statement), but switching the sides (O(n) = n*n) is undefined. A mathematically correct statement would be n = O(n*n).
You can assume all (fixed bit-length) arithmetic operations to be O(1), since there is a constant upper bound to the number of processor instructions needed. The exact number of processor instructions is irrelevant for the analysis.
Let's look at the code in more detail and annotate it:
a, b = map(int, input().split()) # O(1)
list = [] # O(1)
for i in range(1, a+b+1): # O(a+b) multiplied by what's inside the loop
if a % i == 0 and b % i == 0: # O(1)
list.append(i) # O(1) (amortized)
n = len(list) # O(1)
print(list[n-1]) # O(log(a+b))
So what's the overall complexity? The dominating part is indeed the loop (the stuff before and after is negligible, complexity-wise), so it's O(a+b), if you take a and b to be the input parameters. (If you instead wanted to take the length N of your input input() as the input parameter, it would be O(2^N), since a+b grows exponentially with respect to N.)
One thing to keep in mind, and you have the right idea, is that higher degree take precedence. So you can have a step that’s constant O(1) but happens n times O(N) then it will be O(1) * O(N) = O(N).
Your program is O(N) because the only thing really affecting the time complexity is the loop, and as you know a simple loop like that is O(N) because it increases linearly as n increases.
Now if you had a nested loop that had both loops increasing as n increased, then it would be O(n^2).
def hasPairWithSum(arr,target):
for i in range(len(arr)):
if ((target-arr[i]) in arr[i+1:]):
return True
return False
In python, is time complexity for this function O(n) or O(n^2), in other words 'if ((target-arr[i]) in arr[i+1:])' is this another for loop or no?
Also what about following function, is it also O(n^2), if not why:
def hasPairWithSum2(arr,target):
seen = set()
for num in arr:
num2 = target - num
if num2 in seen:
return True
seen.add(num)
return False
Thanks!
The first version has a O(n²) time complexity:
Indeed the arr[i+1:] already creates a new list with n-1-i elements, which is not a constant time operation. The in operator will scan that new list, and in worst case visit each value in that new list.
If we count the number of elements copied into a new list (with arr[i+1), we can sum those counts for each iteration of the outer loop:
n-1
+ n-2
+ n-3
+ ...
+ 1
+ 0
This is a triangular number, and equals n(n-1)/2, which is O(n²).
Second version
The second version, using a set, runs in O(n) average time complexity.
There is no list slicing here, and the in operator on a set has -- contrary to a list argument -- an average constant time complexity. So now every action within the loop has an (average) constant time complexity, giving the algorithm an average time complexity of O(n).
According to the python docs, the in operator for a set can have an amortised worst time complexity of O(n). So you would still get a worst time complexity for your algorithm of O(n²).
It's O(n^2)
First loop will run n times and second will run (n-m) for every n.
So the whole thing will run n(n-m) times that is n^2 - nm. Knowing that m<n we know that it has a complexity of O(n^2)
But if it's critical to something you might consult someone who is better at that.
I'm having a really hard time understanding how to calculate worst case run times and run times in general. Since there is a while loop, would the run time have to be n+1 because the while loop must run 1 additional time to check is the case is still valid?I've also been searching online for a good explanation/ practice on how to calculate these run times but I can't seem to find anything good. A link to something like this would be very much appreciated.
def reverse1(lst):
rev_lst = []
i = 0
while(i < len(lst)):
rev_lst.insert(0, lst[i])
i += 1
return rev_lst
def reverse2(lst):
rev_lst = []
i = len(lst) - 1
while (i >= 0):
rev_lst.append(lst[i])
i -= 1
return rev_lst
Constant factors or added values don't matter for big-O run times, so you're over-complicating this. The run time is O(n) (linear) for reverse2, and O(n**2) (quadratic) for reverse1 (because list.insert(0, x) is itself a O(n) operation, performed O(n) times).
Big-O runtime calculations are about how the algorithm behaves as the input size increases towards infinity, and the smaller factors don't matter here; O(n + 1) is the same as O(n) (as is O(5n) for that matter; as n increases, the constant multiplier of 5 is irrelevant to the change in runtime), O(n**2 + n) is still just O(n**2), etc.
Since the number of iterations is fixed for any given size of the input list for both functions, the "worst" time complexity would be the same as the "best" and the average here.
In reverse1, the operation of inserting an item into a list at index 0 costs O(n) because it has to copy all the items to their following positions, and coupled with the while loop that iterates for the number of times of the size of the input list, the time complexity of reverse1 would be O(n^2).
There's no such an issue in reverse2, however, since the append method costs just O(1) to execute, so its overall time complexity is O(n).
I'm going to give you a mathematical explanation of why extra iterations and operations with constant time doesn't matter.
This is O(n) since the definition of Big-Oh is that for f(n) ∈ O(g(n)) there exists some constant k such that f(n) < kg(n).
Consider an algorithm with runtime represented as f(n) = 10000n + 15000000. A way you could simplify this is by factoring out the n: f(n) = n(10000 + 15000000/n). For the worst case runtime, you only care about the performance of the algorithm for super large values of n. Because in this simplification you're dividing by n, in the second part, as n gets really big, the coefficient of n will approach 10000, since 15000000/n approaches 0 if n is enormous. Therefore, for n > N (this means for a large enough value of n) there must exist a constant k such that f(n) < kn, for example k = 10001. Therefore, f(n) ∈ O(n), it has linear runtime efficiency.
With that being said, this means you don't need to worry about constant differences in your runtime, even if you loop n+1 times. The only part that matter (for polynomial time) is the highest degree of n in your code. Your reverse algorithms are O(n) runtime, and even if you iterated n + 1000 times, it would still be O(n) runtime.
I am trying to understand Big-O notation so I was making my own example for a O(n) using a while loop since I find while loops a bit confusing to understand in Big O notation. I defined a function called linear_example that takes in a list , the example is is python:
So my code is :
def linear_example (l):
n =10
while n>1:
n -= 1
for i in l:
print(i)
My thought process is the code in the for loop runs in constant time O(1)
and the code in the while loop runs in O(n) time .
So there for it would be O(1)+O(n) which would evaluate to O(n).
Feedback?
Think of a simple for-loop:
for i in l:
print(i)
This will be O(n) since you’re iterating through the list for however many items exist in l. (Where n == len(l))
Now we add a while loop which does the same thing ten times, so:
n + n + ... + n (x10)
And the complexity is O(10n).
Since this is still a polynomial with degree one, we can simplify this down to O(n), yes.
Not quite. First of all, n is not a fixed value, so O(n) is meaningless. Let's assume a given value M for this, changing the first two lines:
def linear_example (l, M):
n = M
The code in the for loop does run in O(1) time, provided that each element i of l is of finite bounded print time. However, the loop iterates len(l) times, so the loop complexity is O(len(l)).
Now, that loop runs once entirely through for each value of n in the while loop, a total of M times. Therefore, the complexity is the product of the loop complexities: O(M * len(l)).
def f2(L):
sum = 0
i = 1
while i < len(L):
sum = sum + L[i]
i = i * 2
return sum
Let n be the size of the list L passed to this function. Which of the following most accurately describes how the runtime of this function grow as n grows?
(a) It grows linearly, like n does.
(b) It grows quadratically, like n^2 does.
(c) It grows less than linearly.
(d) It grows more than quadratically.
I don't understand how you figure out the relationship between the runtime of the function and the growth of n. Can someone please explain this to me?
ok, since this is homework:
this is the code:
def f2(L):
sum = 0
i = 1
while i < len(L):
sum = sum + L[i]
i = i * 2
return sum
it is obviously dependant on len(L).
So lets see for each line, what it costs:
sum = 0
i = 1
# [...]
return sum
those are obviously constant time, independant of L.
In the loop we have:
sum = sum + L[i] # time to lookup L[i] (`timelookup(L)`) plus time to add to the sum (obviously constant time)
i = i * 2 # obviously constant time
and how many times is the loop executed?
it's obvously dependant on the size of L.
Lets call that loops(L)
so we got an overall complexity of
loops(L) * (timelookup(L) + const)
Being the nice guy I am, I'll tell you that list lookup is constant in python, so it boils down to
O(loops(L)) (constant factors ignored, as big-O convention implies)
And how often do you loop, based on the len() of L?
(a) as often as there are items in the list (b) quadratically as often as there are items in the list?
(c) less often as there are items in the list (d) more often than (b) ?
I am not a computer science major and I don't claim to have a strong grasp of this kind of theory, but I thought it might be relevant for someone from my perspective to try and contribute an answer.
Your function will always take time to execute, and if it is operating on a list argument of varying length, then the time it takes to run that function will be relative to how many elements are in that list.
Lets assume it takes 1 unit of time to process a list of length == 1. What the question is asking, is the relationship between the size of the list getting bigger vs the increase in time for this function to execute.
This link breaks down some basics of Big O notation: http://rob-bell.net/2009/06/a-beginners-guide-to-big-o-notation/
If it were O(1) complexity (which is not actually one of your A-D options) then it would mean the complexity never grows regardless of the size of L. Obviously in your example it is doing a while loop dependent on growing a counter i in relation to the length of L. I would focus on the fact that i is being multiplied, to indicate the relationship between how long it will take to get through that while loop vs the length of L. Basically, try to compare how many loops the while loop will need to perform at various values of len(L), and then that will determine your complexity. 1 unit of time can be 1 iteration through the while loop.
Hopefully I have made some form of contribution here, with my own lack of expertise on the subject.
Update
To clarify based on the comment from ch3ka, if you were doing more than what you currently have inside your with loop, then you would also have to consider the added complexity for each loop. But because your list lookup L[i] is constant complexity, as is the math that follows it, we can ignore those in terms of the complexity.
Here's a quick-and-dirty way to find out:
import matplotlib.pyplot as plt
def f2(L):
sum = 0
i = 1
times = 0
while i < len(L):
sum = sum + L[i]
i = i * 2
times += 1 # track how many times the loop gets called
return times
def main():
i = range(1200)
f_i = [f2([1]*n) for n in i]
plt.plot(i, f_i)
if __name__=="__main__":
main()
... which results in
Horizontal axis is size of L, vertical axis is how many times the function loops; big-O should be pretty obvious from this.
Consider what happens with an input of length n=10. Now consider what happens if the input size is doubled to 20. Will the runtime double as well? Then it's linear. If the runtime grows by factor 4, then it's quadratic. Etc.
When you look at the function, you have to determine how the size of the list will affect the number of loops that will occur.
In your specific situation, lets increment n and see how many times the while loop will run.
n = 0, loop = 0 times
n = 1, loop = 1 time
n = 2, loop = 1 time
n = 3, loop = 2 times
n = 4, loop = 2 times
See the pattern? Now answer your question, does it:
(a) It grows linearly, like n does. (b) It grows quadratically, like n^2 does.
(c) It grows less than linearly. (d) It grows more than quadratically.
Checkout Hugh's answer for an empirical result :)
it's O(log(len(L))), as list lookup is a constant time operation, independant of the size of the list.