Understanding how to measure the time complexity of a function

Understanding how to measure the time complexity of a function - python

This is the function:
c = []
def badsort(l):
v = 0
m = len(l)
while v<m:
c.append(min(l))
l.remove(min(l))
v+=1
return c
Although I realize that this is a very inefficient way to sort, I was wondering what the time complexity of such a function would be, as although it does not have nested loops, it repeats the loop multiple times.

Terminology
Assume that n = len(l).
Iteration Count
The outer loops runs n times. The min() in the inner loop runs twice over l (room for optimisation here) but for incrementally decreasing numbers (for each iteration of the loop, the length of l decrements because you remove an item from the list every time).
That way the complexity is 2 * (n + (n-1) + (n-2) + ... + (n-n)).
This equals 2 * (n^2 - (1 + 2 + 3 + ... + n)).
The second term in the parenthesis is a triangular number and diverges to n*(n+1)/2.
Therefore your complexity equals 2*(n^2 - n*(n+1)/2)).
This can be expanded to 2*(n^2 - n^2/2 - n/2),
and simplified to n^2 - n.
BigO Notation
BigO notation is interested in the overall growth trend, rather than the precise growth rate of the function.
Drop Constants
In BigO notation, the constants are dropped. This leaves us still with n^2 - n since there are no constants.
Retain Only Dominant Terms
Also, in BigO notation only the dominant terms are considered. n^2 is dominant over n, so n is dropped.
Result
That means the answer in BigO is O(n) = n^2, i.e. quadratic complexity.

Here are a couple of useful points to help you understand how to find the complexity of a function.
Measure the number of iterations
Measure the complexity of each operation at each iteration
For the first point, you see the terminating condition is v < m, where v is 0 initially, and m is the size of the list. Since v increments by one at each iteration, the loop runs at most (at least) N times, where N is the size of the list.
Now, for the second point. Per iteration we have -
c.append(min(l))
Where min is a linear operation, taking O(N) time. append is a constant operation.
Next,
l.remove(min(l))
Again, min is linear, and so is remove. So, you have O(N) + O(N) which is O(N).
In summary, you have O(N) iterations, and O(N) per iteration, making it O(N ** 2), or quadratic.

The time compexity for this problem is O(n^2). While the code itself has only one obvious loop, the while loop, the min and max functions are both O(n) by implementation, because at worst case, it would have to scan the entire list to find the corresponding minimum or maximum value. list.remove is O(n) because it too has to traverse the list until it finds the first target value, which at worst case, could be at the end. list.append is amortized O(1), due to a clever implementation of the method, because list.append is technically O(n)/n = O(1) for n objects pushed:
def badsort(l):
v = 0
m = len(l)
while v<m: #O(n)
c.append(min(l)) #O(n) + O(1)
l.remove(min(l)) #O(n) + O(n)
v+=1
return c
Thus, there is:
Outer(O(n)) * Inner(O(n)+O(n)+O(n)) = Outer(O(n)) * Inner(O(n))
O(n)+O(n)+O(n) can be combined to simply O(n) because big o measures worst case. Thus, by combining the outer and inner compexities, the final complexity is O(n^2).

Related

What is the time complexity of sum in loop?

here is my code:
my_list.sort()
runningSum=0
for idx,query in enumerate(my_list):
runningSum+=sum(my_list[:idx])
as i know :
sort is O(nlogn)
for is O(n)
sum is O(n)
but total is how can I calculate the time complexity?
thanks

As you said sorting is O(n log n) (given an appropriate algorithm is used).
Then the for loop runs n times. In the first iteration there is only one operation, in the second loop there are two, in the third three and so on.
Therefore the computational effort can be described as the sum of the first n natural numbers:
1 + 2 + 3 + ... + n = (n* (n - 1)) / 2 = 0.5 * (n² - n) = O(n²)
The sum of the first n natural number can be calculated using the Gaussian summation formula.
The rules of big O notation dictate:
See Wikipedia for big-O notation
Therefore the overall runtime will be:
O(n log n) + O(n²) = O(n²)
Note: The algorithm you created is not the most efficient as you compute the same sum over and over again.

How do I calculate Time Complexity for this particular algorithm?

I know there are many other questions out there asking for the general guide of how to calculate the time complexity, such as this one.
From them I have learnt that when there is a loop, such as the (for... if...) in my Python programme, the Time complexity is N * N where N is the size of input. (please correct me if this is also wrong) (Edited once after being corrected by an answer)
# greatest common divisor of two integers
a, b = map(int, input().split())
list = []
for i in range(1, a+b+1):
if a % i == 0 and b % i == 0:
list.append(i)
n = len(list)
print(list[n-1])
However, do other parts of the code also contribute to the time complexity, that will make it more than a simple O(n) = N^2 ? For example, in the second loop where I was finding the common divisors of both a and b (a%i = 0), is there a way to know how many machine instructions the computer will execute in finding all the divisors, and the consequent time complexity, in this specific loop?
I wish the question is making sense, apologise if it is not clear enough.
Thanks for answering

First, a few hints:
In your code there is no nested loop. The if-statement does not constitute a loop.
Not all nested loops have quadratic time complexity.
Writing O(n) = N*N doesn't make any sense: what is n and what is N? Why does n appear on the left but N is on the right? You should expect your time complexity function to be dependent on the input of your algorithm, so first define what the relevant inputs are and what names you give them.
Also, O(n) is a set of functions (namely those asymptotically bounded from above by the function f(n) = n, whereas f(N) = N*N is one function. By abuse of notation, we conventionally write n*n = O(n) to mean n*n ∈ O(n) (which is a mathematically false statement), but switching the sides (O(n) = n*n) is undefined. A mathematically correct statement would be n = O(n*n).
You can assume all (fixed bit-length) arithmetic operations to be O(1), since there is a constant upper bound to the number of processor instructions needed. The exact number of processor instructions is irrelevant for the analysis.
Let's look at the code in more detail and annotate it:
a, b = map(int, input().split()) # O(1)
list = [] # O(1)
for i in range(1, a+b+1): # O(a+b) multiplied by what's inside the loop
if a % i == 0 and b % i == 0: # O(1)
list.append(i) # O(1) (amortized)
n = len(list) # O(1)
print(list[n-1]) # O(log(a+b))
So what's the overall complexity? The dominating part is indeed the loop (the stuff before and after is negligible, complexity-wise), so it's O(a+b), if you take a and b to be the input parameters. (If you instead wanted to take the length N of your input input() as the input parameter, it would be O(2^N), since a+b grows exponentially with respect to N.)

One thing to keep in mind, and you have the right idea, is that higher degree take precedence. So you can have a step that’s constant O(1) but happens n times O(N) then it will be O(1) * O(N) = O(N).
Your program is O(N) because the only thing really affecting the time complexity is the loop, and as you know a simple loop like that is O(N) because it increases linearly as n increases.
Now if you had a nested loop that had both loops increasing as n increased, then it would be O(n^2).

What would be the worst case run time of my two algorithms

I'm having a really hard time understanding how to calculate worst case run times and run times in general. Since there is a while loop, would the run time have to be n+1 because the while loop must run 1 additional time to check is the case is still valid?I've also been searching online for a good explanation/ practice on how to calculate these run times but I can't seem to find anything good. A link to something like this would be very much appreciated.
def reverse1(lst):
rev_lst = []
i = 0
while(i < len(lst)):
rev_lst.insert(0, lst[i])
i += 1
return rev_lst
def reverse2(lst):
rev_lst = []
i = len(lst) - 1
while (i >= 0):
rev_lst.append(lst[i])
i -= 1
return rev_lst

Constant factors or added values don't matter for big-O run times, so you're over-complicating this. The run time is O(n) (linear) for reverse2, and O(n**2) (quadratic) for reverse1 (because list.insert(0, x) is itself a O(n) operation, performed O(n) times).
Big-O runtime calculations are about how the algorithm behaves as the input size increases towards infinity, and the smaller factors don't matter here; O(n + 1) is the same as O(n) (as is O(5n) for that matter; as n increases, the constant multiplier of 5 is irrelevant to the change in runtime), O(n**2 + n) is still just O(n**2), etc.

Since the number of iterations is fixed for any given size of the input list for both functions, the "worst" time complexity would be the same as the "best" and the average here.
In reverse1, the operation of inserting an item into a list at index 0 costs O(n) because it has to copy all the items to their following positions, and coupled with the while loop that iterates for the number of times of the size of the input list, the time complexity of reverse1 would be O(n^2).
There's no such an issue in reverse2, however, since the append method costs just O(1) to execute, so its overall time complexity is O(n).

I'm going to give you a mathematical explanation of why extra iterations and operations with constant time doesn't matter.
This is O(n) since the definition of Big-Oh is that for f(n) ∈ O(g(n)) there exists some constant k such that f(n) < kg(n).
Consider an algorithm with runtime represented as f(n) = 10000n + 15000000. A way you could simplify this is by factoring out the n: f(n) = n(10000 + 15000000/n). For the worst case runtime, you only care about the performance of the algorithm for super large values of n. Because in this simplification you're dividing by n, in the second part, as n gets really big, the coefficient of n will approach 10000, since 15000000/n approaches 0 if n is enormous. Therefore, for n > N (this means for a large enough value of n) there must exist a constant k such that f(n) < kn, for example k = 10001. Therefore, f(n) ∈ O(n), it has linear runtime efficiency.
With that being said, this means you don't need to worry about constant differences in your runtime, even if you loop n+1 times. The only part that matter (for polynomial time) is the highest degree of n in your code. Your reverse algorithms are O(n) runtime, and even if you iterated n + 1000 times, it would still be O(n) runtime.

Space-Time complexity in variation of Dutch National Flag

A variation of the DNF is as follows:
def dutch_flag_partition(pivot_index , A):
pivot = A[pivot_index]
# First pass: group elements smaller than pivot.
for i in range(len(A)):
# Look for a smaller element.
for j in range(i + 1, len(A)):
if A[j] < pivot:
A[i], A[j] = A[j], A[i]
break
# Second pass: group elements larger than pivot.
for i in reversed(range(len(A))):
if A[i] < pivot:
break
# Look for a larger element. Stop when we reach an element less than
# pivot , since first pass has moved them to the start of A.
for j in reversed(range(i)):
if A[j] > pivot:
A[i], A[j] = A[j], A[i]
break
The additional space complexity is given as O(1). Is that because the swapping doesn't depend on the input length? And time complexity, given as O(N^2), is it so due to the nested loops? Thanks

The additional space complexity is given as O(1). Is that because the swapping doesn't depend on the input length?
No. Swapping, in fact, takes no extra space at all.
More importantly, you can't just look for one thing and say however much that thing takes, that's the complexity. You have to look over all the things, and the largest one determines the complexity. So, look over all the things you're creating:
pivot is just a reference to one of the list members, which is constant size.
a range is constant size.
an iterator over a range is constant-size.
the i and j integer values returned by the range iterator are constant size.1
…
Since nothing is larger than constant size, the total size is constant.
And time complexity, given as O(N^2), is it so due to the nested loops?
Well, yes, but you have to get a bit more detailed than that. Two nested loops don't necessarily mean quadratic. Two nested loops that do linear work inside the nested loop would be cubic. Two nested loops that combine so that the size of the inner loop depends inversely on the outer loop are linear. And so on.
And again, you have to add up everything, not just pick one thing and guess.
So, the first pass does:
A plain list indexing and assignment, constant.
A loop over the input length.
… with a loop over the input length
… with some list indexing, comparisons, and assignments, all constant
… which also breaks early in some cases… which we can come back to.
So, if the break doesn't help at all, that's O(1 + N * N * 1), which is O(N * N).
And the second pass is similarly O(N * (1 + N * 1)), which is again O(N * N).
And if you add O(N * N + N * N), you get O(N * N).
Also, even if the break made the first pass log-linear or something, O(N * log N + N * N) is still O(N * N), so it wouldn't matter.
So the time is quadratic.
1. Technically, this isn't quite true. Integers are variable-sized, and the memory they take is the log of their magnitude. So, i and j, and the stop attributes of the range objects, and probably some other stuff are all log N. But, unless you're dealing with huge-int arithmetic, like in crypto algorithms that multiply huge prime factors, people usually ignore this, and get away with it.

The additional space complexity is given as O(1). Is that because the swapping doesn't depend on the input length?
As you are "just" swapping there is no new data being created or generated, you are just reassigning values you already have, thus why the space complexity is constant.
And time complexity, given as O(N^2), is it so due to the nested loops?
True. It's a second order polynomial time complexity because you have two for loops nested.
You have a break in them, so in more favorable cases your time complexity will be below N^2. However, as big-O is worst case then it's ok to say it's of degree 2.

Time Complexity of list flattening

I have two functions, both of which flatten an arbitrarily nested list of lists in Python.
I am trying to figure out the time complexity of both, to see which is more efficient, but I haven't found anything definitive on SO anything so far. There are lots of questions about lists of lists, but not to the nth degree of nesting.
function 1 (iterative)
def flattenIterative(arr):
i = 0
while i < len(arr):
while isinstance(arr[i], list):
if not arr[i]:
arr.pop(i)
i -= 1
break
else:
arr[i: i + 1] = arr[i]
i += 1
return arr
function 2 (recursive)
def flattenRecursive(arr):
if not arr:
return arr
if isinstance(arr[0], list):
return flattenRecursive(arr[0]) + flattenRecursive(arr[1:])
return arr[:1] + flattenRecursiveweb(arr[1:])
My thoughts are below:
function 1 complexity
I think that the time complexity for the iterative version is O(n * m), where n is the length of the initial array, and m is the amount of nesting. I think space complexity of O(n) where n is the length of the initial array.
function 2 complexity
I think that the time complexity for the recursive version will be O(n) where n is the length of the input array. I think space complexity of O(n * m) where n is the length of the initial array, and m is the depth of nesting.
summary
So, to me it seems that the iterative function is slower, but more efficient with space. Conversely, the recursive function is faster, but less efficient with space. Is this correct?

I don't think so. There are N elements, so you will need to visit each element at least once. Overall, your algorithm will run for O(N) iterations. The deciding factor is what happens per iteration.
Your first algorithm has 2 loops, but if you observe carefully, it is still iterating over each element O(1) times per iteration. However, as #abarnert pointed out, the arr[i: i + 1] = arr[i] moves every element from arr[i+1:] up, which is O(N) again.
Your second algorithm is similar, but you are adding lists in this case (in the previous case, it was a simple slice assignment), and unfortunately, list addition is linear in complexity.
In summary, both your algorithms are quadratic.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.