A variation of the DNF is as follows:
def dutch_flag_partition(pivot_index , A):
pivot = A[pivot_index]
# First pass: group elements smaller than pivot.
for i in range(len(A)):
# Look for a smaller element.
for j in range(i + 1, len(A)):
if A[j] < pivot:
A[i], A[j] = A[j], A[i]
break
# Second pass: group elements larger than pivot.
for i in reversed(range(len(A))):
if A[i] < pivot:
break
# Look for a larger element. Stop when we reach an element less than
# pivot , since first pass has moved them to the start of A.
for j in reversed(range(i)):
if A[j] > pivot:
A[i], A[j] = A[j], A[i]
break
The additional space complexity is given as O(1). Is that because the swapping doesn't depend on the input length? And time complexity, given as O(N^2), is it so due to the nested loops? Thanks
The additional space complexity is given as O(1). Is that because the swapping doesn't depend on the input length?
No. Swapping, in fact, takes no extra space at all.
More importantly, you can't just look for one thing and say however much that thing takes, that's the complexity. You have to look over all the things, and the largest one determines the complexity. So, look over all the things you're creating:
pivot is just a reference to one of the list members, which is constant size.
a range is constant size.
an iterator over a range is constant-size.
the i and j integer values returned by the range iterator are constant size.1
…
Since nothing is larger than constant size, the total size is constant.
And time complexity, given as O(N^2), is it so due to the nested loops?
Well, yes, but you have to get a bit more detailed than that. Two nested loops don't necessarily mean quadratic. Two nested loops that do linear work inside the nested loop would be cubic. Two nested loops that combine so that the size of the inner loop depends inversely on the outer loop are linear. And so on.
And again, you have to add up everything, not just pick one thing and guess.
So, the first pass does:
A plain list indexing and assignment, constant.
A loop over the input length.
… with a loop over the input length
… with some list indexing, comparisons, and assignments, all constant
… which also breaks early in some cases… which we can come back to.
So, if the break doesn't help at all, that's O(1 + N * N * 1), which is O(N * N).
And the second pass is similarly O(N * (1 + N * 1)), which is again O(N * N).
And if you add O(N * N + N * N), you get O(N * N).
Also, even if the break made the first pass log-linear or something, O(N * log N + N * N) is still O(N * N), so it wouldn't matter.
So the time is quadratic.
1. Technically, this isn't quite true. Integers are variable-sized, and the memory they take is the log of their magnitude. So, i and j, and the stop attributes of the range objects, and probably some other stuff are all log N. But, unless you're dealing with huge-int arithmetic, like in crypto algorithms that multiply huge prime factors, people usually ignore this, and get away with it.
The additional space complexity is given as O(1). Is that because the swapping doesn't depend on the input length?
As you are "just" swapping there is no new data being created or generated, you are just reassigning values you already have, thus why the space complexity is constant.
And time complexity, given as O(N^2), is it so due to the nested loops?
True. It's a second order polynomial time complexity because you have two for loops nested.
You have a break in them, so in more favorable cases your time complexity will be below N^2. However, as big-O is worst case then it's ok to say it's of degree 2.
Related
I know there are many other questions out there asking for the general guide of how to calculate the time complexity, such as this one.
From them I have learnt that when there is a loop, such as the (for... if...) in my Python programme, the Time complexity is N * N where N is the size of input. (please correct me if this is also wrong) (Edited once after being corrected by an answer)
# greatest common divisor of two integers
a, b = map(int, input().split())
list = []
for i in range(1, a+b+1):
if a % i == 0 and b % i == 0:
list.append(i)
n = len(list)
print(list[n-1])
However, do other parts of the code also contribute to the time complexity, that will make it more than a simple O(n) = N^2 ? For example, in the second loop where I was finding the common divisors of both a and b (a%i = 0), is there a way to know how many machine instructions the computer will execute in finding all the divisors, and the consequent time complexity, in this specific loop?
I wish the question is making sense, apologise if it is not clear enough.
Thanks for answering
First, a few hints:
In your code there is no nested loop. The if-statement does not constitute a loop.
Not all nested loops have quadratic time complexity.
Writing O(n) = N*N doesn't make any sense: what is n and what is N? Why does n appear on the left but N is on the right? You should expect your time complexity function to be dependent on the input of your algorithm, so first define what the relevant inputs are and what names you give them.
Also, O(n) is a set of functions (namely those asymptotically bounded from above by the function f(n) = n, whereas f(N) = N*N is one function. By abuse of notation, we conventionally write n*n = O(n) to mean n*n ∈ O(n) (which is a mathematically false statement), but switching the sides (O(n) = n*n) is undefined. A mathematically correct statement would be n = O(n*n).
You can assume all (fixed bit-length) arithmetic operations to be O(1), since there is a constant upper bound to the number of processor instructions needed. The exact number of processor instructions is irrelevant for the analysis.
Let's look at the code in more detail and annotate it:
a, b = map(int, input().split()) # O(1)
list = [] # O(1)
for i in range(1, a+b+1): # O(a+b) multiplied by what's inside the loop
if a % i == 0 and b % i == 0: # O(1)
list.append(i) # O(1) (amortized)
n = len(list) # O(1)
print(list[n-1]) # O(log(a+b))
So what's the overall complexity? The dominating part is indeed the loop (the stuff before and after is negligible, complexity-wise), so it's O(a+b), if you take a and b to be the input parameters. (If you instead wanted to take the length N of your input input() as the input parameter, it would be O(2^N), since a+b grows exponentially with respect to N.)
One thing to keep in mind, and you have the right idea, is that higher degree take precedence. So you can have a step that’s constant O(1) but happens n times O(N) then it will be O(1) * O(N) = O(N).
Your program is O(N) because the only thing really affecting the time complexity is the loop, and as you know a simple loop like that is O(N) because it increases linearly as n increases.
Now if you had a nested loop that had both loops increasing as n increased, then it would be O(n^2).
I'm having a really hard time understanding how to calculate worst case run times and run times in general. Since there is a while loop, would the run time have to be n+1 because the while loop must run 1 additional time to check is the case is still valid?I've also been searching online for a good explanation/ practice on how to calculate these run times but I can't seem to find anything good. A link to something like this would be very much appreciated.
def reverse1(lst):
rev_lst = []
i = 0
while(i < len(lst)):
rev_lst.insert(0, lst[i])
i += 1
return rev_lst
def reverse2(lst):
rev_lst = []
i = len(lst) - 1
while (i >= 0):
rev_lst.append(lst[i])
i -= 1
return rev_lst
Constant factors or added values don't matter for big-O run times, so you're over-complicating this. The run time is O(n) (linear) for reverse2, and O(n**2) (quadratic) for reverse1 (because list.insert(0, x) is itself a O(n) operation, performed O(n) times).
Big-O runtime calculations are about how the algorithm behaves as the input size increases towards infinity, and the smaller factors don't matter here; O(n + 1) is the same as O(n) (as is O(5n) for that matter; as n increases, the constant multiplier of 5 is irrelevant to the change in runtime), O(n**2 + n) is still just O(n**2), etc.
Since the number of iterations is fixed for any given size of the input list for both functions, the "worst" time complexity would be the same as the "best" and the average here.
In reverse1, the operation of inserting an item into a list at index 0 costs O(n) because it has to copy all the items to their following positions, and coupled with the while loop that iterates for the number of times of the size of the input list, the time complexity of reverse1 would be O(n^2).
There's no such an issue in reverse2, however, since the append method costs just O(1) to execute, so its overall time complexity is O(n).
I'm going to give you a mathematical explanation of why extra iterations and operations with constant time doesn't matter.
This is O(n) since the definition of Big-Oh is that for f(n) ∈ O(g(n)) there exists some constant k such that f(n) < kg(n).
Consider an algorithm with runtime represented as f(n) = 10000n + 15000000. A way you could simplify this is by factoring out the n: f(n) = n(10000 + 15000000/n). For the worst case runtime, you only care about the performance of the algorithm for super large values of n. Because in this simplification you're dividing by n, in the second part, as n gets really big, the coefficient of n will approach 10000, since 15000000/n approaches 0 if n is enormous. Therefore, for n > N (this means for a large enough value of n) there must exist a constant k such that f(n) < kn, for example k = 10001. Therefore, f(n) ∈ O(n), it has linear runtime efficiency.
With that being said, this means you don't need to worry about constant differences in your runtime, even if you loop n+1 times. The only part that matter (for polynomial time) is the highest degree of n in your code. Your reverse algorithms are O(n) runtime, and even if you iterated n + 1000 times, it would still be O(n) runtime.
I have a question about the runtime complexity of the standard permutation finding algorithm. Consider a list A, find (and print) all permutations of its elements.
Here's my recursive implementation, where printperm() prints every permutation:
def printperm(A, p):
if len(A) == len(p):
print("".join(p))
return
for i in range(0, len(A)):
if A[i] != 0:
tmp = A[i] # remember ith position
A[i] = 0 # mark character i as used
p.append(tmp) # select character i for this permutation
printperm(A, p) # Solve subproblem, which is smaller because we marked a character in this subproblem as smaller
p.pop() # done with selecting character i for this permutation
A[i] = tmp # restore character i in preparation for selecting the next available character
printperm(['a', 'b', 'c', 'd'], [])
The runtime complexity appears to be O(n!) where n is the size of A. This is because at each recursion level, the amount of work decreases by 1. So, the top recursion level is n amount of work, the next level is n-1, and the next level is n-2, and so on. So the total complexity is n*(n-1)*(n-2)...=n!
Now the problem is the print("".join(p)) statement. Every time this line runs, it iterates through the list, which iterates through the entire list, which is complexity n. There are n! number of permutations of a list of size n. So that means the amount of work done by the print("".join(p)) statement is n!*n.
Does the presence of the print("".join(p)) statement then increases the runtime complexity to O(n * n!)?? But this doesn't seem right, because I'm not running the print statement on every recursion call. Where does my logic for getting O(n * n!) break down?
You're basically right! The possible confusion comes in your "... and the next level is n-2, and so on". "And so on" is glossing over that at the very bottom level of the recursion, you're not doing O(1) work, but rather O(n) work to do the print. So the total complexity is proportional to
n * (n-1) * (n-2) ... * 2 * n
which equals n! * n. Note that the .join() doesn't really matter to this. It would also take O(n) work to simply print(p).
EDIT: But that's not really right, for a different reason. At all levels above the print level, you're doing
for i in range(0, len(A)):
and len(A) doesn't change. So every level is doing O(n) work. To be sure, the deeper the level the more zeroes there are in A, and so the less work the loop does, but it's nevertheless still O(n) merely to iterate over range(n) at all.
Permutations can be generated from combinations using divide and conquer technique to achieve better time complexity. However, generated output is not in order.
Suppose we want to generate permutations for n=8 {0,1,2,3,4,5,6,7}.
Below are instances of permutations
01234567
01234576
03215654
30126754
Note first 4 items in each arrangement are of same set {0,1,2,3} and last 4 items are of same set {4,5,6,7}
This means given set A={0,1,2,3} B={4,5,6,7} permutations from A are 4! and from B are 4!.
It follows that taking one instance of permutations from A and one instance of permutations from B we get a correct permutation instance from universal set {0,1,2,3,4,5,6,7}.
Total permutations given sets A and B such that A union B and A equals universal set and B intersection is null set gives universal set are 2*A!*B! (By swapping A and B order
In this case we get 2*4!*4! = 1156.
So to generate all permutations when n=8, we simply need to generate possible combinations of sets A and B that satisfy conditions explained earlier.
Generating combinations of 8 elements taking 4 at time in lexicographical order is pretty fast. We need to generate the first half combinations each assign to respective set A and set B is found using set operation.
For each pair A and B we apply divide and conquer algorithm.
A and B doesn't need to be of equal sizes and the trick can be applied recursive for higher values of n.
In stead of using 8! = 40320 steps
We can use 8!/(2*4!4!)(4!+4!+1) = 1750 steps with this method. I ignore the cross product operation when joining set A and set B permutations because its straight forward operation.
In real implementation this method runs approximately 20 times faster than some naive algorithm and can take advantage of parallelism. For example different pairs of set A and B can be grouped and process on different threads.
I have two functions, both of which flatten an arbitrarily nested list of lists in Python.
I am trying to figure out the time complexity of both, to see which is more efficient, but I haven't found anything definitive on SO anything so far. There are lots of questions about lists of lists, but not to the nth degree of nesting.
function 1 (iterative)
def flattenIterative(arr):
i = 0
while i < len(arr):
while isinstance(arr[i], list):
if not arr[i]:
arr.pop(i)
i -= 1
break
else:
arr[i: i + 1] = arr[i]
i += 1
return arr
function 2 (recursive)
def flattenRecursive(arr):
if not arr:
return arr
if isinstance(arr[0], list):
return flattenRecursive(arr[0]) + flattenRecursive(arr[1:])
return arr[:1] + flattenRecursiveweb(arr[1:])
My thoughts are below:
function 1 complexity
I think that the time complexity for the iterative version is O(n * m), where n is the length of the initial array, and m is the amount of nesting. I think space complexity of O(n) where n is the length of the initial array.
function 2 complexity
I think that the time complexity for the recursive version will be O(n) where n is the length of the input array. I think space complexity of O(n * m) where n is the length of the initial array, and m is the depth of nesting.
summary
So, to me it seems that the iterative function is slower, but more efficient with space. Conversely, the recursive function is faster, but less efficient with space. Is this correct?
I don't think so. There are N elements, so you will need to visit each element at least once. Overall, your algorithm will run for O(N) iterations. The deciding factor is what happens per iteration.
Your first algorithm has 2 loops, but if you observe carefully, it is still iterating over each element O(1) times per iteration. However, as #abarnert pointed out, the arr[i: i + 1] = arr[i] moves every element from arr[i+1:] up, which is O(N) again.
Your second algorithm is similar, but you are adding lists in this case (in the previous case, it was a simple slice assignment), and unfortunately, list addition is linear in complexity.
In summary, both your algorithms are quadratic.
This is the function:
c = []
def badsort(l):
v = 0
m = len(l)
while v<m:
c.append(min(l))
l.remove(min(l))
v+=1
return c
Although I realize that this is a very inefficient way to sort, I was wondering what the time complexity of such a function would be, as although it does not have nested loops, it repeats the loop multiple times.
Terminology
Assume that n = len(l).
Iteration Count
The outer loops runs n times. The min() in the inner loop runs twice over l (room for optimisation here) but for incrementally decreasing numbers (for each iteration of the loop, the length of l decrements because you remove an item from the list every time).
That way the complexity is 2 * (n + (n-1) + (n-2) + ... + (n-n)).
This equals 2 * (n^2 - (1 + 2 + 3 + ... + n)).
The second term in the parenthesis is a triangular number and diverges to n*(n+1)/2.
Therefore your complexity equals 2*(n^2 - n*(n+1)/2)).
This can be expanded to 2*(n^2 - n^2/2 - n/2),
and simplified to n^2 - n.
BigO Notation
BigO notation is interested in the overall growth trend, rather than the precise growth rate of the function.
Drop Constants
In BigO notation, the constants are dropped. This leaves us still with n^2 - n since there are no constants.
Retain Only Dominant Terms
Also, in BigO notation only the dominant terms are considered. n^2 is dominant over n, so n is dropped.
Result
That means the answer in BigO is O(n) = n^2, i.e. quadratic complexity.
Here are a couple of useful points to help you understand how to find the complexity of a function.
Measure the number of iterations
Measure the complexity of each operation at each iteration
For the first point, you see the terminating condition is v < m, where v is 0 initially, and m is the size of the list. Since v increments by one at each iteration, the loop runs at most (at least) N times, where N is the size of the list.
Now, for the second point. Per iteration we have -
c.append(min(l))
Where min is a linear operation, taking O(N) time. append is a constant operation.
Next,
l.remove(min(l))
Again, min is linear, and so is remove. So, you have O(N) + O(N) which is O(N).
In summary, you have O(N) iterations, and O(N) per iteration, making it O(N ** 2), or quadratic.
The time compexity for this problem is O(n^2). While the code itself has only one obvious loop, the while loop, the min and max functions are both O(n) by implementation, because at worst case, it would have to scan the entire list to find the corresponding minimum or maximum value. list.remove is O(n) because it too has to traverse the list until it finds the first target value, which at worst case, could be at the end. list.append is amortized O(1), due to a clever implementation of the method, because list.append is technically O(n)/n = O(1) for n objects pushed:
def badsort(l):
v = 0
m = len(l)
while v<m: #O(n)
c.append(min(l)) #O(n) + O(1)
l.remove(min(l)) #O(n) + O(n)
v+=1
return c
Thus, there is:
Outer(O(n)) * Inner(O(n)+O(n)+O(n)) = Outer(O(n)) * Inner(O(n))
O(n)+O(n)+O(n) can be combined to simply O(n) because big o measures worst case. Thus, by combining the outer and inner compexities, the final complexity is O(n^2).