Checking for time complexity of a code - python

def sort_list(lst):
result = []
result.append(lst[0])
for i in range(1,len(lst)):
insert_list(lst[i], result)
return result
def insert_list(x, lst):
a = search(x, lst)
lst.insert(a, x)
return lst
def search(x, seq):
for i in seq:
if x<i:
return seq.index(i)
elif x == i:
return seq.index(i)
elif x>seq[-1]:
return (seq.index(seq[-1]))+1
Is the time complexity of this code O(n)?

Depends on the complexity of 'insert_list'. In this case, since insert_list is inside a for-loop, the complexity would be n*(complexity of insert_list).

If you take n unit of time for the loop as there are n elements inside the list and one unit time for each remaining steps. The total time taken is 1+1+n*(time for insert_list )+1 = n*(time for insert_list) + 3.
Again, the time complexity further depends on insert_list step.
Consider this step,
a = search(x, lst)
It has two parts, search and assignment.
Lets take linear search. its time complexity is O(n).
so for insert_list function time taken is (search)+(assign)+(insert) = n+1+1 = n+2
So the total time taken = n*(n+2)+3 = n^2+2n+3
Since O(n) is an asymptotic notation, n is taken as a very large value. In that case 2n and 3 can be omitted. So time complexity is O(n^2), which is a polynomial time complexity.

Related

How to calculate time complexity of these functions

def f1(n):
for i in range(n):
k = aux1(n - i)
while k > 0:
print(i*k)
k //= 2
def aux1(m):
jj = 0
for j in range(m):
jj += j
return m
I am trying to calculate the time complexity of function f1, and it's not really working out for me. I would appreciate any feedback on my work.
What I'm doing: I tried at first to substitute i=1 and try to go for an iteration, so the function calls aux with m=n-1, and aux1 iterates n-1 times, and returns m = n-1, so now in f1 we have k = n-1, and the while k > 0 loop runs log(n-1) times. so basically for the first run O(n) time complexity for f1 (coming from the call to function aux1).
But now with the loop we continue calling aux1 with n = n-1, n-2, n-3 ... 1, I am a little bit confused on how to continue calculating time complexity from here or if I'm on the right track.
Thanks in advance for any help and explanation!
This is all very silly but it can be figured out step by step.
The inner loop halves k every time, so its time complexity is O(log(aux1(n-i))).
Now what's aux1(n-i)? It is actually just n-i. But running it has time complexity n-i because of that superfluous weird extra loop.
Okay so now for the inner stuff we have one part time complexity n-i and one part log(n-i) so using the rules of time complexity, we can just ignore the smaller part (the log) and focus on the larger part that's O(n-i)`.
And now the outer loop has i run from 0 to n which means our time complexity would be O(n^2) because 1 + 2 + 3 + ... + n = O(n^2)
to find the factors I won't suggest the substitution approach for this type of question, rather try taking the approach where you actually try to calculate the order of functions on the basis of the number of operations they are trying to do.
Let's analyze it by first checking the below line
for i in range(n):
this will run for O(n) without any doubts.
k = aux1(n - i)
The complexity of the above line would be O( n * complexity of aux1(n-i))
Let's find the complexity of aux1(n-i) -> because of only one for loop it will also run for O(n) hence the complexity of the above line will be O(n * n)
Now the while loop will have a complexity of O(n * complexity of while loop)
while k > 0:
print(i*k)
k //= 2
this will run for log(k) times, but k is equal to (n-i) having an order of O(n)
hence, log(k) will be log(n). Making the complexity O(log(n)).
So the while loop will have a complexity of O(n*log(n)).
Now adding the overall complexities
O(nn) (complexity of aux1(n)) + O(nlog(n)) (complexity of while loop)
the above can be descibed as O(n^2) as big oh function requires the upper limit.

My program can't run that fast even with memoization

I tried a problem on project euler where I needed to find the sum of all the fibonacci terms under 4 million. It took me a long time but then I found out that I can use memoization to do it but it seems to take still a long time. After a lot of research, I found out that I can use a built-in module called lru_cache. My question is : why isn't it as fast as memoization ?
Here's my code:
from functools import lru_cache
#lru_cache(maxsize=1000000)
def fibonacci_memo(input_value):
global value
fibonacci_cache = {}
if input_value in fibonacci_cache:
return fibonacci_cache[input_value]
if input_value == 0:
value = 1
elif input_value == 1:
value = 1
elif input_value > 1:
value = fibonacci_memo(input_value - 1) + fibonacci_memo(input_value - 2)
fibonacci_cache[input_value] = value
return value
def sumOfFib():
SUM = 0
for n in range(500):
if fibonacci_memo(n) < 4000000:
if fibonacci_memo(n) % 2 == 0:
SUM += fibonacci_memo(n)
return SUM
print(sumOfFib())
The code works by the way. It takes less than a second to run it when I use the lru_cache module.
The other answer is the correct way to calculate the fibonacci sequence, indeed, but you should also know why your memoization wasn't working. To be specific:
fibonacci_cache = {}
This line being inside the function means you were emptying your cache every time fibonacci_memo was called.
You shouldn't be computing the Fibonacci sequence, not even by dynamic programming. Since the Fibonacci sequence satisfies a linear recurrence relation with constant coefficients and constant order, then so will be the sequence of their sums.
Definitely don't cache all the values. That will give you an unnecessary consumption of memory. When the recurrences have constant order, you only need to remember as many previous terms as the order of the recurrence.
Further more, there is a way to turn recurrences of constant order into systems recurrences of order one. The solution of the latter is given by a power of a matrix. This gives a faster algorithm, for large values of n. Each step will be more expensive, though. So, the best method would use a combination of the two, choosing the first method for small values of n and the latter for large inputs.
O(n) using the recurrence for the sum
Denote S_n=F_0+F_1+...+F_n the sum of the first Fibonacci numbers F_0,F_1,...,F_n.
Observe that
S_{n+1}-S_n=F_{n+1}
S_{n+2}-S_{n+1}=F_{n+2}
S_{n+3}-S_{n+2}=F_{n+3}
Since F_{n+3}=F_{n+2}+F_{n+1} we get that S_{n+3}-S_{n+2}=S_{n+2}-S_n. So
S_{n+3}=2S_{n+2}-S_n
with the initial conditions S_0=F_0=1, S_1=F_0+F_1=1+1=2, and S_2=S_1+F_2=2+2=4.
One thing that you can do is compute S_n bottom up, remembering the values of only the previous three terms at each step. You don't need to remember all of the values of S_k, from k=0 to k=n. This gives you an O(n) algorithm with O(1) amount of memory.
O(ln(n)) by matrix exponentiation
You can also get an O(ln(n)) algorithm in the following way:
Call X_n to be the column vector with components S_{n+2},S_{n+1},S_{n}
So, the recurrence above gives the recurrence
X_{n+1}=AX_n
where A is the matrix
[
[2,0,-1],
[1,0,0],
[0,1,0],
]
Therefore, X_n=A^nX_0. We have X_0. To multiply by A^n we can do exponentiation by squaring.
For the sake of completeness here are implementations of the general ideas described in #NotDijkstra's answer plus my humble optimizations including the "closed form" solution implemented in integer arithmetic.
We can see that the "smart" methods are not only an order of magnitude faster but also seem to scale better compatible with the fact (thanks #NotDijkstra) that Python big ints use better than naive multiplication.
import numpy as np
import operator as op
from simple_benchmark import BenchmarkBuilder, MultiArgument
B = BenchmarkBuilder()
def pow(b,e,mul=op.mul,unit=1):
if e == 0:
return unit
res = b
for bit in bin(e)[3:]:
res = mul(res,res)
if bit=="1":
res = mul(res,b)
return res
def mul_fib(a,b):
return (a[0]*b[0]+5*a[1]*b[1])>>1 , (a[0]*b[1]+a[1]*b[0])>>1
def fib_closed(n):
return pow((1,1),n+1,mul_fib)[1]
def fib_mat(n):
return pow(np.array([[1,1],[1,0]],'O'),n,op.matmul)[0,0]
def fib_sequential(n):
t1,t2 = 1,1
for i in range(n-1):
t1,t2 = t2,t1+t2
return t2
def sum_fib_direct(n):
t1,t2,res = 1,1,1
for i in range(n):
t1,t2,res = t2,t1+t2,res+t2
return res
def sum_fib(n,method="closed"):
if method == "direct":
return sum_fib_direct(n)
return globals()[f"fib_{method}"](n+2)-1
methods = "closed mat sequential direct".split()
def f(method):
def f(n):
return sum_fib(n,method)
f.__name__ = method
return f
for method in methods:
B.add_function(method)(f(method))
B.add_arguments('N')(lambda:(2*(1<<k,) for k in range(23)))
r = B.run()
r.plot()
import matplotlib.pylab as P
P.savefig(fib.png)
I am not sure how you are taking anything near a second. Here is the memoized version without fanciness:
class fibs(object):
def __init__(self):
self.thefibs = {0:0, 1:1}
def __call__(self, n):
if n not in self.thefibs:
self.thefibs[n] = self(n-1)+self(n-2)
return self.thefibs[n]
dog = fibs()
sum([dog(i) for i in range(40) if dog(i) < 4000000])

Time complexity of O(n) inside O(logn) outer loop

I am trying to figure out the time complexity of this algorithm. A is an array input. The code does not run, by the way, it is for demonstrative purposes.
def func(A):
result = 0
n = len(A)
while n > 1:
n = n/2
result = result + min(A[1,...,n])
return result
This assumes array A is of length n.
I would assume the time complexity of this to be O(n(log n)), as the while loop has complexity O(log n), and the min function has complexity O(n). However, this function is apparently of complexity O(n) not O(n(log n)). I am wondering how this could be?
The total number of iterations you get linearly depends on n. it is n/2 + n/4 + n/8 + ... = n(1/2 + 1/4 + 1/8 + ...) and this is what O(n) denotes.

Why is branching recursion faster than linear recursion (example: list inversion)

Yesterday I wrote two possible reverse functions for lists to demonstrate some one different ways to do list inversion. But then I noticed that the function using branching recursion (rev2) is actually faster than the function using linear recursion (rev1), even though the branching function takes more calls to finish and the same number of calls (minus one) of non-trivial calls (that are actually more computation intensive) than the non-trivial calls of the linearly recursive function. Nowhere am I explicitly triggering parallelism, so where does the performance difference come from that makes a function with more calls that are more involved take less time?
from sys import argv
from time import time
nontrivial_rev1_call = 0 # counts number of calls involving concatentation, indexing and slicing
nontrivial_rev2_call = 0 # counts number of calls involving concatentation, len-call, division and sclicing
length = int(argv[1])
def rev1(l):
global nontrivial_rev1_call
if l == []:
return []
nontrivial_rev1_call += 1
return rev1(l[1:])+[l[0]]
def rev2(l):
global nontrivial_rev2_call
if l == []:
return []
elif len(l) == 1:
return l
nontrivial_rev2_call += 1
return rev2(l[len(l)//2:]) + rev2(l[:len(l)//2])
lrev1 = rev1(list(range(length)))
print ('Calls to rev1 for a list of length {}: {}'.format(length, nontrivial_rev1_call))
lrev2 = rev2(list(range(length)))
print ('Calls to rev2 for a list of length {}: {}'.format(length, nontrivial_rev2_call))
print()
l = list(range(length))
start = time()
for i in range(1000):
lrev1 = rev1(l)
end = time()
print ("Average time taken for 1000 passes on a list of length {} with rev1: {} ms".format(length, (end-start)/1000*1000))
start = time()
for i in range(1000):
lrev2 = rev2(l)
end = time()
print ("Average time taken for 1000 passes on a list of length {} with rev2: {} ms".format(length, (end-start)/1000*1000))
Example call:
$ python reverse.py 996
calls to rev1 for a list of length 996: 996
calls to rev2 for a list of length 996: 995
Average time taken for 1000 passes on a list of length 996 with rev1: 7.90629506111145 ms
Average time taken for 1000 passes on a list of length 996 with rev2: 1.3290061950683594 ms
Short answer: It's not that much the calls here, but it is the amount of copying of the lists. As a result the linear recursion has time complexity O(n2) wheras the branching recursion has time complexity O(n log n).
The recursive call here does not operate in constant time: it operates in the length of the list it copies. Indeed, if you copy a list of n elements, it will require O(n) time.
Now if we perform the linear recursion, it means we will perform O(n) calls (the maximum call depth is O(n)). Each time, we will copy the list entirely, except for one item. So the time complexity is:
n
---
\ n * (n+1)
/ k = -----------
--- 2
k=1
So the time complexity of the algorithm is - given the calls itself are done in O(1) - O(n2).
In case we perform branching recursion, we make two copies of the list, each with a length that is approximately half. So every level of recursion will take O(n) time (since these halves result in copies of the list as well, and if we sum these up, we make an entire copy at every recursive level). But the number of levels scales logwise:
log n
-----
\
/ n = n log n
-----
k=1
So the time complexity is here O(n log n) (here log is the 2-log, but that does not matter in terms of big oh).
Using views
Instead of copying lists, we can use views: here we keep a reference to the same list, but use two integers that specify the span of the list. For example:
def rev1(l, frm, to):
global nontrivial_rev1_call
if frm >= to:
return []
nontrivial_rev1_call += 1
result = rev1(l, frm+1, to)
result.append(l[frm])
return result
def rev2(l, frm, to):
global nontrivial_rev2_call
if frm >= to:
return []
elif to-frm == 1:
return l[frm]
nontrivial_rev2_call += 1
mid = (frm+to)//2
return rev2(l, mid, to) + rev2(l, frm, mid)
If we now run the timeit module, we obtain:
>>> timeit.timeit(partial(rev1, list(range(966)), 0, 966), number=10000)
2.176353386021219
>>> timeit.timeit(partial(rev2, list(range(966)), 0, 966), number=10000)
3.7402000919682905
This is because we no longer copy the lists, and thus the append(..) function works in O(1) amortized cost. Whereas for the branching recursion, we append two lists, so it works in O(k) with k the sum of the length of the two lists. So now we compare O(n) (linear recursion), with O(n log n) (branching recursion).

How do i check the time complexity of a comprehension

I have gone through many blogs regarding python time complexity and posting my doubt:
In case of list comprehensions how will the time complexity be analysed?
For example:
x = [(i,xyz_list.count(i)) for i in xyz_set]
where xyz_list = [1,1,1,1,3,3,4,5,3,2,2,4,5,3] and xyz_set = set([1, 2, 3, 4, 5])
So, is the complexity the one line of code O(n*n*n) [i.e., O(n) for iteration, O(n) for list creation, O(n) for count function]??
This is quadratic O(n^2):
x = [(i,xyz_list.count(i)) for i in xyz_set]
xyz_list.count(i)) # 0(n) operation
for every i in xyz_set you do a 0(n) xyz_list.count(i)
You can write it using a double for loop which will make it more obvious:
res = []
for i in xyz_set: # 0(n)
count = 0
for j in xyz_list: # 0(n)
if i == j: # constant operation 0(1)
count += 1 # constant operation 0(1)
res.append(count) # constant operation 0(1)
Python time complexity
usually when you see a double for loop the complexity will be quadratic unless you are dealing with some constant number, for instance we just want to check the first 7 elements of xyz_list then the running time would be 0(n) presuming we are doing the same constant work inside the inner for:
sec = 7
res = []
for i in xyz_set:
count = 0
for j in xyz_list[:sec]:
......
The complexities are not necessarily multiplied. In many cases they are just added up.
In your case:
O(n) for iteration, and O(n) for list creation, and for each new item there is O(n) for count() which gives n*O(n). The total complexity is O(n) + O(n) + n*O(n) = O(n*n)
A list comprehension is nothing special, it is just a loop. You could rewrite your code to:
x = []
for i in xyz_set:
item = (i, xyz_list.count(i))
x.append(item)
So we have a loop, and we have a O(n) list.count() operation, making the algorithm O(N**2) (quadratic).

Categories