Can some one please let me know the time complexity of below code.
nums=[1,2,4,6,180,290,1249]
ll=[]
l=[]
for i in nums:
for j in range(1,int(sqrt(i))+1):
if(i%j==0):
l.append(j)
ll.append(l.copy())
l.clear()
print(ll)
pass
There are three main operations that are going to determine the time complexity.
The outer loop for i in nums is O(N) where N = len(nums)
The inner loop for j in range(1,int(sqrt(i))+1)
Within the first loop we also have ll.append(l.copy()), where l.copy() is an O(k) operation where k == len(l)
Let N = len(nums), M = sqrt(max(nums)), and K = the length of the longest list l being copied.
As M and K are at the same level, this starts us at O(N * (M+K))
However, K is dependent on M, and will always be smaller (K is the number of factors of i <= sqrt(i)), so we can effectively ignore it.
This results in the a complexity of O(N * M), where N = len(nums), and M = sqrt(max(nums))
Related
I have trouble finding the time complexity of my program and the exact formula to compute the number of calls (represented by the length of the list). Here is my python program that I wrote:
from math import *
def calc(n):
i = n
li = []
while i>0:
j = 0
li.append(1)
while j<n:
li.append(1)
k = j
while k<n:
li.append(1)
k+=1
j+=1
i = floor(i/2)
return len(li)
for i in range(1, 16):
print(calc(i))
For computing time complexity of this program we come from most inner loop.
k=j
while k<n:
li.append(1)
k+=1
Here k iterate one by one forward move start from j to n. That means it run (n-k) time. Now back to the outer loop of this inner loop.
j=0
while j<n:
li.append(1)
k = j
//(n-j)
j+=1
Here j also iterate one by one from 0 to n. Total n times. And for every j inner loop iterate n-j.
while j=0 than k=0,1,2,3,....(n-1) Total n times
while j=1 than k=1,2,3,....(n-1) Total n-1 times
while j=2 than k=2,3,4,....(n-1) Total n-2 times
....
....
while j=n-1 than k=(n-1) Total 1 times
Total complexity = 1+2+3+......(n-2)+(n-1)+n
=(n*(n+1))/2;
Now the outer loop for i :
i=n
while i>0:
(n*(n+1))/2;
i = floor(i/2)
Here i start from n and every time divided by 2. We know if it iterate dividing by x than it's complexity will log(base x)(n). Hence, for i the complexity will approximately log(base 2)(n).
Hence the Total time complexity is approximately log(n)(n(n+1))/2;
it also can be represented as O(n^2*log(n)).
Maybe plotting your total number of calls against your value for n will help:
import matplotlib.pyplot as plt
x = [i for i in range(1, 16)]
y = [calc(n) for n in x]
plt.plot(x, y)
plt.show()
I was working on a leetcode problem (https://leetcode.com/problems/top-k-frequent-elements/) which is:
Given an integer array nums and an integer k, return the k most frequent elements. You may return the answer in any order.
I solved this using min-heap (My time complexity calculations are in comment - do correct me if I did a mistake):
from collections import Counter
if k == len(nums):
return nums
# O(N)
c = Counter(nums)
it = iter([(x[1], x[0]) for x in c.items()])
# O(K)
result = list(islice(it, k))
heapify(result)
# O(N-K)
for elem in it:
# O(log K)
heappushpop(result, elem)
# O(K)
return [pair[1] for pair in result]
# O(K) + O(N) + O((N - K) log K) + O(K log K)
# if k < N :
# O(N log K)
Then I saw a solution using Bucket Sort that suppose to beat the heap solution with O(N):
bucket = [[] for _ in nums]
# O(N)
c = collections.Counter(nums)
# O(d) where d is the number of distinct numbers. d <= N
for num, freq in c.items():
bucket[-freq].append(num)
# O(?)
return list(itertools.chain(*bucket))[:k]
How do we compute the time complexity of the itertools.chain call here?
Is it come from the fact that at most we will chain N elements? Is that enough to deduce it is O(N)?
In any case, at least on leetcode test cases the first one has better performance - what can be the reason for that?
The time complexity of list(itertools.chain(*bucket)) is O(N) where N is the total number of elements in the nested list bucket. The chain function is roughly equivalent to this:
def chain(*iterables):
for iterable in iterables:
for item in iterable:
yield item
The yield statement dominates the running time, is O(1), and executes N times, hence the result.
The reason your O(N log k) algorithm might end up being faster in practice is because log k is probably not very large; LeetCode says k is at most the number of distinct elements in the array, but I suspect for most of the test cases k is much smaller, and of course log k is smaller than that.
The O(N) algorithm probably has a relatively high constant factor because it allocates N lists and then randomly accesses them by index, resulting in a lot of cache misses; the append operations may also cause a lot of those lists to be reallocated as they become larger.
Notwithstanding my comment using nlargest seems to run more slowly than using heapify, etc. (see below). But the Bucket Sort, at least for this input, definitely is more performant. It would also seem that with the Bucket Sort that creating the full list of num elements to take the first k elements does not cause too much of a penalty.
from collections import Counter
from heapq import nlargest
from itertools import chain
def most_frequent_1a(nums, k):
if k == len(nums):
return nums
# O(N)
c = Counter(nums)
it = iter([(x[1], x[0]) for x in c.items()])
# O(K)
result = list(islice(it, k))
heapify(result)
# O(N-K)
for elem in it:
# O(log K)
heappushpop(result, elem)
# O(K)
return [pair[1] for pair in result]
def most_frequent_1b(nums, k):
if k == len(nums):
return nums
c = Counter(nums)
return [pair[1] for pair in nlargest(k, [(x[1], x[0]) for x in c.items()])]
def most_frequent_2a(nums, k):
bucket = [[] for _ in nums]
# O(N)
c = Counter(nums)
# O(d) where d is the number of distinct numbers. d <= N
for num, freq in c.items():
bucket[-freq].append(num)
# O(?)
return list(chain(*bucket))[:k]
def most_frequent_2b(nums, k):
bucket = [[] for _ in nums]
# O(N)
c = Counter(nums)
# O(d) where d is the number of distinct numbers. d <= N
for num, freq in c.items():
bucket[-freq].append(num)
# O(?)
# don't create full list:
i = 0
for elem in chain(*bucket):
yield elem
i += 1
if i == k:
break
import timeit
nums = [i for i in range(1000)]
nums.append(7)
nums.append(88)
nums.append(723)
print(most_frequent_1a(nums, 3))
print(most_frequent_1b(nums, 3))
print(most_frequent_2a(nums, 3))
print(list(most_frequent_2b(nums, 3)))
print(timeit.timeit(stmt='most_frequent_1a(nums, 3)', number=10000, globals=globals()))
print(timeit.timeit(stmt='most_frequent_1b(nums, 3)', number=10000, globals=globals()))
print(timeit.timeit(stmt='most_frequent_2a(nums, 3)', number=10000, globals=globals()))
print(timeit.timeit(stmt='list(most_frequent_2b(nums, 3))', number=10000, globals=globals()))
Prints:
[7, 723, 88]
[723, 88, 7]
[7, 88, 723]
[7, 88, 723]
3.180169899998873
4.487235299999156
2.710413699998753
2.62860400000136
PROBLEM :
You are given a list of size N, initialized with zeroes. You have to perform M operations on the list and output the maximum of final values of all the elements in the list. For every operation, you are given three integers a,b and k and you have to add value to all the elements ranging from index a to b(both inclusive).
Input Format
First line will contain two integers N and M separated by a single space.
Next lines will contain three integers a,b and k separated by a single space.
Numbers in list are numbered from 1 to N.
Here is the code which I have written:
n,m=map(int,input().split())
arr=[]
for i in range(n+1):
arr.append(0)
for j in range(m):
a,b,k=map(int,input().split())
for i in range(a,b+1):
arr[i]+=k;
print(max(arr))
When I try to submit my solution I get a "TERMINATED DUE TO TIMOUT" message.Could you please suggest a strategy to avoid these kind of errors and also a solution to the problem.
Thanks in advance!
Don't loop over the list range; instead, use map again to increment the indicated values. Something like
for j in range(m):
a,b,k=map(int,input().split())
arr[a:b+1] = map(lambda <increment-by-k>, arr[a:b+1])
This should let your resident optimization swoop in and save some time.
You probably need an algorithm that has better complexity than O(M*N).
You can put interval delimiters in a list:
n,m=map(int,input().split())
intervals = []
arr = [0 for i in range(n)]
for j in range(m):
a,b,k=map(int,input().split())
intervals += [(str(a), "begin", k)]
intervals += [(str(b), "end", k)]
intervals = sorted(intervals, key=lambda x: x[0]+x[1])
k, i = 0, 0
for op in intervals:
ind = int(op[0])
if op[1] == "begin":
while ind > i:
arr[i] += k
i += 1
k += op[2]
else:
while i <= ind:
arr[i] += k
i+= 1
k -= op[2]
print(arr)
If the sorting algorithm is O(MlogM), this is O(MlogM + N)
So I'm working on some practice problems and having trouble reducing the complexity. I am given an array of distinct integers a[] and a threshold value T. I need to find the number of triplets i,j,k such that a[i] < a[j] < a[k] and a[i] + a[j] + a[k] <= T. I've gotten this down from O(n^3) to O(n^2 log n) with the following python script. I'm wondering if I can optimize this any further.
import sys
import bisect
first_line = sys.stdin.readline().strip().split(' ')
num_numbers = int(first_line[0])
threshold = int(first_line[1])
count = 0
if num_numbers < 3:
print count
else:
numbers = sys.stdin.readline().strip().split(' ')
numbers = map(int, numbers)
numbers.sort()
for i in xrange(num_numbers - 2):
for j in xrange(i+1, num_numbers - 1):
k_1 = threshold - (numbers[i] + numbers[j])
if k_1 < numbers[j]:
break
else:
cross_thresh = bisect.bisect(numbers,k_1) - (j+1)
if cross_thresh > 0:
count += cross_thresh
print count
In the above example, the first input line simply provides the number of numbers and the threshold. The next line is the full list. If the list is less than 3, there is no triplets that can exist, so we return 0. If not, we read in the full list of integers, sort them, and then process them as follows: we iterate over every element of i and j (such that i < j) and we compute the highest value of k that would not break i + j + k <= T. We then find the index (s) of the first element in the list that violates this condition and take all the elements between j and s and add them to the count. For 30,000 elements in a list, this takes about 7 minutes to run. Is there any way to make it faster?
You are performing binary search for each (i,j) pair to find the corresponding value for k. Hence O(n^2 log(n)).
I can suggest an algorithm that will have the worst case time complexity of O(n^2).
Assume the list is sorted from left to right and elements are numbered from 1 to n. Then the pseudo code is:
for i = 1 to n - 2:
j = i + 1
find maximal k with binary search
while j < k:
j = j + 1
find maximal k with linear search to the left, starting from last k position
The reason this has the worst case time complexity of O(n^2) and not O(n^3) is because the position k is monotonically decreasing. Thus even with linear scanning, you are not spending O(n) for each (i,j) pair. Rather, you are spending a total of O(n) time to scan for k for each distinct i value.
O(n^2) version implemented in Python (based on wookie919's answer):
def triplets(N, T):
N = sorted(N)
result = 0
for i in xrange(len(N)-2):
k = len(N)-1
for j in xrange(i+1, len(N)-1):
while k>=0 and N[i]+N[j]+N[k]>T:
k-=1
result += max(k, j)-j
return result
import random
sample = random.sample(xrange(1000000), 30000)
print triplets(sample, 500000)
def bubble(lst):
swap = 'True'
counter = 0
n = len(lst)
m = len(lst)
while swap == 'True':
for j in range(n-1):
if lst[j] > lst[j+1]:
lst[j],lst[j+1] = lst[j+1],lst[j]
counter += 1
swap = 'True'
else:
swap = 'False'
n = n - 1
return counter
How do I shorten the time this function takes because I want to use it on a larger list.
Change algorithm.
Use MergeSort or QuickSort.
BubbleSort is O(n*n).
The only reason it exists is to show students how they should not sort arrays :)
MergeSort is worst case O(n log n).
QuickSort is O(n * n) worst case, average case O(n log n), but with "low constants", so it's usually faster than merge sort.
Search for them on the web.
If i'm not wrong... (don't rage at me if I am please)... I think I understood what you want to do:
def bubble(lst):
n = len(lst)
while True
newn = 0
for i in range(1, n-1):
if lst[i-1] > lst[i]:
lst[i-1],lst[i] = lst[i],lst[i-1]
newn = i
counter += 1
if newn <= 0:
return counter
n = newn
The complexity however will be always O(n * n) so you will not notice any important difference.
For example:
If your list is 2000 items and you use bubble sort, O(2000 * 2000) = 4000000 loop steps. This is huge.
O(2000 * log2 2000) = about 21931 of loop steps, and this is manageable.
def bubble(lol):
lol.sort()
return lol