Time complexity with a logarithmic recursive function - python

Could someone please explain the time complexity of the following bit of code:
def fn(n):
if n==0:
linear_time_fn(n) #some function that does work in O(n) time
else:
linear_time_fn(n)
fn(n//5)
I was under the impression that the complexity of the code is O(nlogn) while the actual complexity is be O(n). How is this function different from one like merge sort which has an O(nlogn) complexity? Thanks.

It's O(n) because n is smaller in each recursive level. So you have O(log n) calls to the function, but you don't do n units of work each time. The first call is O(n), the second call is O(n//5), the next call is O(n//5//5), and so on.
When you combine these, it's O(n).

You are correct that this is O(n). The difference between this and merge sort is that this makes one recursive call, while merge sort makes two.
So for this code, you have
One problem of size n
One problem of size n\2
One problem of size n\4
...
With merge sort, you have
One problem of size n
which yields two problems of size n/2
which yields four problems of size n/4
...
which yields n problems of size 1.
In the first case, you have n + n/2 + n/4 + ... = 1.
In the second case you have 1 + 1 + 1 + .... 1, but after log2(n) steps, you reach the end.

Related

What is the time complexity of a while loop that uses random.shuffle (python) inside of it?

first of all, can we even measure it since we don't know how many times random.shuffle will shuffle the array until it reaches the desired outcome
def sort(numbers):
import random
while not sort(numbers)==numbers:
random.shuffle(numbers)
return numbers
First I assume the function name to not be sort as this would be trivial and would lead to unconditional infinite recursion. I am assuming this function
import random
def random_sort(numbers):
while not sorted(numbers) == numbers:
random.shuffle(numbers)
return numbers
Without looking at the implementation to much I would assume O(n) for the inner shuffle random.shuffle(numbers). Where n is the number of elements in numbers.
Then we have the while loop. It stops when the array is sorted. Now shuffle returns us one of all possible permutations of numbers. The loop aborts when its sorted. This is for just one of those. (if we don't assume a small number space).
This stopping is statistical. So we need technically define which complexity we are speaking of. This is where best case, worst case, amortized case comes in.
Best case
The numbers we get are already sorted. Then we have the cost of sort(numbers) and the comparison .. == numbers. Sorting a sorted array is O(n). So our best case complexity is O(n).
Worst case
The shuffle never gives us the right permutation. This is definitely possible. The algorithm would never terminate. So its O(∞).
Average case
This is probably the most interesting case. First we need to establish how many permutations shuffle is giving us. Here is a link which discusses that. An approximation is given as e β‹… n!. Which is O(n!) (please check).
Now the question is on average when does our loop stop. This is answered in this link. They say its the geometric distribution (please check). The result is 1/ p, where p is the probablity of getting it. In our case this is p = 1 / (e β‹… n!). So we need on average e β‹… n! tries.
Now for each try we need to sort O(n log(n)), compare O(n) and compute the shuffle O(n). For the shuffle we can say it uses the Fisher Yates algorithm which has a complexity of O(n), as shown here.
So we have O(n! n log(n)) for the average complexity.

Can you help me with the time complexity of this Python code?

I have written this code and I think its time complexity is O(n+m) as time depends on both the inputs, am I right? Is there a better algorithm you can suggest?
The function return the length of union of both the inputs.
class Solution :
def getUnion(self,a,b,):
p= 0
lower, greater = a,b
if len(a)>len(b):
lower,greater = b,a
while p< len(lower): # O(n+m)
if lower[p] in greater:
greater.remove(lower[p])
p+=1
return len(lower+greater)
print(Solution().getUnion([1,2,3,4,5],[2,3,4,54,67]))
Assuming π‘š is the shorter length, and 𝑛 the longer (or both are equal), then the while loop will iterate π‘š times.
Inside a single iteration of that loop an in greater operation is executed, which has a time complexity of O(𝑛) -- for each individual execution.
So the total time complexity is O(π‘šπ‘›).
The correctness of this algorithm depends on whether we can assume that the input lists only contain unique values (each).
You can do better using a set:
return len(set(a + b))
Building a set is O(π‘š + 𝑛), and getting its length is a constant time operation, so this is O(π‘š + 𝑛)

Algorithm for finding the SumKSmallest

Can anyone help me solve this question in pseudocode?
Consider the function SumKSmallest(A[0..n βˆ’ 1], k) that returns the sum of the k smallest elements in an unsorted integer array A of size n. For example, given the array A =[6,-6,3,2,1,2,0,4,3,5] and k =3, the function should return -5.
a. Write an algorithm in pseudocode for SumKSmallest using the brute force paradigm. Indicate and justify (within a few sentences) the time complexity of your algorithm.b. Write another algorithm in pseudocode for SumKSmalleast using the transform & conquer paradigm. Your algorithm should strictly run in O(n log n) time. Justify the time complexity of your algorithm. c. Explain with details, how we could implement another SumKSmalleast that strictly runs in less than O(n log n) time, considering k<
Brute force
O(n^2) You got it right.
Second Approach:
Sort the array and sum the first k numbers to get the sum of the smallest K numbers. O(nlog(n)) Merge Sort.
Third Approach:
Heap: Take the first K elements and Max-heapify it. Now iterate over remaining elements taking them one by one and comparing it with the root of max-heap (arr[0]), if (arr[0] > element) remove arr[0] from the heap and add an element into heap. At last, you will be left with K smallest numbers of the array.
O(k + (n-k)log(k))
Read about Min-heap and Max-heap

Python generator time complexity log(n)

In python3 range built with help of generators
Logarithmic Time β€” O(log n)
An algorithm is said to have a logarithmic time complexity when it reduces the size of the input data in each step. example if we are printing first 10 digits with help of generators first we will get one element so remaining 9 element has to process, then second element so remaining 8 element has to process
for index in range(0, len(data)):
print(data[index])
When i check the url python generators time complexity confusion its saying O(n).
Since every time its generating only one output because we need to do __next__
it will be everytime 1 unit cost.
Can I get explanation on this
That explanation of logarithmic time complexity is wrong.
You get logarithmic complexity if you reduce the size of the input by a fraction, not by a fixed amount. For instance, binary search divides the size by 2 on each iteration, so it's O(log n). If the input size is 8 it takes 4 iterations, doubling the size to 16 only increase iterations to 5.

Creating a heap with heapify vs heappush. Which one is faster?

Question
I have to create a priority queue storing distances. To build the heap I am thinking about the following two possibilities:
from heapq import heapify, heappush
n = 35000 # input size
# way A: using heapify
dist = []
for i in range(n):
dist.push(distance) # distance is computed in O(1) time
heapify(dist)
# way B: using heappush
dist = []
for i in range(n):
heappush(dist, distance) # distance is computed in O(1) time
Which one is faster?
Reasoning
According to the docs heapify() runs in linear time, and I'm guessing heappush() runs in O(log n) time. Therefore, the running time for each way would be:
A: O(2n) = O(n)
B: O(n log n)
However, it is counter intuitive for me that A is faster than B. Am I missing something? is it A really faster than B?
**EDIT
I've been testing with different inputs and different sizes of the array, and I am still not sure which one is faster.
After reading the link of the comment by Elisha, I understand how heapify() runs in linear time. However, I still don't know if using heappush() could be faster depending on the input.
I mean, heappush() has a worst case running time of O(log n), but in average will probably be smaller, depending on the input. Its best case running time is actually O(1). In the other hand heapify() has a best case running time of O(n), and must be called after filling the array, which takes also O(n). That makes a best case of O(2n).
So heappush() could be as fast as linear or as slow as O(n log n), whereas heapify() is going to take 2n time in any case. If we look at the worst case, heapify() will be better. But what about an average case?
Can we even be sure that one be faster than te other?
Yes, we can be certain that one is faster than the other.
heap.push builds the heap from the bottom up. Each item is added to the end of the array and then "bubbled up" to its correct position. If you were building a min heap and you presented the items in reverse order, then every item you inserted would require log(n) (n being the current size of the heap) comparisons. So the worst case for building a heap by insertion is O(n log n).
Imagine starting with an empty heap and adding 127 items in reverse order (i.e. 127, 126, 125, 124, etc.). Each new item is smaller than all the other items, so every item will require the maximum number of swaps to bubble up from the last position to the top. The first item that’s added makes zero swaps. The next two items make one swap each. The next four items make two swaps each. Eight items make three swaps. 16 items make four swaps. 32 items make five swaps, and 64 items make six swaps.It works out to:
0 + 2*1 + 4*2 + 8*3 + 16*4 + 32*5 + 64*6
0 + 2 + 8 + 24 + 64 + 160 + 384 = 642 swaps
The worst case for build-heap is n swaps. Consider that same array of 127 items. The leaf level contains 64 nodes. build-heap starts at the halfway point and works its way backwards, moving things down as required. The next-to-last level has 32 nodes that at worst will move down one level. The next level up has 16 nodes that can't move down more than two levels. If you add it up, you get:
64*0 + 32*1 + 16*2 + 8*3 + 4*4 + 2*5 + 1*6
0 + 32 + 32 + 24 + 16 + 10 + 6 = 120 swaps
That's the absolute worst case for build-heap. It's O(n).
If you profile those two algorithms on an array of, say, a million items, you'll see a huge difference in the running time, with build-heap being much faster.

Categories