Time complexity of Python Function Involving List Operations - python

When I plot the time taken for the following algorithm for different size input, the time complexity appears to be polynomial. I'm not sure which operations account for this.
I'm assuming it's to do with list(s), del l[i] and l[::-1], but I'm not clear what the complexity of these is individually. Can anyone please explain?
Also, is there a way to optimize the algorithm without completely changing the approach? (I know there is a way to bring it down to linear time complexity by using "double-ended pincer-movement".)
def palindrome_index(s):
for i, c in enumerate(s):
l = list(s)
del l[i]
if l[::-1] == l:
return i
return -1

Your algorithm indeed is quadratic in len(s):
In iteration i, you perform linear time operations in the length: creating the list, reversing it, and (on linear on average) erasing element i. Since you perform this len(s) times, it is quadratic in len(s).
I'm assuming it's to do with list(s), del l[i] and l[::-1], but I'm not clear what the complexity of these is individually. Can anyone please explain?
Each of these operations is linear time (at least on average, which is enough to analyze your algorithm). Constructing a list, either from an iterable, or by reversing an existing list, is linear in the length of the list. Deleting element i, at the very least, requires about n - i + 1 shifts of the elements, as each one is moved back once.

All of these are linear "O(n)":
list(s)
list(s) creates a new list from s. To do that, it has to go through all elements in s, so its time is proportional to the length of s.
l[::-1]
Just like list(s), l[::-1] creates a new list with the same elements as l, but in different order. It has to touch each element once, so its time is proportional to the length of l.
del l[i]
In order to delete an element at position i, the element which was at position i+1 has to be moved to position i, then element which was at i+2 has to be moved to position i+1 etc. So, if you are deleting the first element (del l[0]), it has to touch move elements of the list and if you are deleting the last (del l[-1]), it just has to remove the last. On average, it will move n/2 elements, so it is also linear.

Related

Why does this function have exponential complexity big O instead of quadratic?

The following function has been given:
def genSubsets(L):
res = []
if len(L) == 0:
return [[]]
smaller = genSubsets(L[:-1])
extra = L[-1:]
new = []
for small in smaller:
new.append(small+extra)
return smaller+new
From my understanding, i making a copy of a list is (O n), then looping is (O n) as well. Which should make this (O n^2). However, it seems that my logic is flawed and the answer is (O 2^n). Why?
From my understanding, i making a copy of a list is (O n)
You are correct that making a copy of a list of n items takes time O(n). And in this case, each of the lists that's being copied is a subset of the original list, which has length n, so each list copied does take time O(n).
then looping is (O n) as well
Looping over a list of length n takes time O(n). However, in this case, the lists that you're looping over do not have n elements in them. There are 2n subsets of a set of size n, so at the top-level recursive call, when you recursively generate all subsets of L[:-1], you will end up with a list of 2n-1 items. Looping over that list takes time O(2n).
More generally, when looking at a loop or a list, it's important to ask "how many times does this loop run?" or "how many elements are in this list?"

space complexity for python function

could someone help to get the space complexity for this python function?
input: nums = [1, 2, 3, 4, 5, 6, 7, 8....]
m = integer
for i in range(len(nums)):
temp = nums[i:i+m]
should this space complexity as o(m), or o(n*m), and why? Thank you!
Not including the input, with that piece of code since m doesn't seem to be a constant, it should just be O(m) because at any given point in time, we are only storing 1 chunk of nums[i:i+m] because temp is just reassigned with a new sublist for every loop thus making the previous sublist to be subject for garbage collection already.
So regardless if there are 1 million nums and m is only 5, then we would only be storing 5 items now, then next iteration leave that previous 5 items and store a new set of 5 items (depending on python implementation, this might even just use the same memory used and overwrite the previous one), and so on.
But if you are storing each sublist such as:
temp_list = []
for i in range(len(nums)):
temp = nums[i:i+m]
temp_list.append(temp)
Then it should be O(m * len(nums)) because we will be storing m items for each element in nums.
Ignoring Python's garbage collection, you get space complexity of O(mn), where n = len(nums). That's because you first allocate a list of n elements, and then you allocate n lists of m elements each (note that slicing a list creates a new allocation). That gives a total of n + mn cell allocations, which is asymptotically O(mn).
But the lists created in the for loop are all referenced by temp. That means that as soon as a new list is created, the previous one has no references and it becomes eligible for garbage collection. That leaves us practically with two lists: nums with length of n, and the last temp with length of m, which amounts to space complexity of O(m + n).

Longest phrase in Tweet - Python timeout

Input - array / list a, constant k
Output - Length of Longest sublist/subarray with sum <=k
E.g. given
I am Bob
i.e. array [1,2,3] and k=3
Sublists possible are [1],[2],[3],[1,2]
Longest sublist here is [1,2]
Length = 2
Issue - TimeOut error in Python on Hackerrank
Time Complexity - 1 for loop - O(n)
Space complexity O(n)
def maxLength(a, k):
lenmax=0
dummy=[]
for i in a:
dummy.append(i)
if sum(dummy)<=k:
lenmax=max(lenmax,len(dummy))
else:
del dummy[0]
return lenmax
Solved it by replacing the time-intensive operation
Time-out occurs when it exceeds the time limit set by HackerRank for every environment "HackerRank TimeOut"
Solution
Replace sum() function by a variable
In worst case, sum(list) would take O(n^2) time if the entire list was to be summed up all the time.
Instead, maintaining a variable would mean O(n) for the entire function as O(1) for updating a variable.
def maxLength(a, k):
lenmax=0
dummy=[]
sumdummy=0
for i in a:
dummy.append(i)
sumdummy+=i
if sumdummy<=k:
lenmax=max(lenmax,len(dummy))
else:
sumdummy-=dummy[0]
del dummy[0]
return lenmax

What is fastest way to determine numbers are within specific range of each other in Python?

I have list of numbers as follows -
L = [ 1430185458, 1430185456, 1430185245, 1430185246, 1430185001 ]
I am trying to determine which numbers are within range of "2" from each other. List will be in unsorted when I receive it.
If there are numbers within range of 2 from each other I have to return "1" at exact same position number was received in.
I was able to achieve desired result , however code is running very slow. My approach involves sorting list, iterating it twice taking two pointers and comparing it successively. I will have millions of records coming as seperate lists.
Just trying to see what is best possible approach to address this problem.
Edit - Apology as I was away for a while. List can have any number of elements in it ranging from 1 to n. Idea is to return either 0 or 1 in exact same position number was received. I can not post actual code I implemented but here is pseudo code.
a. create new list as list of list with second part as 0 for each element. We assume that there are no numbers within range of 2 of each other.
[[1430185458,0], [1430185456,0], [1430185245,0], [1430185246,0], [1430185001,0]]
b. sort original list
c. compare first element to second, second to third and so on until end of list is reached and whenever difference is less than or equal to 2 update corresponding second elements in step a to 1.
[[1430185458,1], [1430185456,1], [1430185245,1], [1430185246,1], [1430185001,0]]
The goal is to be fast, so that presumably means an O(N) algorithm. Building an NxN difference matrix is O(N^2), so that's not good at all. Sorting is O(N*log(N)), so that's out, too. Assuming average case O(1) behavior for dictionary insert and lookup, the following is an O(N) algorithm. It rips through a list of a million random integers in a couple of seconds.
def in_range (numbers) :
result = [0] * len(numbers)
index = {}
for idx, number in enumerate(numbers) :
for offset in range(-2,3) :
match_idx = index.get(number+offset)
if match_idx is not None :
result[match_idx] = result[idx] = 1
index[number] = idx
return result
Update
I have to return "1" at exact same position number was received in.
The update to the question asks for a list of the form [[1,1],[2,1],[5,0]] given an input of [1,2,5]. I didn't do that. Instead, my code returns [1,1,0] given [1,2,5]. It's about 15% faster to produce that simple 0/1 list compared to the [[value,in_range],...] list. The desired list can easily be created using zip:
zip(numbers,in_range(numbers)) # Generator
list(zip(numbers,in_range(numbers))) # List of (value,in_range) tuples
I think this does what you need (process() modifies the list L). Very likely it's still optimizable, though:
def process(L):
s = [(v,k) for k,v in enumerate(L)]
s.sort()
j = 0
for i,v_k in enumerate(s):
v = v_k[0]
while j < i and v-s[j][0]>2:
j += 1
while j < i:
L[s[j][1]] = 1
L[s[i][1]] = 1
j += 1

How should I analyze and or prove this simple sorting algorithm?

I have been racking my head trying to wrap my mind around this thing, but I cannot figure out the proper loop invariant and expected behavior of the following code. Any help would be much appreciated.
def modSwapSort(L):
""" L is a list on integers """
print "Original L: ", L
for i in range(len(L)):
for j in range(len(L)):
if L[j] < L[i]:
# the next line is a short
# form for swap L[i] and L[j]
L[j], L[i] = L[i], L[j]
print L
print "Final L: ", L
It is a some form of selection sort.
It's complexity is O(n^2).
It could be helped with some simple optimisation in a for-ranges.
Because now on every i-loop step it founds the least element in j-loop and puts it in the i-th position, the second least element will be in postion (i-1) and so on.
So on every step your list will look like (i=3):
[ A1, A2, A3, B1, B2] where Ak < Ak-1 and Bs are unordered.
So shorty: on every ith step you look for the least element with index greater than i, that will be greater than the first element in a list.
With some tunning of for-ranges you could minimize the comparisons and swaps count.
But anyway, this algorithm will have ~ O(n^2) complexity.
English is not my native language, sorry for the typos.

Categories