What is time complexity of this __setitem__ method for empty dictionary? I think that, .key and .value python methods for dictionary are O(n) ( I read it somewhere) and for loop is also O(n). My guess is O(n)*O(n)*O(n)+O(1) = for loop + if + if body + append. But Im not sure with that "if in for loop body" situation and the .item and .value being O(n).
Please help. It was on my school test. Code is in python.
def __setitem__(self,k,v):
for item in self._table:
if k == item._key:
item._value = v
return
self._table.append(self._Item(k,v))
It is O(len(self._table)) (assuming constructing self._Item is O(1)) because in the worst case you need to check every element in the self._table object.
Both the if statement and its body will be O(1) in terms of the input because they're atomic operations. Thus, the for-loop's complexity is O(n) * O(1) * O(1) == O(n) where n is the size of self._table.
append is an atomic operation for lists, so it's amortized O(1), but if you reach this operation, you've already done O(n) work, so it makes the method O(n).
The complexity is O(n). If you look at complexity, you always imagine to have very large data to put into an algorithm. And for large numbers, O(n) is the same as O(2n) which would not be a valid notation of big O, because it doesn't make a "big" difference, where as O(n) opposed to O(n^2) makes a huge difference in computation complexity. Since you only loop through your list once which is O(n), because you loop n-times, and you don't have a nested loop or any expensive calculation with your list there, the overall complexity stays O(n).
Related
My textbook says that the following algorithm has an efficiency of O(n):
list = [5,8,4,5]
def check_for_duplicates(list):
dups = []
for i in range(len(list)):
if list[i] not in dups:
dups.append(list[i])
else:
return True
return False
But why? I ask because the in operation has an efficiency of O(n) as well (according to this resource). If we take list as an example the program needs to iterate 4 times over the list. But with each iteration, dups keeps growing faster. So for the first iteration over list, dups does not have any elements, but for the second iteration it has one element, for the third two elements and for the fourth three elements. Wouldn't that make 1 + 2 + 3 = 6 extra iterations for the in operation on top of the list iterations? But if this is true then wouldn't this alter the efficiency significantly, as the sum of the extra iterations grows faster with every iteration?
You are correct that the runtime of the code that you've posted here is O(n2), not O(n), for precisely the reason that you've indicated.
Conceptually, the algorithm you're implementing goes like this:
Maintain a collection of all the items seen so far.
For each item in the list:
If that item is in the collection, report a duplicate exists.
Otherwise, add it to the collection.
Report that there are no duplicates.
The reason the code you've posted here is slow is because the cost of checking whether a duplicate exists is O(n) when using a list to track the items seen so far. In fact, if you're using a list of the existing elements, what you're doing is essentially equivalent to just checking the previous elements of the array to see if any of them are equal!
You can speed this up by switching your implementation so that you use a set to track prior elements rather than a list. Sets have (expected) O(1) lookups and insertions, so this will make your code run in (expected) O(1) time.
I have designed an algorithm but confused whether the time complexity is theta(n) or theta (n^2).
def prefix_soms(number):
Array_A=[1,2,3,4,5,6]
Array_B=[]
soms=0
for i in range(0,number):
soms=soms+Array_A[i]
Array_B.insert(i,soms)
return Array_B[number-1]
I know the for loop is running n times so that's O(n).
Is the inside operations O(1)?
For arbitrary large numbers, it is not, since adding two huge numbers takes logarithmic time in the value of these numbers. If we assume that the sum will not run out of control, then we can say that it runs in O(n). The .insert(…) is basically just an .append(…). The amortized cost of appending n items is O(n).
We can however improve the readablility, and memory usage, by writing this as:
def prefix_soms(number):
Array_A=[1,2,3,4,5,6]
soms=0
for i in range(0,number):
soms += Array_A[i]
return soms
or:
def prefix_soms(number):
Array_A=[1,2,3,4,5,6]
return sum(Array_A[:number])
or we can omit creating a copy of the list, by using islice(..):
from itertools import islice
def prefix_soms(number):
Array_A=[1,2,3,4,5,6]
return sum(islice(Array_A, number))
We thus do not need to use another list, since we are only interested in the last item.
Given that the insert method doesn't shift your array - that is as to your algorithm it solely appends one element to end of the list - its time
complexity is O(1). Moreover, accessing an element with index takes O(1) time as well.
You run number of number time a loop with some O(1)s. O(number)*someO(1)s = O(number)
The complexity of list.insert is O(n), as shown on this wiki page. You can check the blist library which provides an optimized list type that has an O(log n) insert, but in your algorithm I think that the item is always placed at the end of the Array_B, so it is just an append which takes constant amortized time (You can replace the insert with append to make the code more elegant).
I have been given a small and simple function to refactor into a function that is O(n) complexity. However, i believe the function given already is, unless I am missing something?
Basically the idea of the function is simply to iterate over a list and remove the target item.
for i in self.items:
if i == item:
self.items.pop(i)
I know the for loop gives this a O(n) complexity but does the additional if statement add to the complexity? I didn't think it did in the worst-case for this simple piece of code.
If it does, is there a way this can be re-written to be O(n)?
I cannot think of another way to iterate over a list and remove an item without using a For loop and then using an if statement to do the comparison?
PS. self.items is a list of words
The list.pop method has an average and worst time complexity of O(n), so compounded with the loop, it makes the code O(n^2) in time complexity.
And as #juanpa.arrivillaga has already pointed out in the comments, you can use a list comprehension instead to filter out items of a specific value in O(n) time complexity:
self.items = [i for i in self.items if i != item]
for i in self.items: # grows with the cardinal n of self.items
if i == item:
self.items.pop(i) # grows with the cardinal n of self.items
So you have a complexity of O(n²).
The list method remove(item) in python though is of complexity O(n), so you'd prefer use it.
self.items.remove(item)
Your current solution has the time complexity of O(n^2).
For an O(n) solution you can just use for example an list comprehension to filter all the non wanted elements from the list:
self.items = [i for i in self.items if i != item]
First, you'd have to slightly adjust your code. The argument given to pop() is currently an item, where it should be an index (use remove() to remove the first occurrence of an item).
for i, item in enumerate(self.items):
if item == target:
self.items.pop(i)
The complexity depends on how often item matches the element in your list.
Using n=len(items) and k for the number of matches, the complexity is O(n, k) = O(n) + k O(n). The first term comes from the fact that we iterate through the entire list, the second one corresponds to the individual .pop(i) operations.
The total complexity for k=1 is thus simply O(n), and could go up to O(n*n) for k=n.
Note that the complexity for .pop(i) is O(n-i). Hence, popping the first and last element is O(n) and O(1), respectively.
Lastly, I'd generally recommend not to add/remove items to an object that you're currently iterating over.
The question explains it, but what is the time complexity of the set difference operation in Python?
EX:
A = set([...])
B = set([...])
print(A.difference(B)) # What is the time complexity of the difference function?
My intuition tells me O(n) because we can iterate through set A and for each element, see if it's contained in set B in constant time (with a hash function).
Am I right?
(Here is the answer that I came across: https://wiki.python.org/moin/TimeComplexity)
looks that you're right, difference is performed with O(n) complexity in the best cases
But keep in mind that in worst cases (maximizing collisions with hashes) it can raise to O(n**2) (since lookup worst case is O(n): How is set() implemented?, but it seems that you can generally rely on O(1))
As an aside, speed depends on the type of object in the set. Integers hash well (roughly as themselves, with probably some modulo), whereas strings need more CPU.
https://wiki.python.org/moin/TimeComplexity suggests that its O(cardinality of set A) in the example you described.
My understanding that its O(len(A)) and not O(len(B)) because, you only need to check if each element in setA is present in setB. Each lookup in setB is O(1), hence you will be doing len(A) * O(1) lookups on setB. Since O(1) is constant, then its O(len(A))
Eg:
A = {1,2,3,4,5}
B = {3,5,7,8,9,10,11,12,13}
A-B = {1,2,4}
When A-B is called, iterate through every element in A (only 5 elements), and check for membership in B. If not found in B, then it will be present in the result.
Note: Of course all this is amortised complexity. In practice, each lookup in setB could be more than O(1).
I see a lot of questions about the run-time complexity of python's built in methods, and there are a lot of answers for a lot of the methods (e.g. https://wiki.python.org/moin/TimeComplexity , https://www.ics.uci.edu/~pattis/ICS-33/lectures/complexitypython.txt , Cost of len() function , etc.)
What I don't see anything that addresses enumerate. I know it returns at least one new array (the indexes) but how long does it take to generate that and is the other array just the original array?
In other words, I'm assuming it's O(n) for creating a new array (iteration) and O(1) for the reuse of the original array...O(n) in total (I think). Is the another O(n) for the copy making it O(n^2), or something else...?
The enumerate-function returns an iterator. The concept of an iterator is described here.
Basically this means that the iterator gets initialized pointing to the first item of the list and then returning the next element of the list every time its next() method gets called.
So the complexity should be:
Initialization: O(1)
Returning the next element: O(1)
Returning all elements: n * O(1)
Please note that enumerate does NOT create a new data structure (list of tuples or something like that)! It is just iterating over the existing list, keeping the element index in mind.
You can try this out by yourself:
# First, create a list containing a lot of entries:
# (O(n) - executing this line should take some noticeable time)
a = [str(i) for i in range(10000000)] # a = ["0", "1", ..., "9999999"]
# Then call the enumeration function for a.
# (O(1) - executes very fast because that's just the initialization of the iterator.)
b = enumeration(a)
# use the iterator
# (O(n) - retrieving the next element is O(1) and there are n elements in the list.)
for i in b:
pass # do nothing
Assuming the naïve approach (enumerate duplicates the array, then iterates over it), you have O(n) time for duplicating the array, then O(n) time for iterating over it. If that was just n instead of O(n), you would have 2 * n time total, but that's not how O(n) works; all you know is that the amount of time it takes will be some multiple of n. That's (basically) what O(n) means anyway, so in any case, the enumerate function is O(n) time total.
As martineau pointed out, enumerate() does not make a copy of the array. Instead it returns an object which you use to iterate over the array. The call to enumerate() itself is O(1).