Input: A list of positive integers where one entry occurs exactly once, and all other entries occur exactly twice (for example [1,3,2,5,3,4,1,2,4])
Output: The unique entry (5 in the above example)
The following algorithm is supposed to be O(m) time and O(1) space where m is the size of the list.
def get_unique(intlist):
unique_val = 0
for int in intlist:
unique_val ^= int
return unique_val
My analysis: Given a list of length m there will be (m + 1)/2 unique positive integers in the input list, so that the smallest possible maximum integer in the list will be (m+1)/2. If we assume this best case, then when taking an XOR sum the variable unique_val will require ceiling(log((m+1)/2)) bits in memory, so I thought the space complexity should be at least O(log(m)).
Your analysis is certainly one correct answer, particularly in a language like Python which gracefully handles arbitrarily large numbers.
It's important to be clear about what you're trying to measure when thinking about space and time complexity. A reasonable assumption might be that the size of an integer is constant (e.g. you're using 64-bit integers). In that case, the space complexity is certainly O(1), but the time complexity is still O(m).
Now, you could also argue that using a fixed-size integer means you have a constant upper-bound on the size of m, so perhaps the time complexity is also O(1). But in most cases where you need to analyze the running time of this sort of algorithm, you're probably very interested in the difference between a list of length 10 and one of length 1 billion.
I'd say it's important to clarify and state your assumptions when analyzing space- and time-complexity. In this case, I would assume we have a fixed size integer and a value of m much smaller than the maximum integer value. In that case, O(1) space and O(m) time are probably the best answers.
EDIT (based on discussion in other answers)
Since all m gives you is a lower-bound no the maximum value in the list, you really can't provide a worst-case estimate of the space. I.e. a number in the list can be arbitrarily large. To have any reasonable answer as to the space complexity of this algorithm, you need to make some assumption about the maximum size of the input values.
The (space/time) complexity analysis is usually applied to algorithms on a higher level. While you can drop down to specific language implementation level, it may not be useful in all cases.
Your analysis is both right and possibly wrong. It's right for current cpython implementation where integers do not have a maximum value. It's ok if all your integers are relatively small and fit into the implementation-specific case of small numbers.
But it doesn't have to be valid for all other implementations of python. For example, you could have an optimizing implementation which figures out that intlist is not used again and instead of using unique_val, it reuses the space of the consumed list elements. (basically transforming this function into a space-optimized reduce call)
Then again, can we even talk about space complexity in a GC'd language with allocated integers? Your analysis of the complexity is wrong, because a ^= b will allocate new memory for big value b and the size of that depends on the system, architecture, python version, and luck.
Your original question is however "Why is the following algorithm O(1) space?". If you look at the algorithm itself and assume you have some arbitrary maximum integer limits, or your language can represent any number in a limited space, then the answer is yes. The algorithm itself with those conditions uses constant space.
The complexity of an algorithm is always dependent on the machine model (= platform) you use. E.g. we often say that multiplying and dividing IEEE floating point numbers is of run-time complexity O(1) - which is not always the case (e.g. on an 8086 processor without FPU).
For the above algorithm, the space complexity O(1) only holds as long as your input list has no element > 2147483647 (= sys.maxint). Usually, python stores integers as signed 32 bit values. For those datatypes, your processor has all relevant operations already implemented in hardware and it generally takes only a constant number of clock cycles (in most cases only one) to perform them (= run-time complexity O(1)) and only a constant number of memory addresses (only one) is occupied to store the result (= space complexity O(1)).
However, if your input exceeds 2147483647, python generally uses a software-implemented datatype to store these big integers. Operations on these are no longer in O(1) and they require more than constant O(1) space.
Related
I have the following function, that checks for the first duplicate value in an array.
Example: an array input of [123344] would return 3.
I believe the space complexity is O(1) since the total space being used remains constant. Is this correct?
#o(n) time.
#o(1) space or o(n) space?
def firstDuplicateValue(array):
found = set()
while len(array) > 0:
x = array.pop(0) #we remove elements from the array as we add them to the set
if x in found:
return x
else:
found.add(x)
return -1
Space complexity
I have to answer "no, the space complexity is not O(1)" for two reasons. The first is theoretical and pedantic, the second is more practical.
We typically consider space complexity to be the extra space which your program needs access to in addition to the input it is given.
Some programs read the input but are not allowed to modify it. For these programs, your question isn't raised and the space complexity is exactly the space taken by all the extra variables and data structures created in the program.
Some programs are explicitly allowed to modify the input. Still, for those programs, the space complexity only counts the extra space, not the space already in the input. For these programs it is quite possible to have a space complexity of O(log(n)). A space complexity of O(1) is unheard of in theory, because to iterate over an input consisting in an array of n elements, you need a counter variable that can count up to n, which requires log(n) bits. So, when people say O(1) in practice, they probably mean O(log(n)) in theory. This is my first reason to answer no: if we're being pedantic, the space complexity cannot be better than O(log(n)) in theory.
Your program modifies the input array by deleting elements from it, and stores these elements in an extra data structure found. In theory, this new data structure could fit exactly in the space liberated from the input array, and if that were the case, you would be right that the complexity of your algorithm is better than O(n). In practice, it looks shady because:
there is no disclaimer in your function that warns the user that it might destroy the input;
there is no guarantee by the python interpreter that array.pop() will really free space on the computer.
In practice, python interpreters typically only resize the array when it doubles or halves ido not free space when using array.pop(), until you've popped about half the values in the array. This is my second reason to say no: you need to pop at least n/2 values from the input array before any space will be freed. Before that happens, you will have used n/2 space in your extra data structure found. Hence the space complexity of your function will not be better than O(n) with a standard python interpreter.
Time complexity
Your fist comment in the code says #o(n) time.. But that is not correct for two reasons. First, don't confuse o() and O(), which are very different. Second, this function is O(n^2), not O(n). This is because of the repeated use of array.pop(0). As a good rule of thumb: never use list.pop(0) in python. Use list.pop() if you can, or find some other way.
list.pop(0) is terribly inefficient, as it removes the first element, then moves every other element one space up to fill the gap. Thus, every call to list.pop(0) has O(n) time complexity, and your function makes up to n calls to list.pop(0), so the function has O(n^2) time complexity.
People coming from other coding languages to python often ask how they should pre-allocate or initialize their list. This is especially true for people coming from Matlab where codes as
l = []
for i = 1:100
l(end+1) = 1;
end
returns a warning that explicitly suggest you to initialize the list.
There are several posts on SO explaining (and showing through tests) that list initialization isn't required in python. A good example with a fair bit of discussion is this one (but the list could be very long): Create a list with initial capacity in Python
The other day, however, while looking for operations complexity in python, I stumbled this sentence on the official python wiki:
the largest [cost for list operations] come from growing beyond the current allocation size (because everything must move),
This seems to suggest that indeed lists do have a pre-allocation size and that growing beyond that size cause the whole list to move.
This shacked a bit my foundations. Can list pre-allocation reduce the overall complexity (in terms of number of operations) of a code? If not, what does that sentence means?
EDIT:
Clearly my question regards the (very common) code:
container = ... #some iterable with 1 gazilion elements
new_list = []
for x in container:
... #do whatever you want with x
new_list.append(x) #or something computed using x
In this case the compiler cannot know how many items there are in container, so new_list could potentially require his allocated memory to change an incredible number of times if what is said in that sentence is true.
I know that this is different for list-comprehensions
Can list pre-allocation reduce the overall complexity (in terms of number of operations) of a code?
No, the overall time complexity of the code will be the same, because the time cost of reallocating the list is O(1) when amortised over all of the operations which increase the size of the list.
If not, what does that sentence means?
In principle, pre-allocating the list could reduce the running time by some constant factor, by avoiding multiple re-allocations. This doesn't mean the complexity is lower, but it may mean the code is faster in practice. If in doubt, benchmark or profile the relevant part of your code to compare the two options; in most circumstances it won't matter, and when it does, there are likely to be better alternatives anyway (e.g. NumPy arrays) for achieving the same goal.
new_list could potentially require his allocated memory to change an incredible number of times
List reallocation follows a geometric progression, so if the final length of the list is n then the list is reallocated only O(log n) times along the way; not an "incredible number of times". The way the maths works out, the average number of times each element gets copied to a new underlying array is a constant regardless of how large the list gets, hence the O(1) amortised cost of appending to the list.
I see two sentences:
total amortized cost of a sequence of operations must be an upper
bound on the total actual cost of the sequence
When assigning amortized costs to operations on a data structure, you
need to ensure that, for any sequence of operations performed, that
the sum of the amortized costs is always at least as big as the sum of
the actual costs of those operations.
my challenge is two things:
A) both of them meaning: amortized cost >= Real Cost of operation? I think amortized is (n* real cost).
B) is there any example to more clear me to understand? a real and short example?
The problem that amortization solves is that common operations may trigger occasional slow ones. Therefore if we add up the worst cases, we are effectively looking at how the program would perform if garbage collection is always running and every data structure had to be moved in memory every time. But if we ignore the worst cases, we are effectively ignoring that garbage collection sometimes does run, and large lists sometimes do run out of allocated space and have to be moved to a bigger bucket.
We solve this by gradually writing off occasional big operations over time. We write it off as soon as we realize that it may be needed some day. Which means that the amortized cost is usually bigger than the real cost, because it includes that future work, but occasionally the real cost is way bigger than the amortized. And, on average, they come out to around the same.
The standard example people start with is a list implementation where we allocate 2x the space we currently need, and then reallocate and move it if we use up space. When I run foo.append(...) in this implementation, usually I just insert. But occasionally I have to copy the whole large list. However if I just copied and the list had n items, after I append n times I will need to copy 2n items to a bigger space. Therefore my amortized analysis of what it costs to append includes the cost of an insert and moving 2 items. And over the next n times I call append it my estimate exceeds the real cost n-1 times and is less the nth time, but averages out exactly right.
(Python's real list implementation works like this except that the new list is around 9/8 the size of the old one.)
I need a data structure to store positive (not necessarily integer) values. It must support the following two operations in sublinear time:
Add an element.
Remove the largest element.
Also, the largest key may scale as N^2, N being the number of elements. In principle, having O(N^2) space requirement wouldn't be a big problem, but if a more efficient option exists in terms of store, it would work better.
I am working in Python, so if such a data structure exists, it would be of help to have an implementation in this language.
There is no such data structure. For example, if there were, sorting would be worst-case linear time: add all N elements in O(N) time, then remove the largest element remaining N times, again in total O(N) time.
the best data structure you can choose for this operations is the heap: https://www.tutorialspoint.com/python_data_structure/python_heaps.htm#:~:text=Heap%20is%20a%20special%20tree,is%20called%20a%20max%20heap.
with this data structure both adding an element and removing the max are O(log(n)).
this is the most used data structure when you need a lot of operations on the max element, for example is commonly used to implement priority queues
Although constant time may be impossible, depending on your input constraints, you might consider a y-fast-trie, which has O(log log m) time operations and O(n) space, where m is the range, although they work with integers, taking advantage of the bit structure. One of the supported operations is next higher or lower element, which could let you keep track of the highest when the latter is removed.
I have a nested r-tree like datastructure in Python (list of lists). The key is a large number (about 10 digits). On each level there are about x number of items (eg:10) in the list. Then within each list, it recurses and has x items and so on. The height of the tree is h levels (eg: 5). Each level also has an indication of what range of keys it contains (like r-tree).
For a given key, I need to locate the corresponding entry in the tree. This can be trivially done by scanning through each level, check if the given key lies within the range. If so, then step into that layer and recurse till it reaches the leaf.
This can also be done by successively dividing the key by x and taking the quotient as list index.
So the question is, what is more effecient : walking through list sequentially (complexity = depth * x (eg:50)) or successively dividing the large number by x to get the actual list indices (complexity = h divisions (eg: 5 divisions)).
(ie) 50 range checks or 5 divisions ?
This needs to be scalable. So if this code is being accessed in cloud by very large number of users, what is efficient ? May be division is more expensive to perform at scale than range checks ?
You need to benchmark the code in somewhat realistic scenario.
The reason why it's so hard to say is that you are not just comparing division (by the way, modern compilers avoid divisions with a large number of tricks). On modern CPUs you have large caches so likely the list will fit into L2 or L3 which decreases the run-time dramatically. There's also the fancy vector/SIMD instructions that might be used to speed up all the checks in the linear case.
I would guess that going through the list sequentially will be faster, in addition the code will be simpler.
But don't take my word for it, take a real example and benchmark the two versions and pick based on the results. Especially if this is critical for your system's performance.