What is the time complexity of these three solutions? - python

I have these three solutions to a Leetcode problem and do not really understand the difference in time complexity here. Why is the last function twice as fast as the first one?
68 ms
def numJewelsInStones(J, S):
count=0
for s in S:
if s in J:
count += 1
return count
40ms
def numJewelsInStones(J, S):
return sum(s in J for s in S)
32ms
def numJewelsInStones(J, S):
return len([x for x in S if x in J])

Why is the last function twice as fast as the first one?
The analytical time complexity in terms of big O notation looks the same for all, however subject to constants. That is e.g. O(n) really means O(c*n) however c is ignored by convention when comparing time complexities.
Each of your functions has a different c. In particular
loops in general are slower than generators
sum of a generator is likely executed in C code (the sum part, adding numbers)
len is a simple attribute "single operation" lookup on the array, which can be done in constant time, whereas sum takes n add operations.
Thus c(for) > c(sum) > c(len) where c(f) is the hypothetical fixed-overhead measurement of function/statement f.
You could check my assumptions by disassembling each function.
Other than that your measurements are likely influenced by variation due to other processes running in your system. To remove these influences from your analysis, take the average of execution times over at least 1000 calls to each function (you may find that perhaps c is less than this variation though I don't expect that).
what is the time complexity of these functions?
Note that while all functions share the same big O time complexity, the latter will be different depending on the data type you use for J, S. If J, S are of type:
dict, the complexity of your functions will be in O(n)
set, the complexity of your functions will be in O(n)
list, the complexity of your functions will be in O(n*m), where n,m are the sizes of the J, S variables, respectively. Note if n ~ m this will effectively turn into O(n^2). In other words, don't use list.
Why is the data type important? Because Python's in operator is really just a proxy to membership testing implemented for a particular type. Specifically, dict and set membership testing works in O(1) that is in constant time, while the one for list works in O(n) time. Since in the list case there is a pass on every member of J for each member of S, or vice versa, the total time is in O(n*m). See Python's TimeComplexity wiki for details.

With time complexity, big O notation describes how the solution grows as the input set grows. In other words, how they are relatively related. If your solution is O(n) then as the input grows then the time to complete grows linearly. More concretely, if the solution is O(n) and it takes 10 seconds when the data set is 100, then it should take approximately 100 seconds when the data set is 1000.
Your first solution is O(n), we know this because of the for loop, for s in S, which will iterate through the entire data set once. If s in J, assuming J is a set or a dictionary will likely be constant time, O(1), the reasoning behind this is a bit beyond the scope of the question. As a result, the first solution overall is O(n), linear time.
The nuanced differences in time between the other solutions is very likely negligible if you ran your tests on multiple data sets and averaged them out over time, accounting for startup time and other factors that impact the test results. Additionally, Big O notation discards coefficients, so for example, O(3n) ~= O(n).
You'll notice in all of the other solutions you have the same concept, loop over the entire collection and check for the existence in the set or dict. As a result, all of these solutions are O(n). The differences in time can be attributed to other processes running at the same time, the fact that some of the built-ins used are pure C, and also to differences as a result of insufficient testing.

Well, second function faster than first because of using generator instead of loop. Third function is faster than second because second summing generators output (which returns something like list), but third - just calculating it's length.

Related

Converting recursive function to completely iterative function without using extra space

Is it possible to convert a recursive function like the one below to a completely iterative function?
def fact(n):
if n <= 1:
return
for i in range(n):
fact(n-1)
doSomethingFunc()
It seems pretty easy to do given extra space like a stack or a queue, but I was wondering if we can do this in O(1) space complexity?
Note, we cannot do something like:
def fact(n):
for i in range (factorial(n)):
doSomethingFunc()
since it takes a non-constant amount of memory to store the result of factorial(n).
Well, generally speaking no.
I mean, the space taken in the stack by recursive functions is not just an inconvenient of this programming style. It is the memory needed for the computation.
So, sure, for lot of algorithm, that space is unnecessary and could be spared. For a classical factorial for example
def fact(n):
if n<=1:
return 1
else:
return n*fact(n-1)
the stacking of all the n, n-1, n-2, ..., 1 arguments is not really necessary.
So, sure, you can find an implementation that get rid of it. But that is optimization (For example, in the specific case of terminal recursion. But I am pretty sure that you add that "doSomething" to make clear that you don't want to focus on that specific case).
You cannot assume in general that an algorithm that don't need all those values exist, recursive or iterative. Or else, that would be saying that all algorithm exist in a O(1) space complexity version.
Example: base representation of a positive integer
def baseRepr(num, base):
if num>=base:
s=baseRepr(num//base, base)
else:
s=''
return s+chr(48+num%base)
Not claiming it is optimal, or even well written.
But, the stacking of the arguments is needed. It is the way you implicitly store the digits that you compute in the reverse order.
An iterative function would also need some memory to store those digits, since you have to compute the last one first.
Well, I am pretty sure that for this simple example, you could find a way to compute from left to right, for example using a log computation to know in advance the number of digits or something. But that's not the point. Just imagine that there is no other algorithm known than the one computing digits from right to left. Then you need to store them. Either implicitly in the stack using recursion, or explicitly in allocated memory. So again, memory used in the stack is not just an inconvenience of recursion. It is the way recursive algorithm store things, that would be stored otherwise in iterative algorithm
Note, we cannot do something like:
def fact(n):
for i in range (factorial(n)):
doSomethingFunc()
since it takes a non-constant amount of memory to store the result of
factorial(n).
Yes.
I was wondering if we can do this in O(1) space complexity?
So, no.

Is it better for performance to use min() mutliple times or to store it in a variable?

I have a small python code that uses min(list) multiple times on the same unchanged list, this got me wondering if I use the result of functions like min(), len(), etc... multiple times on an unchanged list is it better for me to store those results in variables, does it affect memory usage / performance at all?
If I had to guess I'd say that if a function like min() gets called many times it'd be better for performance to store it in a variable, but I am not sure of this since I don't really know how python gets the value or if python automatically stores this value somewhere as long as the list isn't changed.
Speed
It is almost always cheaper to store the result and re-use it rather than re-call the function multiple times.
Python does not cache (store and later remember) results from functions like min(), len(), etc.
Here is a quick speed test:
timeit.timeit("c = min(x) + min(x)", "x = [1, 2, 3]")
0.24990593400000005
timeit.timeit("a = min(x); b = a + a", "x = [1, 2, 3]")
0.1296667110000005
The second is almost twice as fast, because storing a variable is much cheaper than re-calling the min function.
Memory use
If the result is a single number, as with min() or len(), then memory use is negligible.
If the result is something substantial (e.g. a large table of values), then you can remove it when you're done with it using del
large_object = expensive_function()
do_something(large_object)
do_something_else(large_object)
del large_object
Also, large objects will automatically be deleted from memory when they fall out of scope (e.g. when a function returns) or when garbage collection rounds happen at regular intervals. For this reason, del is only necessary in certain circumstances like when dealing with circular references to an object.
min() is very fast compared to many other operations, such as I/O. So the efficiency improvements could be small for short lists and only a few repeated calls. However, if you cache the results of min(), you can realize some time savings. See the code below for examples of time you can actually save. As you can see, you need multiple iterations of the loops that contain min() calls to get any substantial the time savings.
import timeit
lst = range(2)
def test_min():
x = [min(lst) for i in range(10)]
def test_cached_min():
min_lst = min(lst)
x = [min_lst for i in range(10)]
print(timeit.timeit("test_min()", globals = locals(), number = 1000))
print(timeit.timeit("test_cached_min()", globals = locals(), number = 1000))
# lst = range(2):
# 0.0027015960000000006
# 0.0010772920000000005
# lst = range(2000):
# 0.5262554810000001
# 0.05257684900000004
functions like min or max definitely have to traverse the array each time (giving them a complexity of O(n)). So yeah, specially if your array is larger, it's a better idea to store it in a variable rather than performing the calculation again.
More details about performance in another question
Time Complexity of Python List Operations
Source
The table shows that:
function len (to get length) has complexity O(1) (so very fast, so already stored)
function min (to get minimum) is O(n) (depends upon size of list, so computed each time).
This means that:
len does not need to be stored for reuse
min should be stored for reuse (especially for large lists)
If you are only using it 1-5 times, it doesn't really matter. But if you are going to call it anymore, and really less too, it is best to just save it as a variable. It will take next to no memory, and very little time to do so and to pull it from memory.

Testing complexity using time module

I have a doubt regarding time complexity with recursion.
Let's say I need to find the largest number in a list using recursion what I came up with is this:
def maxer(s):
if len(s) == 1:
return s.pop(0)
else:
if s[0]>s[1]:
s.pop(1)
else:
s.pop(0)
return maxer(s)
Now to test the function with many inputs and find out its time complexity, I called the function as follows:
import time
import matplotlib.pyplot as plt
def timer_v3(fx,n):
x_axis=[]
y_axis=[]
for x in range (1,n):
z = [x for x in range(x)]
start=time.time()
fx(z)
end=time.time()
x_axis.append(x)
y_axis.append(end-start)
plt.plot(x_axis,y_axis)
plt.show()
Is there a fundamental flaw in checking complexity like this as a rough estimate? If so, how can we rapidly check the time complexity?
Assuming s is a list, then your function's time complexity is O(n2). When you pop from the start of the list, the remaining elements have to be shifted left one space to "fill in" the gap; that takes O(n) time, and your function pops from the start of the list O(n) times. So the overall complexity is O(n * n) = O(n2).
Your graph doesn't look like a quadratic function, though, because the definition of O(n2) means that it only has to have quadratic behaviour for n > n0, where n0 is an arbitrary number. 1,000 is not a very large number, especially in Python, because running times for smaller inputs are mostly interpreter overhead, and the O(n) pop operation is actually very fast because it's written in C. So it's not only possible, but quite likely that n < 1,000 is too small to observe quadratic behaviour.
The problem is, your function is recursive, so it cannot necessarily be run for large enough inputs to observe quadratic running time. Too-large inputs will overflow the call stack, or use too much memory. So I converted your recursive function into an equivalent iterative function, using a while loop:
def maxer(s):
while len(s) > 1:
if s[0] > s[1]:
s.pop(1)
else:
s.pop(0)
return s.pop(0)
This is strictly faster than the recursive version, but it has the same time complexity. Now we can go much further; I measured the running times up to n = 3,000,000.
This looks a lot like a quadratic function. At this point you might be tempted to say, "ah, #kaya3 has shown me how to do the analysis right, and now I see that the function is O(n2)." But that is still wrong. Measuring the actual running times - i.e. dynamic analysis - still isn't the right way to analyse the time complexity of a function. However large n we test, n0 could still be bigger, and we'd have no way of knowing.
So if you want to find the time complexity of an algorithm, you have to do it by static analysis, like I did (roughly) in the first paragraph of this answer. You don't save yourself time by doing a dynamic analysis instead; it takes less than a minute to read your code and see that it does an O(n) operation O(n) times, if you have the knowledge. So, it is definitely worth developing that knowledge.

speed up function based on list comprehension

I'm trying to get the 15 most relevant item for each users but every functions i tried took an eternity. (more than 6 hours i shutdown it after that ...)
I have 418 unique users, 3718 unique items.
U2tfifd dict has as well 418 entry and there is 32645 words in tfidf_feature_names.
Shape of my interactions_full_df is (40733, 3)
i tried :
def index_tfidf_users(user_id) :
return [users for users in U2tfifd[user_id].flatten().tolist()]
def get_relevant_items(user_id):
return sorted(zip(tfidf_feature_names, index_tfidf_users(user_id)), key=lambda x: -x[1])[:15]
def get_tfidf_token(user_id) :
return [words for words, values in get_relevant_items(user_id)]
then interactions_full_df["tags"] = interactions_full_df["user_id"].apply(lambda x : get_tfidf_token(x))
or
def get_tfidf_token(user_id) :
tags = []
v = sorted(zip(tfidf_feature_names, U2tfifd[user_id].flatten().tolist()), key=lambda x: -x[1])[:15]
for words, values in v :
tags.append(words)
return tags
or
def get_tfidf_token(user_id) :
v = sorted(zip(tfidf_feature_names, U2tfifd[user_id].flatten().tolist()), key=lambda x: -x[1])[:15]
tags = [words for words in v]
return tags
U2tfifd is a dict with keys = user_id, values = an array
There are several things going on which could cause poor performance in your code. The impact of each of these will depend on things like your Python version (2.x or 3.x), your RAM speed, and whatnot. You'll need to experiment and benchmark the various potential improvements yourself.
1. TFIDF Sparsity (~10x speedup depending on sparsity)
One glaring potential problem is that TFIDF naturally returns sparse data (e.g. a paragraph doesn't use anywhere near as many unique words as an entire book), and working with dense structures like numpy arrays is a strange choice when the data is probably zero almost everywhere.
If you'll be doing this same analysis in the future, it might be helpful to make/use a version of TFIDF with sparse array outputs so that when you extract your tokens you can skip over the zero values. This would likely have the secondary benefit of the entire sparse array for each user fitting in the cache and preventing costly RAM access in your sorts and other operations.
It might be worth sparsifying your data anyway. On my potato, a quick benchmark on data which should be similar to yours indicates that the process can be done in ~30s. The process replaces much of the work you're doing with a highly optimized routine coded in C and wrapped for use in Python. The only real cost is the second pass through the non-zero entries, but unless that pass is pretty efficient to begin with you should be better off working with sparse data.
2. Duplicated Efforts and Memoization (~100x speedup)
If U2tfifd has 418 entries and interactions_full_df has 40733 rows then at least 40315 (or 99.0%) of your calls to get_tfidf_token() are wasted since you've already computed the answer. There are tons of memoization decorators out there, but you don't need anything very complicated for your use case.
def memoize(f):
_cache = {}
def _f(arg):
if arg not in _cache:
_cache[arg] = f(arg)
return _cache[arg]
return _f
#memoize
def get_tfidf_token(user_id):
...
Breaking this down, the function memoize() returns another function. The behavior of that function is to check a local cache for the expected return value before computing it and storing it if necessary.
The syntax #memoize... is short for something like the following.
def uncached_get_tfidf_token(user_id):
...
get_tfidf_token = memoize(uncached_get_tfidf_token)
The # symbol is used to signify that we want the modified, or decorated, version of get_tfidf_token() instead of the original. Depending on your application, it might be beneficial to chain decorators together.
3. Vectorized Operations (varying speedup, benchmarking necessary)
Python doesn't really have a notion of primitive types like other languages, and even integers take 24 bytes in memory on my machine. Lists aren't usually be packed, so you can incur costly cache misses as you're plowing through them. No matter how little work the CPU is doing for sorting and whatnot, clobbering a whole new chunk of memory to turn your array into a list and only using that brand new, expensive memory once is going to incur a performance hit.
Many of the things you are trying to do have fast (SIMD vectorized, parallelized, memory-efficient, packed memory, and other fun optimizations) numpy equivalents AND avoid unnecessary array copies and type conversions. It seems you're already using numpy anyway, so you won't have any extra imports or dependencies.
As one example, zip() creates another list in memory in Python 2.x and still does unnecessary work in Python 3.x when you really only care about the indices of tfidf_feature_names. To compute those indices, you can use something like the following, which avoids an unnecessary list creation and uses an optimized routine with slightly better asymptotic complexity as an added bonus.
def get_tfidf_token(user_id):
temp = U2tfifd[user_id].flatten()
ind = np.argpartition(temp, len(temp)-15)[-15:]
return tfidf_feature_names[ind] # works if tfidf_feature_names is a numpy array
return [tfidf_feature_names[i] for i in ind] # always works
Depending on the shape of U2tfifd[user_id], you could avoid the costly .flatten() computation by passing an axis argument to np.argsort() and flattening the 15 obtained indices instead.
4. Bonus
The sorted() function supports a reverse argument so that you can avoid extra computations like throwing a negative on every value. Simply use
sorted(..., reverse=True)
Even better, since you really don't care about the sort itself but just the 15 largest values you can get away with
sorted(...)[-15:]
to index the largest 15 instead of reversing the sort and taking the smallest 15. That doesn't really matter if you're using a better function for the application like np.argpartition(), but it could be helpful in the future.
You can also avoid some function calls by replacing .apply(lambda x : get_tfidf_token(x)) with .apply(get_tfidf_token) since get_tfidf_token is already a function which has the intended behavior. You don't really need the extra lambda.
As far as I can see though, most additional gains are fairly nitpicky and system-dependent. You can make most things faster with Cython or straight C with enough time for example, but you already have reasonably fast routines which do what you want out of the box. The extra engineering effort probably isn't worth any potential gains.

Python :: Iteration vs Recursion on string manipulation

In the examples below, both functions have roughly the same number of procedures.
def lenIter(aStr):
count = 0
for c in aStr:
count += 1
return count
or
def lenRecur(aStr):
if aStr == '':
return 0
return 1 + lenRecur(aStr[1:])
Picking between the two techniques is a matter of style or is there a most efficient method here?
Python does not perform tail call optimization, so the recursive solution can hit a stack overflow on long strings. The iterative method does not have this flaw.
That said, len(str) is faster than both methods.
This is not correct: 'functions have roughly the same number of procedures'. You probably mean that: 'these procedures require the same number of operations', or, more formally 'they have the same computational time complexity'.
While both have the same computational time complexity, the one using recursion requires additional CPU instructions to execute code for creating new instances of procedures during recursion, and to switch contexts. And to clean up after returning from every recursion. While these operations do not increase the theoretical computational complexity, in most real life implementations of operating systems they will put significant load.
Also the resursive method will have higher space complexity, as each new instance of recursively-called procedure needs new storage for its data.
Surely the first approach is more optimized, as python doesn't have to do a lot of function call and string slicing, which each of these operations are contain some other operations that cost much for python interpreter, and may be cause a lot of problems in future and in dealing with log strings.
As a more pythonic way you better to use len() function in order to get the length of a string.
You can also use code object to see the required stack sized for each function:
>>> lenRecur.__code__.co_stacksize
4
>>> lenIter.__code__.co_stacksize
3

Categories