What is the computational complexity of the function pow in Python? [duplicate] - python

This question already has an answer here:
Why is time complexity O(1) for pow(x,y) while it is O(n) for x**y?
(1 answer)
Closed 5 months ago.
I want to know the computational complexity of pow in Python. To two-arg (plain exponentiation).
I have this code, and I know the computational complexity of a for is O(n), but I don't know if the pow affect the complexity.
def function(alpha,beta,p):
for x in range(1,p):
beta2 = (pow(alpha, x)) % p
if beta == beta2:
return x
else:
print("no existe")

As mentioned by comment, the official Python interpreter does a lot of optimization for its internal math functions. The usual pow operation of type A ** B calls two variable Pointers for evaluation (actually all Python variables are a combination of Pointers, making it unnecessary to initialize data types), but this is a slow process.
On the contrary, the interpreter can optimize the data in the POW, fix their variable types as int , thus to reduce complexity.
You can also read this answer, which should fully explain your question
Why is time complexity O(1) for pow(x,y) while it is O(n) for x**y?
Oh now you post a code, that clearify the problem. Usually, in Algorithm we treat the time complexity of OPERATION as O(1), this dosn't matter how many operations you have in a loop, because that is the def of O() notation.
And for usual program, only the loop matters, for your progrom the complexity should only be O(n)
def function(alpha,beta,p):
for x in range(1,p): # Only one loop
beta2 = (pow(alpha, x)) % p
if beta == beta2:
return x
else:
print("no existe")

Related

Testing complexity using time module

I have a doubt regarding time complexity with recursion.
Let's say I need to find the largest number in a list using recursion what I came up with is this:
def maxer(s):
if len(s) == 1:
return s.pop(0)
else:
if s[0]>s[1]:
s.pop(1)
else:
s.pop(0)
return maxer(s)
Now to test the function with many inputs and find out its time complexity, I called the function as follows:
import time
import matplotlib.pyplot as plt
def timer_v3(fx,n):
x_axis=[]
y_axis=[]
for x in range (1,n):
z = [x for x in range(x)]
start=time.time()
fx(z)
end=time.time()
x_axis.append(x)
y_axis.append(end-start)
plt.plot(x_axis,y_axis)
plt.show()
Is there a fundamental flaw in checking complexity like this as a rough estimate? If so, how can we rapidly check the time complexity?
Assuming s is a list, then your function's time complexity is O(n2). When you pop from the start of the list, the remaining elements have to be shifted left one space to "fill in" the gap; that takes O(n) time, and your function pops from the start of the list O(n) times. So the overall complexity is O(n * n) = O(n2).
Your graph doesn't look like a quadratic function, though, because the definition of O(n2) means that it only has to have quadratic behaviour for n > n0, where n0 is an arbitrary number. 1,000 is not a very large number, especially in Python, because running times for smaller inputs are mostly interpreter overhead, and the O(n) pop operation is actually very fast because it's written in C. So it's not only possible, but quite likely that n < 1,000 is too small to observe quadratic behaviour.
The problem is, your function is recursive, so it cannot necessarily be run for large enough inputs to observe quadratic running time. Too-large inputs will overflow the call stack, or use too much memory. So I converted your recursive function into an equivalent iterative function, using a while loop:
def maxer(s):
while len(s) > 1:
if s[0] > s[1]:
s.pop(1)
else:
s.pop(0)
return s.pop(0)
This is strictly faster than the recursive version, but it has the same time complexity. Now we can go much further; I measured the running times up to n = 3,000,000.
This looks a lot like a quadratic function. At this point you might be tempted to say, "ah, #kaya3 has shown me how to do the analysis right, and now I see that the function is O(n2)." But that is still wrong. Measuring the actual running times - i.e. dynamic analysis - still isn't the right way to analyse the time complexity of a function. However large n we test, n0 could still be bigger, and we'd have no way of knowing.
So if you want to find the time complexity of an algorithm, you have to do it by static analysis, like I did (roughly) in the first paragraph of this answer. You don't save yourself time by doing a dynamic analysis instead; it takes less than a minute to read your code and see that it does an O(n) operation O(n) times, if you have the knowledge. So, it is definitely worth developing that knowledge.

What is the time complexity of these three solutions?

I have these three solutions to a Leetcode problem and do not really understand the difference in time complexity here. Why is the last function twice as fast as the first one?
68 ms
def numJewelsInStones(J, S):
count=0
for s in S:
if s in J:
count += 1
return count
40ms
def numJewelsInStones(J, S):
return sum(s in J for s in S)
32ms
def numJewelsInStones(J, S):
return len([x for x in S if x in J])
Why is the last function twice as fast as the first one?
The analytical time complexity in terms of big O notation looks the same for all, however subject to constants. That is e.g. O(n) really means O(c*n) however c is ignored by convention when comparing time complexities.
Each of your functions has a different c. In particular
loops in general are slower than generators
sum of a generator is likely executed in C code (the sum part, adding numbers)
len is a simple attribute "single operation" lookup on the array, which can be done in constant time, whereas sum takes n add operations.
Thus c(for) > c(sum) > c(len) where c(f) is the hypothetical fixed-overhead measurement of function/statement f.
You could check my assumptions by disassembling each function.
Other than that your measurements are likely influenced by variation due to other processes running in your system. To remove these influences from your analysis, take the average of execution times over at least 1000 calls to each function (you may find that perhaps c is less than this variation though I don't expect that).
what is the time complexity of these functions?
Note that while all functions share the same big O time complexity, the latter will be different depending on the data type you use for J, S. If J, S are of type:
dict, the complexity of your functions will be in O(n)
set, the complexity of your functions will be in O(n)
list, the complexity of your functions will be in O(n*m), where n,m are the sizes of the J, S variables, respectively. Note if n ~ m this will effectively turn into O(n^2). In other words, don't use list.
Why is the data type important? Because Python's in operator is really just a proxy to membership testing implemented for a particular type. Specifically, dict and set membership testing works in O(1) that is in constant time, while the one for list works in O(n) time. Since in the list case there is a pass on every member of J for each member of S, or vice versa, the total time is in O(n*m). See Python's TimeComplexity wiki for details.
With time complexity, big O notation describes how the solution grows as the input set grows. In other words, how they are relatively related. If your solution is O(n) then as the input grows then the time to complete grows linearly. More concretely, if the solution is O(n) and it takes 10 seconds when the data set is 100, then it should take approximately 100 seconds when the data set is 1000.
Your first solution is O(n), we know this because of the for loop, for s in S, which will iterate through the entire data set once. If s in J, assuming J is a set or a dictionary will likely be constant time, O(1), the reasoning behind this is a bit beyond the scope of the question. As a result, the first solution overall is O(n), linear time.
The nuanced differences in time between the other solutions is very likely negligible if you ran your tests on multiple data sets and averaged them out over time, accounting for startup time and other factors that impact the test results. Additionally, Big O notation discards coefficients, so for example, O(3n) ~= O(n).
You'll notice in all of the other solutions you have the same concept, loop over the entire collection and check for the existence in the set or dict. As a result, all of these solutions are O(n). The differences in time can be attributed to other processes running at the same time, the fact that some of the built-ins used are pure C, and also to differences as a result of insufficient testing.
Well, second function faster than first because of using generator instead of loop. Third function is faster than second because second summing generators output (which returns something like list), but third - just calculating it's length.

Python :: Iteration vs Recursion on string manipulation

In the examples below, both functions have roughly the same number of procedures.
def lenIter(aStr):
count = 0
for c in aStr:
count += 1
return count
or
def lenRecur(aStr):
if aStr == '':
return 0
return 1 + lenRecur(aStr[1:])
Picking between the two techniques is a matter of style or is there a most efficient method here?
Python does not perform tail call optimization, so the recursive solution can hit a stack overflow on long strings. The iterative method does not have this flaw.
That said, len(str) is faster than both methods.
This is not correct: 'functions have roughly the same number of procedures'. You probably mean that: 'these procedures require the same number of operations', or, more formally 'they have the same computational time complexity'.
While both have the same computational time complexity, the one using recursion requires additional CPU instructions to execute code for creating new instances of procedures during recursion, and to switch contexts. And to clean up after returning from every recursion. While these operations do not increase the theoretical computational complexity, in most real life implementations of operating systems they will put significant load.
Also the resursive method will have higher space complexity, as each new instance of recursively-called procedure needs new storage for its data.
Surely the first approach is more optimized, as python doesn't have to do a lot of function call and string slicing, which each of these operations are contain some other operations that cost much for python interpreter, and may be cause a lot of problems in future and in dealing with log strings.
As a more pythonic way you better to use len() function in order to get the length of a string.
You can also use code object to see the required stack sized for each function:
>>> lenRecur.__code__.co_stacksize
4
>>> lenIter.__code__.co_stacksize
3

Why is python's built in multiplication so fast [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 9 years ago.
Improve this question
So the other day I was trying something in python, I was trying to write a custom multiplication function in python
def multi(x, y):
z = 0
while y > 0:
z = z + x
y = y - 1
return z
However, when I ran it with extremely large numbers like (1 << 90) and (1 << 45) which is (2 ^ 90) * (2 ^ 45). It took forever to compute.
So I tried looking into different types of multiplication, like the russian peasant multiplication technique, implemented down there, which was extremely fast but not as readable as multi(x, y)
def russian_peasant(x, y):
z = 0
while y > 0:
if y % 2 == 1: z = z + x
x = x << 1
y = y >> 1
return z
What I want you to answer is how do programming languages like python multiply numbers ?
Your multi version runs in O(N) whereas russian_peasant version runs in O(logN), which is far better than O(N).
To realize how fast your russian_peasant version is, check this out
from math import log
print round(log(100000000, 2)) # 27.0
So, the loop has to be executed just 27 times, but your multi version's while loop has to be executed 100000000 times, when y is 100000000.
To answer your other question,
What I want you to answer is how do programming languages like python
multiply numbers ?
Python uses O(N^2) grade school multiplication algorithm for small numbers, but for big numbers it uses Karatsuba algorithm.
Basically multiplication is handled in C code, which can be compiled to machine code and executed faster.
Programming languages like Python use the multiplication instruction provided by your computer's CPU.
In addition, you have to remember that Python is a very high-level programming language, which runs on a virtual machine which itself runs on your computer. As such, it is, inherently, a few order of magnitudes slower than native code. Translating your algorithm to assembly (or even to C) would result in a massive speedup -- although it'd still be slower than the CPU's multiplication operation.
On the plus side, unlike naive assembly/C, Python auto-promotes integers to bignums instead of overflowing when your numbers are bigger than 2**32.
The basic answer to your question is this, multiplication using * is handled through C code. In essence if you write something in pure python its going to be slower than the C implementation, let me give you an example.
The operator.mul function is implemented in C, but a lambda is implemented in Python, we're going to try to find the product of all the numbers in an array using functools.reduce and we are going to use two cases, one using operator.mul and another using a lambda which both do the same thing (on the surface):
from timeit import timeit
setup = """
from functools import reduce
from operator import mul
"""
print(timeit('reduce(mul, range(1, 10))', setup=setup))
print(timeit('reduce(lambda x, y: x * y, range(1, 10))', setup=setup))
Output:
1.48362842561
2.67425475375
operator.mul takes less time, as you can see.
Usually, functional programming involving many computations is best made to take less time using memoization -- the basic idea is that if you feed a true function (something that always evaluates the same result for a given argument) the same thing twice or more, you're wasting time, time that could easily be saved by identifying common calls and storing whatever they evaluate down to into a hash table or other quickly-accessible object. See https://en.wikipedia.org/wiki/Memoization for basic theory. It is well-implemented in Common Lisp.

Exponentiation in Python - should I prefer ** operator instead of math.pow and math.sqrt? [duplicate]

This question already has answers here:
Which is faster in Python: x**.5 or math.sqrt(x)?
(15 answers)
Closed 9 years ago.
In my field it's very common to square some numbers, operate them together, and take the square root of the result. This is done in pythagorean theorem, and the RMS calculation, for example.
In numpy, I have done the following:
result = numpy.sqrt(numpy.sum(numpy.pow(some_vector, 2)))
And in pure python something like this would be expected:
result = math.sqrt(math.pow(A, 2) + math.pow(B,2)) # example with two dimensions.
However, I have been using this pure python form, since I find it much more compact, import-independent, and seemingly equivalent:
result = (A**2 + B**2)**0.5 # two dimensions
result = (A**2 + B**2 + C**2 + D**2)**0.5
I have heard some people argue that the ** operator is sort of a hack, and that squaring a number by exponentiating it by 0.5 is not so readable. But what I'd like to ask is if:
"Is there any COMPUTATIONAL reason to prefer the former two alternatives over the third one(s)?"
Thanks for reading!
math.sqrt is the C implementation of square root and is therefore different from using the ** operator which implements Python's built-in pow function. Thus, using math.sqrt actually gives a different answer than using the ** operator and there is indeed a computational reason to prefer numpy or math module implementation over the built-in. Specifically the sqrt functions are probably implemented in the most efficient way possible whereas ** operates over a large number of bases and exponents and is probably unoptimized for the specific case of square root. On the other hand, the built-in pow function handles a few extra cases like "complex numbers, unbounded integer powers, and modular exponentiation".
See this Stack Overflow question for more information on the difference between ** and math.sqrt.
In terms of which is more "Pythonic", I think we need to discuss the very definition of that word. From the official Python glossary, it states that a piece of code or idea is Pythonic if it "closely follows the most common idioms of the Python language, rather than implementing code using concepts common to other languages." In every single other language I can think of, there is some math module with basic square root functions. However there are languages that lack a power operator like ** e.g. C++. So ** is probably more Pythonic, but whether or not it's objectively better depends on the use case.
Even in base Python you can do the computation in generic form
result = sum(x**2 for x in some_vector) ** 0.5
x ** 2 is surely not an hack and the computation performed is the same (I checked with cpython source code). I actually find it more readable (and readability counts).
Using instead x ** 0.5 to take the square root doesn't do the exact same computations as math.sqrt as the former (probably) is computed using logarithms and the latter (probably) using the specific numeric instruction of the math processor.
I often use x ** 0.5 simply because I don't want to add math just for that. I'd expect however a specific instruction for the square root to work better (more accurately) than a multi-step operation with logarithms.

Categories