Fourier Transform: Computational Scaling as a function of vector length - python

If I were to use the following defined function to compute a discrete Fourier transform, how would I show that the computation scales as O(N^2) as a function of vector length.
def dft(y):
N = len(y)
c = np.zeros(N//2+1,complex)
for k in range(N//2+1):
for n in range(N):
c[k] += y[k]*np.exp(-2j*np.pi*k*n/N)
return c
from what I understand, if an algorithm scales as O(N^2) means that it is quadratic and the run time of the loops is proportional to the square of N. If N were doubled...then the run time would increase by N*N.
My first thought would to run a program were I transform an array of values where the length is equal to N, and then double these values (doubling N) and show that the run time difference between these two is N^2. Does this make any sense (or is there a different/better way)? If so how would I measure the run time in python?
thank you.

The runtime? You could Just make a counter at the beginning and each time something is done increase it by 1. So, inside your second for loop just increment the counter by 1, and when the program finishes print the counter. That would show the amount of calculations needed.
count = 0
def dft(y):
N = len(y)
c = np.zeros(N//2+1,complex)
for k in range(N//2+1):
for n in range(N):
c[k] += y[k]*np.exp(-2j*np.pi*k*n/N)
count+=1
return c
print(count)

A little depending on what time you want to measure you could use time.clock (which I think is closest to what you want here - it measures the time shares that your program actually got to run) or datetime.datetime.now.
You just get the time before and after your calculation is done. Something like:
t0 = time.clock()
dft()
t1 = time.clock()
print("Time ellapsed: {0}".format(t1-t0))
Note that what you're looking for when doubling N is a quadrupling of the time.

The line computing the coefficients is repeated
times. Then you need to show there is a constant M and a value for N that
as N approaches infinity. Then you've shown
.

The timeit library is made for exactly this purpose.
https://docs.python.org/2/library/timeit.html
from timeit import timeit
for i in [10, 100, 1000, ...]:
y = generate_array(i)
timeit('dft(y)')

Related

How can I find a check for a converging value more efficiently in python

I'm trying to estimate the value of pi up to 3 d.p but it feels extremely slow, for example it takes almost 20 minutes to loop 10,000 times. I'm assuming because it keeps individually checking every single value so I was wondering if there's anyway to loop faster since I need to find a better average.
def est(b,a,d,c):
total=0
count=0
dp=False
while not dp:
count+=1
x,y=random.uniform(a,b),random.uniform(c,d)
if x**2+y**2<1:
total+=1
areaest=((abs(b-a)*abs(d-c))/count)*total
round=float("{:.3f}".format(areaest))
if round==3.142:
dp=True
return count
There isn't an O() improvement to be had. Monte Carlo methods rely on trying things many, many times.
There are lots of low-level ways to cut cycles, though, which you pick up by experience. For example, move invariant computations out of loops, multiply once instead of raising to the power 2, use float literals where appropriate instead of integer literals that you know will have to be converted to float every time ... Here's a version with a bunch of those:
def est(b, a, d, c):
from random import uniform
box_area = float(abs(b-a) * abs(d-c))
total = count = 0
while True:
count += 1
x, y = uniform(a, b), uniform(c, d)
if x*x + y*y < 1.0:
total += 1
areaest = box_area / count * total
if abs(areaest - 3.142) < 0.0005:
break
return count
BTW, a meta-comment: why are you checking for convergence to 3.142 at all? In a Monte Carlo application, you typically don't know the result you're looking for in advance. If you did, why bother running Monte Carlo?
More typical: you pick a fixed number of iteration to run in advance. Then average over many runs each using that fixed number of iterations. Timing is at least roughly predictable then.

How do I calculate Time Complexity for this particular algorithm?

I know there are many other questions out there asking for the general guide of how to calculate the time complexity, such as this one.
From them I have learnt that when there is a loop, such as the (for... if...) in my Python programme, the Time complexity is N * N where N is the size of input. (please correct me if this is also wrong) (Edited once after being corrected by an answer)
# greatest common divisor of two integers
a, b = map(int, input().split())
list = []
for i in range(1, a+b+1):
if a % i == 0 and b % i == 0:
list.append(i)
n = len(list)
print(list[n-1])
However, do other parts of the code also contribute to the time complexity, that will make it more than a simple O(n) = N^2 ? For example, in the second loop where I was finding the common divisors of both a and b (a%i = 0), is there a way to know how many machine instructions the computer will execute in finding all the divisors, and the consequent time complexity, in this specific loop?
I wish the question is making sense, apologise if it is not clear enough.
Thanks for answering
First, a few hints:
In your code there is no nested loop. The if-statement does not constitute a loop.
Not all nested loops have quadratic time complexity.
Writing O(n) = N*N doesn't make any sense: what is n and what is N? Why does n appear on the left but N is on the right? You should expect your time complexity function to be dependent on the input of your algorithm, so first define what the relevant inputs are and what names you give them.
Also, O(n) is a set of functions (namely those asymptotically bounded from above by the function f(n) = n, whereas f(N) = N*N is one function. By abuse of notation, we conventionally write n*n = O(n) to mean n*n ∈ O(n) (which is a mathematically false statement), but switching the sides (O(n) = n*n) is undefined. A mathematically correct statement would be n = O(n*n).
You can assume all (fixed bit-length) arithmetic operations to be O(1), since there is a constant upper bound to the number of processor instructions needed. The exact number of processor instructions is irrelevant for the analysis.
Let's look at the code in more detail and annotate it:
a, b = map(int, input().split()) # O(1)
list = [] # O(1)
for i in range(1, a+b+1): # O(a+b) multiplied by what's inside the loop
if a % i == 0 and b % i == 0: # O(1)
list.append(i) # O(1) (amortized)
n = len(list) # O(1)
print(list[n-1]) # O(log(a+b))
So what's the overall complexity? The dominating part is indeed the loop (the stuff before and after is negligible, complexity-wise), so it's O(a+b), if you take a and b to be the input parameters. (If you instead wanted to take the length N of your input input() as the input parameter, it would be O(2^N), since a+b grows exponentially with respect to N.)
One thing to keep in mind, and you have the right idea, is that higher degree take precedence. So you can have a step that’s constant O(1) but happens n times O(N) then it will be O(1) * O(N) = O(N).
Your program is O(N) because the only thing really affecting the time complexity is the loop, and as you know a simple loop like that is O(N) because it increases linearly as n increases.
Now if you had a nested loop that had both loops increasing as n increased, then it would be O(n^2).

Experimentally determining computing complexity of matrix determinant

I need help determining experimentally the computing complexity of the determinant of a matrix nxn
My code:
import numpy as np
import timeit
t0 = time.time()
for n in range(1, 10):
A = np.random.rand(n, n)
det = np.linalg.slogdet(A)
t = timeit.timeit(lambda: det)
print(t)
But I get the same time for every n, hence, computing complexity: O(N) which is not correct as it is meant to be O(N^3). Any help would be much appreciated.
For what it's worth, any meaningful benchmarking typically requires sufficiently large N to give the computer something to chew on. A 10x10 matrix is not nearly large enough to start seeing complexity. Start throwing numbers like 100, 1000, 10000, etc, then you'll see your scaling.
For example if I slightly modify your code
for n in range(1, 14):
t0 = time.time()
p = 2**n
A = np.random.rand(p,p)
det = np.linalg.slogdet(A)
print('N={:04d} : {:.2e}s'.format(p, time.time() - t0))
This results in
N=0002 : 4.35e-02s
N=0004 : 0.00e+00s
N=0008 : 0.00e+00s
N=0016 : 5.02e-04s
N=0032 : 0.00e+00s
N=0064 : 5.02e-04s
N=0128 : 5.01e-04s
N=0256 : 1.50e-03s
N=0512 : 8.00e-03s
N=1024 : 3.95e-02s
N=2048 : 2.05e-01s
N=4096 : 1.01e+00s
N=8192 : 7.14e+00s
You can see that for very small values of N, some small-value optimizations and tricks make it hard to see O() complexity, but as the values of N grow, you can start to see the scaling.
There are some possible reasons:
The computer used to generate these numbers was busy doing something else when the and "slow" operations like n=2 or n=16 operations were done
particularly for n=2, maybe some caching was done after the first loop, speeding up subsequent runnings.
You'd also typically expect n=1 to have the worst ratio of running time to n, simply because of constant overhead like variable initialization.

Time complexity of a function

I'm trying to find out the time complexity (Big-O) of functions and trying to provide appropriate reason.
First function goes:
r = 0
# Assignment is constant time. Executed once. O(1)
for i in range(n):
for j in range(i+1,n):
for k in range(i,j):
r += 1
# Assignment and access are O(1). Executed n^3
like this.
I see that this is triple nested loop, so it must be O(n^3).
but I think my reasoning here is very weak. I don't really get what is going
on inside the triple nested loop here
Second function is:
i = n
# Assignment is constant time. Executed once. O(1)
while i>0:
k = 2 + 2
i = i // 2
# i is reduced by the equation above per iteration.
# so the assignment and access which are O(1) is executed
# log n times ??
I found out this algorithm to be O(1). But like the first function,
I don't see what is going on in the while-loop.
Can someone explain thoroughly about the time complexity of the two
functions? Thanks!
For such a simple case, you could find the number of iterations of the innermost loop as a function of n exactly:
sum_(i=0)^(n-1)(sum_(j=i+1)^(n-1)(sum_(k=i)^(j-1) 1)) = 1/6 n (n^2-1)
i.e., Θ(n**3) time complexity (see Big Theta): it assumes that r += 1 is O(1) if r has O(log n) digits (model has words with log n bits).
The second loop is even simpler: i //= 2 is i >>= 1. n has Θ(log n) digits and each iteration drops one binary digit (shift right) and therefore the whole loop is Θ(log n) time complexity if we assume that the i >> 1 shift of log(n) digits is O(1) operation (same model as in the first example).
Well first of all, for the first function, the time complexity seems to be closer to O(N log N) because the parameters of each loop decreases each time.
Also, for the second function, the runtime is O(log2 N). Except, say i == n == 2. After one run i is 1. After another i is 0.5. After another i is 0.25. And so on... I assume you would want int(i).
For a rigorous mathematical approach to each function, you can go to https://www.coursera.org/course/algo. It's a great course for this sort of thing. I was sort of sloppy in my calculations.

Python Time Complexity (run-time)

def f2(L):
sum = 0
i = 1
while i < len(L):
sum = sum + L[i]
i = i * 2
return sum
Let n be the size of the list L passed to this function. Which of the following most accurately describes how the runtime of this function grow as n grows?
(a) It grows linearly, like n does.
(b) It grows quadratically, like n^2 does.
(c) It grows less than linearly.
(d) It grows more than quadratically.
I don't understand how you figure out the relationship between the runtime of the function and the growth of n. Can someone please explain this to me?
ok, since this is homework:
this is the code:
def f2(L):
sum = 0
i = 1
while i < len(L):
sum = sum + L[i]
i = i * 2
return sum
it is obviously dependant on len(L).
So lets see for each line, what it costs:
sum = 0
i = 1
# [...]
return sum
those are obviously constant time, independant of L.
In the loop we have:
sum = sum + L[i] # time to lookup L[i] (`timelookup(L)`) plus time to add to the sum (obviously constant time)
i = i * 2 # obviously constant time
and how many times is the loop executed?
it's obvously dependant on the size of L.
Lets call that loops(L)
so we got an overall complexity of
loops(L) * (timelookup(L) + const)
Being the nice guy I am, I'll tell you that list lookup is constant in python, so it boils down to
O(loops(L)) (constant factors ignored, as big-O convention implies)
And how often do you loop, based on the len() of L?
(a) as often as there are items in the list (b) quadratically as often as there are items in the list?
(c) less often as there are items in the list (d) more often than (b) ?
I am not a computer science major and I don't claim to have a strong grasp of this kind of theory, but I thought it might be relevant for someone from my perspective to try and contribute an answer.
Your function will always take time to execute, and if it is operating on a list argument of varying length, then the time it takes to run that function will be relative to how many elements are in that list.
Lets assume it takes 1 unit of time to process a list of length == 1. What the question is asking, is the relationship between the size of the list getting bigger vs the increase in time for this function to execute.
This link breaks down some basics of Big O notation: http://rob-bell.net/2009/06/a-beginners-guide-to-big-o-notation/
If it were O(1) complexity (which is not actually one of your A-D options) then it would mean the complexity never grows regardless of the size of L. Obviously in your example it is doing a while loop dependent on growing a counter i in relation to the length of L. I would focus on the fact that i is being multiplied, to indicate the relationship between how long it will take to get through that while loop vs the length of L. Basically, try to compare how many loops the while loop will need to perform at various values of len(L), and then that will determine your complexity. 1 unit of time can be 1 iteration through the while loop.
Hopefully I have made some form of contribution here, with my own lack of expertise on the subject.
Update
To clarify based on the comment from ch3ka, if you were doing more than what you currently have inside your with loop, then you would also have to consider the added complexity for each loop. But because your list lookup L[i] is constant complexity, as is the math that follows it, we can ignore those in terms of the complexity.
Here's a quick-and-dirty way to find out:
import matplotlib.pyplot as plt
def f2(L):
sum = 0
i = 1
times = 0
while i < len(L):
sum = sum + L[i]
i = i * 2
times += 1 # track how many times the loop gets called
return times
def main():
i = range(1200)
f_i = [f2([1]*n) for n in i]
plt.plot(i, f_i)
if __name__=="__main__":
main()
... which results in
Horizontal axis is size of L, vertical axis is how many times the function loops; big-O should be pretty obvious from this.
Consider what happens with an input of length n=10. Now consider what happens if the input size is doubled to 20. Will the runtime double as well? Then it's linear. If the runtime grows by factor 4, then it's quadratic. Etc.
When you look at the function, you have to determine how the size of the list will affect the number of loops that will occur.
In your specific situation, lets increment n and see how many times the while loop will run.
n = 0, loop = 0 times
n = 1, loop = 1 time
n = 2, loop = 1 time
n = 3, loop = 2 times
n = 4, loop = 2 times
See the pattern? Now answer your question, does it:
(a) It grows linearly, like n does. (b) It grows quadratically, like n^2 does.
(c) It grows less than linearly. (d) It grows more than quadratically.
Checkout Hugh's answer for an empirical result :)
it's O(log(len(L))), as list lookup is a constant time operation, independant of the size of the list.

Categories