So I am not a CS major and have hard time answering questions about a program's big(O) complexity.
I wrote the following routine to output the pairs of numbers in an array which sum to 0:
asd=[-3,-2,-3,2,3,2,4,5,8,-8,9,10,-4]
def sum_zero(asd):
for i in range(len(asd)):
for j in range(i,len(asd)):
if asd[i]+asd[j]==0:
print asd[i],asd[j]
Now if someone asks the complexity of this method I know since the first loop goes thorough all the n items it will be more than (unless I am wrong) but can someone explain how to find the correct complexity?
If there is better more efficient way of solving this?
I won't give you a full solution, but will try to guide you.
You should get a pencil and a paper, and ask yourself:
How many times does the statement print asd[i], asd[j] execute? (in worst case, meaning that you shouldn't really care about the condition there)
You'll find that it really depends on the loop above it, which gets executed len(asd) (denote it by n) times.
The only thing you need to know, how many times is the inner loop executed giving that the outer loop has n iterations? (i runs from 0 up to n)
If you still not sure about the result, just take a real example, say n=20, and calculate how many times is the lowest statement executed, this will give you a very good indication about the answer.
def sum_zero(asd):
for i in range(len(asd)): # len called once = o(1), range called once = o(1)
for j in range(i,len(asd)): # len called once per i times = O(n), range called once per i times = O(n)
if asd[i]+asd[j]==0: # asd[i] called twice per j = o(2*n²)
# adding is called once per j = O(n²)
# comparing with 0 is called once per j = O(n²)
print asd[i],asd[j] # asd[i] is called twice per j = O(2*n²)
sum_zero(asd) # called once, o(1)
Assuming the worst case scenario (the if-condition always being true):
Total:
O(1) * 3
O(n) * 2
O(n²) * 6
O(6n² + 2n + 3)
A simple program to demonstrate the complexity:
target= []
quadraditc = []
linear = []
for x in xrange(1,100):
linear.append(x)
target.append(6*(x**2) + 2*x + 3)
quadraditc.append(x**2)
import matplotlib.pyplot as plt
plt.plot(linear,label="Linear")
plt.plot(target,label="Target Function")
plt.plot(quadraditc,label="Quadratic")
plt.ylabel('Complexity')
plt.xlabel('Time')
plt.legend(loc=2)
plt.show()
EDIT:
As pointed out by #Micah Smith, the above answer is the worst case operations, the Big-O is actually O(n^2), since the constants and lower order terms are omitted.
Related
I have written this code and I think its time complexity is O(n+m) as time depends on both the inputs, am I right? Is there a better algorithm you can suggest?
The function return the length of union of both the inputs.
class Solution :
def getUnion(self,a,b,):
p= 0
lower, greater = a,b
if len(a)>len(b):
lower,greater = b,a
while p< len(lower): # O(n+m)
if lower[p] in greater:
greater.remove(lower[p])
p+=1
return len(lower+greater)
print(Solution().getUnion([1,2,3,4,5],[2,3,4,54,67]))
Assuming 𝑚 is the shorter length, and 𝑛 the longer (or both are equal), then the while loop will iterate 𝑚 times.
Inside a single iteration of that loop an in greater operation is executed, which has a time complexity of O(𝑛) -- for each individual execution.
So the total time complexity is O(𝑚𝑛).
The correctness of this algorithm depends on whether we can assume that the input lists only contain unique values (each).
You can do better using a set:
return len(set(a + b))
Building a set is O(𝑚 + 𝑛), and getting its length is a constant time operation, so this is O(𝑚 + 𝑛)
So this is my line of code so far,
def Adder (i,j,k):
if i<=j:
for x in range (i, j+1):
print(x**k)
else:
print (0)
What it's supposed to do is get inputs (i,j,k) so that each number between [i,j] is multiplied the power of k. For example, Adder(3,6,2) would be 3^2 + 4^2 + 5^2 + 6^2 and eventually output 86. I know how to get the function to output the list of numbers between i and j to the power of K but I don't know how to make it so that the function sums that output. So in the case of my given example, my output would be 9, 16, 25, 36.
Is it possible to make it so that under my if conditional I can generate an output that adds up the numbers in the range after they've been taken to the power of K?
If anyone can give me some advice I would really appreciate it! First week of any coding ever and I don't quite know how to ask this question so sorry for vagueness!
Question now Answered, thanks to everyone who responded so quickly!
You could use built-in function sum()
def adder(i,j,k):
if i <= j:
print(sum(x**k for x in range(i,j+1)))
else:
print(0)
The documentation is here
I'm not sure if this is what you want but
if i<=j:
sum = 0
for x in range (i, j+1):
sum = sum + x**k #sum += x**k for simplicity
this will give you the sum of the powers
Looking at a few of the answers posted, they do a good job of giving you pythonic code for your solution, I thought I could answer your specific questions:
How can I get my function to add together its output?
A perhaps reasonable way is to iteratively and incrementally perform your calculations and store your interim solutions in a variable. See if you can visualize this:
Let's say (i,j,k) = (3,7,2)
We want the output to be: 135 (i.e., the result of the calculation 3^2 + 4^2 + 5^2 + 6^2 + 7^2)
Use a variable, call it result and initialize it to be zero.
As your for loop kicks off with x = 3, perform x^2 and add it to result. So result now stores the interim result 9. Now the loop moves on to x = 4. Same as the first iteration, perform x^2 and add it to result. Now result is 25. You can now imagine that result, by the time x = 7, contains the answer to the calculation 3^2+4^2+5^2+6^2. Let the loop finish, and you will find that 7^2 is also added to result.
Once loop is finished, print result to get the summed up answer.
A thing to note:
Consider where in your code you need to set and initialize the _result_ variable.
If anyone can give me some advice I would really appreciate it! First week of any coding ever and I don't quite know how to ask this question so sorry for vagueness!
Perhaps a bit advanced for you, but helpful to be made aware I think:
Alright, let's get some nuance added to this discussion. Since this is your first week, I wanted to jot down some things I had to learn which have helped greatly.
Iterative and Recursive Algorithms
First off, identify that the solution is an iterative type of algorithm. Where the actual calculation is the same, but is executed over different cumulative data.
In this example, if we were to represent the calculation as an operation called ADDER(i,j,k), then:
ADDER(3,7,2) = ADDER(3,6,2)+ 7^2
ADDER(3,6,2) = ADDER(3,5,2) + 6^2
ADDER(3,5,2) = ADDER(3,4,2) + 5^2
ADDER(3,4,2) = ADDER(3,3,2) + 4^2
ADDER(3,3,2) = 0 + 3^2
Problems like these can be solved iteratively (like using a loop, be it while or for) or recursively (where a function calls itself using a subset of the data). In your example, you can envision a function calling itself and each time it is called it does the following:
calculates the square of j and
adds it to the value returned from calling itself with j decremented
by 1 until
j < i, at which point it returns 0
Once the limiting condition (Point 3) is reached, a bunch of additions that were queued up along the way are triggered.
Learn to Speak The Language before using Idioms
I may get down-voted for this, but you will encounter a lot of advice displaying pythonic idioms for standard solutions. The idiomatic solution for your example would be as follows:
def adder(i,j,k):
return sum(x**k for x in range(i,j+1)) if i<=j else 0
But for a beginner this obscures a lot of the "science". It is far more rewarding to tread the simpler path as a beginner. Once you develop your own basic understanding of devising and implementing algorithms in python, then the idioms will make sense.
Just so you can lean into the above idiom, here's an explanation of what it does:
It calls the standard library function called sum which can operate over a list as well as an iterator. We feed it as argument a generator expression which does the job of the iterator by "drip feeding" the sum function with x^k values as it iterates over the range (1, j+1). In cases when N (which is j-i) is arbitrarily large, using a standard list can result in huge memory overhead and performance disadvantages. Using a generator expression allows us to avoid these issues, as iterators (which is what generator expressions create) will overwrite the same piece of memory with the new value and only generate the next value when needed.
Of course it only does all this if i <= j else it will return 0.
Lastly, make mistakes and ask questions. The community is great and very helpful
Well, do not use print. It is easy to modify your function like this,
if i<=j:
s = 0
for x in range (i, j+1):
s += x**k
return s # print(s) if you really want to
else:
return 0
Usually functions do not print anything. Instead they return values for their caller to either print or further process. For example, someone may want to find the value of Adder(3, 6, 2)+1, but if you return nothing, they have no way to do this, since the result is not passed to the program. A side note, do not capitalize functions. Those are for classes.
Okay, so I'm working on Euler Problem 12 (find the first triangular number with a number of factors over 500) and my code (in Python 3) is as follows:
factors = 0
y=1
def factornum(n):
x = 1
f = []
while x <= n:
if n%x == 0:
f.append(x)
x+=1
return len(f)
def triangle(n):
t = sum(list(range(1,n)))
return t
while factors<=500:
factors = factornum(triangle(y))
y+=1
print(y-1)
Basically, a function goes through all the numbers below the input number n, checks if they divide into n evenly, and if so add them to a list, then return the length in that list. Another generates a triangular number by summing all the numbers in a list from 1 to the input number and returning the sum. Then a while loop continues to generate a triangular number using an iterating variable y as the input for the triangle function, and then runs the factornum function on that and puts the result in the factors variable. The loop continues to run and the y variable continues to increment until the number of factors is over 500. The result is then printed.
However, when I run it, nothing happens - no errors, no output, it just keeps running and running. Now, I know my code isn't the most efficient, but I left it running for quite a bit and it still didn't produce a result, so it seems more likely to me that there's an error somewhere. I've been over it and over it and cannot seem to find an error.
I'd merely request that a full solution or a drastically improved one isn't given outright but pointers towards my error(s) or spots for improvement, as the reason I'm doing the Euler problems is to improve my coding. Thanks!
You have very inefficient algorithm.
If you ask for pointers rather than full solution, main pointers are:
There is a more efficient way to calculate next triangular number. There is an explicit formula in the wiki. Also if you generate sequence of all numbers it is just more efficient to add next n to the previous number. (Sidenote list in sum(list(range(1,n))) makes no sense to me at all. If you want to use this approach anyway, sum(xrange(1,n) will probably be much more efficient as it doesn't require materialization of the range)
There are much more efficient ways to factorize numbers
There is a more efficient way to calculate number of factors. And it is actually called after Euler: see Euler's totient function
Generally Euler project problems (as in many other programming competitions) are not supposed to be solvable by sheer brute force. You should come up with some formula and/or more efficient algorithm first.
As far as I can tell your code will work, but it will take a very long time to calculate the number of factors. For 150 factors, it takes on the order of 20 seconds to run, and that time will grow dramatically as you look for higher and higher number of factors.
One way to reduce the processing time is to reduce the number of calculations that you're performing. If you analyze your code, you're calculating n%1 every single time, which is an unnecessary calculation because you know every single integer will be divisible by itself and one. Are there any other ways you can reduce the number of calculations? Perhaps by remembering that if a number is divisible by 20, it is also divisible by 2, 4, 5, and 10?
I can be more specific, but you wanted a pointer in the right direction.
From the looks of it the code works fine, it`s just not the best approach. A simple way of optimizing is doing until the half the number, for example. Also, try thinking about how you could do this using prime factors, it might be another solution. Best of luck!
First you have to def a factor function:
from functools import reduce
def factors(n):
step = 2 if n % 2 else 1
return set(reduce(list.__add__,
([i, n//i] for i in range(1, int(pow(n,0.5) + 1)) if n % i
== 0)))
This will create a set and put all of factors of number n into it.
Second, use while loop until you get 500 factors:
a = 1
x = 1
while len(factors(a)) < 501:
x += 1
a += x
This loop will stop at len(factors(a)) = 500.
Simple print(a) and you will get your answer.
I have a task to find asymptotic complexity of this python code.
for i in range(x):
if i == 0:
for j in range(x):
for k in range(500):
print("A ")
By what I know it should be 500*x. Because the first cycle only goes once (i==0), second one goes x times and third one 500 times, so therefore it should be 500*x, shouldn´t it? However this isn´t the right answer. Could you help me out?
Asymptotic complexity describes the growth of execution time as the variable factors grow arbitrarily large. In short, constants added or multiplied don't count at all, since they don't change with the variable.
Yes, there are 500*x lines printed. You also have x-1 non-functional loop iterations. Your total time would be computed as something like
(x-1)[loop overhead] + x[loop overhead] + 500*x[loop overhead + print
time]
However, the loop overhead, being a constant, is insignificant, and dropped out of the complexity expression. Likewise, the 500 is merely a scaling factor, and is also dropped from the expression.
The complexity is O(x).
It's 501*x since you also have to check x times the line if i == 0.
As the other answer says, usually we don't include the factor. But sometimes we do.
I want to program the following (I've just start to learn python):
f[i]:=f[i-1]-(1/n)*(1-(1-f[i-1])^n)-(1/n)*(f[i-1])^n+(2*f[0]/n);
with F[0]=x, x belongs to [0,1] and n a constant integer.
My try:
import pylab as pl
import numpy as np
N=20
n=100
h=0.01
T=np.arange(0, 1+h, h)
def f(i):
if i == 0:
return T
else:
return f(i-1)-(1./n)*(1-(1-f(i-1))**n)-(1./n)*(f(i-1))**n+2.*T/n
pl.figure(figsize=(10, 6), dpi=80)
pl.plot(T,f(N), color="red",linestyle='--', linewidth=2.5)
pl.show()
For N=10 (number of iterations) it returns the correct plot fast enough, but for N=20 it keeps running and running (more than 30 minutes already).
The reason why your run time is so slow is the fact that, like the simplistic calculation of the nth fibonacci number it runs in exponential time (in this case 3^n). To see this, before F[i] can return it's value, it has to call f[i-1] 3 times, but then each of those has to call F[i-2] 3 times (3*3 calls), and then each of those has to call F[i-3] 3 times (3*3*3 calls), and so on. In this example, as others have shown, this can be calculated simply in linear time. That you see it slow for N = 20 is because your function has to be called 3^20 = 3486784401 times before you get the answer!
You calculate f(i-1) three times in a single recursion layer - so after the first run you "know" the answer but still calculate it two more times. A naive approach:
fi_1 = f(i-1)
return fi_1-(1./n)*(1-(1-fi_1)**n)-(1./n)*(fi_1)**n+2.*T/n
But of course we can still do better and cache every evaluation of f:
cache = {}
def f_cached(i):
if not i in cache:
cache[i] = f(i)
return(cache[i])
Then replace every every occurence of f with f_cached.
There are also libraries out there that can do that for you automatically (with a decorator).
While recursion often yields nice and easy formulas, python is not that good at evaluating them (see tail recursion). You are probably better off with rewriting it in a iterativ way and calculate that.
First of all you are calculating f[i-1] three times when you can save it's result in some variable and calculate it only once :
t = f(i-1)
return t-(1./n)*(1-(1-t)**n)-(1./n)*(t)**n+2.*T/n
It will increase the speed of the program, but I would also like to recommend to calculate f without using recursion.
fs = T
for i in range(1,N+1):
tmp = fs
fs = (tmp-(1./n)*(1-(1-tmp)**n)-(1./n)*(tmp)**n+2.*T/n)