I have this python code to generate prime numbers. I added a little piece of code (between # Start progress code and # End progress code) to display the progress of the operation but it slowed down the operation.
#!/usr/bin/python
a = input("Enter a number: ")
f = open('data.log', 'w')
for x in range (2, a):
p = 1
# Start progress code
s = (float(x)/float(a))*100
print '\rProcessing ' + str(s) + '%',
# End progress code
for i in range(2, x-1):
c = x % i
if c == 0:
p = 0
break
if p != 0:
f.write(str(x) + ", ")
print '\rData written to \'data.log\'. Press Enter to exit...'
raw_input()
My question is how to show the progress of the operation without slowing down the actual code/loop. Thanks in advance ;-)
To answer your question I/O is very expensive, and so printing out your progress will have a huge impact on performance. I would avoid printing if possible.
If you are concerned about speed, there is a very nice optimization you can use to greatly speed up your code.
For you inner for loop, instead of
for i in range(2, x-1):
c = x % i
if c == 0:
p = 0
break
use
for i in range(2, x-1**(1.0/2)):
c = x % i
if c == 0:
p = 0
break
You only need to iterate from the range of 2 to the square root of the number you are primality testing.
You can use this optimization to offset any performance loss from printing your progress.
Your inner loop is O(n) time. If you're experiencing lag toward huge numbers then it's pretty normal. Also you're converting x and a into float while performing division; as they get bigger, it could slow down your process.
First, I hope this is a toy problem, because (on quick glance) it looks like the whole operation is O(n^2).
You probably want to put this at the top:
from __future__ import division # Make floating point division the default and enable the "//" integer division operator.
Typically for huge loops where each iteration is inexpensive, progress isn't output every iteration because it would take too long (as you say you are experiencing). Try outputting progress either a fixed number of times or with a fixed duration between outputs:
N_OUTPUTS = 100
OUTPUT_EVERY = (a-2) // 5
...
# Start progress code
if a % OUTPUT_EVERY == 0:
print '\rProcessing {}%'.format(x/a),
# End progress code
Or if you want to go by time instead:
UPDATE_DT = 0.5
import time
t = time.time()
...
# Start progress code
if time.time() - t > UPDATE_DT:
print '\rProcessing {}%'.format(x/a),
t = time.time()
# End progress code
That's going to be a little more expensive, but will guarantee that even as the inner loop slows down, you won't be left in the dark for more than one iteration or 0.5 seconds, whichever takes longer.
Related
I am implementing the coin change problem in python in CS50's pset6. When I first tackled the problem, this was the algorithm I used:
import time
while True:
try:
totalChange = input('How much change do I owe you? ')
totalChange = float(totalChange) # check it it's a valid numeric value
if totalChange < 0:
print('Error: Please enter a positive numeric value')
continue
break
except:
print('Error: Please enter a positive numeric value')
start_time1 = time.time()
change1 = int(totalChange * 100) # convert money into cents
n = 0
while change1 >= 25:
change1 -= 25
n += 1
while change1 >= 10:
change1 -= 10
n += 1
while change1 >= 5:
change1 -= 5
n += 1
while change1 >= 1:
change1 -= 1
n += 1
print(f'Method1: {n}')
print("--- %s seconds ---" % (time.time() - start_time1))
Having watched the lecture on dynamic programming, I wanted to implement it into this problem. This was my attempt:
while True:
try:
totalChange = input('How much change do I owe you? ')
totalChange = float(totalChange) # check it it's a valid numeric value
if totalChange < 0:
print('Error: Please enter a positive numeric value')
continue
break
except:
print('Error: Please enter a positive numeric value')
start_time2 = time.time()
change2 = int(totalChange*100)
rowsCoins = [1,5,10,25]
colsCoins = list(range(change2 + 1))
n = len(rowsCoins)
m = len(colsCoins)
matrix = [[i for i in range(m)] for j in range(n)]
for i in range(1,n):
for j in range(1,m):
if rowsCoins[i] == j:
matrix[i][j] = 1
elif rowsCoins[i] > j:
matrix[i][j] = matrix[i-1][j]
else:
matrix[i][j] = min(matrix[i-1][j], 1 + matrix[i][j-rowsCoins[i]])
print(f'Method2: {matrix[-1][-1]}')
print("--- %s seconds ---" % (time.time() - start_time2))
When I run the program, it gives the correct answers, but it takes a much longer time.
How could I adjust the second code so that it is correctly implementing dynamic programming. Is my problem that I am starting the loops from the top left corner of the matrix instead of the bottom right?
What are the time complexities of the algorithms for each code that I wrote (as well as for a correct implementation of dynamic programming). I suspect that for the first code, it follows O(n^4), and for the second code O(n*m), and a correct implementation of dynamic programming should be O(n). Am I correct to think this?
Any help for a better understanding of these algorithms is much appreciated.
I think both algorithms are basically O(n).
n in this case is the size of the number entered.
In the first algorithm, it's not O(n^4) as that would suggest you have 4 nested loops looping n times. Instead, you have 4 loops that run sequentially. If they didn't modify change1 at all, that would potentially be O(4n), which is the same as O(n).
In the second algorithm, your choice of variable names confuses things a little. n is a constant, and m is based on the size of the input, so is what would typically be called n. So, if we rename n to c and m to n, we get O(c*n) which, again, is the same as O(n).
The key point here is that for any particular n, and O(n) algorithm isn't necessarily faster than, say, an O(n^2) algorithm. Big O notation just describes how the amount of work done varies with the size of the input. What it does say, is that as n gets bigger, the time taken by an O(n) algorithm will increase slower than the time taken by an O(n^2) algorithm, so for some large enough n, the algorithm with the lower complexity will be quicker.
How could I adjust the second code so that it is correctly implementing dynamic programming. Is my problem that I am starting the loops from the top left corner of the matrix instead of the bottom right?
IMHO, this problem is not suitable for dynamic programming, so it is hard to implement the correct dp. Check a greedy solution https://github.com/endiliey/cs50/blob/master/pset6/greedy.py which should be the best solution.
What are the time complexities of the algorithms for each code that I wrote (as well as for a correct implementation of dynamic programming).
Basically both of your codes should be O(n), but it does not mean that they have the same time complexity, as you have said, the dp solution is much slower. That is because they have different factor(ratio). For example, 4n and 0.25n both are O(n) but they have different time complexity.
The greedy solution should have a time complexity of O(1).
I have a small script that calculates something. It uses a primitive brute force algorithm and is inherently slow. I expect it to take about 30 minutes to complete. The script only has one print statement at the end when it is done. I would like to have something o make sure the script is still running. I do no want to include prints statements for each iteration of the loop, that seems unnecessary. How can I make sure a script that takes very long to execute is still running at a given time during the script execution. I do not want to cause my script to slow down because of this though. This is my script.
def triangle_numbers(num):
numbers = []
for item in range(1, num):
if num % item == 0:
numbers.append(item)
numbers.append(num)
return numbers
count = 1
numbers = []
while True:
if len(numbers) == 501:
print number
print count
break
numbers = triangle_numbers(count)
count += 1
You could print every 500 loops (or choose another number).
while True:
if len(numbers) == 501:
print number
print count
break
numbers = triangle_numbers(count)
count += 1
# print every 500 loops
if count % 500 == 0:
print count
This will let you know not only if it is running (which it obviously is unless it has finished), but how fast it is going (which I think might be more helpful to you).
FYI:
I expect your program will take more like 30 weeks than 30 minutes to compute. Try this:
'''
1. We only need to test for factors up to the square root of num.
2. Unless we are at the end, we only care about the number of numbers,
not storing them in a list.
3. xrange is better than range in this case.
4. Since 501 is odd, the number must be a perfect square.
'''
def divisors_count(sqrt):
num = sqrt * sqrt
return sum(2 for item in xrange(1, sqrt) if num % item == 0) + 1
def divisors(sqrt):
num = sqrt * sqrt
for item in xrange(1, sqrt):
if num % item == 0:
numbers.append(item)
numbers.append(item / sqrt)
numbers.append(sqrt)
return sorted(numbers)
sqrt = 1
while divisors_count(sqrt) != 501:
if sqrt % 500 == 0:
print sqrt * sqrt
sqrt += 1
print triangle_numbers(sqrt)
print sqrt * sqrt
though I suspect this will still take a long time. (In fact, I'm not convinced it will terminate.)
configure some external tool like supervisor
Supervisor starts its subprocesses via fork/exec and subprocesses don’t daemonize. The operating system signals Supervisor immediately when a process terminates, unlike some solutions that rely on troublesome PID files and periodic polling to restart failed processes.
I typed this code in python and my computer really heats and doesn't print anything! however when I assigned num = 2**10 it did. How can I calculate approx. how long will it take for an average computer to run this code?
the code is:
num = 2**100
cnt = 0
import time
t0 = time.clock()
for i in range(num):
cnt = cnt+1
print(cnt)
t1 = time.clock()
print("running time: ", t1-t0, " sec")
Use Ipython notebook for this.
It has a magic function called %%timeit, where you can do this sort of things.
Maybe 2**30 it's too much. The O for this kind of thing is O(2**n). It means that 2**30 would take approximately 2*20 more time than 2**10. And that's a lot of time.
Look the times using IPython:
Do the math, and double the time 20 more times, to see how much it would take using 2**30.
That's because your computer doesn't ever finish the computation with 2**30.
If your indents are accurate to what you're running, your code may not be making it to the print function, which would make it look like the script is not doing anything.
for i in range(num):
cnt = cnt+1
print(cnt)
is not the same as:
for i in range(num):
cnt = cnt+1
print(cnt)
If you want to check the progress of your script, you can use occasional print statements, by using mod. You can play around with the delay amount, depending on your computer's speed, but I would only print progress updates every 2-5 seconds.
delay = 5000
for i in range(num):
cnt = cnt+1
if i % delay == 0:
print("Current interation: {}".format(cnt)) # Will only print when i is divisible by the delay amount
I have a very long loop, and I would like to check the status every N iterations, in my specific case I have a loop of 10 million elements and I want to print a short report every millionth iteration.
So, currently I am doing just (n is the iteration counter):
if (n % 1000000==0):
print('Progress report...')
but I am worried I am slowing down the process by computing the modulus at each iteration, as one iteration lasts just few milliseconds.
Is there a better way to do this? Or shouldn't I worry at all about the modulus operation?
How about keeping a counter and resetting it to zero when you reach the wanted number? Adding and checking equality is faster than modulo.
printcounter = 0
# Whatever a while loop is in Python
while (...):
...
if (printcounter == 1000000):
print('Progress report...')
printcounter = 0
...
printcounter += 1
Although it's quite possible that the compiler is doing some sort of optimization like this for you already... but this may give you some peace of mind.
1. Human-language declarations for x and n:
let x be the number of iterations that have been examined at any given time.
let n be the multiple of iterations upon which your code will executed.
Example 1: "After x iterations, how many times was n done?"
Example 2: "It is the xth iteration and the action has occurred every nth time, so far."
2. What we're doing:
The first code block (Block A) uses only one variable, x (defined above), and uses 5 (an integer) rather than the variable n (defined above).
The second code block (Block B) uses both of the variables (x and n) that are defined above. The integer, 5, will be replaced by the variable, n. So, Block B literally performs an action at each nth iteration.
Our goal is to do one thing on every iteration and two things on every nth iteration.
We are going through 100 iterations.
m. Easy-to-understand attempt:
Block A, minimal variables:
for x in 100:
#what to do every time (100 times in-total): replace this line with your every-iteration functions.
if x % 5 == 0:
#what to do every 5th time: replace this line with your nth-iteration functions.
Block B, generalization.
n = 5
for x in 100:
#what to do every time (100 times in-total): replace this line with your every-iteration functions.
if x % n == 0:
#what to do every 5th time: replace this line with your nth-iteration functions.
Please, let me know if you have any issues because I haven't had time to test it after writing it here.
3. Exercises
If you've done this properly, see if you can use it with the turtle.Pen() and turtle.forward() function. For example, move the turtle forward 4 times and then right and forward once?
See if you can use this program with the turtle.circle() function. For example, draw a circle with radius+1 4 times and a circle of a new color with radiut+1 once?
Check out the reading (seen below) to attempt to improve the programs from exercise 1 and 2. I can't think of a good reason to be doing this: I just feel like it might be useful!
About modulo and other basic operators:
https://docs.python.org/2/library/stdtypes.html
http://www.tutorialspoint.com/python/python_basic_operators.htm
About turtle:
https://docs.python.org/2/library/turtle.html
https://michael0x2a.com/blog/turtle-examples
Is it really slowing down? You have to try and see for yourself. It won't be much of a slowdown, but if we're talking about nanoseconds it may be considerable. Alternatively you can convert one 10 million loop to two smaller loops:
m = 1000000
for i in range(10):
for i in range(m):
// do sth
print("Progress report")
It's difficult to know how your system will optimize your code without testing.
You could simplify the relational part by realizing that zero is evaluated as false.
if(not N % 10000000)
do stuff
Something like that ? :
for n in xrange(1000000,11000000,1000000):
for i in xrange(n-1000000,n):
x = 10/2
print 'Progress at '+str(i)
result
Progress at 999999
Progress at 1999999
Progress at 2999999
Progress at 3999999
Progress at 4999999
Progress at 5999999
Progress at 6999999
Progress at 7999999
Progress at 8999999
Progress at 9999999
.
EDIT
Better:
for n in xrange(0,10000000,1000000):
for i in xrange(n,n+1000000):
x = 10/2
print 'Progress at '+str(i)
And inspired from pajton:
m = 1000000
for n in xrange(0,10*m,m):
for i in xrange(n,n+m):
x = 10/2
print 'Progress at '+str(i+1)
I prefer this that I find more immediately readable than the pajton's solution.
It keeps the display of a value depending of i
I'd do some testing to see how much time your modulus calls are consuming. You can use timeit for that. If your results indicate a need for time reduction, another approach which eliminates your modulus calculation:
for m in xrange(m_min, m_max):
for n in xrange(n_min, n_max):
#do_n_stuff
print('Progress report...')
It's fast enough that I wouldn't worry about it.
If you really wanted to speed it up, you could do this to avoid the modulus
if (n == 1000000):
n = 0
print('Progress report...')
This makes the inner loop lean, and m does not have to be divisible by interval.
m = 10000000
interval = 1000000
i = 0
while i < m:
checkpoint = min(m, i+interval)
for j in xrange(i, checkpoint):
#do something
i = checkpoint
print "progress"
When I'm doing timing/reports based on count iterations, I just divide my counter by the desired iteration and determine if the result is an integer. So:
if n/1000000 == int(n/1000000):
print(report)
Once again working on Project Euler, this time my script just hangs there. I'm pretty sure I'm letting it run for long enough, and my hand-trace (as my father calls it) yields no issues. Where am I going wrong?
I'm only including the relevant portion of the code, for once.
def main():
f, n = 0, 20
while f != 20:
f = 0
for x in range(1,21):
if n % x != 0: break
else: ++f
if f == 20: print n
n += 20
Thanks in advance!
Python doesn't have increment (++). It's interpreted as +(+(a)). + is the unary plus operator, which basically does nothing. Use += 1
Here in your case 'f' value can never reach 20 and hence never exit
1) At 1st break (when n=20 and x =3) it again set f=0.
Similarly for next loop also n get increased 20 but when 'x' is 3 again same f=0
So this will go in infinite loop....