Negative algorithm run time - python

I'm trying to estimate time of running AES in Python. I have a code from here:
https://gist.github.com/jeetsukumaran/1291836
And i'm using this:
https://repl.it/languages/python3
Sometimes I get negative algorithm run times. Why is it? How to measure it right?
Relevant timing loop:
start = timeit.timeit()
r = Rijndael("abcdefg1234567890123456789012345", block_size = 32)
ciphertext = r.encrypt("99999999999999999999999999999995")
plaintext = r.decrypt(ciphertext)
end = timeit.timeit()
The full code is here.

Use time.time(), not timeit.timeit().
import time
# unrelated code
start = time.time()
r = Rijndael("abcdefg1234567890123456789012345", block_size = 32)
ciphertext = r.encrypt("99999999999999999999999999999995")
plaintext = r.decrypt(ciphertext)
end = time.time()
elapsed = end - start # will not be negative!
Notes
How does time.time() work?
time.time() will always return the number of seconds since January 1, 1970, 00:00:00 (UTC).
How is timeit.timeit() used?
Time one-liners, get average time over 1,000,000 calls.
>>> import timeit
>>> timeit.timeit('4 + 5') # runs 4 + 5 1,000,000 times; returns average speed (ms)
0.009406077000000401

Related

Python async generator function without time drift

I am trying to emulated the generation of a sequence of numbers as shown in the code below. I want to execute this at regular intervals without time drift and without keeping track of the number of times serial_sequence() has been called, i.e. without using the variable num.
How can this be done?
In the example code, delay2 is always 0.1, but it should be slightly less since the event loop is busy elsewhere between calls to serial_sequence(). Thus the elapsed time is over 2.0 seconds.
import asyncio
import time
rx_data = list()
EOT = 20
async def serial_sequence():
'''async generator function that simulates incoming sequence of serial data'''
num = 0
tprev = time.time()
while True:
dt = time.time() - tprev
delay2 = 0.1 - dt #Want to avoid time drift
print('dt: {}'.format(dt))
print('delay2: {}'.format(delay2))
await asyncio.sleep(delay2) #simulated IO delay
tprev = time.time()
num += 1
yield num
async def read_serial1():
gen = serial_sequence()
while(True):
data = await gen.__anext__()
rx_data.append(data)
print('read_serial1:', data)
if data == EOT:
break
return rx_data
async def main():
start = time.time()
task1 = asyncio.create_task(read_serial1())
await(task1)
stop = time.time()
print('Elapsed: {}'.format(stop-start))
if __name__ == '__main__':
asyncio.run(main())
The code in the while loop itself needs some time to compute. The timedrift in your example accumulates because you base dt on the time of the previous timestep tprev.
You could instead use the absolute starting time of serial_sequence as point of reference like so:
async def serial_sequence():
'''async generator function that simulates incoming sequence of serial data'''
num = 0
starttime = time.time()
while True:
dt = (time.time() - starttime) % 0.1
delay2 = 0.1 - dt #Want to avoid time drift
print('dt: {}'.format(dt))
print('delay2: {}'.format(delay2))
await asyncio.sleep(delay2) #simulated IO delay
tprev = time.time()
num += 1
yield num
Compare the accumulation by changing EOT. I.e. Doubling EOT results in double the time drift for your solution, while it is approximately constant for this one.
My answer is largely based on this answer to a very similar question.

Time delta gives 0.0 output

I'm trying to get time delta to run a linear search. When I run on debug mode, the delta variable is logging the difference but when ran as regular python script, print isn't giving the right result. It prints 0.0 even if I change the target which isn't accurate time difference.
import time
import itertools
def linear_search(arry, target):
for index, value in enumerate(arry):
if value == target:
return True
return False
k = list(itertools.islice(range(2000000), 1, 2000000, 2))
start = time.time()
linear_search(k, 25)
done = time.time()
delta = done - start
print(delta)
Can someone help to find if there's anything wrong in the print statement?
The issue might be the resolution of time.time() particularly on Windows. I believe the resolution there is apparently 16 milliseconds.
There is a different package (timeit) that is more oriented to timing very short method calls.
Here is an example:
import timeit
setup = '''
def linear_search(arry, target):
for index, value in enumerate(arry):
if value == target:
return True
return False
k = list(itertools.islice(range(2000000), 1, 2000000, 2))
n = 25
'''
print(timeit.timeit("linear_search(k, n)", setup=setup, number=1000))
This gives me an average of 0.00021399999999999197 over 1000 runs

Increment list size based on elapsed time

I am creating a program that will measure the execution times of various sorting algorithms (Selection, Bubble, Merge, and Tree sort).
The list sizes used for the test cases should start at 10,000, and go up by 10,000 for each test until the execution time for the test exceeds 60 seconds.
And that is my issue.
I have this probably very wrong (and ugly) code that I have created (I am currently testing with just the Bubble Sort).
import random
import time
def bubbleSort(a_list):
for passnum in range(len(a_list)-1,0,-1):
for i in range(passnum):
if a_list[i]>a_list[i+1]:
temp = a_list[i]
a_list[i] = a_list[i+1]
a_list[i+1] = temp
a_list = []
for i in range(10000):
a_list.append(random.randrange(0,10000))
start = time.perf_counter()
bubbleSort(a_list)
end = time.perf_counter()
elapsed = end - start
print("{0:.8f}".format(elapsed, "\n"))
print(a_list)
if elapsed <= 60:
for i in range(len(a_list), len(a_list)+10000):
a_list.append(random.randrange(len(a_list)+10000))
start = time.perf_counter()
bubbleSort(a_list)
end = time.perf_counter()
elapsed = end - start
print("{0:.8f}".format(elapsed, "\n"))
print(a_list)
else:
#it'll quit
I'm sorry for the ignorance that is very apparent. So above was my first reaction. Then I came up with this loop:
start = time.perf_counter()
while start <= 60:
for i in range(len(a_list)+10000):
a_list.append(random.randrange(len(a_list)+10000))
bubbleSort(a_list)
end = time.perf_counter()
elapsed = end - start
print("{0:.8f}".format(elapsed, "\n"))
print(a_list)
I would be very grateful if someone can give me a push in the right direction and help me think of the logic behind it. Thank you much in advance.
First, collapse some of the code to improve readability:
Element switch now uses the Python idiom a, b = b, a
Build the list with a comprehension, not a loop
parametrize the list size; increment each time through the loop.
Do you really need 8 decimal places for the execution time?
Code:
import random
import time
def bubbleSort(a_list):
for passnum in range(len(a_list)-1,0,-1):
for i in range(passnum):
if a_list[i] > a_list[i+1]:
a_list[i], a_list[i+1] = a_list[i+1], a_list[i]
elapsed = 0
size = 0
size_inc = 10000
print("Size\tTime")
while elapsed < 60:
# Add 10,000 numbers to the list
size += size_inc
a_list = [random.randrange(0,size) for i in range(size)]
start = time.perf_counter()
bubbleSort(a_list)
end = time.perf_counter()
elapsed = end - start
print(size, "\t{0:.8f}".format(elapsed, "sec.\n"))
Output:
Size Time
10000 12.05934826
20000 47.99201040
30000 111.39582218

Algorithm timing in Python

I want to compute how many times my computer can do counter += 1 in one second. A naive approach is the following:
from time import time
counter = 0
startTime = time()
while time() - startTime < 1:
counter += 1
print counter
The problem is time() - startTime < 1 may be considerably more expensive than counter += 1.
Is there a way to make a less "clean" 1 sec sample of my algorithm?
The usual way to time algorithms is the other way around: Use a fixed number of iterations and measure how long it takes to finish them. The best way to do such timings is the timeit module.
print timeit.timeit("counter += 1", "counter = 0", number=100000000)
Note that timing counter += 1 seems rather pointless, though. What do you want to achieve?
Why don't you infer the time instead? You can run something like:
from datetime import datetime
def operation():
counter = 0
tbeg = datetime.utcnow()
for _ in range(10**6):
counter += 1
td = datetime.utcnow() - tbeg
return (td.microseconds + (td.seconds + td.days * 24 * 3600) * 10**6)/10.0**6
def timer(n):
stack = []
for _ in range(n):
stack.append(operation()) # units of musec/increment
print sum(stack) / len(stack)
if __name__ == "__main__":
timer(10)
and get the average elapsed microseconds per increment; I get 0.09 (most likely very inaccurate). Now, it is a simple operation to infer that if I can make one increment in 0.09 microseconds, then I am able to make about 11258992 in one second.
I think the measurements are very inaccurate, but maybe is a sensible approximation?
I have never worked with the time() library, but according to that code I assume it counts seconds, so what if you do the /sec calculations after ctrl+C happens? It would be something like:
#! /usr/bin/env python
from time import time
import signal
import sys
#The ctrl+C interruption function:
def signal_handler(signal, frame):
counts_per_sec = counter/(time()-startTime)
print counts_per_sec
exit(0)
signal.signal(signal.SIGINT, signal_handler)
counter = 0
startTime = time()
while 1:
counter = counter + 1
Of course, it wont be exact because of the time passed between the last second processed and the interruption signal, but the more time you leave the script running, the more precise it will be :)
Here is my approach
import time
m = 0
timeout = time.time() + 1
while True:
if time.time() > timeout:
break
m = m + 1
print(m)

Get timer ticks in Python

I'm just trying to time a piece of code. The pseudocode looks like:
start = get_ticks()
do_long_code()
print "It took " + (get_ticks() - start) + " seconds."
How does this look in Python?
More specifically, how do I get the number of ticks since midnight (or however Python organizes that timing)?
In the time module, there are two timing functions: time and clock. time gives you "wall" time, if this is what you care about.
However, the python docs say that clock should be used for benchmarking. Note that clock behaves different in separate systems:
on MS Windows, it uses the Win32 function QueryPerformanceCounter(), with "resolution typically better than a microsecond". It has no special meaning, it's just a number (it starts counting the first time you call clock in your process).
# ms windows
t0= time.clock()
do_something()
t= time.clock() - t0 # t is wall seconds elapsed (floating point)
on *nix, clock reports CPU time. Now, this is different, and most probably the value you want, since your program hardly ever is the only process requesting CPU time (even if you have no other processes, the kernel uses CPU time now and then). So, this number, which typically is smaller¹ than the wall time (i.e. time.time() - t0), is more meaningful when benchmarking code:
# linux
t0= time.clock()
do_something()
t= time.clock() - t0 # t is CPU seconds elapsed (floating point)
Apart from all that, the timeit module has the Timer class that is supposed to use what's best for benchmarking from the available functionality.
¹ unless threading gets in the way…
² Python ≥3.3: there are time.perf_counter() and time.process_time(). perf_counter is being used by the timeit module.
What you need is time() function from time module:
import time
start = time.time()
do_long_code()
print "it took", time.time() - start, "seconds."
You can use timeit module for more options though.
Here's a solution that I started using recently:
class Timer:
def __enter__(self):
self.begin = now()
def __exit__(self, type, value, traceback):
print(format_delta(self.begin, now()))
You use it like this (You need at least Python 2.5):
with Timer():
do_long_code()
When your code finishes, Timer automatically prints out the run time. Sweet! If I'm trying to quickly bench something in the Python Interpreter, this is the easiest way to go.
And here's a sample implementation of 'now' and 'format_delta', though feel free to use your preferred timing and formatting method.
import datetime
def now():
return datetime.datetime.now()
# Prints one of the following formats*:
# 1.58 days
# 2.98 hours
# 9.28 minutes # Not actually added yet, oops.
# 5.60 seconds
# 790 milliseconds
# *Except I prefer abbreviated formats, so I print d,h,m,s, or ms.
def format_delta(start,end):
# Time in microseconds
one_day = 86400000000
one_hour = 3600000000
one_second = 1000000
one_millisecond = 1000
delta = end - start
build_time_us = delta.microseconds + delta.seconds * one_second + delta.days * one_day
days = 0
while build_time_us > one_day:
build_time_us -= one_day
days += 1
if days > 0:
time_str = "%.2fd" % ( days + build_time_us / float(one_day) )
else:
hours = 0
while build_time_us > one_hour:
build_time_us -= one_hour
hours += 1
if hours > 0:
time_str = "%.2fh" % ( hours + build_time_us / float(one_hour) )
else:
seconds = 0
while build_time_us > one_second:
build_time_us -= one_second
seconds += 1
if seconds > 0:
time_str = "%.2fs" % ( seconds + build_time_us / float(one_second) )
else:
ms = 0
while build_time_us > one_millisecond:
build_time_us -= one_millisecond
ms += 1
time_str = "%.2fms" % ( ms + build_time_us / float(one_millisecond) )
return time_str
Please let me know if you have a preferred formatting method, or if there's an easier way to do all of this!
The time module in python gives you access to the clock() function, which returns time in seconds as a floating point.
Different systems will have different accuracy based on their internal clock setup (ticks per second) but it's generally at least under 20milliseconds, and in some cases better than a few microseconds.
-Adam
import datetime
start = datetime.datetime.now()
do_long_code()
finish = datetime.datetime.now()
delta = finish - start
print delta.seconds
From midnight:
import datetime
midnight = datetime.datetime.now().replace(hour=0, minute=0, second=0, microsecond=0)
now = datetime.datetime.now()
delta = now - midnight
print delta.seconds
If you have many statements you want to time, you could use something like this:
class Ticker:
def __init__(self):
self.t = clock()
def __call__(self):
dt = clock() - self.t
self.t = clock()
return 1000 * dt
Then your code could look like:
tick = Ticker()
# first command
print('first took {}ms'.format(tick())
# second group of commands
print('second took {}ms'.format(tick())
# third group of commands
print('third took {}ms'.format(tick())
That way you don't need to type t = time() before each block and 1000 * (time() - t) after it, while still keeping control over formatting (though you could easily put that in Ticket too).
It's a minimal gain, but I think it's kind of convenient.

Categories