In playing around with the python execution of time, I found an odd behavior when calling time.time() twice within a single statement. There is a very small processing delay in obtaining time.time() during statement execution.
E.g. time.time() - time.time()
If executed immediately in a perfect world, would compute in a result of 0.
However, in real world, this results in a very small number as there is an assumed delay in when the processor executes the first time.time() computation and the next. However, when running this same execution and comparing it to a variable computed in the same way, the results are skewed in one direction.
See the small code snippet below.
This also holds true for very large data sets
import time
counts = 300000
def at_once():
first = 0
second = 0
x = 0
while x < counts:
x += 1
exec_first = time.time() - time.time()
exec_second = time.time() - time.time()
if exec_first > exec_second:
first += 1
else:
second += 1
print('1sts: %s' % first)
print('2nds: %s' % second)
prints:
1sts: 39630
2nds: 260370
Unless I have my logic incorrect, I would expect the results to very close to 50:50, but it does not seem to be the case. Is there anyone who could explain what causes this behavior or point out a potential flaw with the code logic that is making the results skewed in one direction?
Could it be that exec_first == exec_second? Your if-else would add 1 to second in that case.
Try changing you if-else to something like:
if exec_first > exec_second:
first += 1
elif exec_second > exec_first:
second += 1
else:
pass
You assign all of the ties to one category. Try it with a middle ground:
import time
counts = 300000
first = 0
second = 0
same = 0
for _ in range(counts):
exec_first = time.time() - time.time()
exec_second = time.time() - time.time()
if exec_first == exec_second:
same += 1
elif exec_first > exec_second:
first += 1
else:
second += 1
print('1sts: %s' % first)
print('same: %s' % same)
print('2nds: %s' % second)
Output:
$ python3 so.py
1sts: 53099
same: 194616
2nds: 52285
$ python3 so.py
1sts: 57529
same: 186726
2nds: 55745
Also, I'm confused as to why you think that a function call might take 0 time. Every invocation requires at least access to the system clock and copying that value to a temporary location of some sort. This isn't free of overhead on any current computer.
Related
I am measuring parameters on a battery (current, voltage, etc..) using an analogue to digital converter. The “While loop” cycle contains also the measurements functions which are not shown in this context because are not part of my question.
With the code here below, I am attempting to calculate Ampere/hours on each iteration of the cycle (Ahinst) simply multiplying the measured current by the elapsed time between two measurements. I am also summing up the Ah to get a cumulative value (TotAh) drained from the battery. This last value is shown only when the current (P2) is negative (battery not in charging mode). When the current (P2) reverse into charging mode I clear TotAh and just show 0.
timeMeas=[]
currInst=[]
Ah=[]
TotAh=0
while(True):
try:
#measurements routines are also running here
#......................
#Ah() in development
if (P2 < 0): #if current is negative (discharging)
Tnow = datetime.now() #get time_start reference to calculate elapsed time until next current measure
timeMeas.append (Tnow) #save time_start
currInst.append (P2) #save current at time_start
if (len(currInst) >1): #if current measurements are two
elapsed=(timeMeas[1]-timeMeas[0]).total_seconds() #calculate time elapsed between two mesurements
Ahinst=currInst[1]/3600*elapsed #calculate Ah per time interval
Ah.append(Ahinst) #save Ah per time interval
TotAh=round(sum(Ah),3)* -1 #update cumulative Ah
timeMeas=[] #cleanup data in array
currInst=[] #cleanup data in array
elif (P2 > 0):
TotAh=0
Ah=[]
time.sleep(1)
except KeyboardInterrupt:
break
The code is working but obviously is not giving me the correct result because in the second “if”condition I always clear the two arrays (timeMeas and CurrInst). Since the calculation requires at least two actual measurements “if (len(currInst)>1) ” to work, clearing the two arrays cause to lose one measurement at every iteration of the cycle. I have considered shifting the values position from 0 to 1 in the arrays at every iteration, but this would cause calculation mistakes when the cycle is restarted after the value P2 is reversed to charging and then discharging mode again.
I am very rusty with coding and doing this for hobby. I am battling to find a solution to calculate “Ahinst” at every cycle with the actual values.
Any help is appreciated. Thanks
If you only want to keep two measurements (current and previous) you can keep arrays of size two, and have idx = 1 - idx at the end of the loop to have it flip-flop between 0 and 1.
timeMeas = [None, None]
currInst = [None, None]
TotAh = 0.0
idx = 0
while True: # no need for parentheses
try:
if (P2 < 0):
Tnow = datetime.now()
timeMeas[idx] = Tnow
currInst[idx] = P2
if currInst[1] is not None: #meaning we have at least two measurements
elapsed = (timeMeas[idx]-timeMeas[1-idx]).total_seconds()
TotAh + = currInst[idx]/3600*elapsed
elif (P2 > 0): # is "elif" really correct here?
TotAh = 0.0
# Do we want to reset these, too?
timeMeas = [None, None]
currInst = [None, None]
# should this really be inside the elif?
time.sleep(1)
idx = 1 - idx
except KeyboardInterrupt:
break
In some sense, it would be simpler to have two dict variables curr and prev, and set prev = None when you start or reset them. Then simply set curr = prev at the end of the loop, and populate curr with new values in each iteration, like curr['when'] = datetime.now() and curr['measurement'] = P2.
I am solving Problem 14 of Project Euler and I wrote 2 programs, one which is optimised and the other which is not. I've even imported the time module to calculate the time taken, but it's not working properly. it works fine in the unoptimised code:
import time
start = time.time()
def collatz(n):
chain=1
while(n>1):
chain+=1
if(n%2==0):
n/=2
else:
n = 3*n+1
return chain
maxChain = 0
num=0
counter = 10**6
while(counter>13):
coll = collatz(counter)
if(coll > maxChain):
maxChain = coll
num = counter
counter-=1
end = time.time()
print("Time taken:",end-start)
print(start+', '+ end)
the output is:
Time taken: 47.83728861808777
1591290440.8452923, 1591290488.682581
But in my other code:
import time
start = time.time()
dict = {n:0 for n in range(1,10**6)}
dict[1], dict[2] = 1,2
for i in range(3,10**6):
counter = 0
start = i
while(i > 1):
#Have we already encountered this sequence?
if(i < start):
dict[start] = counter + dict[i]
break
if(i%2==0):
i/=2
else:
i = 3*i+1
counter += 1
end = time.time()
print('Time taken:',end-start)
print(start+', '+end)
the output is:
Time taken: 1590290651.4527032
999999, 1591290650.4527032
The start time in the second program is 999999 while the end time is fine. this problem doesn't occur in the first program, I don't know why this is happening?
Translated from comment:
You can see in the second version of the code you shadow/reuse the variable start, using it for a counter. Thus the 999999 in your output, and the strange results.
Renaming it to anything else will fix you right up =)
I just wrote up code for problem 1.6 String Compression from Cracking the Coding Interview. I am wondering how I can condense this code to make it more efficient. Also, I want to make sure that this code is O(n) because I am not concatenating to a new string.
The problem states:
Implement a method to perform basic string compression using the counts of repeated characters. For example, the string 'aabcccccaaa' would become a2b1c5a3. If the "compressed" string would not become smaller than the original string, your method should return the original string. You can assume the string has only uppercase and lowercase letters (a - z).
My code works. My first if statement after the else checks to see if the count for the character is 1, and if it is then to just append the character. I do this so when checking the length of the end result and the original string to decide which one to return.
import string
def stringcompress(str1):
res = []
d = dict.fromkeys(string.ascii_letters, 0)
main = str1[0]
for char in range(len(str1)):
if str1[char] == main:
d[main] += 1
else:
if d[main] == 1:
res.append(main)
d[main] = 0
main = str1[char]
d[main] += 1
else:
res.append(main + str(d[main]))
d[main] = 0
main = str1[char]
d[main] += 1
res.append(main + str(d[main]))
return min(''.join(res), str1)
Again, my code works as expected and does what the question asks. I just want to see if there are certain lines of code I can take out to make the program more efficient.
I messed around testing different variations with the timeit module. Your variation worked fantastically when I generated test data that did not repeat often, but for short strings, my stringcompress_using_string was the fastest method. As the strings grow longer everything flips upside down, and your method of doing things becomes the fastest, and stringcompress_using_string is the slowest.
This just goes to show the importance of testing under different circumstances. My initial conclusions where incomplete, and having more test data showed the true story about the effectiveness of these three methods.
import string
import timeit
import random
def stringcompress_original(str1):
res = []
d = dict.fromkeys(string.ascii_letters, 0)
main = str1[0]
for char in range(len(str1)):
if str1[char] == main:
d[main] += 1
else:
if d[main] == 1:
res.append(main)
d[main] = 0
main = str1[char]
d[main] += 1
else:
res.append(main + str(d[main]))
d[main] = 0
main = str1[char]
d[main] += 1
res.append(main + str(d[main]))
return min(''.join(res), str1, key=len)
def stringcompress_using_list(str1):
res = []
count = 0
for i in range(1, len(str1)):
count += 1
if str1[i] is str1[i-1]:
continue
res.append(str1[i-1])
res.append(str(count))
count = 0
res.append(str1[i] + str(count+1))
return min(''.join(res), str1, key=len)
def stringcompress_using_string(str1):
res = ''
count = 0
# we can start at 1 because we already know the first letter is not a repition of any previous letters
for i in range(1, len(str1)):
count += 1
# we keep going through the for loop, until a character does not repeat with the previous one
if str1[i] is str1[i-1]:
continue
# add the character along with the number of times it repeated to the final string
# reset the count
# and we start all over with the next character
res += str1[i-1] + str(count)
count = 0
# add the final character + count
res += str1[i] + str(count+1)
return min(res, str1, key=len)
def generate_test_data(min_length=3, max_length=300, iterations=3000, repeat_chance=.66):
assert repeat_chance > 0 and repeat_chance < 1
data = []
chr = 'a'
for i in range(iterations):
the_str = ''
# create a random string with a random length between min_length and max_length
for j in range( random.randrange(min_length, max_length+1) ):
# if we've decided to not repeat by randomization, then grab a new character,
# otherwise we will continue to use (repeat) the character that was chosen last time
if random.random() > repeat_chance:
chr = random.choice(string.ascii_letters)
the_str += chr
data.append(the_str)
return data
# generate test data beforehand to make sure all of our tests use the same test data
test_data = generate_test_data()
#make sure all of our test functions are doing the algorithm correctly
print('showing that the algorithms all produce the correct output')
print('stringcompress_original: ', stringcompress_original('aabcccccaaa'))
print('stringcompress_using_list: ', stringcompress_using_list('aabcccccaaa'))
print('stringcompress_using_string: ', stringcompress_using_string('aabcccccaaa'))
print()
print('stringcompress_original took', timeit.timeit("[stringcompress_original(x) for x in test_data]", number=10, globals=globals()), ' seconds' )
print('stringcompress_using_list took', timeit.timeit("[stringcompress_using_list(x) for x in test_data]", number=10, globals=globals()), ' seconds' )
print('stringcompress_using_string took', timeit.timeit("[stringcompress_using_string(x) for x in test_data]", number=10, globals=globals()), ' seconds' )
The following results where all taken on an Intel i7-5700HQ CPU # 2.70GHz, quad core processor. Compare the different functions within each blockquote, but don't try to cross compare results from one blockquote to another because the size of the test data will be different.
Using long strings
Test data generated with generate_test_data(10000, 50000, 100, .66)
stringcompress_original took 7.346990528497378 seconds
stringcompress_using_list took 7.589927956366313 seconds
stringcompress_using_string took 7.713812443264496 seconds
Using short strings
Test data generated with generate_test_data(2, 5, 10000, .66)
stringcompress_original took 0.40272931026355685 seconds
stringcompress_using_list took 0.1525574881739265 seconds
stringcompress_using_string took 0.13842854253813164 seconds
10% chance of repeating characters
Test data generated with generate_test_data(10, 300, 10000, .10)
stringcompress_original took 4.675965586924492 seconds
stringcompress_using_list took 6.081609410376534 seconds
stringcompress_using_string took 5.887430301813865 seconds
90% chance of repeating characters
Test data generated with generate_test_data(10, 300, 10000, .90)
stringcompress_original took 2.6049783549783547 seconds
stringcompress_using_list took 1.9739111725413099 seconds
stringcompress_using_string took 1.9460854974553605 seconds
It's important to create a little framework like this that you can use to test changes to your algorithm. Often changes that don't seem useful will make your code go much faster, so the key to the game when optimizing for performance is to try out different things, and time the results. I'm sure there are more discoveries that could be found if you play around with making different changes, but it really matters on the type of data you want to optimize for -- compressing short strings vs long strings vs strings that don't repeat as often vs those that do.
I have a for loop that iterates over a number and performs some simple calculations. I am trying to figure out how to print out ( or log to file) the current value of 'val' every .5 to 1 second with out having to pause or sleep during the loop. Here is a super simple example
val_list = []
for i in xrange(iterations):
val = (i*(1/2)) * pi
val2 = np.linalg.pinv(val)
# print or write to log val2 after every half second (or 1 second)
val_list.append(val2)
Just use time.time to capture the time before starting, then check how long it's been after you calculate val2:
import time
val_list = []
prev_time = time.time()
for i in xrange(iterations):
val = (i*(1/2)) * pi
val2 = np.linalg.pinv(val)
# print or write to log val2 after every half second (or 1 second)
dt = time.time() - prev_time
if dt > 1:
# print or write to log here
prev_time = time.time()
val_list.append(val2)
You can use time.time():
from time import time as t
val_list = []
nowTime = t()
for i in xrange(iterations):
val = (i*(1/2)) * pi
val2 = np.linalg.pinv(val)
curTime = t()
if curTime - nowTime >= 0.5:
#Do your stuff
nowTime = curTime
val_list.append(val2)
You can achieve this using Threads.
Here's a documentation on how to utilize Threads : https://docs.python.org/3/library/threading.html ( If you're using Python2.7 then change the 3 in the url to a 2 )
Here's a link which is similar to what you want and should also point you in the right direction : Python threading.timer - repeat function every 'n' seconds
Basically you have to create a Thread that will only execute ever n number of seconds. On each iteration it will print the value. The above link should suffice for that. Good luck !
I want to compute how many times my computer can do counter += 1 in one second. A naive approach is the following:
from time import time
counter = 0
startTime = time()
while time() - startTime < 1:
counter += 1
print counter
The problem is time() - startTime < 1 may be considerably more expensive than counter += 1.
Is there a way to make a less "clean" 1 sec sample of my algorithm?
The usual way to time algorithms is the other way around: Use a fixed number of iterations and measure how long it takes to finish them. The best way to do such timings is the timeit module.
print timeit.timeit("counter += 1", "counter = 0", number=100000000)
Note that timing counter += 1 seems rather pointless, though. What do you want to achieve?
Why don't you infer the time instead? You can run something like:
from datetime import datetime
def operation():
counter = 0
tbeg = datetime.utcnow()
for _ in range(10**6):
counter += 1
td = datetime.utcnow() - tbeg
return (td.microseconds + (td.seconds + td.days * 24 * 3600) * 10**6)/10.0**6
def timer(n):
stack = []
for _ in range(n):
stack.append(operation()) # units of musec/increment
print sum(stack) / len(stack)
if __name__ == "__main__":
timer(10)
and get the average elapsed microseconds per increment; I get 0.09 (most likely very inaccurate). Now, it is a simple operation to infer that if I can make one increment in 0.09 microseconds, then I am able to make about 11258992 in one second.
I think the measurements are very inaccurate, but maybe is a sensible approximation?
I have never worked with the time() library, but according to that code I assume it counts seconds, so what if you do the /sec calculations after ctrl+C happens? It would be something like:
#! /usr/bin/env python
from time import time
import signal
import sys
#The ctrl+C interruption function:
def signal_handler(signal, frame):
counts_per_sec = counter/(time()-startTime)
print counts_per_sec
exit(0)
signal.signal(signal.SIGINT, signal_handler)
counter = 0
startTime = time()
while 1:
counter = counter + 1
Of course, it wont be exact because of the time passed between the last second processed and the interruption signal, but the more time you leave the script running, the more precise it will be :)
Here is my approach
import time
m = 0
timeout = time.time() + 1
while True:
if time.time() > timeout:
break
m = m + 1
print(m)