use time.time() as Timer failed on `**` - python

I'm a Python 3 learner and recently I was confused by a strange behavior of time.time().
I wrote two pieces of code and timed them with time.time():
pow version
from time import time
t0 = time()
x = pow(2, 1000000000)
t1 = time()
print(t1 - t0)
powimage
** version
from time import time
t0 = time()
x = 2 ** 1000000000
t1 = time()
print(t1 - t0)
**image
pow: time_cost = 1.544sec, output = 1.43625617027
**: time_cost = 1.526sec, output = 0.0
Why time.time() doesn't work for ** version???
More Info:
sys.version='3.6.2 (v3.6.2:5fd33b5, Jul 8 2017, 04:14:34) [MSC v.1900 32 bit (Intel)]',
sys.winver='3.6-32',
sys.platform='win32',
sys.implementation=namespace(cache_tag='cpython-36', hexversion=50725616, name='cpython', version=sys.version_info(major=3, minor=6, micro=2, releaselevel='final', serial=0))

It has to do with the way Python 3 processes certain constant expressions while processing a file. If you run the program:
print("start")
x = 2**1000000000
print("end")
you will see that there is a long start-up delay, and then "start" and "end" are printed almost simultaneously. While the program:
print("start")
x = pow(2, 1000000000)
print("end")
prints "start", then pauses for a while, and prints "end".
Python "pre-computes" the expression 2**1000000000 while it's initially processing the file (at the byte compilation stage), before the program actually starts running. In contrast, the expression pow(2,1000000000) isn't precomputed; it's byte compiled as a function call and computed when the program actually runs.
Here's another way to see this is happening. If you create two module files:
# starstar.py
x = 2**1000000000
# pow.py
x = pow(2,1000000000)
and import them into another program:
# main.py
import starstar
import pow
and run python main.py, then Python will produce byte compiled versions of the modules (maybe in a subdirectory called __pycache__ or something). You'll see that the byte-compiled version of the starstar.pyc module is huge -- it actually contains a copy of the pre-computed value of 2**1000000000; but the pow.pyc file will be tiny.

Related

Shared Memory between Julia and Python seems very slow (60 micros for round trip)

I am generating strings in Julia to use in Python. I would like to use Shared Memory (InterProcessCommunication.jl and Multiprocessing in Python). Currently, Julia generates strings then sends them to Python, which then reads the first number (so determine string length) before converting the rest into an encoded string.
I thought that shared memory would be much faster, but my method of timing (see below) seems to give 60-65 micros to:
Send the string and string length
Detect change in python, read the message and convert to bytes.
Send back an indication for julia to detect.
I am using Ubuntu. Comparatively, using TCP sockets gives 200 micros (so only a 3x speedup).
GSTiming comes from here:
#How can I get millisecond and microsecond-resolution timestamps in Python?
Julia:
using InterProcessCommunication
using Random
function copy_string_to_shared_memory(Aptr::Ptr{UInt8}, p::Vector{UInt8})
for i = 1:length(p)
unsafe_store!(Aptr, p[i], i + 4)
end
end
function main()
A = SharedMemory("myid"; readonly=false)
Aptr = convert(Ptr{UInt8}, pointer(A))
Bptr = convert(Ptr{UInt32}, pointer(A))
u = []
for i = 1:105
# p = Vector{UInt8}(randstring(rand(50:511)))
p = Vector{UInt8}("Hello Stack Exchange" * randstring(rand(1:430)))
a = time_ns()
copy_string_to_shared_memory(Aptr, p)
unsafe_store!(Bptr, length(p), 1)
# Make sure we can write to it
while unsafe_load(Aptr, 1) != 1
nanosleep(10e-7)
end
b = time_ns()
println((b - a) / 1000) # How i get times
sleep(0.01)
end
end
main()
Python Code:
from multiprocessing import shared_memory
import array
import time
import GSTiming
def main():
shm_a = shared_memory.SharedMemory("myid", create=True, size=512)
shm_a.buf[0] = 1 # Modify single byte at a time
u = []
for i in range(105):
while shm_a.buf[0] == 1:
GSTiming.delayMicroseconds(1)
s = bytes(shm_a.buf[4:shm_a.buf[0]+4])
shm_a.buf[0] = 1
shm_a.close()
shm_a.unlink() # Call unlink only once to release the shared memory
main()

How globally accurate is Python's time.time function? [duplicate]

Is there a way to measure time with high-precision in Python --- more precise than one second? I doubt that there is a cross-platform way of doing that; I'm interesting in high precision time on Unix, particularly Solaris running on a Sun SPARC machine.
timeit seems to be capable of high-precision time measurement, but rather than measure how long a code snippet takes, I'd like to directly access the time values.
The standard time.time() function provides sub-second precision, though that precision varies by platform. For Linux and Mac precision is +- 1 microsecond or 0.001 milliseconds. Python on Windows uses +- 16 milliseconds precision due to clock implementation problems due to process interrupts. The timeit module can provide higher resolution if you're measuring execution time.
>>> import time
>>> time.time() #return seconds from epoch
1261367718.971009
Python 3.7 introduces new functions to the time module that provide higher resolution:
>>> import time
>>> time.time_ns()
1530228533161016309
>>> time.time_ns() / (10 ** 9) # convert to floating-point seconds
1530228544.0792289
Python tries hard to use the most precise time function for your platform to implement time.time():
/* Implement floattime() for various platforms */
static double
floattime(void)
{
/* There are three ways to get the time:
(1) gettimeofday() -- resolution in microseconds
(2) ftime() -- resolution in milliseconds
(3) time() -- resolution in seconds
In all cases the return value is a float in seconds.
Since on some systems (e.g. SCO ODT 3.0) gettimeofday() may
fail, so we fall back on ftime() or time().
Note: clock resolution does not imply clock accuracy! */
#ifdef HAVE_GETTIMEOFDAY
{
struct timeval t;
#ifdef GETTIMEOFDAY_NO_TZ
if (gettimeofday(&t) == 0)
return (double)t.tv_sec + t.tv_usec*0.000001;
#else /* !GETTIMEOFDAY_NO_TZ */
if (gettimeofday(&t, (struct timezone *)NULL) == 0)
return (double)t.tv_sec + t.tv_usec*0.000001;
#endif /* !GETTIMEOFDAY_NO_TZ */
}
#endif /* !HAVE_GETTIMEOFDAY */
{
#if defined(HAVE_FTIME)
struct timeb t;
ftime(&t);
return (double)t.time + (double)t.millitm * (double)0.001;
#else /* !HAVE_FTIME */
time_t secs;
time(&secs);
return (double)secs;
#endif /* !HAVE_FTIME */
}
}
( from http://svn.python.org/view/python/trunk/Modules/timemodule.c?revision=81756&view=markup )
David's post was attempting to show what the clock resolution is on Windows. I was confused by his output, so I wrote some code that shows that time.time() on my Windows 8 x64 laptop has a resolution of 1 msec:
# measure the smallest time delta by spinning until the time changes
def measure():
t0 = time.time()
t1 = t0
while t1 == t0:
t1 = time.time()
return (t0, t1, t1-t0)
samples = [measure() for i in range(10)]
for s in samples:
print s
Which outputs:
(1390455900.085, 1390455900.086, 0.0009999275207519531)
(1390455900.086, 1390455900.087, 0.0009999275207519531)
(1390455900.087, 1390455900.088, 0.0010001659393310547)
(1390455900.088, 1390455900.089, 0.0009999275207519531)
(1390455900.089, 1390455900.09, 0.0009999275207519531)
(1390455900.09, 1390455900.091, 0.0010001659393310547)
(1390455900.091, 1390455900.092, 0.0009999275207519531)
(1390455900.092, 1390455900.093, 0.0009999275207519531)
(1390455900.093, 1390455900.094, 0.0010001659393310547)
(1390455900.094, 1390455900.095, 0.0009999275207519531)
And a way to do a 1000 sample average of the delta:
reduce( lambda a,b:a+b, [measure()[2] for i in range(1000)], 0.0) / 1000.0
Which output on two consecutive runs:
0.001
0.0010009999275207519
So time.time() on my Windows 8 x64 has a resolution of 1 msec.
A similar run on time.clock() returns a resolution of 0.4 microseconds:
def measure_clock():
t0 = time.clock()
t1 = time.clock()
while t1 == t0:
t1 = time.clock()
return (t0, t1, t1-t0)
reduce( lambda a,b:a+b, [measure_clock()[2] for i in range(1000000)] )/1000000.0
Returns:
4.3571334791658954e-07
Which is ~0.4e-06
An interesting thing about time.clock() is that it returns the time since the method was first called, so if you wanted microsecond resolution wall time you could do something like this:
class HighPrecisionWallTime():
def __init__(self,):
self._wall_time_0 = time.time()
self._clock_0 = time.clock()
def sample(self,):
dc = time.clock()-self._clock_0
return self._wall_time_0 + dc
(which would probably drift after a while, but you could correct this occasionally, for example dc > 3600 would correct it every hour)
If Python 3 is an option, you have two choices:
time.perf_counter which always use the most accurate clock on your platform. It does include time spent outside of the process.
time.process_time which returns the CPU time. It does NOT include time spent outside of the process.
The difference between the two can be shown with:
from time import (
process_time,
perf_counter,
sleep,
)
print(process_time())
sleep(1)
print(process_time())
print(perf_counter())
sleep(1)
print(perf_counter())
Which outputs:
0.03125
0.03125
2.560001310720671e-07
1.0005455362793145
You can also use time.clock() It counts the time used by the process on Unix and time since the first call to it on Windows. It's more precise than time.time().
It's the usually used function to measure performance.
Just call
import time
t_ = time.clock()
#Your code here
print 'Time in function', time.clock() - t_
EDITED: Ups, I miss the question as you want to know exactly the time, not the time spent...
Python 3.7 introduces 6 new time functions with nanosecond resolution, for example instead of time.time() you can use time.time_ns() to avoid floating point imprecision issues:
import time
print(time.time())
# 1522915698.3436284
print(time.time_ns())
# 1522915698343660458
These 6 functions are described in PEP 564:
time.clock_gettime_ns(clock_id)
time.clock_settime_ns(clock_id, time:int)
time.monotonic_ns()
time.perf_counter_ns()
time.process_time_ns()
time.time_ns()
These functions are similar to the version without the _ns suffix, but
return a number of nanoseconds as a Python int.
time.clock() has 13 decimal points on Windows but only two on Linux.
time.time() has 17 decimals on Linux and 16 on Windows but the actual precision is different.
I don't agree with the documentation that time.clock() should be used for benchmarking on Unix/Linux. It is not precise enough, so what timer to use depends on operating system.
On Linux, the time resolution is high in time.time():
>>> time.time(), time.time()
(1281384913.4374139, 1281384913.4374161)
On Windows, however the time function seems to use the last called number:
>>> time.time()-int(time.time()), time.time()-int(time.time()), time.time()-time.time()
(0.9570000171661377, 0.9570000171661377, 0.0)
Even if I write the calls on different lines in Windows it still returns the same value so the real precision is lower.
So in serious measurements a platform check (import platform, platform.system()) has to be done in order to determine whether to use time.clock() or time.time().
(Tested on Windows 7 and Ubuntu 9.10 with python 2.6 and 3.1)
The original question specifically asked for Unix but multiple answers have touched on Windows, and as a result there is misleading information on windows. The default timer resolution on windows is 15.6ms you can verify here.
Using a slightly modified script from cod3monk3y I can show that windows timer resolution is ~15milliseconds by default. I'm using a tool available here to modify the resolution.
Script:
import time
# measure the smallest time delta by spinning until the time changes
def measure():
t0 = time.time()
t1 = t0
while t1 == t0:
t1 = time.time()
return t1-t0
samples = [measure() for i in range(30)]
for s in samples:
print(f'time delta: {s:.4f} seconds')
These results were gathered on windows 10 pro 64-bit running python 3.7 64-bit.
The comment left by tiho on Mar 27 '14 at 17:21 deserves to be its own answer:
In order to avoid platform-specific code, use timeit.default_timer()
I observed that the resolution of time.time() is different between Windows 10 Professional and Education versions.
On a Windows 10 Professional machine, the resolution is 1 ms.
On a Windows 10 Education machine, the resolution is 16 ms.
Fortunately, there's a tool that increases Python's time resolution in Windows:
https://vvvv.org/contribution/windows-system-timer-tool
With this tool, I was able to achieve 1 ms resolution regardless of Windows version. You will need to be keep it running while executing your Python codes.
For those stuck on windows (version >= server 2012 or win 8)and python 2.7,
import ctypes
class FILETIME(ctypes.Structure):
_fields_ = [("dwLowDateTime", ctypes.c_uint),
("dwHighDateTime", ctypes.c_uint)]
def time():
"""Accurate version of time.time() for windows, return UTC time in term of seconds since 01/01/1601
"""
file_time = FILETIME()
ctypes.windll.kernel32.GetSystemTimePreciseAsFileTime(ctypes.byref(file_time))
return (file_time.dwLowDateTime + (file_time.dwHighDateTime << 32)) / 1.0e7
GetSystemTimePreciseAsFileTime function
On the same win10 OS system using "two distinct method approaches" there appears to be an approximate "500 ns" time difference. If you care about nanosecond precision check my code below.
The modifications of the code is based on code from user cod3monk3y and Kevin S.
OS: python 3.7.3 (default, date, time) [MSC v.1915 64 bit (AMD64)]
def measure1(mean):
for i in range(1, my_range+1):
x = time.time()
td = x- samples1[i-1][2]
if i-1 == 0:
td = 0
td = f'{td:.6f}'
samples1.append((i, td, x))
mean += float(td)
print (mean)
sys.stdout.flush()
time.sleep(0.001)
mean = mean/my_range
return mean
def measure2(nr):
t0 = time.time()
t1 = t0
while t1 == t0:
t1 = time.time()
td = t1-t0
td = f'{td:.6f}'
return (nr, td, t1, t0)
samples1 = [(0, 0, 0)]
my_range = 10
mean1 = 0.0
mean2 = 0.0
mean1 = measure1(mean1)
for i in samples1: print (i)
print ('...\n\n')
samples2 = [measure2(i) for i in range(11)]
for s in samples2:
#print(f'time delta: {s:.4f} seconds')
mean2 += float(s[1])
print (s)
mean2 = mean2/my_range
print ('\nMean1 : ' f'{mean1:.6f}')
print ('Mean2 : ' f'{mean2:.6f}')
The measure1 results:
nr, td, t0
(0, 0, 0)
(1, '0.000000', 1562929696.617988)
(2, '0.002000', 1562929696.6199884)
(3, '0.001001', 1562929696.620989)
(4, '0.001001', 1562929696.62199)
(5, '0.001001', 1562929696.6229906)
(6, '0.001001', 1562929696.6239917)
(7, '0.001001', 1562929696.6249924)
(8, '0.001000', 1562929696.6259928)
(9, '0.001001', 1562929696.6269937)
(10, '0.001001', 1562929696.6279945)
...
The measure2 results:
nr, td , t1, t0
(0, '0.000500', 1562929696.6294951, 1562929696.6289947)
(1, '0.000501', 1562929696.6299958, 1562929696.6294951)
(2, '0.000500', 1562929696.6304958, 1562929696.6299958)
(3, '0.000500', 1562929696.6309962, 1562929696.6304958)
(4, '0.000500', 1562929696.6314962, 1562929696.6309962)
(5, '0.000500', 1562929696.6319966, 1562929696.6314962)
(6, '0.000500', 1562929696.632497, 1562929696.6319966)
(7, '0.000500', 1562929696.6329975, 1562929696.632497)
(8, '0.000500', 1562929696.633498, 1562929696.6329975)
(9, '0.000500', 1562929696.6339984, 1562929696.633498)
(10, '0.000500', 1562929696.6344984, 1562929696.6339984)
End result:
Mean1 : 0.001001 # (measure1 function)
Mean2 : 0.000550 # (measure2 function)
Here is a python 3 solution for Windows building upon the answer posted above by CyberSnoopy (using GetSystemTimePreciseAsFileTime). We borrow some code from jfs
Python datetime.utcnow() returning incorrect datetime
and get a precise timestamp (Unix time) in microseconds
#! python3
import ctypes.wintypes
def utcnow_microseconds():
system_time = ctypes.wintypes.FILETIME()
#system call used by time.time()
#ctypes.windll.kernel32.GetSystemTimeAsFileTime(ctypes.byref(system_time))
#getting high precision:
ctypes.windll.kernel32.GetSystemTimePreciseAsFileTime(ctypes.byref(system_time))
large = (system_time.dwHighDateTime << 32) + system_time.dwLowDateTime
return large // 10 - 11644473600000000
for ii in range(5):
print(utcnow_microseconds()*1e-6)
References
https://learn.microsoft.com/en-us/windows/win32/sysinfo/time-functions
https://learn.microsoft.com/en-us/windows/win32/api/sysinfoapi/nf-sysinfoapi-getsystemtimepreciseasfiletime
https://support.microsoft.com/en-us/help/167296/how-to-convert-a-unix-time-t-to-a-win32-filetime-or-systemtime
1. Python 3.7 or later
If using Python 3.7 or later, use the modern, cross-platform time module functions such as time.monotonic_ns(), here: https://docs.python.org/3/library/time.html#time.monotonic_ns. It provides nanosecond-resolution timestamps.
import time
time_ns = time.monotonic_ns()
# or on Unix or Linux you can also use:
time_ns = time.clock_gettime_ns()
# or on Windows:
time_ns = time.perf_counter_ns()
# etc. etc. There are others. See the link above.
From my other answer from 2016, here: How can I get millisecond and microsecond-resolution timestamps in Python?:
You might also try time.clock_gettime_ns() on Unix or Linux systems. Based on its name, it appears to call the underlying clock_gettime() C function which I use in my nanos() function in C in my answer here and in my C Unix/Linux library here: timinglib.c.
2. Python 3.3 or later
On Windows, in Python 3.3 or later, you can use time.perf_counter(), as shown by #ereOn here. See: https://docs.python.org/3/library/time.html#time.perf_counter. This provides roughly a 0.5us-resolution timestamp, in floating point seconds. Ex:
import time
# For Python 3.3 or later
time_sec = time.perf_counter() # Windows only, I think
# or on Unix or Linux (I think only those)
time_sec = time.monotonic()
3. Pre-Python 3.3 (ex: Python 3.0, 3.1, 3.2), or later
Summary:
See my other answer from 2016 here for 0.5-us-resolution timestamps, or better, in Windows and Linux, and for versions of Python as old as 3.0, 3.2 or 3.2 even! We do this by calling C or C++ shared object libraries (.dll on Windows, or .so on Unix or Linux) using the ctypes module in Python.
I provide these functions:
millis()
micros()
delay()
delayMicroseconds()
Download GS_timing.py from my eRCaGuy_PyTime repo, then do:
import GS_timing
time_ms = GS_timing.millis()
time_us = GS_timing.micros()
GS_timing.delay(10) # delay 10 ms
GS_timing.delayMicroseconds(10000) # delay 10000 us
Details:
In 2016, I was working in Python 3.0 or 3.1, on an embedded project on a Raspberry Pi, and which I tested and ran frequently on Windows also. I needed nanosecond resolution for some precise timing I was doing with ultrasonic sensors. The Python language at the time did not provide this resolution, and neither did any answer to this question, so I came up with this separate Q&A here: How can I get millisecond and microsecond-resolution timestamps in Python?. I stated in the question at the time:
I read other answers before asking this question, but they rely on the time module, which prior to Python 3.3 did NOT have any type of guaranteed resolution whatsoever. Its resolution is all over the place. The most upvoted answer here quotes a Windows resolution (using their answer) of 16 ms, which is 32000 times worse than my answer provided here (0.5 us resolution). Again, I needed 1 ms and 1 us (or similar) resolutions, not 16000 us resolution.
Zero, I repeat: zero answers here on 12 July 2016 had any resolution better than 16-ms for Windows in Python 3.1. So, I came up with this answer which has 0.5us or better resolution in pre-Python 3.3 in Windows and Linux. If you need something like that for an older version of Python, or if you just want to learn how to call C or C++ dynamic libraries in Python (.dll "dynamically linked library" files in Windows, or .so "shared object" library files in Unix or Linux) using the ctypes library, see my other answer here.
I created a tiny C-Extension that uses GetSystemTimePreciseAsFileTime to provide an accurate timestamp on Windows:
https://win-precise-time.readthedocs.io/en/latest/api.html#win_precise_time.time
Usage:
>>> import win_precise_time
>>> win_precise_time.time()
1654539449.4548845
def start(self):
sec_arg = 10.0
cptr = 0
time_start = time.time()
time_init = time.time()
while True:
cptr += 1
time_start = time.time()
time.sleep(((time_init + (sec_arg * cptr)) - time_start ))
# AND YOUR CODE .......
t00 = threading.Thread(name='thread_request', target=self.send_request, args=([]))
t00.start()

Why single addition takes longer than single additions plus single assignment?

Here is the python code, I use python 3.5.2/Intel(R) Core(TM) i7-4790K CPU # 4.00GHz :
import time
empty_loop_t = 0.14300823211669922
N = 10000000
def single_addition(n):
a = 1.0
b = 0.0
start_t = time.time()
for i in range(0, n):
a + b
end_t = time.time()
cost_t = end_t - start_t - empty_loop_t
print(n,"iterations single additions:", cost_t)
return cost_t
single_addition(N)
def single_addition_plus_single_assignment(n):
a = 1.0
b = 0.0
c = 0.0
start_t = time.time()
for i in range(0, n):
c = a + b
end_t = time.time()
cost_t = end_t - start_t - empty_loop_t
print(n,"iterations single additions and single assignments:", cost_t)
return cost_t
single_addition_plus_single_assignment(N)
The output is:
10000000 iterations single additions: 0.19701123237609863
10000000 iterations single additions and single assignments: 0.1890106201171875
Normally, to get a more reliable result, it is better to do the test using K-fold. However, since K-fold loop itself has influence on the result, I don't use it in my test. And I'm sure this inequality can be reproduced, at least on my machine. So the question is why this happened?
I run it with pypy (had to set empty_loop_t = 0) and got the following results:
(10000000, 'iterations single additions:', 0.014394044876098633)
(10000000, 'iterations single additions and single assignments:', 0.018398046493530273)
So I guess it's up to what interpreter does with the source code and how interpreter executes it. It might be that deliberate assignment takes less operations and workload than disposing of the result with non-JIT interpreter while JIT-compiler forces the code to perform the actual number of operations.
Furthermore, the use of JIT-interpreter makes your script run ~50 times faster on my configuration. If you general aim is to optimize the running time of your script you are probably to look that way.

Why is node.js faster than python in file reading?

I'm profiling node.js vs python in file (48KB) reading synchronously.
Node.js code
var fs = require('fs');
var stime = new Date().getTime() / 1000;
for (var i=0; i<1000; i++){
var content = fs.readFileSync('npm-debug.log');
}
console.log("Total time took is: " + ((new Date().getTime() / 1000) - stime));
Python Code
import time
stime = time.time()
for i in range(1000):
with open('npm-debug.log', mode='r') as infile:
ax = infile.read();
print("Total time is: " + str(time.time() - stime));
Timings are as follows:
$ python test.py
Total time is: 0.5195660591125488
$ node test.js
Total time took is: 0.25799989700317383
Where is the difference?
In File IO or
Python list ds allocation
Or Am I not comparing apples to apples?
EDIT:
Updated python's readlines() to read() for a good comparison
Changed the iterations to 1000 from 500
PURPOSE:
To understand the truth in node.js is slower than python is slower than C kind of things and if so slow at which place in this context.
readlines returns a list of lines in the file, so it has to read the data char by char, constantly comparing the current character to any of the newline characters, and keep composing a list of lines.
This is more complicated than simple file.read(), which would be the equivalent of what Node.js does.
Also, the length calculated by your Python script is the number of lines, while Node.js gets the number of characters.
If you want even more speed, use os.open instead of open:
import os, time
def Test_os(n):
for x in range(n):
f = os.open('Speed test.py', os.O_RDONLY)
data = ""
t = os.read(f, 1048576).decode('utf8')
while t:
data += t
t = os.read(f, 1048576).decode('utf8')
os.close(f)
def Test_open(n):
for x in range(n):
with open('Speed test.py') as f:
data = f.read()
s = time.monotonic()
Test_os(500000)
print(time.monotonic() - s)
s = time.monotonic()
Test_open(500000)
print(time.monotonic() - s)
On my machine os.open is several seconds faster than open. The output is as follows:
53.68909174999999
58.12600833400029
As you can see, open is 4.4 seconds slower than os.open, although as the number of runs decreases, so does this difference.
Also, you should try tweaking the buffer size of the os.read function as different values may give very different timings:
Here 'operation' means a single call to Test_os.
If you get rid of bytes' decoding and use io.BytesIO instead of mere bytes objects, you'll get a considerable speedup:
def Test_os(n, buf):
for x in range(n):
f = os.open('test.txt', os.O_RDONLY)
data = io.BytesIO()
while data.write(os.read(f, buf)):
...
os.close(f)
Thus, the best result is now 0.038 seconds per call instead of 0.052 (~37% speedup).

How to retrieve the process start time (or uptime) in python

How to retrieve the process start time (or uptime) in python in Linux?
I only know, I can call "ps -p my_process_id -f" and then parse the output. But it is not cool.
By using psutil https://github.com/giampaolo/psutil:
>>> import psutil, os, time
>>> p = psutil.Process(os.getpid())
>>> p.create_time()
1293678383.0799999
>>> time.strftime("%Y-%m-%d %H:%M:%S", time.localtime(p.create_time()))
'2010-12-30 04:06:23'
>>>
...plus it's cross platform, not only Linux.
NB: I am one of the authors of this project.
If you are doing it from within the python program you're trying to measure, you could do something like this:
import time
# at the beginning of the script
startTime = time.time()
# ...
def getUptime():
"""
Returns the number of seconds since the program started.
"""
# do return startTime if you just want the process start time
return time.time() - startTime
Otherwise, you have no choice but to parse ps or go into /proc/pid. A nice bashy way of getting the elapsed time is:
ps -eo pid,etime | grep $YOUR_PID | awk '{print $2}'
This will only print the elapsed time in the following format, so it should be quite easy to parse:
days-HH:MM:SS
(if it's been running for less than a day, it's just HH:MM:SS)
The start time is available like this:
ps -eo pid,stime | grep $YOUR_PID | awk '{print $2}'
Unfortunately, if your process didn't start today, this will only give you the date that it started, rather than the time.
The best way of doing this is to get the elapsed time and the current time and just do a bit of math. The following is a python script that takes a PID as an argument and does the above for you, printing out the start date and time of the process:
import sys
import datetime
import time
import subprocess
# call like this: python startTime.py $PID
pid = sys.argv[1]
proc = subprocess.Popen(['ps','-eo','pid,etime'], stdout=subprocess.PIPE)
# get data from stdout
proc.wait()
results = proc.stdout.readlines()
# parse data (should only be one)
for result in results:
try:
result.strip()
if result.split()[0] == pid:
pidInfo = result.split()[1]
# stop after the first one we find
break
except IndexError:
pass # ignore it
else:
# didn't find one
print "Process PID", pid, "doesn't seem to exist!"
sys.exit(0)
pidInfo = [result.split()[1] for result in results
if result.split()[0] == pid][0]
pidInfo = pidInfo.partition("-")
if pidInfo[1] == '-':
# there is a day
days = int(pidInfo[0])
rest = pidInfo[2].split(":")
hours = int(rest[0])
minutes = int(rest[1])
seconds = int(rest[2])
else:
days = 0
rest = pidInfo[0].split(":")
if len(rest) == 3:
hours = int(rest[0])
minutes = int(rest[1])
seconds = int(rest[2])
elif len(rest) == 2:
hours = 0
minutes = int(rest[0])
seconds = int(rest[1])
else:
hours = 0
minutes = 0
seconds = int(rest[0])
# get the start time
secondsSinceStart = days*24*3600 + hours*3600 + minutes*60 + seconds
# unix time (in seconds) of start
startTime = time.time() - secondsSinceStart
# final result
print "Process started on",
print datetime.datetime.fromtimestamp(startTime).strftime("%a %b %d at %I:%M:%S %p")
man proc says that the 22nd item in /proc/my_process_id/stat is:
starttime %lu
The time in jiffies the process started after system boot.
Your problem now is, how to determine the length of a jiffy and how to determine when the system booted.
The answer for the latter comes still from man proc: it's in /proc/stat, on a line of its own like this:
btime 1270710844
That's a measurement in seconds since Epoch.
The answer for the former I'm not sure about. man 7 time says:
The Software Clock, HZ, and Jiffies
The accuracy of many system calls and timestamps is limited by the resolution of the software clock, a clock maintained by the kernel which measures time in jiffies. The size of a jiffy is determined by the value of the kernel constant HZ. The value of HZ varies across kernel versions and hardware platforms. On x86 the situation is as follows: on kernels up to and including 2.4.x, HZ was 100, giving a jiffy value of 0.01 seconds; starting with 2.6.0, HZ was raised to 1000, giving a jiffy of 0.001 seconds; since kernel 2.6.13, the HZ value is a kernel configuration parameter and can be 100, 250 (the default) or 1000, yielding a jiffies value of, respectively, 0.01, 0.004, or 0.001 seconds.
We need to find HZ, but I have no idea on how I'd go about that from Python except for hoping the value is 250 (as Wikipedia claims is the default).
ps obtains it thus:
/* sysinfo.c init_libproc() */
if(linux_version_code > LINUX_VERSION(2, 4, 0)){
Hertz = find_elf_note(AT_CLKTCK);
//error handling
}
old_Hertz_hack(); //ugh
This sounds like a job well done by a very small C module for Python :)
Here's code based on badp's answer:
import os
from time import time
HZ = os.sysconf(os.sysconf_names['SC_CLK_TCK'])
def proc_age_secs():
system_stats = open('/proc/stat').readlines()
process_stats = open('/proc/self/stat').read().split()
for line in system_stats:
if line.startswith('btime'):
boot_timestamp = int(line.split()[1])
age_from_boot_jiffies = int(process_stats[21])
age_from_boot_timestamp = age_from_boot_jiffies / HZ
age_timestamp = boot_timestamp + age_from_boot_timestamp
return time() - age_timestamp
I'm not sure if it's right though. I wrote a test program that calls sleep(5) and then runs it and the output is wrong and varies over a couple of seconds from run to run. This is in a vmware workstation vm:
if __name__ == '__main__':
from time import sleep
sleep(5)
print proc_age_secs()
The output is:
$ time python test.py
6.19169998169
real 0m5.063s
user 0m0.020s
sys 0m0.036s
def proc_starttime(pid=os.getpid()):
# https://gist.github.com/westhood/1073585
p = re.compile(r"^btime (\d+)$", re.MULTILINE)
with open("/proc/stat") as f:
m = p.search(f.read())
btime = int(m.groups()[0])
clk_tck = os.sysconf(os.sysconf_names["SC_CLK_TCK"])
with open("/proc/%d/stat" % pid) as f:
stime = int(f.read().split()[21]) / clk_tck
return datetime.fromtimestamp(btime + stime)
you can parse /proc/uptime
>>> uptime, idletime = [float(f) for f in open("/proc/uptime").read().split()]
>>> print uptime
29708.1
>>> print idletime
26484.45
for windows machines, you can probably use wmi
import wmi
c = wmi.WMI()
secs_up = int([uptime.SystemUpTime for uptime in c.Win32_PerfFormattedData_PerfOS_System()][0])
hours_up = secs_up / 3600
print hours_up

Categories