How to find running time of a thread in Python - python

I have a multi-threaded SMTP server. Each thread takes care of one client. I need to set a timeout value of 10 seconds on each server thread to terminate dormant or misbehaving clients.
I have used the time.time(), to find the start time and my checkpoint time and the difference gives the running time. But I believe it gives the system time and not the time this thread was running.
Is there a Thread local timer API in Python ?
import threading
stop = 0
def hello():
stop = 1
t=threading.Timer(10,hello)
t.start()
while stop != 1:
print stop
print "stop changed"
This prints 0 (initial stop) in a loop and does not come out of the while loop.

Python has progressed in the 6 years since this question was asked, and in version 3.3 it's introduced a tool for exactly what was being asked for here:
time.clock_gettime(time.CLOCK_THREAD_CPUTIME_ID)
Python 3.7 additionally introduced an analogous time.clock_gettime_ns.
Detailed docs are exactly where you'd expect but the feature is pretty straightforward straight out of the box.

In the python documentation there is no mention of "thread timing". Either the clocks are process-wide or system-wide. In particular time.clock measures process time while time.time returns the system time.
In python3.3 the timings API was revised and improved but still, I can't see any timer that would return the process time taken by a single thread.
Also note that even if possible it's not at all easy to write such a timer.
Timers are OS specific, so you would have to write a different version of the module for every OS. If you want to profile a specific action, just launch it without threads.
When threaded the timing either it runs as expected, or it is a lot slower because of the OS, in which case you can't do nothing about it(at least, if you don't want to write a patch that "fixes" the GIL or removes it safely).

Python 3.7 has added the time.thread_time() method that seems to do what this question needs. According to the docs, it is thread-specific and excludes time spent sleeping.

The hello function's stop value is local, not the global one.
Add the following:
def hello():
global stop
stop = 1

I am posting a sample code which can measure the running time of the thread, you can modify the code, so as to use with your function.
import time
import threading
def hello():
x = 0
while x < 100000000:
pass
x += 1
start = time.clock()
t = threading.Thread(target = hello, args = ())
t.start()
t.join()
end = time.clock()
print "The time was {}".format(end - start)
On my system, it gave a time of 8.34 seconds.

Related

Python multithreaded, Sleep / Wait not working

Been turning my python OpenCV code into multi-threaded tonight and have come completely stuck.
As far as I'm aware I must be the only example on the internet where time.sleep and event.wait do not work.
I have 3 threads, first thread finds boxes, the second thread figures out if it's a good time to make an action and the final thread uses both of these threads to act on this information.
def click_boxes():
global list_of_boxes
global player_obj
if (player_obj.status == "idle"):
for box in list_of_boxes:
if box.status == 'fallen':
print(time.time())
time.sleep(1.0)
print("???? , " + str(time.time()))
return None
results in sleep or wait not blocking / having no function at all.
If I '.join()' this final click_boxes thread that I wish to block, I get a functional sleep/wait BUT it blocks the entire script so I lose all benefit of multi-threading.
click_boxes_t = threading.Thread(target=click_boxes, args=())
click_boxes_t.start()
click_boxes_t.join()
This might sound stupid but I haven't found an answer on the internet. When wait() and sleep() are not working in a multi-threaded scenario. Why not? what's the solution?
Fixed using non-blocking 'if time now > time set + time to wait'

Python multithreading - memory not released when ran using While statement

I built a scraper (worker) launched XX times through multithreading (via Jupyter Notebook, python 2.7, anaconda).
Script is of the following format, as described on python.org:
def worker():
while True:
item = q.get()
do_work(item)
q.task_done()
q = Queue()
for i in range(num_worker_threads):
t = Thread(target=worker)
t.daemon = True
t.start()
for item in source():
q.put(item)
q.join() # block until all tasks are done
When I run the script as is, there are no issues. Memory is released after script finishes.
However, I want to run the said script 20 times (batching of sort),
so I turn the script mentioned into a function, and run the function using code below:
def multithreaded_script():
my script #code from above
x = 0
while x<20:
x +=1
multithredaded_script()
memory builds up with each iteration, and eventually the system start writing it to disk.
Is there a way to clear out the memory after each run?
I tried:
setting all the variables to None
setting sleep(30) at end of each iteration (in case it takes time for ram to release)
and nothing seems to help.
Any ideas on what else I can try to get the memory to clear out after each run within the While statement?
If not, is there a better way to execute my script XX times, that would not eat up the ram?
Thank you in advance.
TL;DR Solution: Make sure to end each function with return to ensure all local variables are destroyed from ram**
Per Pavel's suggestion, I used memory tracker (unfortunately suggested mem tracker did't work for me, so i used Pympler.)
Implementation was fairly simple:
from pympler.tracker import SummaryTracker
tracker = SummaryTracker()
~~~~~~~~~YOUR CODE
tracker.print_diff()
The tracker gave a nice output, which made it obvious that local variables generated by functions were not being destroyed.
Adding "return" at the end of every function fixed the issue.
Takeaway:
If you are writing a function that processes info/generates local variables, but doesn't pass local variables to anything else -> make sure to end the function with return anyways. This will prevent any issues that you may run into with memory leaks.
Additional notes on memory usage & BeautifulSoup:
If you are using BeautifulSoup / BS4 with multithreading and multiple workers, and have limited amount of free ram, you can also use soup.decompose() to destroy soup variable right after you are done with it, instead of waiting for the function to return/code to stop running.

Is there anything in Python 2.7 akin to Go's `time.Tick` or Netty's `HashedWheelTimer`?

I write a lot of code that relies on precise periodic method calls. I've been using Python's futures library to submit calls onto the runtime's thread pool and sleeping between calls in a loop:
executor = ThreadPoolExecutor(max_workers=cpu_count())
def remote_call():
# make a synchronous bunch of HTTP requests
def loop():
while True:
# do work here
executor.submit(remote_call)
time.sleep(60*5)
However, I've noticed that this implementation introduces some drift after a long duration of running (e.g. I've run this code for about 10 hours and noticed about 7 seconds of drift). For my work I need this to run on the exact second, and millisecond would be even better. Some folks have pointed me to asyncio ("Fire and forget" python async/await), but I have not been able to get this working in Python 2.7.
I'm not looking for a hack. What I really want is something akin to Go's time.Tick or Netty's HashedWheelTimer.
Nothing like that comes with Python. You'd need to manually adjust your sleep times to account for time spent working.
You could fold that into an iterator, much like the channel of Go's time.Tick:
import itertools
import time
import timeit
def tick(interval, initial_wait=False):
# time.perf_counter would probably be more appropriate on Python 3
start = timeit.default_timer()
if not initial_wait:
# yield immediately instead of sleeping
yield
for i in itertools.count(1):
time.sleep(start + i*interval - timeit.default_timer())
yield
for _ in tick(300):
# Will execute every 5 minutes, accounting for time spent in the loop body.
do_stuff()
Note that the above ticker starts ticking when you start iterating, rather than when you call tick, which matters if you try to start a ticker and save it for later. Also, it doesn't send the time, and it won't drop ticks if the receiver is slow. You can adjust all that on your own if you want.

5 minutes loop in python does it causing issue?

I am using this loop for running every 5 minutes just creating thread and it completes.
while True:
now_plus_5 = now + datetime.timedelta(minutes = 5)
while datetime.datetime.now()<= now_plus_5:
new=datetime.datetime.now()
pass
now = new
pass
But when i check my process status it shows 100% usage when the script runs.Does it causing problem?? or any good ways??
Does it causes CPU 100% usage??
You might rather use something like time.sleep
while True:
# do something
time.sleep(5*60) # wait 5 minutes
Based on your comment above, you may find a Timer object from the threading module to better suit your needs:
from threading import Timer
def hello():
print "hello, world"
t = Timer(300.0, hello)
t.start() # after 5 minutes, "hello, world" will be printed
(code snippet modified from docs)
A Timer is a thread subclass, so you can further encapsulate your logic as needed.
This allows the threading subsystem to schedule the execution of your task such that it's not entirely CPU bound like your current implementation.
I should also note that the Timer class is designed to be fired only once. As such, you'd want to design your task to start a new instance upon completion, or create your own Thread subclass with its own smarts.
While researching this, I noticed that there's also a sched module that provides this functionality as well, but rather than rehash the solution, check out this related question:
Python Equivalent of setInterval()?
timedelta takes(seconds,minutes,hours,days,months,years) as input and works accordingly
from datetime import datetime,timedelta
end_time = datetime.now()+timedelta(minutes=5)
while end_time>= datetime.now():
statements

PYTHON: How to perform set of instruction at predefined time

I have a set of instructions, say {I} and I would like to perform this set {I}
at predefined time for instance each minute.
I'm not asking how to insert a delay of 1 minutes between to successive executions of
the set {I}, I want to start the instructions {I} each minute independently of the time of execution of {I}.
If I inderstand the following code
import time
while True:
{I}
time.sleep(60)
would simply insert a delay of 60 secs between the end of the execution of {I} and the following one. Is it true? Instead I would like that the set of instructions {I} starts each minute (for instance at 9.00 am, 9.01 am, 9.02 am, etc).
Is it possible to perform such a task inside python, or is it preferable to write a script with {I} that I execute each minutes, for instance, with Crontab?
Thank you in advance
Looks like signal.alarm and signal.signal(signal.SIGALRM, handler) should help you.
If you don't need finer resolution than a minute, cron would be the easiest option. Otherwise you'd end up re-writing something like it.
If you need intervals shorter than a minute, you might consider "timeouts" from the glib library. It has Python bindings. The timeout should then probably start the task in a separate process.
Something like APScheduler might meet your needs.
I'm sure there are other similar packages out there as well.
Chances are, you'd have to instantiate separate threads for every instruction to be run concurrently, and simply dispatch them in your delayed while loop.
You could spawn a thread every second using threading.Timer:
import threading
import time
def do_stuff(count):
print(count)
if c < 10: # Let's build in some way to quit
t = threading.Timer(1.0, do_stuff, args=[count+1])
t.start()
t = threading.Timer(0.0, do_stuff, args=[0])
t.start()
t.join()
Using the sched module is another possibility, but note that the sched.scheduler.run method blocks the main process until the event queue is empty. (So if the do_stuff function takes longer than a second, the next event won't run on time.) If you want nonblocking events, use threading.Timer.

Categories