Python3 parallel processes with their own timeouts - python

I have a requirement where I have to launch multiple applications, get each of their stdout,stderr and these applications run infinitely. The processes do not exchange/share data and are independent of each other. To do variable stress tests, their timeouts might be different.
For eg:
app1 -> 70sec
app2 -> 30sec
.
.
appN -> 20sec
If these were multiple apps with one common timeout, I would have wrapped it in a timed while loop and killed all processes in the end.
Here are some approaches I think should work:
A timer thread for each app, which reads stdout and as soon as it expires, kills the process. The process is launched within the thread
One timer thread that loops through a dictionary of pid/process_objects:end_time for each process and kills the process when its end_time is >= current time
I have tried using asyncio gather, but it doesn't fully meet my needs and I have faced some issues on Windows.
Are there any other approaches that I can use?

Second option is pretty production-ready. Have a control loop where you poll for processes to complete and kill them if they are timing out

Here is the code for the second approach (extended https://stackoverflow.com/a/9745864/286990).
#!/usr/bin/env python
import io
import os
import sys
from subprocess import Popen
import threading
import time
import psutil
def proc_monitor_thread(proc_dict):
while proc_dict != {}:
for k,v in list(proc_dict.items()):
if time.time() > v:
print("killing " + str(k))
m = psutil.Process(k)
m.kill()
del proc_dict[k]
time.sleep(2)
pros = {}
ON_POSIX = 'posix' in sys.builtin_module_names
# create a pipe to get data
input_fd, output_fd = os.pipe()
# start several subprocesses
st_time = time.time()
for i in ["www.google.com", "www.amd.com", "www.wix.com"]:
proc = Popen(["ping", "-t", str(i)], stdout=output_fd,
close_fds=ON_POSIX) # close input_fd in children
if "google" in i:
pros[proc.pid] = time.time() + 5
elif "amd" in i:
pros[proc.pid] = time.time() + 8
else:
pros[proc.pid] = time.time() + 10
os.close(output_fd)
x = threading.Thread(target=proc_monitor_thread, args=(pros,))
x.start()
# read output line by line as soon as it is available
with io.open(input_fd, 'r', buffering=1) as file:
for line in file:
print(line, end='')
#
print("End")
x.join()

Related

Using both multiprocessing and multithreading in a Python script to speed up execution

I have the following range of subnets: 10.106.44.0/24 - 10.106.71.0/24. I am writing a Python script to ping each IP in all the subnets. To speed up this script I am trying to use both multiprocessing and multithreading. I am creating a new process for each subnet and creating a new thread to ping each host in that subnet. I would like to ask two questions:
Is this the best approach for this problem?
If yes, how would I go about implementing this?
I would first try to use threading. You can try creating a thread pool whose size is the total number of pings you have to do, but ultimately I believe that this will not do much better than using a thread pool size equal to the number of CPU cores you have (explanation below). Here is a comparison both ways using threading and multiprocessing:
ThreadPoolExecutor (255 threads)
from concurrent.futures import ThreadPoolExecutor
import os
import platform
import subprocess
import time
def ping_ip(ip_address):
param = '-n' if platform.system().lower() == 'windows' else '-c'
try:
output = subprocess.check_output(f"ping {param} 1 {ip_address}", shell=True, universal_newlines=True)
if 'unreachable' in output:
return False
else:
return True
except Exception:
return False
def main():
t1 = time.time()
ip_addresses = ['192.168.1.154'] * 255
#with ThreadPoolExecutor(os.cpu_count())) as executor: # uses number of CPU cores
with ThreadPoolExecutor(len(ip_addresses)) as executor:
results = list(executor.map(ping_ip, ip_addresses))
#print(results)
print(time.time() - t1)
if __name__ == '__main__':
main()
Prints:
2.049474000930786
You can try experimenting with fewer threads (max_workers argument to the ThreadPoolExecutor constructor). See: concurrent.futures
I found that running 8 threads, which is the number of cores that I had, did just about as well (timing: 2.2745485305786133). I believe the reason for this is that despite pinging being an I/O-related task, the call to subprocess must be creating internally a new process that uses a fair amount of CPU and therefore the concurrency is somewhat processor-limited.
ProcessPoolExecutor (8 cores)
from concurrent.futures import ProcessPoolExecutor
import os
import platform
import subprocess
import time
def ping_ip(ip_address):
param = '-n' if platform.system().lower() == 'windows' else '-c'
try:
output = subprocess.check_output(f"ping {param} 1 {ip_address}", shell=True, universal_newlines=True)
if 'unreachable' in output:
return False
else:
return True
except Exception:
return False
def main():
t1 = time.time()
ip_addresses = ['192.168.1.154'] * 255
with ProcessPoolExecutor() as executor:
results = list(executor.map(ping_ip, ip_addresses))
#print(results)
print(time.time() - t1)
if __name__ == '__main__':
main()
Prints:
2.509838819503784
Note that on my Linux system you have to be a superuser to issue a ping command.

running two different proceses in parallel py 3.8

I want to develop a system that reads input from two devices at the same time. Each process works independently at the moment but since I need to sync them, I want them both to write their output on the same file.
import multiprocessing as mp
from multiprocessing import Process
from multiprocessing import Pool
import time
# running the data aquisition from the screen
def Screen(fname):
for x in range(1, 9):
fname.write(str(x)+ '\n')
fname.flush()
time.sleep(0.5)
print(x)
# running the data aquisition from the EEG
def EEG(fname):
for y in range(10, 19):
fname.write(str(y)+ '\n')
fname.flush()
time.sleep(0.3)
print(y)
# main program body #
# open the common file that the processes write to
fname = open('C:/Users/Yaron/Documents/Python Scripts/research/demofile.txt', 'w+')
pool = Pool(processes=2)
p1 = pool.map_async(Screen,fname)
p2 = pool.map_async(EEG,fname)
print ('end')
fname.close()
In multiprocessing, depending on the OS you may not be able to pass an open file handle to the process. Here's code that should work on any OS:
import multiprocessing as mp
import time
def Screen(fname,lock):
with open(fname,'a') as f:
for y in range(1,11):
time.sleep(0.5)
with lock:
print(y)
print(y,file=f,flush=True)
def EEG(fname,lock):
with open(fname,'a') as f:
for y in range(11, 21):
time.sleep(0.3)
with lock:
print(y)
print(y,file=f,flush=True)
if __name__ == '__main__':
fname = 'demofile.txt'
lock = mp.Lock()
with open(fname,'w'): pass # truncates existing file and closes it
processes = [mp.Process(target=Screen,args=(fname,lock)),
mp.Process(target=EEG,args=(fname,lock))]
s = time.perf_counter()
for p in processes:
p.start()
for p in processes:
p.join()
print (f'end (time={time.perf_counter() - s}s)')
Some notes:
Open the file in each process. Windows, for example, doesn't fork() the process and doesn't inherit the handle. The handle isn't picklable to pass between processes.
Open the file for append. Two processes would have two different file pointers. Append makes sure it seeks to the end each time.
Protect the file accesses with a lock for serialization. Create the lock in the main thread and pass the same Lock to each process.
Use if __name__ == '__main__': to run one-time code in the main thread. Some OSes import the script in other processes and this protects the code from running multiple times.
map_async isn't used correctly. It takes an iterable of arguments to pass to the function. Instead, make the two processes, start them, and join them to wait for completion.

Python multiprocessing: kill process if it is taking too long to return

Below is a simple example that freezes because a child process exits without returning anything and the parent keeps waiting forever. Is there a way to timeout a process if it takes too long and let the rest continue? I am a beginner to multiprocessing in python and I find the documentation not very illuminating.
import multiprocessing as mp
import time
def foo(x):
if x == 3:
sys.exit()
#some heavy computation here
return result
if __name__ == '__main__':
pool = mp.Pool(mp.cpu_count)
results = pool.map(foo, [1, 2, 3])
I had the same problem, and this is how I solved it. Maybe there are better solutions, however, it solves also issues not mendioned. E.g. If the process is taking many resources it can happen that a normal termination will take a while to get through to the process -- therefore I use a forceful termination (kill -9). This part probably only works for Linux, so you may have to adapt the termination, if you are using another OS.
It is part of my own code, so it is probably not copy-pasteable.
from multiprocessing import Process, Queue
import os
import time
timeout_s = 5000 # seconds after which you want to kill the process
queue = Queue() # results can be written in here, if you have return objects
p = Process(target=INTENSIVE_FUNCTION, args=(ARGS_TO_INTENSIVE_FUNCTION, queue))
p.start()
start_time = time.time()
check_interval_s = 5 # regularly check what the process is doing
kill_process = False
finished_work = False
while not kill_process and not finished_work:
time.sleep(check_interval_s)
now = time.time()
runtime = now - start_time
if not p.is_alive():
print("finished work")
finished_work = True
if runtime > timeout_s and not finished_work:
print("prepare killing process")
kill_process = True
if kill_process:
while p.is_alive():
# forcefully kill the process, because often (during heavvy computations) a graceful termination
# can be ignored by a process.
print(f"send SIGKILL signal to process because exceeding {timeout_s} seconds.")
os.system(f"kill -9 {p.pid}")
if p.is_alive():
time.sleep(check_interval_s)
else:
try:
p.join(60) # wait 60 seconds to join the process
RETURN_VALS = queue.get(timeout=60)
except Exception:
# This can happen if a process was killed for other reasons (such as out of memory)
print("Joining the process and receiving results failed, results are set as invalid.")

Gevent: Using two queues with two consumers without blocking each other at the same time

I have the problem that I need to write values generated by a consumer to disk. I do not want to open a new instance of a file to write every time, so I thought to use a second queue and a other consumer to write to disk from a singe Greenlet. The problem with my code is that the second queue does not get consumed async from the first queue. The first queue finishes first and then the second queue gets consumed.
I want to write values to disk at the same time then other values get generated.
Thanks for help!
#!/usr/bin/python
#- * -coding: utf-8 - * -
import gevent #pip install gevent
from gevent.queue import *
import gevent.monkey
from timeit import default_timer as timer
from time import sleep
import cPickle as pickle
gevent.monkey.patch_all()
def save_lineCount(count):
with open("count.p", "wb") as f:
pickle.dump(count, f)
def loader():
for i in range(0,3):
q.put(i)
def writer():
while True:
task = q_w.get()
print "writing",task
save_lineCount(task)
def worker():
while not q.empty():
task = q.get()
if task%2:
q_w.put(task)
print "put",task
sleep(10)
def asynchronous():
threads = []
threads.append(gevent.spawn(writer))
for i in range(0, 1):
threads.append(gevent.spawn(worker))
start = timer()
gevent.joinall(threads,raise_error=True)
end = timer()
#pbar.close()
print "\n\nTime passed: " + str(end - start)[:6]
q = gevent.queue.Queue()
q_w = gevent.queue.Queue()
gevent.spawn(loader).join()
asynchronous()
In general, that approach should work fine. There are some problems with this specific code, though:
Calling time.sleep will cause all greenlets to block. You either need to call gevent.sleep or monkey-patch the process in order to have just one greenlet block (I see gevent.monkey imported, but patch_all is not called). I suspect that's the major problem here.
Writing to a file is also synchronous and causes all greenlets to block. You can use FileObjectThread if that's a major bottleneck.

Python threads hang and don't close

This is my first try with threads in Python,
I wrote the following program as a very simple example. It just gets a list and prints it using some threads. However, Whenever there is an error, the program just hangs in Ubuntu, and I can't seem to do anything to get the control prompt back, so have to restart another SSH session to get back in.
Also have no idea what the issue with my program is.
Is there some kind of error handling I can put in to ensure it doesn't hang.
Also, any idea why ctrl/c doesn't work (I don't have a break key)
from Queue import Queue
from threading import Thread
import HAInstances
import logging
log = logging.getLogger()
logging.basicConfig()
class GetHAInstances:
def oraHAInstanceData(self):
log.info('Getting HA instance routing data')
# HAData = SolrGetHAInstances.TalkToOracle.main()
HAData = HAInstances.main()
log.info('Query fetched ' + str(len(HAData)) + ' HA Instances to query')
# for row in HAData:
# print row
return(HAData)
def do_stuff(q):
while True:
print q.get()
print threading.current_thread().name
q.task_done()
oraHAInstances = GetHAInstances()
mainHAData = oraHAInstances.oraHAInstanceData()
q = Queue(maxsize=0)
num_threads = 10
for i in range(num_threads):
worker = Thread(target=do_stuff, args=(q,))
worker.setDaemon(True)
worker.start()
for row in mainHAData:
#print str(row[0]) + ':' + str(row[1]) + ':' + str(row[2]) + ':' + str(row[3])i
q.put((row[0],row[1],row[2],row[3]))
q.join()
In your thread method, it is recommended to use the "try ... except ... finally". This structure guarantees to return the control to the main thread even when errors occur.
def do_stuff(q):
while True:
try:
#do your works
except:
#log the error
finally:
q.task_done()
Also, in case you want to kill your program, go find out the pid of your main thread and use kill #pid to kill it. In Ubuntu or Mint, use ps -Ao pid,cmd, in the output, you can find out the pid (first column) by searching for the command (second column) you yourself typed to run your Python script.
Your q is hanging because your worker as errored. So your q.task_done() never got called.
import threading
to use
print threading.current_thread().name

Categories