This question already has answers here:
Timeout on a function call
(23 answers)
Closed 2 years ago.
Currently I am writing a function that includes random samples with some restrictions. It is possible at the last step there is no valid condition to choose and my function will be stuck.
Is there a way to set run time limit for this function? Let's say after 2 seconds, if there is no result returned, and it will run the function again until it returns the result within 2 seconds.
Thanks in advance.
You can use the threading module.
The following code is an example of a Thread with a timeout:
from threading import Thread
p = Thread(target = myFunc, args = [myArg1, myArg2])
p.start()
p.join(timeout = 2)
You could add something like the following to your code, as a check to keep looping until the function should properly end:
shouldFinish = False
def myFunc(myArg1, myArg2):
...
if finishedProperly:
shouldFinish = True
while shouldFinish == False:
p = Thread(target = myFunc, args = [myArg1, myArg2])
p.start()
p.join(timeout = 2)
Related
I am learning multiprocessing in Python, and thinking of a problem. I want that for a shared list(nums = mp.Manager().list), is there any way that it automatically splits the list for all the processes so that it does not compute on same numbers in parallel.
Current code:
# multiple processes
nums = mp.Manager().list(range(10000))
results = mp.Queue()
def get_square(list_of_num, results_sharedlist):
# simple get square
results_sharedlist.put(list(map(lambda x: x**2, list_of_num)))
start = time.time()
process1 = mp.Process(target=get_square, args = (nums, results))
process2 = mp.Process(target=get_square, args=(nums, results))
process1.start()
process2.start()
process1.join()
process2.join()
print(time.time()-start)
for i in range(results.qsize()):
print(results.get())
Current Behaviour
It computes the square of same list twice
What I want
I want the process 1 and process 2 to compute squares of nums list 1 time in parallel without me defining the split.
You can make function to decide on which data it needs to perform operations. In current scenario, you want your function to divide the square calculation work by it's own based on how many processes are working in parallel.
To do so, you need to let your function know which process it is working on and how many other processes are working along with it. So that it can only work on specific data. So you can just pass two more parameters to your functions which will give information about processes running in parallel. i.e. current_process and total_process.
If you have a list of length divisible by 2 and you want to calculate squares of same using two processes then your function would look something like as follows:
def get_square(list_of_num, results_sharedlist, current_process, total_process):
total_length = len(list_of_num)
start = (total_length // total_process) * (current_process - 1)
end = (total_length // total_process) * current_process
results_sharedlist.put(list(map(lambda x: x**2, list_of_num[start:end])))
TOTAL_PROCESSES = 2
process1 = mp.Process(target=get_square, args = (nums, results, 1, TOTAL_PROCESSES))
process2 = mp.Process(target=get_square, args=(nums, results, 2, TOTAL_PROCESSES))
The assumption I have made here is that the length of list on which you are going to work is in multiple of processes you are allocating. And if it not then the current logic will leave behind some numbers with no output.
Hope this answers your question!
Agree on the answer by Jake here, but as a bonus:
if you are using a multiprocessing.Pool(), it keeps an internal counter of the multiprocessing threads spawned, so you can avoid the parametr to identify the current_process by accessing _identity from the current_process by multiprocessing, like this:
from multiprocessing import current_process, Pool
p = current_process()
print('process counter:', p._identity[0])
more info from this answer.
This question already has answers here:
Correct way to pause a Python program
(16 answers)
Closed 1 year ago.
Looking for a way to pause (and then later resume) a Python script every x minutes (with a small random +/-). The original script would run over and over and then every x minutes it would pause for a set amount (again with a random +/-) then continue.
Here's an example code using random.uniform() to generate a random float and time.sleep() to wait for a given time.
import random
import time
BASE_DELAY = 60 # base amount in seconds
RAND_MAX = 30 # high end of random in seconds
RAND_MIN = -30 # low end of random in seconds
running = True
def do_stuff():
#do stuff here, maybe setting running to False
pass
def loop():
while running:
time.sleep(BASE_DELAY + random.uniform(RAND_MIN, RAND_MAX))
do_stuff()
loop()
import time
import random
# this will pause between 1 & 10 minutes, randomly.
time.sleep(60*(random.randint(1, 10)))
Please bear with me as this is a bit of a contrived example of my real application. Suppose I have a list of numbers and I wanted to add a single number to each number in the list using multiple (2) processes. I can do something like this:
import multiprocessing
my_list = list(range(100))
my_number = 5
data_line = [{'list_num': i, 'my_num': my_number} for i in my_list]
def worker(data):
return data['list_num'] + data['my_num']
pool = multiprocessing.Pool(processes=2)
pool_output = pool.map(worker, data_line)
pool.close()
pool.join()
Now however, there's a wrinkle to my problem. Suppose that I wanted to alternate adding two numbers (instead of just adding one). So around half the time, I want to add my_number1 and the other half of the time I want to add my_number2. It doesn't matter which number gets added to which item on the list. However, the one requirement is that I don't want to be adding the same number simultaneously at the same time across the different processes. What this boils down to essentially (I think) is that I want to use the first number on Process 1 and the second number on Process 2 exclusively so that the processes are never simultaneously adding the same number. So something like:
my_num1 = 5
my_num2 = 100
data_line = [{'list_num': i, 'my_num1': my_num1, 'my_num2': my_num2} for i in my_list]
def worker(data):
# if in Process 1:
return data['list_num'] + data['my_num1']
# if in Process 2:
return data['list_num'] + data['my_num2']
# and so forth
Is there an easy way to specify specific inputs per process? Is there another way to think about this problem?
multiprocessing.Pool allows to execute an initializer function which is going to be executed before the actual given function will be run.
You can use it altogether with a global variable to allow your function to understand in which process is running.
You probably want to control the initial number the processes will get. You can use a Queue to notify to the processes which number to pick up.
This solution is not optimal but it works.
import multiprocessing
process_number = None
def initializer(queue):
global process_number
process_number = queue.get() # atomic get the process index
def function(value):
print "I'm process %s" % process_number
return value[process_number]
def main():
queue = multiprocessing.Queue()
for index in range(multiprocessing.cpu_count()):
queue.put(index)
pool = multiprocessing.Pool(initializer=initializer, initargs=[queue])
tasks = [{0: 'Process-0', 1: 'Process-1', 2: 'Process-2'}, ...]
print(pool.map(function, tasks))
My PC is a dual core, as you can see only Process-0 and Process-1 are processed.
I'm process 0
I'm process 0
I'm process 1
I'm process 0
I'm process 1
...
['Process-0', 'Process-0', 'Process-1', 'Process-0', ... ]
I am new to python multiprocess and I want to understand why my code does not terminate (maybe zombi or deadlock) and how to fix it. The createChain functions execute a for loop also and returns a tuple: (value1, value2). Inside createChain function there are other calls to other functions. I don't think posting the createChain function code will help because inside that function I am not doing something regarding multiprocess. I tried to make the processes as deamon but still didn't work. The strange think is that if I decrease the value of maxChains i.e to 500 or 100 is working.
I just want the process to do some heavy tasks and put the results to a data type.
My version of python is 2.7
def createTable(chainsPerCore, q, chainLength):
for chain in xrange(chainsPerCore):
q.put(createChain(chainLength, chain))
def initTable():
maxChains = 1000
chainLength = 10000
resultsQueue = JoinableQueue()
numOfCores = cpu_count()
chainsPerCore = maxChains / numOfCores
processes = [Process(target=createTable, args=(chainsPerCore, resultsQueue, chainLength,)) for x in range(numOfCores)]
for p in processes:
# p.daemon = True
p.start()
# Wait for hashing cores to finish
for p in processes:
p.join()
resultsQueue.task_done()
temp = [resultsQueue.get() for p in processes]
print temp
Based on the very useful comments of Tadhg McDonald-Jensen I understood better my needs and how the Queues are workings and for what purpose they should be used.
I change my code to
def initTable(output):
maxChains = 1000
results = []
with closing(Pool(processes=8)) as pool:
results = pool.map(createChain, xrange(maxChains))
pool.terminate()
This question already has answers here:
Python - Join Multiple Threads With Timeout
(4 answers)
Closed 8 years ago.
How do i properly use the join(timeout) function in the follow example? the timeout didn't seem to have an effect on the main thread execution. From the docs, the main thread is blocked until threads join or timeouts.
import threading,time
class Server(threading.Thread):
def __init__(self, hostname):
super(Server, self).__init__()
self.__hostname = hostname
def run(self):
print self.__hostname+' left'
time.sleep(5)
print self.__hostname+' back'
sem.release()
#init
sem = threading.BoundedSemaphore(4)
threads = []
for x in xrange(1,5):
sem.acquire()
t = Server('thread '+str(x))
threads.append(t)
t.start()
for t in threads:
t.join(2)
print 'why is this line executed by main thread 5 seconds after, not 2?'
You have a for loop that tries to join each of the 4 threads with a 2 second timeout.
The first .join() call takes the full 2 seconds and then times out. The second does the same. The third thread finishes after 5 seconds (1 second after the third .join(2) call), and the 4th is already done when it's joined. 2 + 2 + 1 = 5.