Create two child process using python(windows) - python

Use Python programming language to accomplish the following task:
Create two processes (let’s call them P1 and P2). P1 should print “I am P1”, P2 should print “I am P2”. The main process (the process that creates P1 and P2) should wait for them. Then, after P1 and P2 are done, the main process should print “I am the main process, the two processes are done”.

In windows, we don't have fork system call, so we can use a python module called multiprocessing as:-
from multiprocessing import Process, Lock
import time
import os
def f(lock,id,sleepTime):
lock.acquire()
print "I'm P"+str(id)+" Process ID: "+str(os.getpid())
lock.release()
time.sleep(sleepTime) #sleeps for some time
if __name__ == '__main__':
print "Main Process ID: "+str(os.getpid())
lock=Lock()
p1=Process(target=f, args=(lock,1,3,)) #P1 sleeps for 3 seconds
p2=Process(target=f, args=(lock,2,5,)) #P2 sleeps for 5 seconds
start=time.time()
p1.start()
p2.start()
p1.join()
p2.join()
end=time.time()
print "I am the main process, the two processes are done"
print "Time taken:- "+str(end-start)+"secs" #MainProcess terminates at approx ~ 5 secs.
The processes as captured in task manager:-
The code output was:-
Main Process ID: 9804
I'm P1 Process ID: 6088
I'm P2 Process ID: 4656
I am the main process, the two processes are done
Time taken:- 5.15300011635secs
Hope that helps!!

I didn't notice the Windows tag first. So I wrote according to UNIX. I kept the answer instead of deleting in hoping that it will aid UNIX users too.The proper code demonstrating the same is:-
import os
import time
def child(id, sleepTime):
print "I'm P"+str(id)
time.sleep(sleepTime)
os._exit(0)
p1=os.fork()
if (p1==0):
child(1,3) #P1 sleeps for 3 seconds
p2=os.fork()
if (p2==0):
child(2,5) #P2 sleeps for 5 seconds
if (p1>0 and p2>0):
os.waitpid(p1,0) #Waiting for child 1
os.waitpid(p2,0) #Waiting for child2
print "I am the main process, the two processes are done" #Printed after approx 5 seconds
I executed
time python fork.py
The output was as expected:-
I'm P1
I'm P2
I am the main process, the two processes are done
real 0m5.020s
user 0m0.004s
sys 0m0.008s

Related

Why do these two processes behave like this?

I'm creating two instances of a process over here but when I'm running this program I'm getting only main function output.
import multiprocessing
import time
def sleepy_man():
print("Starting to Sleep")
time.sleep(1)
print("Done Sleeping")
tic = time.time()
p1 = multiprocessing.Process(target=sleepy_man)
p2 = multiprocessing.Process(target=sleepy_man)
p1.start()
p2.start()
toc = time.time()
print("Done in {:.4f} seconds".format(toc-tic))
Output
Done in 0.0554 seconds
I was doing it for practice from this blog only.
Source: https://www.analyticsvidhya.com/blog/2021/04/a-beginners-guide-to-multi-processing-in-python/
It is worth noting you would see the same behavior if you had somehow set p1.daemon = p2.daemon = True.
It is also possibly due to output buffering, rather than logic errors.
Two questions:
If you add a sys.stdout.flush() or flush=True to your print, do you see different behavior?
If you run this with time python foobar.py does it take .02s or 1s to run?
Obviously, continuing your tutorial and correctly adding .join() below will resolve the issue in a way that would be expected for normal usage.
import multiprocessing as mp
import time
def sleepy_man():
print("Starting to Sleep")
time.sleep(1)
print("Done Sleeping")
# if you are on Windows, which use spawning to create child processes, use __name__ == '__main__'
if __name__ == '__main__':
tic = time.time()
processes = [
mp.Process(target=sleepy_man),
mp.Process(target=sleepy_man)
]
[p.start() for p in processes]
# if you want to see results of process work, join them
# otherwise if main process finish its work before their children
# you'll get no results since parent process will finish children
# you can also declare Process as daemon=False - as another choice
# in that case you can use no join()
# on the other hand join() makes parent process to wait for children join()
# and only then it prints time in your case
[p.join() for p in processes]
toc = time.time()
print("Done in {:.4f} seconds".format(toc-tic))

Basic python threading is not working. What am I missing in this?

I am trying to use python threading and am having problems getting the threads to work independently. They seem to be running in sequential order and waiting for one to finish before starting to process the next thread. I have read other posts suggesting that I need to get more work into the threads to differentiate actual CPU work vs the CPU work of starting and managing the threads, and that a sleep timer could be used to simulate this. So I tried that and then measured the task durations.
So my code is below. It first runs three tasks sequentially with a 2 second timer. This takes about 6 seconds to run as expected. The next section starts three threads and they should run roughly in parallel if my understanding of threading is correct. I have played with the timers to test the overall duration of this section of code, expecting that if one timer is larger than the other two, the code will execute in an interval closest to that larger one. but what I am seeing is that is taking the same amount of time as the three running in sequence - one after the other.
I got onto this because I am writing some code to read an asynchronous queue in the background. After launching the thread to read the queue, my code seems to stop and wait until the queue reader is stopped, which it normally doesn't as it waits for messages to come in. So what happens is that it never executes the next section of code and it seems to be waiting for the thread to complete.
Also I checked the number of threads active and it remains at the same number, and when I check for the thread ID in the code (not shown) I get the same thread number coming back for every thread.
I am new to python and am using the jupyter compiler environment. Is there a compile option or some other limitation that I am not aware of that is preventing the threading? Am I just not getting the concept? I dont believe that this is related to CPU cores / threading as it would be done through logical thread cores within the python compiled code. I also ran a similar program in a command shell environment and got the same sequential performance.
Cut and paste this code to see what it does. What am I missing?
'''
import threading
import logging
import datetime
import time
import random
class Parallel:
def work(self, interval):
time.sleep(interval)
name = self.__repr__()
print (name, " is complete after ", interval, " seconds")
# SetupLogger()
logging.getLogger().setLevel(logging.DEBUG)
logging.debug("thread program start time is %s", datetime.datetime.now())
thread1 = Parallel()
thread2 = Parallel()
thread3 = Parallel()
print ("sequential threads::")
thread1.work(2.0)
thread2.work(2.0)
thread3.work(2.0)
logging.info("parallel threads start time is %s ", datetime.datetime.now())
start = time.time()
work1 = threading.Thread(target=thread1.work(1), daemon=True)
work1.start()
print ("thread 1 is started and there are ", threading.activeCount(), " threads active")
work2 = threading.Thread(target=thread2.work(2), daemon=False)
work2.start()
print ("thread 2 is started and there are ", threading.activeCount(), " threads active")
work3 = threading.Thread(target=thread3.work(5), daemon=False)
work3.start()
print ("thread 3 is started and there are ", threading.activeCount(), " threads active")
# wait for all to complete
print ("now wait for all to finish at ", datetime.datetime.now())
work1.join()
work2.join()
work3.join()
end = time.time()
logging.info ("parallel threads end time is %s with %s elapsed", datetime.datetime.now(), str(end-start))
print ("all threads completed at:", datetime.datetime.now())
'''
In the line that initializes the thread, you are actually executing the function instead of passing its reference to the thread.
thread1.work() ----> this will actually execute the function when the program runs and encounters this statement
So when your program reaches this line,
work1 = threading.Thread(target=thread1.work(1), daemon=True)
and encounters target=thread1.work(1), it simply calls the function right there and the actual thread does nothing.
thread1.work is a reference to the function, which you need to pass to your Thread object.
So just remove the parenthesis and your code becomes
work1 = threading.Thread(target=thread1.work, daemon=True, args=(1,))
and this will behave as you expect.

What are the recommended ways to deal with multiprocessing and sleep?

So, I've been playing with Python's multiprocessing module for a few days now, and there's something that I can't understand. Maybe someone can give me a bit of help.
I'm trying to run two methods from the same class in parallel, but apparently there's something that I'm missing:
from multiprocessing import Process
import time
class SomeClass:
def __init__(self):
pass
def meth1(self):
print(1)
time.sleep(10)
def meth2(self):
print(2)
time.sleep(5 * 60)
def main():
while True:
s = SomeClass()
p1 = Process(target=s.meth1) # I want this to run from 10 to 10 seconds
p1.start()
p2 = Process(target=s.meth2) # I want this to run from 5 to 5 minutes
# while the first one still does its own
# job from 10s to 10s
p2.start()
p1.join()
p2.join()
if __name__ == '__main__':
main()
What I would expect to happen is:
the first method should print 1;
then the second one should print 2 (without waiting 10s - which does happen and seems to work as expected);
then I should see only 1s for the next 4 minutes and 50s (this isn't happening and the program just waits for that time to pass.
What am I missing? Why does the second step work as expected, but the 3rd one doesn't? How can I make it work?
Its difficult to really know what you want, but this code below does what you describe, but has no convenient way of exiting:
from multiprocessing import Process
import time
class SomeClass:
def __init__(self):
pass
def meth1(self):
while True:
print(1)
time.sleep(10)
def meth2(self):
while True:
print(2)
time.sleep(5 * 60)
def main():
s = SomeClass()
p1 = Process(target=s.meth1) # I want this to run from 10 to 10 seconds
p1.start()
p2 = Process(target=s.meth2) # I want this to run from 5 to 5 minutes while the first one still does its own job from 10s to 10s
p2.start()
p1.join()
p2.join()
if __name__ == '__main__':
main()
I have moved the while into each of the methods of SomeClass
This code will never exit, hanging at p1.join()
Think of processes as friends / clones of yours that you can call on the phone, have come in and do something, and then when they are all done they go home, leaving you a note.
The lines:
p1 = Process(target=s.meth1)
p1.start()
call one clone up and have him run s.meth1. He comes over and prints 1 on your screen, waits 10 seconds, then leaves you a note: "all done, no exceptions occurred, I've gone home".
Meanwhile (while your first clone is coming over), the lines:
p2 = Process(target=s.meth2)
p2.start()
call up another clone and have him run s.meth2. He comes over, prints 2 on your screen, waits around for 5 minutes, then leaves you a note: "all done, no exceptions occurred, I've gone home".
While clones #1 and #2 are doing their work, the line:
p1.join()
waits for clone #1 to leave you his note. That happens after 10 seconds. You then go on to:
p2.join()
which waits for clone #2 to leave you his note. That happens after another 4 minutes and 50 seconds. Then you go back around to your while True and start everything over again.
If you want not to wait for clone #2 to finish and leave you his note, don't call p2.join() yet. Eventually, though, you should call p2.join() to make sure that everything went well and that he went home successfully, and isn't lying dead in your living room in a pool of blood, shot by some exception. :-)

JoinableQueue join() method blocking main thread even after task_done()

In below code, if I put daemon = True , consumer will quit before reading all queue entries. If consumer is non-daemon, Main thread is always blocked even after the task_done() for all the entries.
from multiprocessing import Process, JoinableQueue
import time
def consumer(queue):
while True:
final = queue.get()
print (final)
queue.task_done()
def producer1(queue):
for i in "QWERTYUIOPASDFGHJKLZXCVBNM":
queue.put(i)
if __name__ == "__main__":
queue = JoinableQueue(maxsize=100)
p1 = Process(target=consumer, args=((queue),))
p2 = Process(target=producer1, args=((queue),))
#p1.daemon = True
p1.start()
p2.start()
print(p1.is_alive())
print (p2.is_alive())
for i in range(1, 10):
queue.put(i)
time.sleep(0.01)
queue.join()
Let's see what—I believe—is happening here:
both processes are being started.
the consumer process starts its loop and blocks until a value is received from the queue.
the producer1 process feeds the queue 26 times with a letter while the main process feeds the queue 9 times with a number. The order in which letters or numbers are being fed is not guaranteed—a number could very well show up before a letter.
when both the producer1 and the main processes are done with feeding their data, the queue is being joined. No problem here, the queue can be joined since all the buffered data has been consumed and task_done() has been called after each read.
the consumer process is still running but is blocked until more data to consume show up.
Looking at your code, I believe that you are confusing the concept of joining processes with the one of joining queues. What you most likely want here is to join processes, you probably don't need a joinable queue at all.
#!/usr/bin/env python3
from multiprocessing import Process, Queue
import time
def consumer(queue):
for final in iter(queue.get, 'STOP'):
print(final)
def producer1(queue):
for i in "QWERTYUIOPASDFGHJKLZXCVBNM":
queue.put(i)
if __name__ == "__main__":
queue = Queue(maxsize=100)
p1 = Process(target=consumer, args=((queue),))
p2 = Process(target=producer1, args=((queue),))
p1.start()
p2.start()
print(p1.is_alive())
print(p2.is_alive())
for i in range(1, 10):
queue.put(i)
time.sleep(0.01)
queue.put('STOP')
p1.join()
p2.join()
Also your producer1 exits on its own after feeding all the letters but you need a way to tell your consumer process to exit when there won't be any more data for it to process. You can do this by sending a sentinel, here I chose the string 'STOP' but it can be anything.
In fact, this code is not great since the 'STOP' sentinel could be received before some letters, thus both causing some letters to not be processed but also a deadlock because the processes are trying to join even though the queue still contains some data. But this is a different problem.

Parallel calls of sleep don't add up

Using the subprocess module, I'm running 1000 calls to sleep(1) in parallel:
import subprocess
import time
start = time.clock()
procs = []
for _ in range(1000):
proc = subprocess.Popen(["sleep.exe", "1"])
procs.append(proc)
for proc in procs:
proc.communicate()
end = time.clock()
print("Executed in %.2f seconds" % (end - start))
On my 4-core machine, this results in an execution time of a couple of seconds, far less than I expected (~ 1000s / 4).
How does it get optimized away? Does it depend on the sleep implementation (this one is taken from the Windows-Git-executables)?
Sleeping doesn't require any processor time, so your OS can run far more than 4 sleep requests at a time, even though it has only 4 cores. Ideally it would be able to process the entire batch of 1000 in only 1 second, but there's lots of overhead in the creation and teardown of the individual processes.
This is because, subprocess.Popen(..) is not a blocking call. The thread just triggers the child process creation and moves on. It does not wait for it to finish.
In other words, you are spawning 1000 asynchronous processes in a loop, and then waiting on them one by one later on. This asynchronous behavior results in your overall run time of a few seconds.
Calling proc.communicate() waits until the child process is complete (has exited). Now, if you want the sleep times to add up (minus the process creation/destruction) overhead, you'd do:
import subprocess
import time
start = time.clock()
procs = []
#Get the start time
for _ in range(10):
proc = subprocess.Popen(["sleep.exe", "1"])
procs.append(proc)
proc.communicate()
#Get the end time
Does it depend on the sleep implementation (this one is taken from the Windows-Git-executables)?
As I've outlined above, this has nothing to do with implementation of sleep.

Categories