asyncio.run() seems to be blocking the other process i started

asyncio.run() seems to be blocking the other process i started - python

This code is supposed to control a servo from stdin
import asyncio
import sys
import threading
from multiprocessing import Process
async def connect_stdin_stdout():
loop = asyncio.get_event_loop()
reader = asyncio.StreamReader()
protocol = asyncio.StreamReaderProtocol(reader)
await loop.connect_read_pipe(lambda: protocol, sys.stdin)
w_transport, w_protocol = await loop.connect_write_pipe(asyncio.streams.FlowControlMixin, sys.stdout)
writer = asyncio.StreamWriter(w_transport, w_protocol, reader, loop)
return reader, writer
servo0ang = 90
async def main():
reader, writer = await connect_stdin_stdout()
while True:
res = await reader.read(100)
if not res:
break
servo0ang = int(res)
# Main program logic follows:
def runAsync():
asyncio.run(main())
def servoLoop():
pwm = Servo()
while True:
pwm.setServoPwm('0', servo0ang)
if __name__ =="__main__":
p = Process(target = servoLoop)
p.start()
runAsync()
p.join()
When i run it the async function starts but servoLoop doesn't
It was supposed to turn the servo to the angle specified in stdin. I'm a bit rusty at Python.
The Servo class is from an example program that came with the robot I'm working with and it works there

So, as I said in comment, you are not sharing servo0ang. You have two processes. Each of them has its own variables. They have the same name, and the same initial value, as in a fork in other languages. Because the new process starts as a copy of the main one. But they are just 2 different python running, with almost nothing to do which each other (one if the parent of the other, so can join it).
If you need to share data, you have either to send them through pipes connecting the two processes. Or by creating a shared memory that both process will access to (it seems easy. And in python, it is quite easy. But it is also easy to have some inefficient polling systems, like yours seems to be, with its infinite loop trying to poll values of servo0ang as fast as it can to not miss any change. In reality, very often, it would be a better idea to wait on pipes. But well, I won't discuss the principles of your project. Just how to do what you want to do, not whether it is a good idea or not).
In python, you have, in the multiprocessing module a Value class that creates memory that can then be shared among processes of the same machine (with Manager you could even share value among processes of different machines, but that is slower)
from multiprocessing import Process, Value
import time # I don't like infinite loop without sleep
v=Value('i',90) # Creates an integer, with initial value of 90 in shared memory
x=90 # Just a normal integer, by comparison
print(v.value,x) # read it
v.value=80 # Modify it
x=80
def f(v):
global x
while True:
time.sleep(1)
v.value = (v.value+1)%360
x = (x+1)%360
p=Process(target=f, args=(v,))
p.start()
while True:
print("New val", v.value, x)
time.sleep(5)
As you see, the value in the main loop increases approx. 5 at each loop. Because the process running f increased it by 1 5 times in the meantime.
But x in that same loop doesn't change. Because it is only the x of the process that runs f (the same global x, but different process. It is as you were running the same program, twice, into two different windows) that changes.
Now, applied to your code
import asyncio
import sys
import threading
import time
from multiprocessing import Process, Value
async def connect_stdin_stdout():
loop = asyncio.get_event_loop()
reader = asyncio.StreamReader()
protocol = asyncio.StreamReaderProtocol(reader)
await loop.connect_read_pipe(lambda: protocol, sys.stdin)
w_transport, w_protocol = await loop.connect_write_pipe(asyncio.streams.FlowControlMixin, sys.stdout)
writer = asyncio.StreamWriter(w_transport, w_protocol, reader, loop)
return reader, writer
servo0ang = Value('i', 90)
async def main():
reader, writer = await connect_stdin_stdout()
while True:
res = await reader.read(100)
if not res:
break
servo0ang.value = int(res)
# Main program logic follows:
def runAsync():
asyncio.run(main())
class Servo:
def setServoPwm(self, s, ang):
time.sleep(1)
print(f'\033[31m{ang=}\033[m')
def servoLoop():
pwm = Servo()
while True:
pwm.setServoPwm('0', servo0ang.value)
if __name__ =="__main__":
p = Process(target = servoLoop)
p.start()
runAsync()
p.join()
I used a dummy Servo class that just prints in red the servo0ang value.
Note that I've change nothing in your code, outside that.
Which means, that, no, asyncio.run was not blocking the other process. I still agree with comments you had, on the fact that it is never great to combine both asyncio and processes. Here, you have no other concurrent IO, so your async/await is roughly equivalent to a good old while True: servo0ang.value=int(input()). It is not like your input could yield to something else. There is nothing else, at least not in this process (if your two processes were communicating through a pipe, that would be different)
But, well how ever vainly convoluted your code may be, it works, and asyncio.run is not blocking the other process. It is just that the other process was endlessly calling setPwm with the same, constant, 90 value, that could never change, since that process was doing nothing else with this variable than calling setPwm with. It was doing nothing to try do grab a new value from the main process.
With Value shared memory, there is nothing to do neither. But this time, since it is shared memory, it is less vain to expect that the value changes when nothing changes it in the process.

Related

multiprocessing.Process and asyncio loop communication

import asyncio
from multiprocessing import Queue, Process
import time
task_queue = Queue()
# This is simulating the task
async def do_task(task_number):
for progress in range(task_number):
print(f'{progress}/{task_number} doing')
await asyncio.sleep(10)
# This is the loop that accepts and runs tasks
async def accept_tasks():
event_loop = asyncio.get_event_loop()
while True:
task_number = task_queue.get() <-- this blocks event loop from running do_task()
event_loop.create_task(do_task(task_number))
# This is the starting point of the process,
# the event loop runs here
def worker():
event_loop = asyncio.get_event_loop()
event_loop.run_until_complete(accept_tasks())
# Run a new process
Process(target=worker).start()
# Simulate adding tasks every 1 second
for _ in range(1,50):
task_queue.put(_)
print('added to queue', _)
time.sleep(1)
I'm trying to run a separate process that runs an event loop to do I/O operations. Now, from a parent process, I'm trying to "queue-in" tasks. The problem is that do_task() does not run. The only solution that works is polling (i.e. checking if empty, then sleeping X seconds).
After some researching, the problem seems to be that task_queue.get() isn't doing event-loop-friendly IO.
aiopipe provides a solution, but assumes both processes are running in an event loop.
I tried creating this. But the consumer isn't consuming anything...
read_fd, write_fd = os.pipe()
consumer = AioPipeReader(read_fd)
producer = os.fdopen(write_fd, 'w')

A simple workaround for this situation is to change task_number = task_queue.get() to task_number = await event_loop.run_in_executor(None, task_queue.get). That way the blocking Queue.get() function will be off-loaded to a thread pool and the current coroutine suspended, as a good asyncio citizen. Likewise, once the thread pool finishes with the function, the coroutine will resume execution.
This approach is a workaround because it doesn't scale to a large number of concurrent tasks: each blocking call "turned async" that way will take a slot in the thread pool, and those that exceed the pool's maximum number of workers will not even start executing before a threed frees up. For example, rewriting all of asyncio to call blocking functions through run_in_executor would just result in a badly written threaded system. However, if you know that you have a small number of child processes, using run_in_executor is correct and can solve the problem very effectively.

I finally figured it out. There is a known way to do this with aiopipe library. But it's made to run on two event loops on two different processes. In my case, I only have the child process running an event loop. To solve that, I changed the writing part into a unbuffered normal write using open(fd, buffering=0).
Here is the code without any library.
import asyncio
from asyncio import StreamReader, StreamReaderProtocol
from multiprocessing import Process
import time
import os
# This is simulating the task
async def do_task(task_number):
for progress in range(task_number):
print(f'{progress}/{task_number} doing')
await asyncio.sleep(1)
# This is the loop that accepts and runs tasks
async def accept_tasks(read_fd):
loop = asyncio.get_running_loop()
# Setup asynchronous reading
reader = StreamReader()
protocol = StreamReaderProtocol(reader)
transport, _ = await loop.connect_read_pipe(
lambda: protocol, os.fdopen(read_fd, 'rb', 0))
while True:
task_number = int(await reader.readline())
await asyncio.sleep(1)
loop.create_task(do_task(task_number))
transport.close()
# This is the starting point of the process,
# the event loop runs here
def worker(read_fd):
loop = asyncio.get_event_loop()
loop.run_until_complete(accept_tasks(read_fd))
# Create read and write pipe
read_fd, write_fd = os.pipe()
# allow inheritance to child
os.set_inheritable(read_fd, True)
Process(target=worker, args=(read_fd, )).start()
# detach from parent
os.close(read_fd)
writer = os.fdopen(write_fd, 'wb', 0)
# Simulate adding tasks every 1 second
for _ in range(1,50):
writer.write((f'{_}\n').encode())
print('added to queue', _)
time.sleep(1)
Basically, we use asynchronous reading on the child process' end, and do non-buffered synchronous write on the parent process' end. To do the former, you need to connect the event loop as shown in accept_tasks coroutine.

Creating a process that creates a thread which again updates a global variable

Currently, I am trying to spawn a process in a Python program which again creates threads that continuously update variables in the process address space. So far I came up with this code which runs, but the update of the variable seems not to be propagated to the process level. I would have expected that defining a variable in the process address space and using global in the thread (which shares the address space of the process) would allow the thread to manipulate the variable and propagate the changes to the process.
Below is a minimal example of the problem:
import multiprocessing
import threading
import time
import random
def process1():
lst = {}
url = "url"
thrd = threading.Thread(target = urlCaller, args = (url,))
print("process alive")
thrd.start()
while True:
# the process does some CPU intense calculation
print(lst)
time.sleep(2)
def urlCaller(url):
global lst
while True:
# the thread continuously pulls data from an API
# this is I/O heavy and therefore done by a thread
lst = {random.randint(1,9), random.randint(20,30)}
print(lst)
time.sleep(2)
prcss = multiprocessing.Process(target = process1)
prcss.start()
The process always prints an empty list while the thread prints, as expected, a list with two integers. I would expect that the process prints a list with two integers as well.
(Note: I am using Spyder as IDE and somehow there is only printed something to the console if I run this code on Linux/Ubuntu but nothing is printed to the console if I run the exact same code in Spyder on Windows.)
I am aware that the use of global variables is not always a good solution but I think it serves the purpose well in this case.
You might wonder why I want to create a thread within a process. Basically, I need to run the same complex calculation on different data sets that constantly change. Hence, I need multiple processes (one for each data set) to optimize the utilization of my CPU and use threads within the processes to make the I/O process most efficient. The data depreciates very fast, therefore, I cannot just store it in a database or file, which would of course simplify the communication process between data producer (thread) and data consumer (process).

You are defining a local variable lst inside the function process1, so what urlCaller does is irrelevant, it cannot change the local variable of a different function. urlCaller is defining a global variable but process1 can never see it because it's shadowed by the local variable you defined.
You need to remove lst = {} from that function and find an other way to return a value or declare the variable global there too:
def process1():
global lst
lst = {}
url = "url"
thrd = threading.Thread(target = urlCaller, args = (url,))
print("process alive")
thrd.start()
while True:
# the process does some CPU intense calculation
print(lst)
time.sleep(2)
I'd use something like concurrent.futures instead of the threading module directly.

Thanks to the previous answer, I figured out that it's best to implement a process class and define "thread-functions" within this class. Now, the threads can access a shared variable and manipulate this variable without the need of using "thread.join()" and terminating a thread.
Below is a minimal example in which 2 concurrent threads provide data for a parent process.
import multiprocessing
import threading
import time
import random
class process1(multiprocessing.Process):
lst = {}
url = "url"
def __init__(self, url):
super(process1, self).__init__()
self.url = url
def urlCallerInt(self, url):
while True:
self.lst = {random.randint(1,9), random.randint(20,30)}
time.sleep(2)
def urlCallerABC(self, url):
while True:
self.lst = {"Ab", "cD"}
time.sleep(5)
def run(self):
t1 = threading.Thread(target = self.urlCallerInt, args=(self.url,))
t2 = threading.Thread(target = self.urlCallerABC, args=(self.url,))
t1.start()
t2.start()
while True:
print(self.lst)
time.sleep(1)
p1 = process1("url")
p1.start()

Python script is hanging AFTER multithreading

I know there are a few questions and answers related to hanging threads in Python, but my situation is slightly different as the script is hanging AFTER all the threads have been completed. The threading script is below, but obviously the first 2 functions are simplified massively.
When I run the script shown, it works. When I use my real functions, the script hangs AFTER THE LAST LINE. So, all the scenarios are processed (and a message printed to confirm), logStudyData() then collates all the results and writes to a csv. "Script Complete" is printed. And THEN it hangs.
The script with threading functionality removed runs fine.
I have tried enclosing the main script in try...except but no exception gets logged. If I use a debugger with a breakpoint on the final print and then step it forward, it hangs.
I know there is not much to go on here, but short of including the whole 1500-line script, I don't know hat else to do. Any suggestions welcome!
def runScenario(scenario):
# Do a bunch of stuff
with lock:
# access global variables
pass
pass
def logStudyData():
# Combine results from all scenarios into a df and write to csv
pass
def worker():
global q
while True:
next_scenario = q.get()
if next_scenario is None:
break
runScenario(next_scenario)
print(next_scenario , " is complete")
q.task_done()
import threading
from queue import Queue
global q, lock
q = Queue()
threads = []
scenario_list = ['s1','s2','s3','s4','s5','s6','s7','s8','s9','s10','s11','s12']
num_worker_threads = 6
lock = threading.Lock()
for i in range(num_worker_threads):
print("Thread number ",i)
this_thread = threading.Thread(target=worker)
this_thread.start()
threads.append(this_thread)
for scenario_name in scenario_list:
q.put(scenario_name)
q.join()
print("q.join completed")
logStudyData()
print("script complete")

As the docs for Queue.get say:
Remove and return an item from the queue. If optional args block is true and timeout is None (the default), block if necessary until an item is available. If timeout is a positive number, it blocks at most timeout seconds and raises the Empty exception if no item was available within that time. Otherwise (block is false), return an item if one is immediately available, else raise the Empty exception (timeout is ignored in that case).
In other words, there is no way get can ever return None, except by you calling q.put(None) on the main thread, which you don't do.
Notice that the example directly below those docs does this:
for i in range(num_worker_threads):
q.put(None)
for t in threads:
t.join()
The second one is technically necessary, but you usually get away with not doing it.
But the first one is absolutely necessary. You need to either do this, or come up with some other mechanism to tell your workers to quit. Without that, your main thread just tries to exit, which means it tries to join every worker, but those workers are all blocked forever on a get that will never happen, so your program hangs forever.
Building a thread pool may not be rocket science (if only because rocket scientists tend to need their calculations to be deterministic and hard real-time…), but it's not trivial, either, and there are plenty of things you can get wrong. You may want to consider using one of the two already-built threadpools in the Python standard library, concurrent.futures.ThreadPoolExecutor or multiprocessing.dummy.Pool. This would reduce your entire program to:
import concurrent.futures
def work(scenario):
runScenario(scenario)
print(scenario , " is complete")
scenario_list = ['s1','s2','s3','s4','s5','s6','s7','s8','s9','s10','s11','s12']
with concurrent.futures.ThreadPoolExecutor(max_workers=6) as x:
results = list(x.map(work, scenario_list))
print("q.join completed")
logStudyData()
print("script complete")
Obviously you'll still need a lock around any mutable variables you change inside runScenario—although if you're only using a mutable variable there because you couldn't figure out how to return values to the main thread, that's trivial with an Executor: just return the values from work, and then you can use them like this:
for result in x.map(work, scenario_list):
do_something(result)

How to call method from different class using multiprocess pool python

How do I call a method from a different class (different module) with the use of Multiprocess pool in python?
My aim is to start a process which keep running until some task is provide, and once task is completed it will again go back to waiting mode.
Below is code, which has three module, Reader class is my run time task, I will provide execution of reader method to ProcessExecutor.
Process executor is process pool, it will continue while loop until some task is provided to it.
Main module which initiates everything.
Module 1
class Reader(object):
def __init__(self, message):
self.message = message
def reader(self):
print self.message
Module 2
class ProcessExecutor():
def run(self, queue):
print 'Before while loop'
while True:
print 'Reached Run'
try:
pair = queue.get()
print 'Running process'
print pair
func = pair.get('target')
arguments = pair.get('args', None)
if arguments is None:
func()
else:
func(arguments)
queue.task_done()
except Exception:
print Exception.message
main Module
from process_helper import ProcessExecutor
from reader import Reader
import multiprocessing
import Queue
if __name__=='__main__':
queue = Queue.Queue()
myReader = Reader('Hi')
ps = ProcessExecutor()
pool = multiprocessing.Pool(2)
pool.apply_async(ps.run, args=(queue, ))
param = {'target': myReader.reader}
queue.put(param)
Code executed without any error: C:\Python27\python.exe
C:/Users/PycharmProjects/untitled1/main/main.py
Process finished with exit code 0
Code gets executed but it never reached to run method. I am not sure is it possible to call a method of the different class using multi-processes or not
I tried apply_async, map, apply but none of them are working.
All example searched online are calling target method from the script where the main method is implemented.
I am using python 2.7
Please help.

Your first problem is that you just exit without waiting on anything. You have a Pool, a Queue, and an AsyncResult, but you just ignore all of them and exit as soon as you've created them. You should be able to get away with only waiting on the AsyncResult (after that, there's no more work to do, so who cares what you abandon), except for the fact that you're trying to use Queue.task_done, which doesn't make any sense without a Queue.join on the other side, so you need to wait on that as well.
Your second problem is that you're using the Queue from the Queue module, instead of the one from the multiprocessing module. The Queue module only works across threads in the same process.
Also, you can't call task_done on a plain Queue; that's only a method for the JoinableQueue subclass.
Once you've gotten to the point where the pool tries to actually run a task, you will get the problem that bound methods can't be pickled unless you write a pickler for them. Doing that is a pain, even though it's the right way. The traditional workaround—hacky and cheesy, but everyone did it, and it works—is to wrap each method you want to call in a top-level function. The modern solution is to use the third-party dill or cloudpickle libraries, which know how to pickle bound methods, and how to hook into multiprocessing. You should definitely look into them. But, to keep things simple, I'll show you the workaround.
Notice that, because you've created an extra queue to pass methods onto, in addition to the one built into the pool, you'll need the workaround for both targets.
With these problems fixed, your code looks like this:
from process_helper import ProcessExecutor
from reader import Reader
import multiprocessing
def call_run(ps):
ps.run(queue)
def call_reader(reader):
return reader.reader()
if __name__=='__main__':
queue = multiprocessing.JoinableQueue()
myReader = Reader('Hi')
ps = ProcessExecutor()
pool = multiprocessing.Pool(2)
res = pool.apply_async(call_run, args=(ps,))
param = {'target': call_reader, 'args': myReader}
queue.put(param)
print res.get()
queue.join()
You have additional bugs beyond this in your ProcessReader, but I'm not going to debug everything for you. This gets you past the initial hurdles, and shows the answer to the specific question you were asking about. Also, I'm not sure what the point of all that code is. You seem to be trying to replace what Pool already does on top of Pool, only in a more complicated but less powerful way, but I'm not entirely sure.
Meanwhile, here's a program that does what I think you want, with no problems, by just throwing away that ProcessExecutor and everything that goes with it:
from reader import Reader
import multiprocessing
def call_reader(reader):
return reader.reader()
if __name__=='__main__':
myReader = Reader('Hi')
pool = multiprocessing.Pool(2)
res = pool.apply_async(call_reader, args=(myReader,))
print res.get()

Same socket speed is 66% slower on a new thread?

This is crazy! I heard Python threads are slow but this is beyond normal.
Here is the pseudo-code:
class ReadThread:
v = []
def __init__(self, threaded = True):
self.v = MySocket('127.0.0.1')
if threaded:
thread.start_new_thread(self._scan, ())
def read(self):
t0 = datetime.now()
self.v.read('SomeVariable')
t = datetime.now()
dt = (t-t0).total_seconds()
print dt
def _scan(self):
while True:
self.read()
If I run the read() in a while loop in the main-thread as this:
r = ReadThread(threaded = False)
while True:
r.read()
dt is about 78 ms with small variation. Now if I run it in a new thread like this:
r = ReadThread(threaded = True)
while True:
pass
dt is about 130 ms with +-10ms variance!
Why is it so slow? Am I doing something really wrong? It's the same thing just in a new thread!
MySocket() is is an object that uses a socket to read/write variables to a server and read() just gets some variable for the test.

It is hard to reproduce this problem locally without knowing what MySocket is, and the full example. However, I can try guessing, that the problem is this cycle:
while True:
pass
It is VERY CPU-consuming. The CPU literally goes around all the time, taking the CPU cycles to itself, and not letting the socket to work.
Contrary, the socket read operations are usually blocking and idling for the data to arrive, so they consume almost no CPU.
In the first example, you run your socket while nothing else eats CPU. In the second example, the main thread consumes 1 CPU completely.
Try replacing this cycle with a usual idling operation, e.g. time.sleep(60). So the main thread will idle for 60s while the socket thread will read and process data.
r = ReadThread(threaded = True)
time.sleep(60)
What will be the measuring in that case then?

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

asyncio.run() seems to be blocking the other process i started - python

Related

multiprocessing.Process and asyncio loop communication

Creating a process that creates a thread which again updates a global variable

Python script is hanging AFTER multithreading

How to call method from different class using multiprocess pool python

Same socket speed is 66% slower on a new thread?

Categories

Resources