Why doesn't this async code stop? - python

The following code snippet has two coroutines each for server and client. The client coroutine has a logic to break the while loop after 10 seconds and server should stop after 15 seconds.
When I run the script this doesn't stop, ideally, it should stop after 15 seconds but this is not happening.
import asyncio
import time
import zmq
import zmq.asyncio
zmq.asyncio.install()
ctx = zmq.asyncio.Context()
server_socket = ctx.socket(zmq.REP)
client_socket = ctx.socket(zmq.REQ)
server_socket.bind("tcp://127.0.0.1:8899")
client_socket.connect("tcp://127.0.0.1:8899")
t0 = time.time()
#asyncio.coroutine
def server_coroutine():
while True:
msg = yield from server_socket.recv_string()
print(msg)
msg = "Server:: {}".format(msg)
yield from server_socket.send_string(msg)
t1 = time.time()
elapsed_time = t1 - t0
# print('elapsed time is {}'.format(elapsed_time))
if elapsed_time > 15:
print("Breaking Server loop")
break
#asyncio.coroutine
def client_coroutine():
counter = 0
while True:
yield from asyncio.sleep(2)
msg = 'Message: {}'.format(counter)
yield from client_socket.send_string(msg)
res = yield from client_socket.recv_string()
print(res)
t1 = time.time()
elapsed_time = t1 - t0
print('elapsed time is {}'.format(elapsed_time))
if elapsed_time > 10:
print("Breaking Client loop")
break
counter += 1
if __name__ == '__main__':
loop = asyncio.get_event_loop()
loop.run_until_complete(asyncio.gather(
asyncio.ensure_future(server_coroutine()),
asyncio.ensure_future(client_coroutine())
))

If you run code you will see something like this:
Server:: Message: 4
elapsed time is 10.022311687469482
Breaking Client loop
ok, client_coroutine finished successfully, but what state of server_coroutine at this moment? It stuck at this line msg = yield from server_socket.recv_string() waiting for possibility to recive string from server_socket, but it won't happen since there's already no client to send it! And since your event loop runs until both coroutines done it would run forever.
Here's the simplest fix:
#asyncio.coroutine
def server_coroutine():
while True:
msg = yield from server_socket.recv_string()
if msg == 'CLOSE': # LOOK HERE 1
break
print(msg)
msg = "Server:: {}".format(msg)
yield from server_socket.send_string(msg)
t1 = time.time()
elapsed_time = t1 - t0
# print('elapsed time is {}'.format(elapsed_time))
if elapsed_time > 15:
print("Breaking Server loop")
break
#asyncio.coroutine
def client_coroutine():
counter = 0
while True:
yield from asyncio.sleep(2)
msg = 'Message: {}'.format(counter)
yield from client_socket.send_string(msg)
res = yield from client_socket.recv_string()
print(res)
t1 = time.time()
elapsed_time = t1 - t0
print('elapsed time is {}'.format(elapsed_time))
if elapsed_time > 10:
print("Breaking Client loop")
yield from client_socket.send_string('CLOSE') # LOOK HERE 2
break
counter += 1
Note, this fix is only to demonstrate issue and one possible way to solve it.
In real life I think you would want to do something different: probably, set timeouts to you coroutines to guarantee they won't stuck forever if client/server stops responding.

Related

python async - functions not starting at the same time

I have the following program here
`
import datetime
import asyncio
import time
import math
async def count1():
s = 0
print('start time count 1: ' +str(datetime.datetime.now()))
for i in range(100000000):
s += math.cos(i)
print('end time count 1: ' +str(datetime.datetime.now()))
return s
async def count2():
s = 0
print('start time count 2: ' +str(datetime.datetime.now()))
for i in range(1000000):
s += math.cos(i)
print('end time count 2: ' +str(datetime.datetime.now()))
return s
async def main():
start_time = time.time()
task = asyncio.gather(count1(), count2())
results = await task
end_time = time.time()
print(f"Result 1: {results[0]}")
print(f"Result 2: {results[1]}")
print(f"Total time taken: {end_time - start_time:.2f} seconds")
asyncio.run(main())
The output is
start time count 1: 2023-02-16 12:26:19.322523
end time count 1: 2023-02-16 12:26:40.866866
start time count 2: 2023-02-16 12:26:40.868166
end time count 2: 2023-02-16 12:26:41.055005
Result 1: 1.534369444774577
Result 2: -0.28870546796843
Total time taken: 21.73 seconds
I am trying to get count1() and count2() to start working at the same time, as it seen it the output it's not happening. count2() only starts after count1() has ended, and I am not sure why.
I also tried replacing the lines in main() with:
task1 = asyncio.create_task(count1())
task2 = asyncio.create_task(count2())
result1 = await task1
result2 = await task2
not also does not result in count1() and count2() starting at the same time.
async is essentially cooperative multitasking. Neither function awaits, so they hog the entire process for as long as they run and don't yield to other functions.
You could add a strategic
if i % 10000 == 0:
await asyncio.sleep(0)
in the loop so they "give way" to other async coroutines.
Async doesn’t mean “running at the same time”. Async means that when one of the functions waits (typically for some I/O), the control can be passed to another function.
In practice async can be used e.g. to have one process wait for several responses from web at the same time. It won’t allow you to e.g. run the same computation on several CPU cores (this stuff is generally hard to achieve in Python due to the infamous GIL problem).

Asyncio with locks not working as expected with add_done_callback

I have an async method, as shown below.
I pass in lists of 1000 numbers, where the method will pass in each number to a helper function which will return something from a website.
I have a global variable called count, which i surround with locks to make sure it doesnt get changed by anything else
I use add_done_callback with the task to make this method async.
The goal is to keep sending a number in the list of 1000 numbers to the server, and only when the server returns data (can take anywhere from 0.1 to 2 seconds), to pause, write the data to a sql database, and then continue
The code works as expected without locks, or without making the callback function, (which is named 'function' below) asyncrounous. But adding locks gives me an error: RuntimeWarning: coroutine 'function' was never awaited self._context.run(self._callback, *self._args) RuntimeWarning: Enable tracemalloc to get the object allocation traceback
I am super new to async in python so any help/advice is greatly appriciated
My code is shown below. It is just a simple draft:
import time
import random
import asyncio
# from helper import get_message_from_server
async def get(number):
# get_message_from_server(number), which takes somewhere between 0.1 to 2 seconds
await asyncio.sleep(random.uniform(0.1, 2))
s = 'Done with number ' + number
return s
async def function(future, lock):
global count
print(future.result())
# write future.result() to db
acquired = await lock.acquire()
count -= 1 if (count > 1) else 0
lock.release()
async def main(numbers, lock):
global count
count = 0
for i, number in enumerate(numbers):
print('number:', number, 'count:', count)
acquired = await lock.acquire()
count += 1
lock.release()
task = asyncio.create_task(get(number))
task.add_done_callback(
lambda x: function(x, lock)
)
if (count == 50):
print('Reached 50')
await task
acquired = await lock.acquire()
count = 0
lock.release()
if (i == len(numbers) - 1):
await task
def make_numbers():
count = []
for i in range(1001):
count.append(str(i))
return count
if __name__ == '__main__':
numbers = make_numbers()
loop = asyncio.get_event_loop()
lock = asyncio.Lock()
try:
loop.run_until_complete(main(numbers, lock))
except Exception as e:
pass
finally:
loop.run_until_complete(loop.shutdown_asyncgens())
loop.stop()
The above comment helped a lot
This is what the final working code looks like:
import time
import random
import asyncio
from functools import partial
# from helper import get_message_from_server
async def get(number):
# get_message_from_server(number), which takes somewhere between 0.1 to 2 seconds
await asyncio.sleep(random.uniform(0.1, 2))
s = 'Done with number ' + number
return s
def function(result, lock):
print(result.result())
async def count_decrement(lock):
global count
print('in count decrement')
acquired = await lock.acquire()
count -= 1 if (count > 1) else 0
lock.release()
asyncio.create_task(count_decrement(lock))
async def main(numbers, lock):
global count
count = 0
for i, number in enumerate(numbers):
print('number:', number, 'count:', count)
acquired = await lock.acquire()
count += 1
lock.release()
task = asyncio.create_task(get(number))
task.add_done_callback(partial(function, lock = lock))
if (count == 50):
print('Reached 50')
await task
acquired = await lock.acquire()
count = 0
lock.release()
if (i == len(numbers) - 1):
await task
def make_numbers():
count = []
for i in range(1001):
count.append(str(i))
return count
if __name__ == '__main__':
numbers = make_numbers()
loop = asyncio.get_event_loop()
lock = asyncio.Lock()
try:
loop.run_until_complete(main(numbers, lock))
except Exception as e:
pass
finally:
loop.run_until_complete(loop.shutdown_asyncgens())
loop.stop()

should i use for loop or while loop to make a break timer in python?

I have this code to stop a function at a specific time. I would loop through the function and then break the function, if it takes too long, is there a better way to do it?
import time
def function_one()
rec = (time.time())
print("im starting")
ans = str(time.time() - rec)
ans = (round(float(ans), 15))
print("this is where im doing something code")
while ans < 10:
return function_one()
break
You can make it simpler like this:
import time
def function_one():
start_time = time.time()
while True:
print('Function doing something ...')
if time.time() - start_time > 10:
break
function_one()
Here, I'm using a while loop just to keep the function running, but that depends on the details of your function.
In general, what you need is:
set the start time
do whatever the function is supposed to be doing;
check if it's been running for too long and, in case it has, you can simply return.
So, something like:
import time
def function_one():
start_time = time.time()
# do your job
if time.time() - start_time > 10:
return something
function_one()
If you want to stop a function after a set amount of time has passed I would use a while loop and do something like this.
import time
def function_one():
start = (time.time()) #start time
limit = 1 #time limit
print("im starting")
while (time.time() - start) < limit:
#input code to do here
pass
print(f"finished after {time.time() - start} seconds")
function_one()

Do I need to pass variables to threads through Thread?

Is there any real difference between these two?
Variables are accessed in the threaded function, but not passed to it through Thread(args)
Variables are passed to the threaded function through Thread(args)
# 1
def do_something():
event = Event()
tracker = MyObject.instance()
start_time = datetime.timestamp(datetime.now())
def threaded_function():
current_time = datetime.timestamp(datetime.now())
while True:
if current_time - start_time > 30:
tracker.put("too long")
break
elif event.is_set():
break
else:
time.sleep(1)
thread = Thread(target=threaded_function)
thread.start()
# Do something that may take more than 30 seconds
event.set()
# 2
def do_something():
event = Event()
def threaded_function(tracker, start_time):
current_time = datetime.timestamp(datetime.now())
while True:
if current_time - start_time > 30:
tracker.put("too long")
break
elif event.is_set():
break
else:
time.sleep(1)
tracker = MyTracker.instance()
start_time = datetime.timestamp(datetime.now())
thread = Thread(target=threaded_function, args=(tracker, start_time))
thread.start()
# Do something that may take more than 30 seconds
event.set()
Practically speaking, there is little difference between the two, since you only start one thread.
In #1, threaded_function is a closure over two local variables in do_something. There would be no way to reliably vary those values between two or more threads, as any change to tracker or start_time would be visible inside any call to do_something.
In #2, tracker and start_time are local to threaded_function. They can be initialized to different values in each thread that runs threaded_function, and the values in one function are independent of values in another.

How to call a pool with sleep between executions within a multiprocessing process in Python?

In the main function, I am calling a process to run imp_workload() method parallely for each DP_WORKLOAD
#!/usr/bin/env python
import multiprocessing
import subprocess
if __name__ == "__main__":
for DP_WORKLOAD in DP_WORKLOAD_NAME:
p1 = multiprocessing.Process(target=imp_workload, args=(DP_WORKLOAD, DP_DURATION_SECONDS, DP_CONCURRENCY, ))
p1.start()
However, inside this imp_workload() method, I need the import_command_run() method to run a number of processes (the number is equivalent to variable DP_CONCURRENCY) but with the sleep of 60 seconds before new execution.
This is the sample code I have written.
def imp_workload(DP_WORKLOAD, DP_DURATION_SECONDS, DP_CONCURRENCY):
while DP_DURATION_SECONDS > 0:
pool = multiprocessing.Pool(processes = DP_CONCURRENCY)
for j in range(DP_CONCURRENCY):
pool.apply_async(import_command_run, args=(DP_WORKLOAD, dp_workload_cmd, j,)
# Sleep for 1 minute
time.sleep(60)
pool.close()
# Clean the schemas after import is completed
clean_schema(DP_WORKLOAD)
# Sleep for 1 minute
time.sleep(60)
def import_command_run(DP_WORKLOAD):
abccmd = 'impdp admin/DP_PDB_ADMIN_PASSWORD#DP_PDB_FULL_NAME SCHEMAS=ABC'
defcmd = 'impdp admin/DP_PDB_ADMIN_PASSWORD#DP_PDB_FULL_NAME SCHEMAS=DEF'
# any of the above commands
run_imp_cmd(eval(dp_workload_cmd))
def run_imp_cmd(cmd):
output = subprocess.Popen([cmd], shell=True, stdout=subprocess.PIPE, stderr=subprocess.PIPE, universal_newlines=True)
stdout,stderr = output.communicate()
return stdout
When I tried running it in this format, I got the following error:
time.sleep(60)
^
SyntaxError: invalid syntax
So, how can I kickoff the 'abccmd' job for DP_CONCURRENCY times parallely with a sleep of 1 min between each job and also each of these pool running in multiProcess?
Working on Python 2.7.5 (Due to restrictions, can't use Python 3.x so, will appreciate answers specific to Python 2.x)
P.S. This is a very large script and complex file so I have tried posting only relevant excerpts. Please ask for more details if necessary (or if it is not clear from this much)
Let me offer two possibilities:
Possibility 1
Here is an example of how you would kick off a worker function in parallel with DP_CURRENCY == 4 possible arguments, 0, 1, 2 and 3, cycling over and over for up to DP_DURATION_SECONDS seconds with a pool size of DP_CURRENCY and as soon as a job completes restarting the job but guaranteeing that at least TIME_BETWEEN_SUBMITS == 60 seconds has elapsed between successive restarts.
from __future__ import print_function
from multiprocessing import Pool
import time
from queue import SimpleQueue
TIME_BETWEEN_SUBMITS = 60
def worker(i):
print(i, 'started at', time.time())
time.sleep(40)
print(i, 'ended at', time.time())
return i # the argument
def main():
q = SimpleQueue()
def callback(result):
# every time a job finishes, put result (the argument) on the queue
q.put(result)
DP_CURRENCY = 4
DP_DURATION_SECONDS = TIME_BETWEEN_SUBMITS * 10
pool = Pool(DP_CURRENCY)
t = time.time()
expiration = t + DP_DURATION_SECONDS
# kick off initial tasks:
start_times = [None] * DP_CURRENCY
for i in range(DP_CURRENCY):
pool.apply_async(worker, args=(i,), callback=callback)
start_times[i] = time.time()
while True:
i = q.get() # wait for a job to complete
t = time.time()
if t >= expiration:
break
time_to_wait = TIME_BETWEEN_SUBMITS - (t - start_times[i])
if time_to_wait > 0:
time.sleep(time_to_wait)
pool.apply_async(worker, args=(i,), callback=callback)
start_times[i] = time.time()
# wait for all jobs to complete:
pool.close()
pool.join()
# required by Windows:
if __name__ == '__main__':
main()
Possibility 2
This is closer to what you had in that DP_DURATION_SECONDS == 60 seconds of sleeping is done between successive submission of any two jobs. But to me this doesn't make as much sense. If, for example, the worker function only took 50 seconds to complete, you would not be doing any parallel processing at all. In fact, each job would need to take at least 180 (i.e. (DP_CURRENCY - 1) * TIME_BETWEEN_SUBMITS) seconds to complete in order to have all 4 processes in the pool busy running jobs at the same time.
from __future__ import print_function
from multiprocessing import Pool
import time
from queue import SimpleQueue
TIME_BETWEEN_SUBMITS = 60
def worker(i):
print(i, 'started at', time.time())
# A task must take at least 180 seconds to run to have 4 tasks running in parallel if
# you wait 60 seconds between starting each successive task:
# take 182 seconds to run
time.sleep(3 * TIME_BETWEEN_SUBMITS + 2)
print(i, 'ended at', time.time())
return i # the argument
def main():
q = SimpleQueue()
def callback(result):
# every time a job finishes, put result (the argument) on the queue
q.put(result)
# at most 4 tasks at a time but only if worker takes at least 3 * TIME_BETWEEN_SUBMITS
DP_CURRENCY = 4
DP_DURATION_SECONDS = TIME_BETWEEN_SUBMITS * 10
pool = Pool(DP_CURRENCY)
t = time.time()
expiration = t + DP_DURATION_SECONDS
# kick off initial tasks:
for i in range(DP_CURRENCY):
if i != 0:
time.sleep(TIME_BETWEEN_SUBMITS)
pool.apply_async(worker, args=(i,), callback=callback)
time_last_job_submitted = time.time()
while True:
i = q.get() # wait for a job to complete
t = time.time()
if t >= expiration:
break
time_to_wait = TIME_BETWEEN_SUBMITS - (t - time_last_job_submitted)
if time_to_wait > 0:
time.sleep(time_to_wait)
pool.apply_async(worker, args=(i,), callback=callback)
time_last_job_submitted = time.time()
# wait for all jobs to complete:
pool.close()
pool.join()
# required by Windows:
if __name__ == '__main__':
main()

Categories