I am using this asynchronous project (called Broker - see git) with the following code:
proxies = asyncio.Queue()
broker = Broker(proxies)
tasks = asyncio.gather(broker.find(types = ['HTTPS'], strict = True, limit = 10),
self.storeProxies(proxies))
loop = asyncio.get_event_loop()
loop.run_until_complete(tasks)
where self.storeProxies is an async function that contains
while True:
proxy = await proxies.get()
if proxy is None: break
self.proxies['http://{0}:{1}'.format(proxy.host, proxy.port)] = 1
However, I do not think that it is related to the Broker part (not sure of course).
Everytime I run this code, after a quite random number of successes, it fails with [WinError 10038] An operation was attempted on something that is not a socket. I then tried to do some research and ended up at this answer. But when trying with this code, I get this error:
ValueError: loop argument must agree with Future
Any ideas on how to deal with this?
Details that might be useful:
OS: Windows 10 (version 1809)
Python version: 3.7.1
Based on the answer from Mikhail Gerasimov
async def storeProxies(self, proxies):
logger.info("IN IT!")
while True:
proxy = await proxies.get()
if proxy is None: break
self.proxies['http://{0}:{1}'.format(proxy.host, proxy.port)] = 1
async def getfetching(self, amount):
from proxybroker import Broker
proxies = asyncio.Queue()
broker = Broker(proxies)
return await asyncio.gather(
broker.find(
types = ['HTTP', 'HTTPS'],
strict = True,
limit = 10
),
self.storeProxies(proxies)
)
def fetchProxies(self, amount):
if os.name == 'nt':
loop = asyncio.SelectorEventLoop() # for subprocess' pipes on Windows
asyncio.set_event_loop(loop)
else:
loop = asyncio.get_event_loop()
loop.run_until_complete(self.getfetching(amount))
logger.info("FETCHING COMPLETE!!!!!!")
logger.info('Proxies: {}'.format(self.proxies))
where fetchProxies is called at certain intervals from another place. This works perfectly the first time, but then "fails" with the warnings:
2019-04-06 21:04:21 [py.warnings] WARNING: d:\users\me\anaconda3\lib\site-packages\proxybroker\providers.py:78: DeprecationWarning: The object should be created from async function
headers=get_headers(), cookies=self._cookies, loop=self._loop
2019-04-06 21:04:21 [py.warnings] WARNING: d:\users\me\anaconda3\lib\site-packages\aiohttp\connector.py:730: DeprecationWarning: The object should be created from async function
loop=loop)
2019-04-06 21:04:21 [py.warnings] WARNING: d:\users\me\anaconda3\lib\site-packages\aiohttp\cookiejar.py:55: DeprecationWarning: The object should be created from async function
super().__init__(loop=loop)
followed by a behavior that seems like an infinite loop (hard stuck outputting nothing). Noteworthy, this also happened to me in the example from Gerasimov (in the comments) when importing Broker globally. Finally, I am starting to see the light of the tunnel.
Take a look at this part:
proxies = asyncio.Queue()
# ...
loop.run_until_complete(tasks)
asyncio.Queue() when created is bound to current event loop (default event loop or one been set to current with asyncio.set_event_loop()). Usually only loop object is bound to can manage it. If you change current loop, you should recreate object. Same is true for many other asyncio objects.
To make sure each object is bound to new event loop, it's better to create asyncio-related objects already after you set and ran new event loop. It'll look like this:
async def main():
proxies = asyncio.Queue()
broker = Broker(proxies)
return await asyncio.gather(
broker.find(
types = ['HTTPS'],
strict = True,
limit = amount
),
self.storeProxies(proxies)
)
if os.name == 'nt':
loop = asyncio.ProactorEventLoop() # for subprocess' pipes on Windows
asyncio.set_event_loop(loop)
else:
loop = asyncio.get_event_loop()
loop.run_until_complete(main())
Sometimes (rarely) you'll have to place some imports inside main() either.
Upd:
Problem happens because you change event loop between fetchProxies calls, but Broker imported only once (Python caches imported modules).
Reloading Broker didn't work for me thus I couldn't find a better way than to reuse event loop you set once.
Replace this code
if os.name == 'nt':
loop = asyncio.SelectorEventLoop() # for subprocess' pipes on Windows
asyncio.set_event_loop(loop)
else:
loop = asyncio.get_event_loop()
with this
if os.name == 'nt':
loop = asyncio.get_event_loop()
if not isinstance(loop, asyncio.SelectorEventLoop):
loop = asyncio.SelectorEventLoop() # for subprocess' pipes on Windows
asyncio.set_event_loop(loop)
else:
loop = asyncio.get_event_loop()
P.S.
BTW you don't have to do this in the first place:
if os.name == 'nt':
loop = asyncio.SelectorEventLoop()
asyncio.set_event_loop(loop)
Windows already use SelectorEventLoop as default and it doesn't support pipes (doc).
Related
I'm trying to convert a low-level library that is currently targeted to be used via asyncio to anyio.
However, I'm having a hard time figuring out the best way to do so, since the library uses
asyncio.Future futures to represent asynchronous interaction with two worker threads.
Since the logic in the threads is much more complicated than what I'm showing here, converting them to async code is not an option for me at this point. It's also not standard network communication, so I cannot just use an existing anyio based library instead.
The only solution I can come up with is using a thread safe result return Queue.queue that gets created with every sent message. SendMsgAsync would create the return queue, and store a copy of the queue and the message in pending_msgs and send the message via the send_queue to the send_thread. Then it would try to get the result from the result queue, async sleeping in between.
Once a reply is received, the recv_thread would put the reply into the result queue belonging to the original message (fetched from pending_msgs), causing SendMsgAsync to finish.
But polling the queue in SendMsgAsync doesn't seem like the right thing to do.
anyio does have anyio.create_memory_object_stream() that seems to be a form of async queue, but the documentation doesn't state whether these streams are thread safe, so I'm doubtful that I can use them between the event loop and my thread.
With futures this would be much more elegant.
I was also wondering whether I could use concurrent.futures, but I could not find any examples where those can be used with anyio after manually creating them. It seems anyio can return and check them, but apparently only when they are bound to a started task. But since I do not need a new task running in the event loop (just a pseudo-task, the result of which is monitored) I don't know how to elegantly solve this. In a nutshell, a way to make anyio async await a concurrent.futures object I created myself would solve my issue, but I have the feeling this is not compatible with the anyio paradigm of doing async.
Any ideas how to interface this code with anyio are highly appreciated.
Here is a simplification of the code I have:
import asyncio
import queue
from functools import partial
import threading
send_queue:queue.Queue = queue.Queue(10) ## used to send messages to send_thread_fun
pending_msgs:dict = dict() ## stored messages waiting for replies
## message classes
class msg_class:
def __init__(self, uuid) -> None:
self.uuid:str = uuid
class reply_class(msg_class):
def __init__(self, uuid, success:bool) -> None:
super().__init__(uuid)
self.success = success
## container class for stored messages
class stored_msg_class:
def __init__(self, a_msg:msg_class, future:asyncio.Future) -> None:
self.msg = a_msg
self.future = future
## async send function as interface to outside async world
async def SendMsgAsyncAndGetReply(themsg:msg_class, loop:asyncio.AbstractEventLoop):
afuture:asyncio.Future = SendMsg(themsg, loop)
return await afuture
## this send function is only called internally
def SendMsg(themsg:msg_class, loop:asyncio.AbstractEventLoop):
msg_future = loop.create_future()
msg_future.add_done_callback(lambda fut: partial(RemoveMsg_WhenFutureDone, uuid=themsg.uuid) ) ## add a callback, so that the command is removed from the pending list if the future is cancelled externally. This is also called when the future completes, so it must not have negative effects then either
pending_asyncmsg = stored_msg_class(themsg, msg_future)
pending_msgs[themsg.uuid] = pending_asyncmsg
return pending_asyncmsg.future
## Message status updates
def CompleteMsg(pendingmsg:stored_msg_class, result:any) -> bool:
future = pendingmsg.future
hdl:asyncio.Handle = future.get_loop().call_soon_threadsafe(future.set_result, result)
def FailMsg(pendingmsg:stored_msg_class, exception:Exception):
future = pendingmsg.future
hdl:asyncio.Handle = future.get_loop().call_soon_threadsafe(future.set_exception, exception)
def CancelMsg(pendingmsg:stored_msg_class):
future = pendingmsg.future
hdl:asyncio.Handle = future.get_loop().call_soon_threadsafe(future.cancel)
def RemoveMsg_WhenFutureDone(future:asyncio.Future, uuid):
## called by future callback once a future representing a pending msg is cancelled and if a result or an exception is set
s_msg:stored_msg_class = pending_msgs.pop(uuid, None)
## the thread functions:
def send_thread_fun():
while (True):
a_msg:msg_class = send_queue.get()
send(a_msg)
## ...
def recv_thread_fun():
while(True):
a_reply:reply_class = receive()
pending_msg:stored_msg_class = pending_msgs.pop(a_reply.uuid, None)
if (pending_msg is not None):
if a_reply.success:
CompleteMsg(pending_msg, a_reply)
else:
FailMsg(pending_msg, Exception(a_reply))
## ...
## low level functions
def send(a_msg:msg_class):
hardware_send(msg_class)
def receive() -> msg_class:
return hardware_recv()
## using the async message interface:
def main():
tx_thread = threading.Thread(target=send_thread_fun, name="send_thread", daemon=True)
rx_thread = threading.Thread(target=recv_thread_fun, name="recv_thread", daemon=True)
rx_thread.start()
tx_thread.start()
try:
loop = asyncio.get_running_loop()
except RuntimeError as ex:
loop = asyncio.new_event_loop()
msg1 = msg_class("123")
msg2 = msg_class("456")
m1 = SendMsgAsyncAndGetReply(msg1, loop)
m2 = SendMsgAsyncAndGetReply(msg2, loop)
r12 = asyncio.get_event_loop().run_until_complete(asyncio.gather(m1, m2))
Objective:
I am trying to scrape multiple URLs simultaneously. I don't want to make too many requests at the same time so I am using this solution to limit it.
Problem:
Requests are being made for ALL tasks instead of for a limited number at a time.
Stripped-down Code:
async def download_all_product_information():
# TO LIMIT THE NUMBER OF CONCURRENT REQUESTS
async def gather_with_concurrency(n, *tasks):
semaphore = asyncio.Semaphore(n)
async def sem_task(task):
async with semaphore:
return await task
return await asyncio.gather(*(sem_task(task) for task in tasks))
# FUNCTION TO ACTUALLY DOWNLOAD INFO
async def get_product_information(url_to_append):
url = 'https://www.amazon.com.br' + url_to_append
print('Product Information - Page ' + str(current_page_number) + ' for category ' + str(
category_index) + '/' + str(len(all_categories)) + ' in ' + gender)
source = await get_source_code_or_content(url, should_render_javascript=True)
time.sleep(random.uniform(2, 5))
return source
# LOOP WHERE STUFF GETS DONE
for current_page_number in range(1, 401):
for gender in os.listdir(base_folder):
all_tasks = []
# check all products in the current page
all_products_in_current_page = open_list(os.path.join(base_folder, gender, category, current_page))
for product_specific_url in all_products_in_current_page:
current_task = asyncio.create_task(get_product_information(product_specific_url))
all_tasks.append(current_task)
await gather_with_concurrency(random.randrange(8, 15), *all_tasks)
async def main():
await download_all_product_information()
# just to make sure there are not any problems caused by two event loops
if asyncio.get_event_loop().is_running(): # only patch if needed (i.e. running in Notebook, Spyder, etc)
import nest_asyncio
nest_asyncio.apply()
# for asynchronous functionality
if __name__ == '__main__':
asyncio.run(main())
What am I doing wrong? Thanks!
What is wrong is this line:
current_task = asyncio.create_task(get_product_information(product_specific_url))
When you create a "task" it is imediatelly scheduled for execution. As soons
as your code yield execution to the asyncio loop (at any "await" expression), asyncio will loop executing all your tasks.
The semaphore, in the original snippet you pointed too, guarded the creation of the tasks itself, ensuring only "n" tasks would be active at a time. What is passed in to gather_with_concurrency in that snippet are co-routines.
Co-routines, unlike tasks, are objects that are ready to be awaited, but are not yet scheduled. They canbe passed around for free, just like any other object - they will only be executed when they are either awaited, or wrapped by a task (and then when the code passes control to the asyncio loop).
In your code, you are creating the co-routine, with the get_product_information call, and immediately wrapping it in a task. In the await instruction in the line that calls gather_with_concurrency itself, they are all run at once.
The fix is simple: do not create a task at this point, just inside the code guarded by your semaphore. Add just the raw co-routines to your list:
...
all_coroutines = []
# check all products in the current page
all_products_in_current_page = open_list(os.path.join(base_folder, gender, category, current_page))
for product_specific_url in all_products_in_current_page:
current_coroutine = get_product_information(product_specific_url)
all_coroutines.append(current_coroutine)
await gather_with_concurrency(random.randrange(8, 15), *all_coroutines)
There is still an unrelated incorrectness in this code that will make concurrency fail: you are making a synchronous call to time.sleepinside gather_product_information. This will stall the asyncio loop at this point
until the sleep is over. The correct thing to do is to use await asyncio.sleep(...) .
I have a async function and need to run in with apscheduller every N minutes.
There is a python code below
URL_LIST = ['<url1>',
'<url2>',
'<url2>',
]
def demo_async(urls):
"""Fetch list of web pages asynchronously."""
loop = asyncio.get_event_loop() # event loop
future = asyncio.ensure_future(fetch_all(urls)) # tasks to do
loop.run_until_complete(future) # loop until done
async def fetch_all(urls):
tasks = [] # dictionary of start times for each url
async with ClientSession() as session:
for url in urls:
task = asyncio.ensure_future(fetch(url, session))
tasks.append(task) # create list of tasks
_ = await asyncio.gather(*tasks) # gather task responses
async def fetch(url, session):
"""Fetch a url, using specified ClientSession."""
async with session.get(url) as response:
resp = await response.read()
print(resp)
if __name__ == '__main__':
scheduler = AsyncIOScheduler()
scheduler.add_job(demo_async, args=[URL_LIST], trigger='interval', seconds=15)
scheduler.start()
print('Press Ctrl+{0} to exit'.format('Break' if os.name == 'nt' else 'C'))
# Execution will block here until Ctrl+C (Ctrl+Break on Windows) is pressed.
try:
asyncio.get_event_loop().run_forever()
except (KeyboardInterrupt, SystemExit):
pass
But when i tried to run it i have the next error info
Job "demo_async (trigger: interval[0:00:15], next run at: 2017-10-12 18:21:12 +04)" raised an exception.....
..........\lib\asyncio\events.py", line 584, in get_event_loop
% threading.current_thread().name)
RuntimeError: There is no current event loop in thread '<concurrent.futures.thread.ThreadPoolExecutor object at 0x0356B150>_0'.
Could you please help me with this?
Python 3.6, APScheduler 3.3.1,
In your def demo_async(urls), try to replace:
loop = asyncio.get_event_loop()
with:
loop = asyncio.new_event_loop()
asyncio.set_event_loop(loop)
The important thing that hasn't been mentioned is why the error occurs. For me personally, knowing why the error occurs is as important as solving the actual problem.
Let's take a look at the implementation of the get_event_loop of BaseDefaultEventLoopPolicy:
class BaseDefaultEventLoopPolicy(AbstractEventLoopPolicy):
...
def get_event_loop(self):
"""Get the event loop.
This may be None or an instance of EventLoop.
"""
if (self._local._loop is None and
not self._local._set_called and
isinstance(threading.current_thread(), threading._MainThread)):
self.set_event_loop(self.new_event_loop())
if self._local._loop is None:
raise RuntimeError('There is no current event loop in thread %r.'
% threading.current_thread().name)
return self._local._loop
You can see that the self.set_event_loop(self.new_event_loop()) is only executed if all of the below conditions are met:
self._local._loop is None - _local._loop is not set
not self._local._set_called - set_event_loop hasn't been called yet
isinstance(threading.current_thread(), threading._MainThread) - current thread is the main one (this is not True in your case)
Therefore the exception is raised, because no loop is set in the current thread:
if self._local._loop is None:
raise RuntimeError('There is no current event loop in thread %r.'
% threading.current_thread().name)
Just pass fetch_all to scheduler.add_job() directly. The asyncio scheduler supports coroutine functions as job targets.
If the target callable is not a coroutine function, it will be run in a worker thread (due to historical reasons), hence the exception.
I had a similar issue where I wanted my asyncio module to be callable from a non-asyncio script (which was running under gevent... don't ask...). The code below resolved my issue because it tries to get the current event loop, but will create one if there isn't one in the current thread. Tested in python 3.9.11.
try:
loop = asyncio.get_event_loop()
except RuntimeError as e:
if str(e).startswith('There is no current event loop in thread'):
loop = asyncio.new_event_loop()
asyncio.set_event_loop(loop)
else:
raise
Use asyncio.run() instead of directly using the event loop.
It creates a new loop and closes it when finished.
This is how the 'run' looks like:
if events._get_running_loop() is not None:
raise RuntimeError(
"asyncio.run() cannot be called from a running event loop")
if not coroutines.iscoroutine(main):
raise ValueError("a coroutine was expected, got {!r}".format(main))
loop = events.new_event_loop()
try:
events.set_event_loop(loop)
loop.set_debug(debug)
return loop.run_until_complete(main)
finally:
try:
_cancel_all_tasks(loop)
loop.run_until_complete(loop.shutdown_asyncgens())
finally:
events.set_event_loop(None)
loop.close()
Since this question continues to appear on the first page, I will write my problem and my answer here.
I had a RuntimeError: There is no current event loop in thread 'Thread-X'. when using flask-socketio and Bleak.
Edit: well, I refactored my file and made a class.
I initialized the loop in the constructor, and now everything is working fine:
class BLE:
def __init__(self):
self.loop = asyncio.get_event_loop()
# function example, improvement of
# https://github.com/hbldh/bleak/blob/master/examples/discover.py :
def list_bluetooth_low_energy(self) -> list:
async def run() -> list:
BLElist = []
devices = await bleak.discover()
for d in devices:
BLElist.append(d.name)
return 'success', BLElist
return self.loop.run_until_complete(run())
Usage:
ble = path.to.lib.BLE()
list = ble.list_bluetooth_low_energy()
Original answer:
The solution was stupid. I did not pay attention to what I did, but I moved some import out of a function, like this:
import asyncio, platform
from bleak import discover
def listBLE() -> dict:
async def run() -> dict:
# my code that keep throwing exceptions.
loop = asyncio.get_event_loop()
ble_list = loop.run_until_complete(run())
return ble_list
So I thought that I needed to change something in my code, and I created a new event loop using this piece of code just before the line with get_event_loop():
loop = asyncio.new_event_loop()
loop = asyncio.set_event_loop()
At this moment I was pretty happy, since I had a loop running.
But not responding. And my code relied on a timeout to return some values, so it was pretty bad for my app.
It took me nearly two hours to figure out that the problem was the import, and here is my (working) code:
def list() -> dict:
import asyncio, platform
from bleak import discover
async def run() -> dict:
# my code running perfectly
loop = asyncio.get_event_loop()
ble_list = loop.run_until_complete(run())
return ble_list
Reading given answers I only manage to fix my websocket thread by using the hint (try replacing) in https://stackoverflow.com/a/46750562/598513 on this page.
loop = asyncio.new_event_loop()
asyncio.set_event_loop(loop)
The documentation of BaseDefaultEventLoopPolicy explains
Default policy implementation for accessing the event loop.
In this policy, each thread has its own event loop. However, we
only automatically create an event loop by default for the main
thread; other threads by default have no event loop.
So when using a thread one has to create the loop.
And I had to reorder my code so my final code
loop = asyncio.new_event_loop()
asyncio.set_event_loop(loop)
# !!! Place code after setting the loop !!!
server = Server()
start_server = websockets.serve(server.ws_handler, 'localhost', port)
In my case the line was like this
asyncio.get_event_loop().run_until_complete(test())
I replaced above line with this line which solved my problem
asyncio.run(test())
I've been trying to make a bot in Slack that remains responsive even if it hasn't finished processing earlier commands, so it could go and do something that takes some time without locking up. It should return whatever is finished first.
I think I'm getting part of the way there: it now doesn't ignore stuff that's typed in before an earlier command is finished running. But it still doesn't allow threads to "overtake" each other - a command called first will return first, even if it takes much longer to complete.
import asyncio
from slackclient import SlackClient
import time, datetime as dt
token = "my token"
sc = SlackClient(token)
#asyncio.coroutine
def sayHello(waitPeriod = 5):
yield from asyncio.sleep(waitPeriod)
msg = 'Hello! I waited {} seconds.'.format(waitPeriod)
return msg
#asyncio.coroutine
def listen():
yield from asyncio.sleep(1)
x = sc.rtm_connect()
info = sc.rtm_read()
if len(info) == 1:
if r'/hello' in info[0]['text']:
print(info)
try:
waitPeriod = int(info[0]['text'][6:])
except:
print('Can not read a time period. Using 5 seconds.')
waitPeriod = 5
msg = yield from sayHello(waitPeriod = waitPeriod)
print(msg)
chan = info[0]['channel']
sc.rtm_send_message(chan, msg)
asyncio.async(listen())
def main():
print('here we go')
loop = asyncio.get_event_loop()
asyncio.async(listen())
loop.run_forever()
if __name__ == '__main__':
main()
When I type /hello 12 and /hello 2 into the Slack chat window, the bot does respond to both commands now. However it doesn't process the /hello 2 command until it's finished doing the /hello 12 command. My understanding of asyncio is a work in progress, so it's quite possible I'm making a very basic error. I was told in a previous question that things like sc.rtm_read() are blocking functions. Is that the root of my problem?
Thanks a lot,
Alex
What is happening is your listen() coroutine is blocking at the yield from sayHello() statement. Only once sayHello() completes will listen() be able to continue on its merry way. The crux is that the yield from statement (or await from Python 3.5+) is blocking. It chains the two coroutines together and the 'parent' coroutine can't complete until the linked 'child' coroutine completes. (However, 'neighbouring' coroutines that aren't part of the same linked chain are free to proceed in the meantime).
The simple way to release sayHello() without holding up listen() in this case is to use listen() as a dedicated listening coroutine and to offload all subsequent actions into their own Task wrappers instead, thus not hindering listen() from responding to subsequent incoming messages. Something along these lines.
#asyncio.coroutine
def sayHello(waitPeriod, sc, chan):
yield from asyncio.sleep(waitPeriod)
msg = 'Hello! I waited {} seconds.'.format(waitPeriod)
print(msg)
sc.rtm_send_message(chan, msg)
#asyncio.coroutine
def listen():
# connect once only if possible:
x = sc.rtm_connect()
# use a While True block instead of repeatedly calling a new Task at the end
while True:
yield from asyncio.sleep(0) # use 0 unless you need to wait a full second?
#x = sc.rtm_connect() # probably not necessary to reconnect each loop?
info = sc.rtm_read()
if len(info) == 1:
if r'/hello' in info[0]['text']:
print(info)
try:
waitPeriod = int(info[0]['text'][6:])
except:
print('Can not read a time period. Using 5 seconds.')
waitPeriod = 5
chan = info[0]['channel']
asyncio.async(sayHello(waitPeriod, sc, chan))
I am learning python-asyncio, and I'm trying to write a simple model.
I have a function handling tasks. While handling, it goes to another remote service for data and then prints a message.
My code:
dd = 0
#asyncio.coroutine
def slow_operation():
global dd
dd += 1
print('Future is started!', dd)
yield from asyncio.sleep(10 - dd) # request to another server
print('Future is done!', dd)
def add():
while True:
time.sleep(1)
asyncio.ensure_future(slow_operation(), loop=loop)
loop = asyncio.get_event_loop()
future = asyncio.Future()
asyncio.ensure_future(slow_operation(), loop=loop)
th.Thread(target=add).start()
loop.run_forever()
But this code doesn't switch the context while sleeping in:
yield from asyncio.sleep(10 - dd)
How can I fix that?
asyncio.ensure_future is not thread-safe, that's why slow_operation tasks are not started when they should be. Use loop.call_soon_threadsafe:
callback = lambda: asyncio.ensure_future(slow_operation(), loop=loop)
loop.call_soon_threadsafe(cb)
Or asyncio.run_coroutine_threadsafe if you're running python 3.5.1:
asyncio.run_coroutine_threadsafe(slow_operation(), loop)
However, you should probably keep the use of threads to the minimum. Unless you use a library running tasks in its own thread, all the code should run inside the event loop (or inside an executor, see loop.run_in_executor).