Async FIFO throttler - python

I'm having some issue using a throttler for telegram api.
The problem is basically that if number of requests goes over my throttler limit, when the min passes, the messages get sent randomly.
Here's the code for the throttler I'm using (Found it on some github)
class Throttler:
def __init__(self, rate_limit, period=1.0, retry_interval=0.01):
self.rate_limit = rate_limit
self.period = period
self.retry_interval = retry_interval
self._task_logs = deque()
def flush(self):
now = time.time()
while self._task_logs:
if now - self._task_logs[0] > self.period:
self._task_logs.popleft()
else:
break
async def acquire(self):
while True:
self.flush()
if len(self._task_logs) < self.rate_limit:
break
await asyncio.sleep(self.retry_interval)
self._task_logs.append(time.time())
async def __aenter__(self):
await self.acquire()
async def __aexit__(self, exc_type, exc, tb):
pass
I can use this as following
throttler = Throttler(rate_limit=30, period=10)
async with throttler:
await sendmessage(message)

Found out that the best way to get around this was using a different algorithm for the throttler.
The throttler I was using above would always deliver messages randomly because after an initial burst, messages will get stuck in the queue and when the time has passed, asyncio will release all messages at once.
I found out the best way around this is to use what's called a LeakyBucket algorithm. I used the following answer to implement a LeakyBucket myself https://stackoverflow.com/a/45502319/7055234

Related

Can I wait for a callback to have finished using futures?

I am trying to use the pika AMQP client with RabbitMQ in Python, but am struggling to wait for a connection to open properly. I want to just be able to await this async function, but it hangs. The furthest it gets is printing Future is done from one of the helper functions. I ultimately am trying to avoid many nested callbacks and would like to use async/await where possible. I thought this was on the right path, but I can't seem to get it to work, any advice would be appreciated. There is probably a better way, but I am not super familiar with async patterns in python and am basing most of my architecture around my familiarity with Node.
My class
import asyncio
from asyncio import Future
import pika
from src.services.pika_helpers import init_select_connection_async
asyncio.get_event_loop()
EXCHANGE_NAME = 'ACTIVE_LEARNING'
class AsyncRabbitMQ():
def __init__(self):
self.connection = None
self.channel = None
self.queue_name = None
asyncio.run(self.initialize())
async def update_future_cb(self):
self.future_res.set_result("Yep!")
self.future_res.done()
async def initialize(self):
selectConn = await init_select_connection_async()
print(selectConn)
print("Sleeping!")
await asyncio.sleep(1)
print("Done sleeping!")
print("Calling callback!")
return
My helper functions
def on_open_callback(connection, future):
print("Calling callback")
future.set_result(connection)
future.done()
print("Future is done")
return True
async def init_select_connection_async():
future_result = Future()
print("Going to open connection")
connection = pika.SelectConnection(on_open_callback=lambda v: on_open_callback(v, future_result), on_open_error_callback=print)
connection.ioloop.start()
print("Connection opened")
connection = await future_result
return connection

Python Async Functions won't Give up the CPU

I have two async functions that both need to run constantly and one of them just hogs all of the CPU.
The first function handles receiving websocket messages from a client.
async def handle_message(self, ws):
"""Handles a message from the websocket."""
logger.info('awaiting message')
while True:
msg = await ws.receive()
logger.debug('received message: %s', msg)
jmsg = json.loads(msg['text'])
logger.info('received message: {}'.format(jmsg))
param = jmsg['parameter']
val = jmsg['data']['value']
logger.info('setting parameter {} to {}'.format(param, val))
self.camera.change_parameter(param, val)
The second function grabs images from a camera and sends them to the frontend client. This is the one that one that won't give the other guy any time.
async def send_image(self, ws):
"""Sends an image to the websocket."""
for im in self.camera:
await asyncio.sleep(1000)
h, w = im.shape[:2]
resized = cv2.resize(im, (w // 4, h // 4))
await ws.send_bytes(image_to_bytes(resized))
I'm executing these coroutines using asyncio.gather(). The decorator is from FastAPI and Backend() is my class that contains the two async coroutines.
#app.websocket('/ws')
async def websocket_endpoint(websocket: WebSocket):
"""Handle a WebSocket connection."""
backend = Backend()
logger.info('Started backend.')
await websocket.accept()
try:
aws = [backend.send_image(websocket), backend.handle_message(websocket)]
done, pending = await asyncio.gather(*aws)
except WebSocketDisconnect:
await websocket.close()
Both of these coroutines will operate seperately, but if I try to run them together send_image() never gives any time to handle_message and so none of the messages are ever received (or at least that's what I think is going on).
I thought this is what asyncio was trying to solve, but I'm probably using it wrong. I thought about using multiprocessing, but I'm pretty sure FastAPI expects awaitables here. I also read about using the return variables from gather(), but I didn't really understand. Something about canceling the pending tasks and adding them back to the event loop.
Can anyone show me the correct (and preferably modern pythonic) way to make these async coroutines run concurrently?

Discord.py avoid blocking on_message method

I'm working on a Discord bot in which I mainly process images. So far it's working but when multiple images are sent at once, I experience a lot of blocking and inconsistency.
It goes like this:
User upload image > Bot places 'eyes' emoji on the message > bot processes the image > bot responds with result.
However, sometimes it can handle multiple images at once (the bot places the eyes emoji on the first few images) but usually it just puts emoji on the first image and then after finishing that one it will process the next 2-3 images etc.
The process which takes most of the time is the OCR reading the image.
Here is some abstract code:
main.py
#client.event
async def on_message(message):
...
if len(message.attachments) > 0: await message_service.handle_image(message)
...
message_service.py
async def handle_image(self, message):
supported_attachments = filter_out_unsupported(message.attachments)
images = []
await message.reply(f"{random_greeting()} {message.author.mention}, I'm processing your upload(s) please wait a moment, this could take up to 30 seconds.")
await message.add_reaction('👀')
for a in supported_attachments:
async with aiohttp.ClientSession() as session:
async with session.get(a) as res:
if res.status == 200:
buffer = io.BytesIO(await res.read())
arr = np.asarray(bytearray(buffer.read()), dtype=np.uint8)
images.append(cv2.imdecode(arr, -1))
for image in images:
result = await self.image_handler.handle_image(image, message.author)
await message.remove_reaction('👀', message.author)
if result == None:
await message.reply(f"{message.author.mention} I can't process your image. It's incorrect, unclear or I'm just not smart enough... :(")
await message.add_reaction('❌')
else:
await message.reply(result)
image_handler
async def handle_image(self, image, author):
try:
if image is None: return None
governor_id = str(self.__get_governor_id_by_discord_id(author.id))
if governor_id == None:
return f"{author.mention} there was no account registered under your discord id, please register by using this format: `$register <governor_id> <in game name>`, for example: `$register ... ...`. After that repost the screenshot.\n As for now multiple accounts are not supported."
# This part is most likely the bottleneck !!
read_result = self.reader.read_image_task(image)
if self.__no_values_are_found(...):
return None
return self.sheets_client.update_player_row_in_sheets(...)
except:
return None
def __no_values_are_found(self, *args):
return all(v is None for v in [*args])
def __get_governor_id_by_discord_id(self, id):
return self.sheets_client.get_governor_id_by_discord_id(id)
I'm new to Python and Discord bots in general, but is there a clean way to handle this?
I was thinking about threading but can't seem to find many solutions within this context, which makes me believe I am missing something or doing something inefficiently.
There is actually a clean way, you can create your own to_thread decorator and decorate your blocking functions (though they cannot be coroutines, they must be normal, synchronous functions)
import asyncio
from functools import partial, wraps
def to_thread(func):
#wraps(func)
async def wrapper(*args, **kwargs):
loop = asyncio.get_event_loop()
callback = partial(func, *args, **kwargs)
return await loop.run_in_executor(None, callback) # if using python 3.9+ use `await asyncio.to_thread(callback)`
return wrapper
# usage
#to_thread
def handle_image(self, image, author): # notice how it's *not* an async function
...
# calling
await handle_image(...) # not a coroutine, yet I'm awaiting it (cause of the wrapper function)

Trying to understand how to use multithreaded websockets in Python but seem to be stuck with one thread

I have this basic exchange monitor script. I'm trying to create one thread per symbol, apart from the main thread which is handling other work, and have them listen to public Gemini websocket endpoints. I'm getting the first thread running and printing exchange data to the console, but not the second one. I had expected to see data from both threads being printed at approximately the same time. I've tried using the threading library instead of asyncio and encountered the same situation.
I realize my two public API MarketWebsocket classes could be combined to be cleaner, I'm still trying to work out a way to easily add other symbols to the list. Thanks for any nudges in the right direction!
import asyncio
from websockets import connect
symbols_to_watch = [
"BTCUSD",
"ETHUSD"
]
class BTCMarketWebsocket:
disable = False
async def __aenter__(self):
symbol = symbols_to_watch[0]
self._conn = connect("wss://api.gemini.com/v1/marketdata/{}".format(symbol))
self.websocket = await self._conn.__aenter__()
return self
async def __aexit__(self, *args, **kwargs):
await self._conn.__aexit__(*args, **kwargs)
async def receive(self):
return await self.websocket.recv()
class ETHMarketWebsocket:
disable = False
async def __aenter__(self):
symbol = symbols_to_watch[1]
self._conn = connect("wss://api.gemini.com/v1/marketdata/{}".format(symbol))
self.websocket = await self._conn.__aenter__()
return self
async def __aexit__(self, *args, **kwargs):
await self._conn.__aexit__(*args, **kwargs)
async def receive(self):
return await self.websocket.recv()
async def btcMarketWebsocket():
async with BTCMarketWebsocket() as btcMarketWebsocket:
while not btcMarketWebsocket.disable:
print(await btcMarketWebsocket.receive())
async def ethMarketWebsocket():
async with ETHMarketWebsocket() as ethMarketWebsocket:
while not ethMarketWebsocket.disable:
print(await ethMarketWebsocket.receive())
if __name__ == '__main__':
asyncio.run(btcMarketWebsocket())
asyncio.run(ethMarketWebsocket())
You can do
async def multiple_tasks():
Tasks =[]
Tasks.append(btcMarketWebsocket())
Tasks.append(ethMarketWebsocket())
await asyncio.gather(*Tasks, return_exceptions=True)
if __name__ == '__main__':
asyncio.get_event_loop().run_until_complete(multiple_tasks())

aiohttp ClientSession.get() method failing silently - Python3.7

I'm making a small application that attempts to find company website URLs by searching for their names via Bing. It takes in a big list of company names, uses the Bing Search API to obtain the 1st URL, & saves those URLs back in the list.
I'm having a problem with aiohttp's ClientSession.get() method, specifically, it fails silently & I can't figure out why.
Here's how I'm initializing the script. Keep an eye out for worker.perform_mission():
async def _execute(workers,*, loop=None):
if not loop:
loop = asyncio.get_event_loop()
[asyncio.ensure_future(i.perform_mission(verbose=True), loop=loop) for i in workers]
def main():
filepth = 'c:\\SOME\\FILE\\PATH.xlsx'
cache = pd.read_excel(filepth)
# CHANGE THE NUMBER IN range(<here>) TO ADD MORE WORKERS.
workers = (Worker(cache) for i in range(1))
loop = asyncio.get_event_loop()
loop.run_until_complete(_execute(workers, loop=loop))
...<MORE STUFF>...
The worker.perform_mission() method does the following (scroll to the bottom and look at _split_up_request_like_they_do_in_the_docs()):
class Worker(object):
def __init__(self, shared_cache):
...<MORE STUFF>...
async def perform_mission(self, verbose=False):
while not self.mission_complete:
if not self.company_name:
await self.find_company_name()
if verbose:
print('Obtained Company Name')
if self.company_name and not self.website:
print('Company Name populated but no website found yet.')
data = await self.call_bing() #<<<<< THIS IS SILENTLY FAILING.
if self.website and ok_to_set_website(self.shared_cache, self):
await self.try_set_results(data)
self.mission_complete = True
else:
print('{} worker failed at setting website.'.format(self.company_name))
pass
else:
print('{} worker failed at obtaining data from Bing.'.format(self.company_name))
pass
async def call_bing(self):
async with aiohttp.ClientSession() as sesh:
sesh.headers = self.headers
sesh.params = self.params
return await self._split_up_request_like_they_do_in_the_docs(sesh)
async def _split_up_request_like_they_do_in_the_docs(self, session):
print('_bing_request() successfully called.') #<<<THIS CATCHES
async with session.get(self.search_url) as resp:
print('Session.get() successfully called.') #<<<THIS DOES NOT.
return await resp.json()
And finally my output is:
Obtained Company Name
Company Name populated but no website found yet.
_bing_request() successfully called.
Process finished with exit code 0
Can anyone help me figure out why print('Session.get() successfully called.'), isn't triggering?...or maybe help me ask this question better?
Take a look at this part:
async def _execute(workers,*, loop=None):
# ...
[asyncio.ensure_future(i.perform_mission(verbose=True), loop=loop) for i in workers]
You create a bunch of tasks, but you don't await these tasks are finished. It means _execute itself will be done right after tasks are created, long before these tasks are finished. And since you run event loop until _execute done, it will stop shortly after start.
To fix this, use asyncio.gather to wait multiple awaitables are finished:
async def _execute(workers,*, loop=None):
# ...
tasks = [asyncio.ensure_future(i.perform_mission(verbose=True), loop=loop) for i in workers]
await asyncio.gather(*tasks)

Categories