Python3 asyncio: Process tasks from dict and store result in dict

Python3 asyncio: Process tasks from dict and store result in dict - python

I'm trying to learn asyncio.
I have a list of sensors which should be pulled. Each sensor takes about 1 second to pull. So asyncio is the right task to do this. The sensors may change dynamically and so asyncio should change.
I have already following code, but now I don't know how to store the fetched values into a resulting dict. Since the consumer stores the value into a dict this very fast task could be also done in the producer, so there would be no need at all for a consumer, but I guess for asyncio paradgim there has to be a consumer.
Perhaps I'm also thinking too complicated and a much easier programming paradigm with less code could be used here?
Please see also comments in the code for further detailed questions and suggestions.
#!/usr/bin/python3
import asyncio
import random
sensors={ #This list changes often
"sensor1" : "http://abc.example.org",
"sensor2" : "http://outside.example.org",
"temperature" : "http://xe.example.com",
"outdoor" : "http://anywhere.example.org"
}
results = dict() #Result from sensor query should go here
async def queryAll(sensors):
q = asyncio.Queue()
queries = [asyncio.create_task(querySensor(sensorname,q)) for sensorname in sensors]
process = [asyncio.create_task(storeValues(sensorname, q)) for sensorname in sensors] #Since value is only stored to dict, one consumer should be sufficient, or no consumer at all, since producer could also store variable into dict
await asyncio.gather(*queries)
await q.join()
for c in process:
c.cancel()
async def querySensor(sensorname: str, q: asyncio.Queue):
res = str(random.randint(0,100))
resString = "Result for " + sensorname + " is " + res
await q.put(resString)
async def storeValues(sensorname: str, q: asyncio.Queue):
while True:
res = await q.get()
print("Value: ", res)
q.task_done()
if __name__ == "__main__":
asyncio.run(queryAll(sensors))
for result in results: #Now results should be in results
print(result, "measured:", results[result])
Solution
Thanks for both answers. Resulting code is now:
#!/usr/bin/python3
import asyncio
import random
sensors={ #This list changes often
"sensor1" : "http://abc.example.org",
"sensor2" : "http://outside.example.org",
"temperature" : "http://xe.example.com",
"outdoor" : "http://anywhere.example.org"
}
results = dict() #Result from sensor query should go here
async def queryAll(sensors):
queries = [asyncio.create_task(querySensor(sensorname)) for sensorname in sensors]
await asyncio.gather(*queries)
async def querySensor(sensorname: str):
res = str(random.randint(0,100))
resString = "Result for " + sensorname + " is " + res
results[sensorname] = resString
if __name__ == "__main__":
asyncio.run(queryAll(sensors))
for result in results: #Now results should be in results
print(result, "measured:", results[result])
There is some more complex code behind fetchting the acutual value. So this was only an example. I wanted to hide the additional layer of code.
Both answers are very worthful to me. To support new users reputation I accept Sreejith's answer to mark this question as solved.

It's not clear about what you're exactly looking to achieve. If you only want to update the dictionary, the code could be much simpler. Let me know if you were expecting anything else.
sensors={ #This list changes often
"sensor1" : "http://abc.example.org",
"sensor2" : "http://outside.example.org",
"temperature" : "http://xe.example.com",
"outdoor" : "http://anywhere.example.org"
}
results = dict() #Result from sensor query should go here
async def queryAll(sensors):
queries = [asyncio.create_task(querySensor(sensorname, results)) for sensorname in sensors]
await asyncio.gather(*queries)
async def querySensor(sensorname: str, q: dict):
res = str(random.randint(0, 100))
resString = "Result for " + sensorname + " is " + res
q[sensorname] = resString
if __name__ == "__main__":
asyncio.run(queryAll(sensors))
print(results)

The main problem with this code is that you never actually store
the sensor values to you result dict. If the code in stooreValues would includ the line results.setdefault(sensorname, []).append(res) you'd see your results already. (The dictionary .setdefault method is a utilityto create a value in the dicionary if it does not exist, and return that value, or the existing one: therefore we create an empty list on the first call for each sensor, and keep appending to it).
But, as you noted, there is no need to have a producer/consumer separated pattern in this code (whatever code will consume the "results" dict is actually the consumer)
...
from aiohttp_requests import requests
results = dict() #Result from sensor query should go here
async def queryAll(sensors):
queries = [asyncio.create_task(querySensor(sensorname)) for sensorname in sensors]
await asyncio.gather(*queries)
async def querySensor(sensorname: str):
res = str(random.randint(0,100))
# important, when writting the actual call to read the sensor, to use
# an async expresion and await for it.
response = await requests.get(sensors[sensorname], ...)
text = await response.text()
resString = f"Result for {sensorname} is {text}"
results.setdefault(sensorname, []).append(resString)
if __name__ == "__main__":
asyncio.run(queryAll(sensors))
for result in results: #Now results should be in results
print(result, "measured:", results[result])
This example is using https://pypi.org/project/aiohttp-requests/

Related

python asyncio result not set in task error

I am trying to create an asyncio task, perform some db query and then a for loop process and get the result back in the task. However, in the code sample below, it seems like my result is not being put to total_result.result() but rather, just to total_result.
Not sure if there is any misunderstanding that I"m having regarding my implementation of asyncio below?
class DatabaseHandler:
def __init__(self):
self.loop = get_event_loop()
self.engine = create_engine("postgres stuffs here")
self.conn = self.engine.connect()
async def _fetch_sql_data(self, query):
return self.conn.execute(query)
async def get_all(self, item):
total_result = []
if item == "all":
data = create_task(self._fetch_sql_data("select col1 from table1;"))
else:
data = create_task(self._fetch_sql_data(f"select col1 from table1 where quote = '{item}';"))
await data
for i in data.result().fetchall():
total_result.append(i[0])
return total_result
async def update(self):
total_result = create_task(self.get_all("all"))
print(await total_result) # prints out the result immediately and not the task object.
# this means that `total_result.result()` produces an error
loop = get_event_loop()
a = DatabaseHandler()
loop.run_until_complete(a.update())
I have a feeling that it is because of total_result being a list object. But not sure how to resolve this.

task.result() returns the result of your task (the return value of the wrapped coro) and not another Task. This means this
task = asyncio.create_task(coro())
await task
result = task.result()
is actually equivalent to
result = await coro()
Using tasks is especially useful, if you want to execute multiple coroutines concurrently. But as you are not doing that here, your code is a bit overcomplicated. You can just do
async def get_all(self, item):
total_result = []
if item == "all":
result = await self._fetch_sql_data("select col1 from table1;")
else:
result = await self._fetch_sql_data(f"select col1 from table1 where quote = '{item}';")
for i in result.fetchall():
total_result.append(i[0])
return total_result # holds the results of your db query just as called from sync code

asyncio.sleep() - how to use it?

Can anyone help me out, I'm trying to get the program to pause if the condition is met. But as of now, its not sleeping at all. And I can't wrap my head around why. Im completely new to asyncio
time.sleep() doesnt really work either, so I would prefer to use asyncio. Thanks alot!
from python_graphql_client import GraphqlClient
import asyncio
import os
import requests
loop = asyncio.get_event_loop()
def print_handle(data):
print(data["data"]["liveMeasurement"]["timestamp"]+" "+str(data["data"]["liveMeasurement"]["power"]))
tall = (data["data"]["liveMeasurement"]["power"])
if tall >= 1000:
print("OK")
# schedule async task from sync code
asyncio.create_task(send_push_notification(data))
print("msg sent")
asyncio.create_task(sleep())
client = GraphqlClient(endpoint="wss://api.tibber.com/v1-beta/gql/subscriptions")
query = """
subscription{
liveMeasurement(homeId:"fd73a8a6ca"){
timestamp
power
}
}
"""
query2 = """
mutation{
sendPushNotification(input: {
title: "Advarsel! Høyt forbruk",
message: "Du bruker 8kw eller mer",
screenToOpen: CONSUMPTION
}){
successful
pushedToNumberOfDevices
}
}
"""
async def sleep():
await asyncio.sleep(10)
async def send_push_notification(data):
#maybe update your query with the received data here
await client.execute_async(query=query2,headers={'Authorization': "2bTCaFx74"})
async def main():
await client.subscribe(query=query, headers={'Authorization': "2bTCaFxDiYdHlxBSt074"}, handle=print_handle)
asyncio.run(main())

If I understand correctly, you want to observe broadcasts of some data, and react to those broadcasts, keeping the right to pause those reactions. Something like:
async def monitor(read_broadcast):
while True:
data = await read_broadcast()
print(data["data"]["liveMeasurement"]["timestamp"]+" "+str(data["data"]["liveMeasurement"]["power"]))
tall = (data["data"]["liveMeasurement"]["power"])
if tall >= 1000:
print("OK")
await send_push_notification(data)
print("msg sent")
# sleep for a while before sending another one
await asyncio.sleep(10)
To implement read_broadcast, we can use a "future":
# client, query, query2, send_push_notification defined as before
async def main():
broadcast_fut = None
def update_broadcast_fut(_fut=None):
nonlocal broadcast_fut
broadcast_fut = asyncio.get_event_loop().create_future()
broadcast_fut.add_done_callback(update_broadcast_fut)
update_broadcast_fut()
def read_broadcast():
return broadcast_fut
asyncio.create_task(monitor(read_broadcast))
await client.subscribe(
query=query, headers={'Authorization': "2bTCaFxDiYdHlxBSt074"},
handle=lambda data: broadcast_fut.set_result(data),
)
asyncio.run(main())
Note that I haven't tested the above code, so there could be typos.

I think the easiest way to reduce the number of messages being sent, is to define a minimum interval in which no notification is sent while the value is still over the threshold.
import time
last_notification_timestamp = 0
NOTIFICATION_INTERVAL = 5 * 60 # 5 min
def print_handle(data):
global last_notification_timestamp
print(
data["data"]["liveMeasurement"]["timestamp"]
+ " "
+ str(data["data"]["liveMeasurement"]["power"])
)
tall = data["data"]["liveMeasurement"]["power"]
current_time = time.time()
if (
tall >= 1000
and current_time - NOTIFICATION_INTERVAL > last_notification_timestamp
):
print("OK")
# schedule async task from sync code
asyncio.create_task(send_push_notification(data))
last_notification_timestamp = current_time
print("msg sent")
The timestamp of the last message sent needs to be stored somewhere, so we'll define a variable in the global scope to hold it and use the global keyword within print_handle() to be able to write to it from within the function. In the function we will then check, if the value is above the threshold and enough time passed after the last message. This way you will still keep your subscription alive as well as limit the number of notifications you receive. This is simple enough but probably you will soon want to extend the range of what you want to do with your received data. Just keep in mind that print_handle() is a sync callback and should be as short as possible.

How to append items to a list in a parallel process (python-asyncio)?

I have a function which adds items to the list and returns the list. The items are returned from async function. Now it creates the item and then adds it one by one.
I want to create the items in parallel and add them to the list and after that return the value of the function. How can I solve this?
Thank you in advance!
async def __create_sockets(self):
rd_data = []
for s in self.symbols.index:
try:
print(f'Collecting data of {s}')
socket = DepthCacheManager(self.client, s, refresh_interval=None)
rd_data.append(await socket.__aenter__())
except:
continue
return rd_data

An easy solution to your problem is to gather the results asynchronously and compile the list of results at the same time.
This is provided by the asyncio.gather() call as explained in the asyncio documentation. Have a look at the excellent example given there.
In your case it might roughly look like this (obviously I cannot test it):
async def create_socket(self, s):
print(f'Collecting data of {s}')
socket = DepthCacheManager(self.client, s, refresh_interval=None)
return socket.__aenter__()
async def __create_sockets(self):
rd_data = await asyncio.gather(
*[self.create_socket(s) for s in self.symbols.index]
)
return rd_data
There is a problem here with missing exception handling. You may return None in case of an exception and then clean up the list later like this:
async def create_socket(self, s):
try:
print(f'Collecting data of {s}')
socket = DepthCacheManager(self.client, s, refresh_interval=None)
return await socket.__aenter__() # await is important here
except:
return None
async def __create_sockets(self):
rd_data = await asyncio.gather(
*[self.create_socket(s) for s in self.symbols.index]
)
return [i for i in rd_data if i != None]

How to collect wait()'d co-routines in a set?

I have been attempting to generate a ping scan that uses a limited number of processes. I tried as_completed without success and switched to asyncio.wait with asyncio.FIRST_COMPLETED.
The following complete script works if the offending line is commented out. I'd like to collect the tasks to a set in order to get rid of pending = list(pending) however pending_set.union(task) throws await wasn't used with future.
"""Test simultaneous pings, limiting processes."""
import asyncio
from time import asctime
pinglist = [
'127.0.0.1', '192.168.1.10', '192.168.1.20', '192.168.1.254',
'192.168.177.20', '192.168.177.100', '172.17.1.1'
]
async def ping(ip):
"""Run external ping."""
p = await asyncio.create_subprocess_exec(
'ping', '-n', '-c', '1', ip,
stdout=asyncio.subprocess.DEVNULL,
stderr=asyncio.subprocess.DEVNULL
)
return await p.wait()
async def run():
"""Run the test, uses some processes and will take a while."""
iplist = pinglist[:]
pending = []
pending_set = set()
tasks = {}
while len(pending) or len(iplist):
while len(pending) < 3 and len(iplist):
ip = iplist.pop()
print(f"{asctime()} adding {ip}")
task = asyncio.create_task(ping(ip))
tasks[task] = ip
pending.append(task)
pending_set.union(task) # comment this line and no error
done, pending = await asyncio.wait(
pending, return_when=asyncio.FIRST_COMPLETED
)
pending = list(pending)
for taskdone in done:
print(' '.join([
asctime(),
('BAD' if taskdone.result() else 'good'),
tasks[taskdone]
]))
if __name__ == '__main__':
asyncio.run(run())

There are two problems with pending_set.union(task):
union doesn't update the set in-place, it returns a new set consisting of the original one and the one it receives as argument.
It accepts an iterable collection (such as another set), not a single element. Thus union attempts to iterate over task, which doesn't make sense. To make things more confusing, task objects are technically iterable in order to be usable in yield from expressions, but they detect iteration attempts in non-async contexts, and report the error you've observed.
To fix both issues, you should use the add method instead, which operates by side effect and accepts a single element to add to the set:
pending_set.add(task)
Note that a more idiomatic way to limit concurrency in asyncio is using a Semaphore. For example (untested):
async def run():
limit = asyncio.Semaphore(3)
async def wait_and_ping(ip):
async with limit:
print(f"{asctime()} adding {ip}")
result = await ping(ip)
print(asctime(), ip, ('BAD' if result else 'good'))
await asyncio.gather(*[wait_and_ping(ip) for ip in pinglist])

Use await asyncio.gather(*pending_set)
asyncio.gather() accepts any number of awaitables and also returns one
* unpacks the set
>>> "{} {} {}".format(*set((1,2,3)))
'1 2 3'
Example from the docs
await asyncio.gather(
factorial("A", 2),
factorial("B", 3),
factorial("C", 4),
)

I solved this without queuing the ping targets in my original application, which simplified things. This answer includes a gradually received list of targets and the useful pointers from #user4815162342. This completes the answer to the original question.
import asyncio
import time
pinglist = ['127.0.0.1', '192.168.1.10', '192.168.1.20', '192.168.1.254',
'192.168.177.20', '192.168.177.100', '172.17.1.1']
async def worker(queue):
limit = asyncio.Semaphore(4) # restrict the rate of work
async def ping(ip):
"""Run external ping."""
async with limit:
print(f"{time.time():.2f} starting {ip}")
p = await asyncio.create_subprocess_exec(
'ping', '-n', '1', ip,
stdout=asyncio.subprocess.DEVNULL,
stderr=asyncio.subprocess.DEVNULL
)
return (ip, await p.wait())
async def get_assign():
return await queue.get()
assign = {asyncio.create_task(get_assign())}
pending = set()
Maintaining two distinct pending sets proved key. One set is a single task that receives assigned addresses. This completes and needs restarted each time. The other set is for the ping messages which run once and are then complete.
while len(assign) + len(pending) > 0: # stop condition
done, pending = await asyncio.wait(
set().union(assign, pending),
return_when=asyncio.FIRST_COMPLETED
)
for job in done:
if job in assign:
if job.result() is None:
assign = set() # for stop condition
else:
pending.add(asyncio.create_task(ping(job.result())))
assign = {asyncio.create_task(get_assign())}
else:
print(
f"{time.time():.2f} result {job.result()[0]}"
f" {['good', 'BAD'][job.result()[1]]}"
)
The remainder is pretty straight forward.
async def assign(queue):
"""Assign tasks as if these are arriving gradually."""
print(f"{time.time():.2f} start assigning")
for task in pinglist:
await queue.put(task)
await asyncio.sleep(0.1)
await queue.put(None) # to stop nicely
async def main():
queue = asyncio.Queue()
await asyncio.gather(worker(queue), assign(queue))
if __name__ == '__main__':
asyncio.run(main())
The output of this is (on my network with 172 failing to respond):
1631611141.70 start assigning
1631611141.70 starting 127.0.0.1
1631611141.71 result 127.0.0.1 good
1631611141.80 starting 192.168.1.10
1631611141.81 result 192.168.1.10 good
1631611141.91 starting 192.168.1.20
1631611142.02 starting 192.168.1.254
1631611142.03 result 192.168.1.254 good
1631611142.13 starting 192.168.177.20
1631611142.23 starting 192.168.177.100
1631611142.24 result 192.168.177.100 good
1631611142.34 starting 172.17.1.1
1631611144.47 result 192.168.1.20 good
1631611145.11 result 192.168.177.20 good
1631611145.97 result 172.17.1.1 BAD

How to change "for" into a multithreaded pool in python

So I made this program that I want to loop for ever until closed. So at the moment I use this piece of code;
while True:
a = start();
for aaa in a:
check(a[aaa], 0)
But that is pretty slow. How can I multithread this using this (this is my try, it's incorrect ofcourse);
pool = ThreadPool(threads)
results = pool.map(check, a, 0)
I tried that code, with threads = 1. And it just gave nothing. Could anyone help me with this?
==== EDIT ====
Start function;
def start():
global a
url = "URL_WAS_HERE" // receives a json like {"a":56564356, "b":654653453} etc. etc.
r = requests.get(url)
a = json.loads(r.text)
return a
Check function;
def check(idd, tries):
global checked
global snipe
global notworking
if tries < 1:
checked = checked+1
url = "URL_WAS_HERE"+str(idd) // Receives json with extra information about the id
r = requests.get(url)
try:
b = json.loads(r.text)
if b['rap'] > b['best_price']:
difference = b['rap']-b['best_price'];
print(str(idd)+" has a "+str(difference)+ "R$ difference. Price: "+str(b['best_price'])+" //\\ Rap: "+str(b['rap']))
snipe = snipe+1
except:
time.sleep(1)
tries = tries+1
notworking = notworking+1
check(idd, tries)
settitle("Snipes; "+str(snipe)+" //\\ Checked; "+str(checked)+" //\\ Errors; "+str(notworking))
I hope this helps a bit

Perhaps start by using a documented object, ThreadPoolExecutor. ThreadPool is an undocumented language feature.
The docs offer minimal examples to get you started. For your example try the following construction:
from concurrent.futures import ThreadPoolExecutor, as_completed
values_to_test = a()
result_container = []
with ThreadPoolExecutor(max_workers=2) as executor: # set `max_workers` as appropriate
pool = {executor.submit(check, val, tries=0): val for val in values_to_test}
for future in as_completed(pool):
try:
result_container.append(future.result())
except:
pass # handle exceptions here
If you are set on using the map method, you cannot pass 0 as an argument because it is not an iterable; see the method signature.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Python3 asyncio: Process tasks from dict and store result in dict - python

Related

python asyncio result not set in task error

asyncio.sleep() - how to use it?

How to append items to a list in a parallel process (python-asyncio)?

How to collect wait()'d co-routines in a set?

How to change "for" into a multithreaded pool in python

Categories

Resources