I'm using SimPy in Python to create a Discrete Event Simulation that requires resources to be available based on a schedule input by the user in my case in a csv file. The aim is to represent different numbers of the same resource (e.g. staff) being available at different times of day. As far I as I can tell this isn't something that is available in base SimPy - like resource priorities.
I have managed to get this working and have included the code below to show how. However I wanted to ask the community if there is a better way to achieve this functionality in SimPy?
The code below works by requesting the resources at the start of each day for the times they are not supposed to be available - with a much higher priority to ensure they get the resource. The resources are then released at the appropriate times for use by other events/processes. As I say it works but seems wasteful with a lot of dummy processes working to ensure the correct true availability of resources. Any comments which would lead to improvements would be welcomed.
so the csv looks like:
Number time
0 23
50 22
100 17
50 10
20 8
5 6
where Number represents the number of staff that are the become available at the defined time. For example: There will be 5 staff from 6-8, 20 from 8-10, 50 from 10-17 and so on until the end of the day.
The code:
import csv
import simpy
# empty list ready to hold the input data in the csv
input_list = []
# a dummy process that "uses" staff until the end of the current day
def take_res():
req = staff.request(priority=-100)
yield req # Request a staff resource at set priority
yield test_env.timeout(24 - test_env.now)
# A dummy process that "uses" staff for the time those staff should not
# be available for the real processes
def request_res(delay, avail_time):
req = staff.request(priority=-100)
yield req # Request a staff resource at set priority
yield test_env.timeout(delay)
yield staff.release(req)
# pass time it is avail for
yield test_env.timeout(avail_time)
test_env.process(take_res())
# used to print current levels of resource usage
def print_usage():
print('At time %0.2f %d res are in use' % (test_env.now, staff.count))
yield test_env.timeout(0.5)
test_env.process(print_usage())
# used to open the csv and read the data into a list
with open('staff_schedule.csv', mode="r") as infile:
reader = csv.reader(infile)
next(reader, None) # ignore header
for row in reader:
input_list.append(row[:2])
# calculates the time the current number of resources will be
# available for and adds to the list
i = 0
for row in the_list:
if i == 0:
row.append(24 - int(input_list[i][1]))
else:
row.append(int(input_list[i-1][1]) - int(input_list[i][1]))
i += 1
# converts list to tuple of tuples to prevent any accidental
# edits from this point in
staff_tuple = tuple(tuple(row) for row in input_list)
print(staff_tuple)
# define environment and creates resources
test_env = simpy.Environment()
staff = simpy.PriorityResource(test_env, capacity=sum(int(l[0]) for l in staff_tuple))
# for each row in the tuple run dummy processes to hold resources
# according to schedule in the csv
for item in the_tuple:
print(item[0])
for i in range(int(item[0])):
test_env.process(request_res(int(item[1]), int(item[2])))
# run event to print usage over time
test_env.process(print_usage())
# run for 25 hours - so 1 day
test_env.run(until=25)
I tried something else, I overloaded the Resource class, only adding one method and while I don't fully understand the source code, it seems to work properly. You can tell the resource to change the capacity somewhere in your simulation.
from simpy.resources.resource import Resource, Request, Release
from simpy.core import BoundClass
from simpy.resources.base import BaseResource
class VariableResource(BaseResource):
def __init__(self, env, capacity):
super(VariableResource, self).__init__(env, capacity)
self.users = []
self.queue = self.put_queue
#property
def count(self):
return len(self.users)
request = BoundClass(Request)
release = BoundClass(Release)
def _do_put(self, event):
if len(self.users) < self.capacity:
self.users.append(event)
event.usage_since = self._env.now
event.succeed()
def _do_get(self, event):
try:
self.users.remove(event.request)
except ValueError:
pass
event.succeed()
def _change_capacity(self, capacity):
self._capacity = capacity
I think this should work, but I'm not a 100% confident about how the triggers work.
I solved creating a Resource for each time window. Each arrival is processed in the function service, and each customer will be assigned to a resource depending on the arrival time. In case a customer has to wait in queue and has to be re-asigned to the next time window, it is removed from current Resource and re-assigned to the next Resource. This is done by modifying the request as:
with self.Morning.request() as req1:
yield req1 | self.env.timeout(self.durationMorning)
The code:
import simpy
import numpy as np
import itertools
class Queue():
def __init__(self, env, N_m, N_e):
self.Arrival = {}
self.StartService = {}
self.FinishService = {}
self.Morning = simpy.Resource(env, N_m)
self.Evening = simpy.Resource(env, N_e)
self.env = env
self.durationMorning = 30
#arrivals/second
def t_arrival(self,t):
if t<self.durationMorning:
return 1
else:
return 2
def t_service(self):
return 5
def service(self,i):
arrival_time = self.env.now
if arrival_time==self.durationMorning:
yield self.env.timeout(0.0001)
# Add Arrival
system.Arrival[i] = arrival_time
# Morning shift
if self.env.now < self.durationMorning:
with self.Morning.request() as req1:
yield req1 | self.env.timeout(self.durationMorning)
if self.env.now < self.durationMorning:
system.StartService[i] = self.env.now
yield self.env.timeout(self.t_service())
print(f'{i} arrived at {self.Arrival[i]} done at {self.env.now} by 1')
self.FinishService[i] = self.env.now
# Evening shift
if (self.env.now >= self.durationMorning) & (i not in self.FinishService):
with self.Evening.request() as req2:
yield req2
system.StartService[i] = self.env.now
yield self.env.timeout(self.t_service())
print(f'{i} arrived at {self.Arrival[i]} done at {self.env.now} by 2')
self.FinishService[i] = self.env.now
def arrivals(self):
for i in itertools.count():
self.env.process(self.service(i))
t = self.t_arrival(self.env.now)
yield self.env.timeout(t)
env = simpy.Environment()
system = Queue(env, N_morning, N_evening)
system.env.process(system.arrivals())
system.env.run(until=60)
0 arrived at 0 done at 5 by 1
1 arrived at 1 done at 6 by 1
2 arrived at 2 done at 10 by 1
3 arrived at 3 done at 11 by 1
4 arrived at 4 done at 15 by 1
5 arrived at 5 done at 16 by 1
6 arrived at 6 done at 20 by 1
7 arrived at 7 done at 21 by 1
8 arrived at 8 done at 25 by 1
9 arrived at 9 done at 26 by 1
10 arrived at 10 done at 30 by 1
11 arrived at 11 done at 31 by 1
12 arrived at 12 done at 35 by 2
13 arrived at 13 done at 40 by 2
14 arrived at 14 done at 45 by 2
15 arrived at 15 done at 50 by 2
16 arrived at 16 done at 55 by 2
SimPy related
Maybe you can use PreemptiveResource (see this example). With this, you would only need one blocker-process per resource as it can just "kick" less important processes.
Python related
Document your code. What’s the purpose of take_res() and request_res()? (Why do both functions use priority=-100, anyway?)
Use better names. the_list or the_tuple is not very helpful.
Instead of the_list.append(row[0], row[1]) you can do the_list.append(row[:2]).
Why do you convert the list of lists into a tuple of tuples? As far as I can see the benefit. But it adds extra code and thus, extra confusion and expra possibilities for programming errors.
You should leave the with open(file) block as soon as possible (after the first four lines, in your case). There’s no need to keep the file open longer than necessary and when you are done iterating over all lines, you no longer need it.
This is how I solved it for my application. It's not perfect but was the best I could do given my basic level of skill with Python and SimPy.
The result is the correct number of Advisers are available at the desired times.
First I define a store and set the capacity to be equal to the total number of adviser instances that will exist within the simulation.
self.adviser_store = simpy.FilterStore(self.env,
capacity=self.total_ad_instances)
The instances of the Adviser class required are created in an initialization step which for brevity I have not included. I actually use a JSON file to customize the individual adviser instances which are then placed in a list.
The run parameter in the class definition below is actually another class that contains all info related to the current run of the simulation - so for example it contains the start and end dates for the simulation. self.start_date therefore defines the date that adviser starts working. self.run.start_date is the start date for the simulation.
class Adviser(object):
def __init__(self, run, id_num, start_time, end_time, start_date, end_date):
self.env = run.env
self.run = run
self.id_num = id_num
self.start_time = start_time
self.end_time = end_time
self.start_date = datetime.datetime.strptime(start_date, '%Y, %m, %d')
self.end_date = datetime.datetime.strptime(end_date, '%Y, %m, %d')
self.ad_type = ad_type
self.avail = False
self.run.env.process(self.set_availability())
So as you can see creating the adviser class also starts the process to set the availability. In the example below I've simplified it to set the same availability each day for a given date range. You could of course set different availabilities depending on date/day etc.
def set_availability(self):
start_delay = self.start_time + (self.start_date - self.run.start_date).total_seconds()/3600 # this returns the time in hours until the resource becomes available and is applied below.
end_delay = self.end_time + (self.start_date - self.run.start_date).total_seconds()/3600
repeat = (self.end_date - self.start_date).days + 1 # defines how man days to repeat it for
for i in range(repeat):
start_delayed(self.run.env, self.add_to_store(), start_delay)
start_delayed(self.run.env, self.remove_from_store(), end_delay)
start_delay += 24
end_delay += 24
yield self.run.env.timeout(0)
def add_to_store(self):
self.run.ad_avail.remove(self) # take adviser from a list
self.run.adviser_store.put(self) # and put it in the store
yield self.run.env.timeout(0)
def remove_from_store(self):
current_ad = yield self.run.adviser_store.get(lambda item: item.id_num == self.id_num) # get itself from the store
self.run.ad_avail.append(current_ad) # and put it back in the list
yield self.run.env.timeout(0)
So essentially customers can only request advisers from the store and the advisers will only be in the store at certain times. the rest of the time they are in the list attached to the current run of the simulation.
I think there is still a pitfall here. The adviser object may be in use when it is due to become unavailable. I haven't noticed if this happens as yet or the impact if it does.
Related
The code is for a simulation of registration desk with 2 personnel and 10 students where, it takes 5 min to complete the registration and 3 min is the waiting period before the next student can start registration.
I m new to simulation and I fail to understand the purpose/use/working of code tagged as #line1 and #line2
Also, why doesn't the code execute first for the for loop before #line2?
import simpy
class Student(object):
def __init__(self,env,reg):
self.env=env
self.action=env.process(self.run()) #line 1
self.reg=reg
def run(self):
with reg.request() as r:
yield r
print("Start registration at at %d" % env.now)
yield env.timeout(5)
print("Start waiting at %d "% env.now)
yield env.timeout(3)
env=simpy.Environment()
reg=simpy.Resource(env,capacity=2)
student=Student(env,reg)
for i in range (9):
env.process(student.run())
env.run(until=80) #line2
#Line 2 is what actually starts the simulation for 80 units of time. The rest of the code sets up the simulation's starting state. This env.run() is not the run() defined in the Student class
#Line 1 is more interesting. There are a couple of patterns for doing simulations. one pattern uses classes and when a class object is instantiated/created it also starts its simulation process, so you would not need to call the student.run() yourself. The __init__() gets called when you create a class object. However if you use this pattern then the loop before #line 2 should be changed from
student = Student(env, reg)
for i in range(9):
env.process(student.run())
to
students = [Student(env,reg) for _ in range(10)]
This will create 10 students, each waiting to register. The advantage of creating 10 students is each one can have individual states, like a unique id number, and collect stats about themselves. This is very useful in more complex simulations
This is how this code should have been written.
The other pattern is where you do call Student.run() yourself. To do that here you would comment out #line 1 and change the loop counter in the loop above #line 2 from 9 to 10. The disadvantage here is all your students are using the same class variables. A more common way of using this pattern is to not use a class, just a def function and call this function 10 times, passing the env, and reg directly to the function
As written, what this code is really doing is creating one student and that one student is registering 10 times, once when it is created, and then 9 more times in the loop. While this may demonstrate how a resource is used, I agree this code can be a bit confusing.
I'd like to simulate spikey traffic, so that for example:
in the first 5 minutes there are only 50 users (instant hatch of 50 at time T0)
then from 5th to 10th minute we have 100 users (instant hatch +50 at T+5)
then 150 (instant hatch +50 at T+10)
etc.
Is it possible to create an equal number of users, but instead of doing that every second change that to every xx minutes?
There is no such built in feature (https://github.com/locustio/locust/issues/1353 might solve this if it is ever implemented)
One way to do a workaround is to spawn all your users right away (using a spawn rate of something like 100/s), and have them sleep until it is time to run:
import time
start = time.time()
class User1(HttpUser):
#task
def mytask(self):
# do actual task
class User2(HttpUser):
#task
def mytask(self):
while time.time() - start < 300:
time.sleep(1)
# do actual task
class User3(HttpUser):
#task
def mytask(self):
while time.time() - start < 600:
time.sleep(1)
# do actual task
...
You can probably do something clever and put it all in one class, but I'll leave that as an exercise :)
Locust 2.8.6 Update
Now you can benefit from using custom shapes. Read more at Locust Documentation.
You should use the tick function and define the return tuple of your user limit and spawn rate.
Here is a code example:
from locust import LoadTestShape
class SharpStepShape(LoadTestShape):
increase_delay = 300 # 5 minutes for increase
increase_size = 50 # number of extra users per increase
def tick(self):
run_time = self.get_run_time()
step_number = int(run_time / self.increase_delay) + 1
user_limit = int(step_number * self.increase_size)
return user_limit, self.increase_size
Then just import this shape into your locustfile and it will use this shape for your load testing.
from locust import User
from sharp_step_shape import SharpStepShape
class PerformanceUser(User):
pass
I have a problem in which I process documents from files using python generators. The number of files I need to process are not known in advance. Each file contain records which consumes considerable amount of memory. Due to that, generators are used to process records. Here is the summary of the code I am working on:
def process_all_records(files):
for f in files:
fd = open(f,'r')
recs = read_records(fd)
recs_p = (process_records(r) for r in recs)
write_records(recs_p)
My process_records function checks for the content of each record and only returns the records which has a specific sender. My problem is the following: I want to have a count on number of elements being returned by read_records. I have been keeping track of number of records in process_records function using a list:
def process_records(r):
if r.sender('sender_of_interest'):
records_list.append(1)
else:
records_list.append(0)
...
The problem with this approach is that records_list could grow without bounds depending upon the input. I want to be able to consume the content of records_list once it grows to certain point and then restart the process. For example, after 20 records has been processed, I want to find out how many records are from 'sender_of_interest' and how many are from other sources and empty the list. Can I do this without using a lock?
You could make your generator a class with an attribute that contains a count of the number of records it has processed. Something like this:
class RecordProcessor(object):
def __init__(self, recs):
self.recs = recs
self.processed_rec_count = 0
def __call__(self):
for r in self.recs:
if r.sender('sender_of_interest'):
self.processed_rec_count += 1
# process record r...
yield r # processed record
def process_all_records(files):
for f in files:
fd = open(f,'r')
recs_p = RecordProcessor(read_records(fd))
write_records(recs_p)
print 'records processed:', recs_p.processed_rec_count
Here's the straightforward approach. Is there some reason why something this simple won't work for you?
seen=0
matched=0
def process_records(r):
seen = seen + 1
if r.sender('sender_of_interest'):
matched = match + 1
records_list.append(1)
else:
records_list.append(0)
if seen > 1000 or someOtherTimeBasedCriteria:
print "%d of %d total records had the sender of interest" % (matched, seen)
seen = 0
matched = 0
If you have the ability to close your stream of messages and re-open them, you might want one more total seen variable, so that if you had to close that stream and re-open it later, you could go to the last record you processed and pick up there.
In this code "someOtherTimeBasedCriteria" might be a timestamp. You can get the current time in milliseconds when you begin processing, and then if the current time now is more than 20,000ms more (20 sec) then reset the seen/matched counters.
I have a system that accepts messages that contain urls, if certain keywords are in the messages, an api call is made with the url as a parameter.
In order to conserve processing and keep my end presentation efficient.
I don't want duplicate urls being submitted within a certain time range.
so if this url ---> http://instagram.com/p/gHVMxltq_8/ comes in and it's submitted to the api
url = incoming.msg['urls']
url = urlparse(url)
if url.netloc == "instagram.com":
r = requests.get("http://api.some.url/show?url=%s"% url)
and then 3 secs later the same url comes in, I don't want it submitted to the api.
What programming method might I deploy to eliminate/limit duplicate messages from being submitted to the api based on time?
UPDATE USING TIM PETERS METHOD:
limit = DecayingSet(86400)
l = limit.add(longUrl)
if l == False:
pass
else:
r = requests.get("http://api.some.url/show?url=%s"% url)
this snippet is inside a long running process, that is accepting streaming messages via tcp.
every time I pass the same url in, l returns True every time.
But when I try it in the interpreter everything is good, it returns False when the set time hasn't expired.
Does it have to do with the fact that the script is running, while the set is being added to?
Instance issues?
Maybe overkill, but I like creating a new class for this kind of thing. You never know when requirements will get fancier ;-) For example,
from time import time
class DecayingSet:
def __init__(self, timeout): # timeout in seconds
from collections import deque
self.timeout = timeout
self.d = deque()
self.present = set()
def add(self, thing):
# Return True if `thing` not already in set,
# else return False.
result = thing not in self.present
if result:
self.present.add(thing)
self.d.append((time(), thing))
self.clean()
return result
def clean(self):
# forget stuff added >= `timeout` seconds ago
now = time()
d = self.d
while d and now - d[0][0] >= self.timeout:
_, thing = d.popleft()
self.present.remove(thing)
As written, it checks for expirations whenever an attempt is made to add a new thing. Maybe that's not what you want, but it should be a cheap check since the deque holds items in order of addition, so gets out at once if no items are expiring. Lots of possibilities.
Why a deque? Because deque.popleft() is a lot faster than list.pop(0) when the number of items becomes non-trivial.
suppose your desired interval is 1 hour, keep 2 counters that increment every hour but they are offset 30 minutes from each other. i. e. counter A goes 1, 2, 3, 4 at 11:17, 12:17, 13:17, 14:17 and counter B goes 1, 2, 3, 4 at 11:47, 12:47, 13:47, 14:47.
now if a link comes in and has either of the two counters same as an earlier link, then consider it to be duplicate.
the benefit of this scheme over explicit timestamps is that you can hash the url+counterA and url+counterB to quickly check whether the url exists
Update: You need two data stores: one, a regular database table (slow) with columns: (url, counterA, counterB) and two, a chunk of n bits of memory (fast). given a url so.com, counterA 17 and counterB 18, first hash "17,so.com" into a range 0 to n - 1 and see if the bit at that address is turned on. similarly, hash "18,so.com" and see if the bit is turned on.
If the bit is not turned on in either case you are sure it is a fresh URL within an hour, so we are done (quickly).
If the bit is turned on in either case then look up the url in the database table to check if it was that url indeed or some other URL that hashed to the same bit.
Further update: Bloom filters are an extension of this scheme.
I'd recommend keeping an in-memory cache of the most-recently-used URLs. Something like a dictionary:
urls = {}
and then for each URL:
if url in urls and (time.time() - urls[url]) < SOME_TIMEOUT:
# Don't submit the data
else:
urls[url] = time.time()
# Submit the data
I have a simple code that runs a GET request per each item in the generator that I'm trying to speed up:
def stream(self, records):
# type(records) = <type 'generator'>
for record in records:
# record = OrderedDict([('_time', '1518287568'), ('data', '5552267792')])
output = rest_api_lookup(record[self.input_field])
record.update(output)
yield record
Right now this runs on a single thread and takes forever since each REST call waits until the previous REST call finishes.
I have used multithreading in Python from a list before using this great answer (https://stackoverflow.com/a/28463266/1150923), but I'm not sure how to re-use the same strategy on a generator instead of a list.
I had some advise from a fellow developer who recommended me that I break out the generator into 100-element lists and then close the pool, but I don't know how to create these lists from the generator.
I also need to keep the original order since I need to yield record in the right order.
I assume you don't want to turn your generator records into a list first. One way to speed up your processing is to pass the records into a ThreadPoolExecutor chunk-wise. The executor will process your rest_api_lookup concurrently for all items of the chunk. Then you just need to "unchunk" your results. Here's some running sample code (which does not use classes, sorry, but I hope it shows the principle):
from concurrent.futures import ThreadPoolExecutor
from time import sleep
pool = ThreadPoolExecutor(8) # 8 threads, adjust to taste and # of cores
def records():
# simulates records generator
for i in range(100):
yield {'a': i}
def rest_api_lookup(a):
# simulates REST call :)
sleep(0.1)
return {'b': -a}
def stream(records):
def update_fun(record):
output = rest_api_lookup(record['a'])
record.update(output)
return record
chunk = []
for record in records:
# submit update_fun(record) into pool, keep resulting Future
chunk.append(pool.submit(update_fun, record))
if len(chunk) == 8:
yield chunk
chunk = []
if chunk:
yield chunk
def unchunk(chunk_gen):
"""Flattens a generator of Future chunks into a generator of Future results."""
for chunk in chunk_gen:
for f in chunk:
yield f.result() # get result from Future
# Now iterate over all results in same order as generated by records()
for result in unchunk(stream(records())):
print(result)
HTH!
Update: I added a sleep to the simulated REST call, to make it more realistic. This chunked version finishes on my machine in 1.5 seconds. The sequential version takes 10 seconds (as is to be expected, 100 * 0.1s = 10s).
Here's an example how you can do it with concurrent.futures:
from concurrent import futures
from concurrent.futures import ThreadPoolExecutor
class YourClass(object):
def stream(self, records):
for record in records:
output = rest_api_lookup(record[self.input_field])
record.update(output)
# process your list and yield back result.
yield {"result_key": "whatever the result is"}
def run_parallel(self):
""" Use this method to do the parallel processing """
# The important part - concurrent futures
# - set number of workers as the number of jobs to process - suggest 4, but may differ
# this will depend on how many threads you want to run in parallel
with ThreadPoolExecutor(4) as executor:
# Use list jobs for concurrent futures
# Use list scraped_results for results
jobs = []
parallel_results = []
# Pass some keyword arguments if needed - per job
record1 = {} # your values for record1 - if need more - create
record2 = {} # your values for record2 - if need more - create
record3 = {} # your values for record3 - if need more - create
record4 = {} # your values for record4 - if need more - create
list_of_records = [[record1, record2], [record3, record4],]
for records in list_of_records:
# Here we iterate 'number of records' times, could be different
# We're adding stream, could be different function per call
jobs.append(executor.submit(self.stream, records))
# Once parallel processing is complete, iterate over results
# append results to final processing without any networking
for job in futures.as_completed(jobs):
# Read result from future
result = job.result()
# Append to the list of results
parallel_results.append(result)
# Use sorted to sort by key to preserve order
parallel_results = sorted(parallel_results, key=lambda k: k['result_key'])
# Iterate over results streamed and do whatever is needed
for result is parallel_results:
print("Do something with me {}".format(result))
The answer by dnswlt works well but can still be improved. If the request to the REST API (or whatever else should be done with each record) take a variable amount of time, some CPUs may be idle while the slowest request of each batch is running.
The following solution takes a generator and a function as an input and applies the function to each element produced by the generator while maintaining a given number of running threads (each of which applies the function to one element). At the same time, it still returns the results in the order of the input.
from concurrent.futures import ThreadPoolExecutor
import os
import random
import time
def map_async(iterable, func, max_workers=os.cpu_count()):
# Generator that applies func to the input using max_workers concurrent jobs
def async_iterator():
iterator = iter(iterable)
pending_results = []
has_input = True
thread_pool = ThreadPoolExecutor(max_workers)
while True:
# Submit jobs for remaining input until max_worker jobs are running
while has_input and \
len([e for e in pending_results if e.running()]) \
< max_workers:
try:
e = next(iterator)
print('Submitting task...')
pending_results.append(thread_pool.submit(func, e))
except StopIteration:
print('Submitted all task.')
has_input = False
# If there are no pending results, the generator is done
if not pending_results:
return
# If the oldest job is done, return its value
if pending_results[0].done():
yield pending_results.pop(0).result()
# Otherwise, yield the CPU, then continue starting new jobs
else:
time.sleep(.01)
return async_iterator()
def example_generator():
for i in range(20):
print('Creating task', i)
yield i
def do_work(i):
print('Starting to work on', i)
time.sleep(random.uniform(0, 3))
print('Done with', i)
return i
random.seed(42)
for i in map_async(example_generator(), do_work):
print('Got result:', i)
The commented output of a possible execution (on a machine with 8 logical CPUs):
Creating task 0
Submitting task...
Starting to work on 0
Creating task 1
Submitting task...
Starting to work on 1
Creating task 2
Submitting task...
Starting to work on 2
Creating task 3
Submitting task...
Starting to work on 3
Creating task 4
Submitting task...
Starting to work on 4
Creating task 5
Submitting task...
Starting to work on 5
Creating task 6
Submitting task...
Starting to work on 6
Creating task 7
Submitting task...
Starting to work on 7 # This point is reached quickly: 8 jobs are started before any of them finishes
Done with 1 # Job 1 is done, but since job 0 is not, the result is not returned yet
Creating task 8 # Job 1 finished, so a new job can be started
Submitting task...
Creating task 9
Starting to work on 8
Submitting task...
Done with 7
Starting to work on 9
Done with 9
Creating task 10
Submitting task...
Creating task 11
Starting to work on 10
Submitting task...
Done with 3
Starting to work on 11
Done with 2
Creating task 12
Submitting task...
Creating task 13
Starting to work on 12
Submitting task...
Done with 12
Starting to work on 13
Done with 10
Creating task 14
Submitting task...
Creating task 15
Starting to work on 14
Submitting task...
Done with 8
Starting to work on 15
Done with 13 # Several other jobs are started and completed
Creating task 16
Submitting task...
Creating task 17
Starting to work on 16
Submitting task...
Done with 0 # Finally, job 0 is completed
Starting to work on 17
Got result: 0
Got result: 1
Got result: 2
Got result: 3 # The result of all completed jobs are returned in input order until the job of the next one is still running
Done with 5
Creating task 18
Submitting task...
Creating task 19
Starting to work on 18
Submitting task...
Done with 16
Starting to work on 19
Done with 11
Submitted all task.
Done with 19
Done with 4
Got result: 4
Got result: 5
Done with 6
Got result: 6 # Job 6 must have been a very long job; now that it's done, its result and the result of many subsequent jobs can be returned
Got result: 7
Got result: 8
Got result: 9
Got result: 10
Got result: 11
Got result: 12
Got result: 13
Done with 14
Got result: 14
Done with 15
Got result: 15
Got result: 16
Done with 17
Got result: 17
Done with 18
Got result: 18
Got result: 19
The above run took about 4.7s while the sequential execution (setting max_workers=1) took about 23.6s. Without the optimization that avoids waiting for the slowest execution per batch, the execution takes about 5.3s. Depending on the variation of the individual job times and max_workers, the effect of the optimization may be even larger.