I have two functions that I want to run concurrently to check performance, now a days I'm running one after another and it's taking quite some time.
Here it's how I'm running
import pandas as pd
import threading
df = pd.read_csv('data/Detalhado_full.csv', sep=',', dtype={'maquina':str})
def gerar_graph_36():
df_ordered = df.query(f'maquina=="3.6"')[['data', 'dia_semana', 'oee', 'ptg_ruins', 'prod_real_kg', 'prod_teorica_kg']].sort_values(by='data')
oee = df_ordered['oee'].iloc[-1:].iloc[0]
return oee
def gerar_graph_31():
df_ordered = df.query(f'maquina=="3.1"')[['data', 'dia_semana', 'oee', 'ptg_ruins', 'prod_real_kg', 'prod_teorica_kg']].sort_values(by='data')
oee = df_ordered['oee'].iloc[-1:].iloc[0]
return oee
oee_36 = gerar_graph_36()
oee_31 = gerar_graph_31()
print(oee_36, oee_31)
I tried to apply threading using this statement but it's not returning the variable, instead it's printing None value
print(oee_31, oee_36) -> Expecting: 106.3 99.7 // Returning None None
oee_31 = threading.Thread(target=gerar_graph_31, args=()).start()
oee_36 = threading.Thread(target=gerar_graph_36, args=()).start()
print(oee_31, oee_36)
For checking purpose, If I use the command below, returns 3 as expected
print(threading.active_count())
I need the return oee value from the function, something like 103.8.
Thanks in advance!!
Ordinarily creatign a new thread and starting it is not like calling a function which returns a variable: the Thread.start() call just "starts the code of the other thread", and returns imediatelly.
To colect results in the other threads you have to comunicate the computed results to the main thread using some data structure. An ordinary list or dictionary could do, or one could use a queue.Queue.
If you want to have something more like a function call and be able to not modify the gerar_graph() functions, you could use the concurrent.futures module instead of threading: that is higher level code that will wrap your calls in a "future" object, and you will be able to check when each future is done and fetch the value returned by the function.
Otherwise, simply have a top-level variable containign a list, wait for your threads to finish up running (they stop when the function called by "target" returns), and collect the results:
import pandas as pd
import threading
df = pd.read_csv('data/Detalhado_full.csv', sep=',', dtype={'maquina':str})
results = []
def gerar_graph_36():
df_ordered = df.query(f'maquina=="3.6"')[['data', 'dia_semana', 'oee', 'ptg_ruins', 'prod_real_kg', 'prod_teorica_kg']].sort_values(by='data')
oee = df_ordered['oee'].iloc[-1:].iloc[0]
results.append(oee)
def gerar_graph_31():
df_ordered = df.query(f'maquina=="3.1"')[['data', 'dia_semana', 'oee', 'ptg_ruins', 'prod_real_kg', 'prod_teorica_kg']].sort_values(by='data')
oee = df_ordered['oee'].iloc[-1:].iloc[0]
results.append(oee)
# We need to keep a reference to the threads themselves
# so that we can call both ".start()" (which always returns None)
# and ".join()" on them.
oee_31 = threading.Thread(target=gerar_graph_31); oee_31.start()
oee_36 = threading.Thread(target=gerar_graph_36); oee_36.start()
oee_31.join() # will block and return only when the task is done, but oee_36 will be running concurrently
oee_36.join()
print(results)
If you need more than 2 threads, (like all 36...), I strongly suggest using concurrent.futures: you can limit the number of workers to a number comparable to the logical CPUs you have. And, of course, manage your tasks and calls in a list or dictionary, instead of having a separate variable name for each.
Related
I have a script that loops over a pandas dataframe and outputs GIS data to a geopackage based on some searches and geometry manipulation. It works when I use a for loop but with over 4k records it takes a while. Since I have it built as it's own function that returns what I need based on a row iteration I tried to run it with multiprocessing with:
import pandas as pd, bwe_mapping
from multiprocessing import Pool
#Sample dataframe
bwes = [['id', 7216],['item_id', 3277841], ['Date', '2019-01-04T00:00:00.000Z'], ['start_lat', -56.92], ['start_lon', 45.87], ['End_lat', -59.87], ['End_lon', 44.67]]
bwedf = pd.read_csv(bwes)
geopackage = "datalocation\geopackage.gpkg"
tracklayer = "tracks"
if __name__=='__main__':
def task(item):
bwe_mapping.map_bwe(item, geopackage, tracklayer)
pool = Pool()
for index, row in bwedf.iterrows():
task(row)
with Pool() as pool:
for results in pool.imap_unordered(task, bwedf.iterrows()):
print(results)
When I run this my Task manager populates with 16 new python tasks but no sign that anything is being done. Would it be better to use numpy.array.split() to break up my pandas df into 4 or 8 smaller ones and run the for index, row in bwedf.iterrows(): for each dataframe on it's own processor?
No one process needs to be done in any order; as long as I can store the outputs, which are geopanda dataframes, into a list to concatenate into geopackage layers at the end.
Should I have put the for loop in the function and just passed it the whole dataframe and gis data to search?
if you are running on windows/macOS then it's going to use spawn to create the workers, which means that any child MUST find the function it is going to execute when it imports your main script.
your code has the function definition inside your if __name__=='__main__': so the children don't have access to it.
simply moving the function def to before if __name__=='__main__': will make it work.
what is happening is that each child is crashing when it tries to run a function because it never saw its definition.
minimal code to reproduce the problem:
from multiprocessing import Pool
if __name__ == '__main__':
def task(item):
print(item)
return item
pool = Pool()
with Pool() as pool:
for results in pool.imap_unordered(task, range(10)):
print(results)
and the solution is to move the function definition to before the if __name__=='__main__': line.
Edit: now to iterate on rows in a dataframe, this simple example demonstrates how to do it, note that iterrows returns an index and a row, which is why it is unpacked.
import os
import pandas as pd
from multiprocessing import Pool
import time
# Sample dataframe
bwes = [['id', 7216], ['item_id', 3277841], ['Date', '2019-01-04T00:00:00.000Z'], ['start_lat', -56.92],
['start_lon', 45.87], ['End_lat', -59.87], ['End_lon', 44.67]]
bwef = pd.DataFrame(bwes)
def task(item):
time.sleep(1)
index, row = item
# print(os.getpid(), tuple(row))
return str(os.getpid()) + " " + str(tuple(row))
if __name__ == '__main__':
with Pool() as pool:
for results in pool.imap_unordered(task, bwef.iterrows()):
print(results)
the time.sleep(1) is only there because there is only a small amount of work and one worker might grab it all, so i am forcing every worker to wait for the others, you should remove it, the result is as follows:
13228 ('id', 7216)
11376 ('item_id', 3277841)
15580 ('Date', '2019-01-04T00:00:00.000Z')
10712 ('start_lat', -56.92)
11376 ('End_lat', -59.87)
13228 ('start_lon', 45.87)
10712 ('End_lon', 44.67)
it seems like your "example" dataframe is transposed, but you just have to construct the dataframe correctly, i'd recommend you first run the code serially with iterrows, before running it across multiple cores.
obviously sending data to the workers and back from them takes time, so make sure each worker is doing a lot of computational work and not just sending it back to the parent process.
Suppose I have two independent functions. I'd like to call them concurrently, using python's concurrent.futures.ThreadPoolExecutor. Is there a way to call them using Executor and ensure they are returned in order of submission?
I understand this is possible with the Executor.map, but I am looking to parallelize two separate functions, and not one function with a interable input.
I have example code below, but it doesn't guarantee that fn_a will return first, (by design of the wait function).
from concurrent.futures import ThreadPoolExecutor, wait
import time
def fn_a():
t_sleep = 0.5
print("fn_a: Wait {} seconds".format(t_sleep))
time.sleep(t_sleep)
ret = t_sleep * 5 # Do unique work
return "fn_a: return {}".format(ret)
def fn_b():
t_sleep = 1.0
print("fn_b: Wait {} seconds".format(t_sleep))
time.sleep(t_sleep)
ret = t_sleep * 10 # Do unique work
return "fn_b: return {}".format(ret)
if __name__ == "__main__":
with ThreadPoolExecutor() as executor:
futures = []
futures.append(executor.submit(fn_a))
futures.append(executor.submit(fn_b))
complete_futures, incomplete_futures = wait(futures)
for f in complete_futures:
print(f.result())
I'm also interested in knowing if there is a way to do this with joblib
Think I found a reasonable option using lambda and partials. The partials allow me to pass arguments to some functions in the parallelized iterable, but not others.
from functools import partial
import concurrent.futures
fns = [partial(fn_a), partial(fn_b)]
data = []
with concurrent.futures.ThreadPoolExecutor() as executor:
try:
for result in executor.map(lambda x: x(), fns):
data.append(result)
Since it is using executor.map, it returns in order.
I am new to python and tried a lot of methods for multiprocessing in python with no such benefits:
I have a task of implementing 3 methods x,y and z. What I have tried till now is:
Def foo:
Iterate over the lines in a text file:
Call_method_x()
Result from method x say x1
Call_method_y() #this uses x1
Result from method y say y1
For i in range(4):
Multiprocessing.Process(target=Call_method_z()) #this uses y1
I used multiprocessing here on method_z as this is the most cpu intensive.
i tried this another way:
def foo:
call method_x()
call method_y()
call method_z()
def main():
import concurrent.futures
with concurrent.futures.ProcessPoolExecutor() as executor:
executor.map(foo())
Which one seems more appropriate ? I checked the execution time but it was not much of a difference. the thing is that first method_x(), then method_y() and then method_z() should be implemented as they use the output from each other. Both these ways work but theres no significant difference of using multiprocessing with these two methods.
Please let me know if I am missing something here.
You can use multiprocessing.Pool from python, something like :
from multiprocessing import Pool
with open(<path-to-file>) as f:
data = f.readlines()
def method_x():
# do something
pass
def method_y():
x1 = method_x()
#do something
def method_z():
y1 = method_y()
# do something
def call_home():
p = Pool(6)
p.map(method_z, data)
First you read all lines in variable data. Then invoke 6 processes and allow each line to be processed by any of 6 process
I have a python function that requires the user enter an array of data; at which point the function works on the data and produces two arrays which are returned to the main program. In this question I am including a greatly simplified example from which I hope I can solicit some advice or help. I have created a function titled "Test_Function" that requires the programmer supply an array of data titled "Array", which in this case has a length of 5000. The function works on the data and produces two sets of arrays titled "Result1" and "Result2" which are returned to the user in the main program as the variables "Res1" and "Res2". I would like to thread the function so that the function "Test_Function" so that one thread will work on half of the input array and the other thread will work on the other half and then combine them back together in the main program for both output arrays "Result1" and "Result2"/"Res1" and "Res2". I described a scenario where I would produce two threads, but I would like to make it generic enough so that it could run a user defined number of threads. How do I do this with the thread functionality?
import numpy as np
def Test_Function(Array):
Result1 = Array*np.pi*(1-Array)
Result2 = Array+478.5 + (1/Array)
return(np.array(Result1,dtype=float), np.array(Result2,dtype=float))
#---------------------------------------------------------------------------
if __name__ == "__main__":
Dependent_Array = np.linspace(1.0,5000.0,num=5000)
Res1, Res2 = Test_Function(Dependent_Array)
# eof
You could use a ThreadPool to which assign async tasks:
import numpy as np
from multiprocessing.pool import ThreadPool
def sub1(Array):
return Array * np.pi * (1-Array)
def sub2(Array):
return Array + 478.5 + (1/Array)
def Test_Function(Array):
pool = ThreadPool(processes=2)
res1 = pool.apply_async(sub1, (Array,))
res2 = pool.apply_async(sub2, (Array,))
return (np.array(res1.get(),dtype=float), np.array(res2.get(),dtype=float))
if __name__ == "__main__":
Dependent_Array = np.linspace(1.0,5000.0,num=5000)
Res1, Res2 = Test_Function(Dependent_Array)
This module is not very well documented, but, basically, it creates a pool of workers to which you can assign subroutines to execute. The method get() of the result object created by apply_async will return the result only when the corresponding thread has finished its operations.
I have a dataset df of trader transactions.
I have 2 levels of for loops as follows:
smartTrader =[]
for asset in range(len(Assets)):
df = df[df['Assets'] == asset]
# I have some more calculations here
for trader in range(len(df['TraderID'])):
# I have some calculations here, If trader is successful, I add his ID
# to the list as follows
smartTrader.append(df['TraderID'][trader])
# some more calculations here which are related to the first for loop.
I would like to parallelise the calculations for each asset in Assets, and I also want to parallelise the calculations for each trader for every asset. After ALL these calculations are done, I want to do additional analysis based on the list of smartTrader.
This is my first attempt at parallel processing, so please be patient with me, and I appreciate your help.
If you use pathos, which provides a fork of multiprocessing, you can easily nest parallel maps. pathos is built for easily testing combinations of nested parallel maps -- which are direct translations of nested for loops.
It provides a selection of maps that are blocking, non-blocking, iterative, asynchronous, serial, parallel, and distributed.
>>> from pathos.pools import ProcessPool, ThreadPool
>>> amap = ProcessPool().amap
>>> tmap = ThreadPool().map
>>> from math import sin, cos
>>> print amap(tmap, [sin,cos], [range(10),range(10)]).get()
[[0.0, 0.8414709848078965, 0.9092974268256817, 0.1411200080598672, -0.7568024953079282, -0.9589242746631385, -0.27941549819892586, 0.6569865987187891, 0.9893582466233818, 0.4121184852417566], [1.0, 0.5403023058681398, -0.4161468365471424, -0.9899924966004454, -0.6536436208636119, 0.2836621854632263, 0.9601702866503661, 0.7539022543433046, -0.14550003380861354, -0.9111302618846769]]
Here this example uses a processing pool and a thread pool, where the thread map call is blocking, while the processing map call is asynchronous (note the get at the end of the last line).
Get pathos here: https://github.com/uqfoundation
or with:
$ pip install git+https://github.com/uqfoundation/pathos.git#master
Nested parallelism can be done elegantly with Ray, a system that allows you to easily parallelize and distribute your Python code.
Assume you want to parallelize the following nested program
def inner_calculation(asset, trader):
return trader
def outer_calculation(asset):
return asset, [inner_calculation(asset, trader) for trader in range(5)]
inner_results = []
outer_results = []
for asset in range(10):
outer_result, inner_result = outer_calculation(asset)
outer_results.append(outer_result)
inner_results.append(inner_result)
# Then you can filter inner_results to get the final output.
Bellow is the Ray code parallelizing the above code:
Use the #ray.remote decorator for each function that we want to execute concurrently in its own process. A remote function returns a future (i.e., an identifier to the result) rather than the result itself.
When invoking a remote function f() the remote modifier, i.e., f.remote()
Use the ids_to_vals() helper function to convert a nested list of ids to values.
Note the program structure is identical. You only need to add remote and then convert the futures (ids) returned by the remote functions to values using the ids_to_vals() helper function.
import ray
ray.init()
# Define inner calculation as a remote function.
#ray.remote
def inner_calculation(asset, trader):
return trader
# Define outer calculation to be executed as a remote function.
#ray.remote(num_return_vals = 2)
def outer_calculation(asset):
return asset, [inner_calculation.remote(asset, trader) for trader in range(5)]
# Helper to convert a nested list of object ids to a nested list of corresponding objects.
def ids_to_vals(ids):
if isinstance(ids, ray.ObjectID):
ids = ray.get(ids)
if isinstance(ids, ray.ObjectID):
return ids_to_vals(ids)
if isinstance(ids, list):
results = []
for id in ids:
results.append(ids_to_vals(id))
return results
return ids
outer_result_ids = []
inner_result_ids = []
for asset in range(10):
outer_result_id, inner_result_id = outer_calculation.remote(asset)
outer_result_ids.append(outer_result_id)
inner_result_ids.append(inner_result_id)
outer_results = ids_to_vals(outer_result_ids)
inner_results = ids_to_vals(inner_result_ids)
There are a number of advantages of using Ray over the multiprocessing module. In particular, the same code will run on a single machine as well as on a cluster of machines. For more advantages of Ray see this related post.
Probably threading, from standard python library, is most convenient approach:
import threading
def worker(id):
#Do you calculations here
return
threads = []
for asset in range(len(Assets)):
df = df[df['Assets'] == asset]
for trader in range(len(df['TraderID'])):
t = threading.Thread(target=worker, args=(trader,))
threads.append(t)
t.start()
#add semaphore here if you need synchronize results for all traders.
Instead of using for, use map:
import functools
smartTrader =[]
m=map( calculations_as_a_function,
[df[df['Assets'] == asset] \
for asset in range(len(Assets))])
functools.reduce(smartTradder.append, m)
From then on, you can try different parallel map implementations s.a. multiprocessing's, or stackless'