I am trying to accomplish a task that involves doing it parallely using Multiprocessing pool in Python. Basically there are some static parameters for a function and a bunch of variable parameters for different hyperparamters. For eg.
def simulate(static1, static2, iter1, iter2):
#do some math in for loop
return output
Now the thing is nth component in iter2 comes only with nth component of iter1. Like say
iter1 = [1,2,3,4]
iter2 = [x,y,z,w]
So during iteration (1,x),(2,y) etc. should be there as the parameters and in the end I expect to get 4 different outputs. SO I am trying to implement
partial_function = partial(simulate, static1 = s1, static2 = s2)
output = pool.map(partial, (iter1, iter2))
I am stuck at how to use multiple iters given that python returns TypeError mentioning simulate() missing 1 positional argument. Any suggestions on that?
Related
Say I have a program that contains a simulation-based function which takes some time to compute.
def foo_func(args):
# some calculations
return foo # df
res = {} # will be a dictionary of dfs
for i in range(n):
res[i] = foo_func(args)
Problem: Calculating foo using foo_func n times takes too long
Question: how do i implement multiprocessing/multithreading within the program, and store the results in res?
Note that:
foo_func takes in args
order does not matter within the res - the order in which the jobs finish does not matter, as long as all of the jobs are correctly stored in res
I have been using parfor in MATLAB to run parallel for loops for quite some time. I need to do something similar in Python but I cannot find any simple solution. This is my code:
t = list(range(1,3,1))
G = list(range(0,3,2))
results = pandas.DataFrame(columns = ['tau', 'p_value','G','t_i'],index=range(0,len(G)*len(t)))
counter = 0
for iteration_G in list(range(0,len(G))):
for iteration_t in list(range(0,len(t))):
matrix_1,matrix_2 = bunch of code
tau, p_value = scipy.stats.kendalltau(matrix_1, matrix_2)
results['tau'][counter] = tau
results['p_value'][counter] = p_value
results['G'][counter] = G[iteration_G]
results['t_i'][counter] = G[iteration_t]
counter = counter + 1
I would like to use the parfor equivalent in the first loop.
I'm not familiar with parfor, but you can use the joblib package to run functions in parallel.
In this simple example there's a function that prints its argument and we use Parallel to execute it multiple times in parallel with a for-loop
import multiprocessing
from joblib import Parallel, delayed
# function that you want to run in parallel
def foo(i):
print(i)
# define the number of cores (this is how many processes wil run)
num_cores = multiprocessing.cpu_count()
# execute the function in parallel - `return_list` is a list of the results of the function
# in this case it will just be a list of None's
return_list = Parallel(n_jobs=num_cores)(delayed(foo)(i) for i in range(20))
If this doesn't work for what you want to do, you can try to use numba - it might be a bit more difficult to set-up, but in theory with numba you can just add #njit(parallel=True) as a decorator to your function and numba will try to parallelise it for you.
I found a solution using parfor. It is still a bit more complicated than MATLAB's parfor but it's pretty close to what I am used to.
t = list(range(1,16,1))
G = list(range(0,62,2))
for iteration_t in list(range(0,len(t))):
#parfor(list(range(0,len(G))))
def fun(iteration_G):
result = pandas.DataFrame(columns = ['tau', 'p_value'],index=range(0,1))
matrix_1,matrix_2 = bunch of code
tau, p_value = scipy.stats.kendalltau(matrix_1, matrix_2)
result['tau'] = tau
result['p_value'] = p_value
fun = numpy.array([tau,p_value])
return fun
This question already has answers here:
How do I "multi-process" the itertools product module?
(2 answers)
Closed 6 years ago.
In python I am using itertools.product to iterate over all possible combinations of a list of characters that produces a very large result.
However when I look at the Windows 10 Task Manager the python process executing this task is only taking 13.5% CPU. I looked into multiprocessing in python, and found that with pool.map I can map an instance of a function to pool, and have multiple instances of the function running in parallel. This is great, but as I am iterating over a single (very large) list and this is done in one instance of a function that takes up a large amount of time, this doesn't help me.
So the way I see it the only way to speed this up is to split the result of itertools.product into groups and iterate over the groups in parallel. If I can get the length of the result itertools.product, I can divide it into groups by the number of processor cores I have available, and then using multiprocessing I can iterate over all these groups in parallel.
So my question is can this be done, and what is the best approach?
Maybe there is a module out there for this sort of thing?
The concept is something like this. (the following actually works but gives MemoryError when I try and scale it up to the full character set commented out):
#!/usr/bin/env python3.5
import sys, itertools, multiprocessing, functools
def process_group(iIterationNumber, iGroupSize, sCharacters, iCombinationLength, iCombintationsListLength, iTotalIterations):
iStartIndex = 0
if iIterationNumber > 1: iStartIndex = (iIterationNumber - 1) * iGroupSize
iStopIndex = iGroupSize * iIterationNumber
if iIterationNumber == iTotalIterations: iStopIndex = iCombintationsListLength
aCombinations = itertools.product(sCharacters, repeat=iCombinationLength)
lstCombinations = list(aCombinations)
print("Iteration#", iIterationNumber, "StartIndex:", iStartIndex, iStopIndex)
for iIndex in range(iStartIndex, iStopIndex):
aCombination = lstCombinations[iIndex];
print("Iteration#", iIterationNumber, ''.join(aCombination))
if __name__ == '__main__':
#_sCharacters = "abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789~`!##$%^&*()_-+={[}]|\"""':;?/>.<,"
_sCharacters = "123"
_iCombinationLength = 4
aCombinations = itertools.product(_sCharacters, repeat=_iCombinationLength)
lstCombinations = list(aCombinations)
_iCombintationsListLength = len(lstCombinations)
iCPUCores = 4
_iGroupSize = round(_iCombintationsListLength / iCPUCores)
print("Length", _iCombintationsListLength)
pool = multiprocessing.Pool()
pool.map(functools.partial(process_group, iGroupSize = _iGroupSize, sCharacters = _sCharacters, iCombinationLength = _iCombinationLength, iCombintationsListLength = _iCombintationsListLength, iTotalIterations = iCPUCores), range(1,iCPUCores+1))
Thanks for your time.
You can't share the product() output among subprocesses; there is no good way to break this up into chunks per process. Instead, have each subprocess generate new values but give them a prefix to start from.
Remove outer loops from the product() call and create groups from that. For example, you could create len(sCharacters) groups by decreasing iCombinationLength by one and passing in each element from sCharacters as a prefix:
for prefix in sCharacters:
# create group for iCombinationLength - 1 results.
# pass in the prefix
Each group then can loop over product(sCharacters, repeat=iCombinationLength - 1) themselves and combine that with the prefix. So group 1 starts with '0', group 2 starts with '1', etc.
You can extend this by using combinations of 2 or 3 or more characters. For your 10 input characters, that'd create 100 or 1000 groups, respectively. The generic version is:
prefix_length = 3
for prefix in product(sCharacters, repeat=prefix_length):
# create group for iCombinationLength - prefix_length
# pass in the prefix
I have a dataset df of trader transactions.
I have 2 levels of for loops as follows:
smartTrader =[]
for asset in range(len(Assets)):
df = df[df['Assets'] == asset]
# I have some more calculations here
for trader in range(len(df['TraderID'])):
# I have some calculations here, If trader is successful, I add his ID
# to the list as follows
smartTrader.append(df['TraderID'][trader])
# some more calculations here which are related to the first for loop.
I would like to parallelise the calculations for each asset in Assets, and I also want to parallelise the calculations for each trader for every asset. After ALL these calculations are done, I want to do additional analysis based on the list of smartTrader.
This is my first attempt at parallel processing, so please be patient with me, and I appreciate your help.
If you use pathos, which provides a fork of multiprocessing, you can easily nest parallel maps. pathos is built for easily testing combinations of nested parallel maps -- which are direct translations of nested for loops.
It provides a selection of maps that are blocking, non-blocking, iterative, asynchronous, serial, parallel, and distributed.
>>> from pathos.pools import ProcessPool, ThreadPool
>>> amap = ProcessPool().amap
>>> tmap = ThreadPool().map
>>> from math import sin, cos
>>> print amap(tmap, [sin,cos], [range(10),range(10)]).get()
[[0.0, 0.8414709848078965, 0.9092974268256817, 0.1411200080598672, -0.7568024953079282, -0.9589242746631385, -0.27941549819892586, 0.6569865987187891, 0.9893582466233818, 0.4121184852417566], [1.0, 0.5403023058681398, -0.4161468365471424, -0.9899924966004454, -0.6536436208636119, 0.2836621854632263, 0.9601702866503661, 0.7539022543433046, -0.14550003380861354, -0.9111302618846769]]
Here this example uses a processing pool and a thread pool, where the thread map call is blocking, while the processing map call is asynchronous (note the get at the end of the last line).
Get pathos here: https://github.com/uqfoundation
or with:
$ pip install git+https://github.com/uqfoundation/pathos.git#master
Nested parallelism can be done elegantly with Ray, a system that allows you to easily parallelize and distribute your Python code.
Assume you want to parallelize the following nested program
def inner_calculation(asset, trader):
return trader
def outer_calculation(asset):
return asset, [inner_calculation(asset, trader) for trader in range(5)]
inner_results = []
outer_results = []
for asset in range(10):
outer_result, inner_result = outer_calculation(asset)
outer_results.append(outer_result)
inner_results.append(inner_result)
# Then you can filter inner_results to get the final output.
Bellow is the Ray code parallelizing the above code:
Use the #ray.remote decorator for each function that we want to execute concurrently in its own process. A remote function returns a future (i.e., an identifier to the result) rather than the result itself.
When invoking a remote function f() the remote modifier, i.e., f.remote()
Use the ids_to_vals() helper function to convert a nested list of ids to values.
Note the program structure is identical. You only need to add remote and then convert the futures (ids) returned by the remote functions to values using the ids_to_vals() helper function.
import ray
ray.init()
# Define inner calculation as a remote function.
#ray.remote
def inner_calculation(asset, trader):
return trader
# Define outer calculation to be executed as a remote function.
#ray.remote(num_return_vals = 2)
def outer_calculation(asset):
return asset, [inner_calculation.remote(asset, trader) for trader in range(5)]
# Helper to convert a nested list of object ids to a nested list of corresponding objects.
def ids_to_vals(ids):
if isinstance(ids, ray.ObjectID):
ids = ray.get(ids)
if isinstance(ids, ray.ObjectID):
return ids_to_vals(ids)
if isinstance(ids, list):
results = []
for id in ids:
results.append(ids_to_vals(id))
return results
return ids
outer_result_ids = []
inner_result_ids = []
for asset in range(10):
outer_result_id, inner_result_id = outer_calculation.remote(asset)
outer_result_ids.append(outer_result_id)
inner_result_ids.append(inner_result_id)
outer_results = ids_to_vals(outer_result_ids)
inner_results = ids_to_vals(inner_result_ids)
There are a number of advantages of using Ray over the multiprocessing module. In particular, the same code will run on a single machine as well as on a cluster of machines. For more advantages of Ray see this related post.
Probably threading, from standard python library, is most convenient approach:
import threading
def worker(id):
#Do you calculations here
return
threads = []
for asset in range(len(Assets)):
df = df[df['Assets'] == asset]
for trader in range(len(df['TraderID'])):
t = threading.Thread(target=worker, args=(trader,))
threads.append(t)
t.start()
#add semaphore here if you need synchronize results for all traders.
Instead of using for, use map:
import functools
smartTrader =[]
m=map( calculations_as_a_function,
[df[df['Assets'] == asset] \
for asset in range(len(Assets))])
functools.reduce(smartTradder.append, m)
From then on, you can try different parallel map implementations s.a. multiprocessing's, or stackless'
OK I am using different taggers to tag a text. Default, unigram, bigram and trigram.
I have to check which combination of three of those four taggers is the most accurate.
To do that i have to loop through all the possible combinations which i do like this:
permutaties = list(itertools.permutations(['default_tagger','unigram_tagger',
'bigram_tagger','trigram_tagger'],3))
resultaten = []
for element in permutaties:
resultaten.append(accuracy(element))
so each element is a tuple of three tagmethods like for example: ('default_tagger', 'bigram_tagger', 'trigram_tagger')
In the accuracy function I now have to dynamically call the three accompanying methods of each tagger, the problem is: I don't know how to do this.
The tagger functions are as follows:
unigram_tagger = nltk.UnigramTagger(brown_train, backoff=backofff)
bigram_tagger = nltk.BigramTagger(brown_train, backoff=backofff)
trigram_tagger = nltk.TrigramTagger(brown_train, backoff=backofff)
default_tagger = nltk.DefaultTagger('NN')
So for the example the code should become:
t0 = nltk.DefaultTagger('NN')
t1 = nltk.BigramTagger(brown_train, backoff=t0)
t2 = nltk.TrigramTagger(brown_train, backoff=t1)
t2.evaluate(brown_test)
So in essence the problem is how to iterate through all 24 combinations of that list of 4 functions.
Any Python Masters that can help me?
Not shure if I understood what you need, but you can use the methods you want to call themselves instead of strings - sou your code could become soemthing like:
permutaties = itertools.permutations([nltk.UnigramTagger, nltk.BigramTagger, nltk.TrigramTagger, nltk.DefaultTagger],3)
resultaten = []
for element in permutaties:
resultaten.append(accuracy(element, brown_Train, brown_element))
def accuracy(element, brown_train,brown_element):
if element is nltk.DeafultTagger:
evaluator = element("NN")
else:
evaluator = element(brown_train, backoff=XXX) #maybe insert more elif
#clauses to retrieve the proper backoff parameter --or you could
# usr a tuple in the call to permutations so the apropriate backoff
#is avaliable for each function to be called
return evaluator.evaluate(brown_test) # ? I am not shure from your code if this is your intent
Starting with jsbueno's code, I suggest writing a wrapper function for each of the taggers to give them the same signature. And since you only need them once, I suggest using a lambda.
permutaties = itertools.permutations([lambda: ntlk.DefaultTagger("NN"),
lambda: nltk.UnigramTagger(brown_train, backoff),
lambda: nltk.BigramTagger(brown_train, backoff),
lambda: nltk.TrigramTagger(brown_train, backoff)],3)
This would allow you to call each directly, without a special function that figures out which function you're calling and employs the appropriate signature.
basing on jsbueno code I think that you want to reuse evaluator as the backoff argument so the code should be
permutaties = itertools.permutations([nltk.UnigramTagger, nltk.BigramTagger, nltk.TrigramTagger, nltk.DefaultTagger],3)
resultaten = []
for element in permutaties:
resultaten.append(accuracy(element, brown_Train, brown_element))
def accuracy(element, brown_train,brown_element):
evaluator = "NN"
for e in element:
if evaluator == "NN":
evaluator = e("NN")
else:
evaluator = e(brown_train, backoff=evaluator) #maybe insert more elif
#clauses to retrieve the proper backoff parameter --or you could
# usr a tuple in the call to permutations so the apropriate backoff
#is avaliable for each function to be called
return evaluator.evaluate(brown_test) # ? I am not shure from your code if this is your intent