Suppose I have some kind of monte carlo simulation f(x), where x is some parameter that I want to hold constant for N trials. The obvious way to parallelize this and collect the output is:
results = pool.map(f,x * np.ones(N))
However, the meaning of x in my program is really not sequence-like, so typing x * np.ones(N) seems kind of silly. None of the map variants seem to do exactly this, but maybe there exists some kind of function like:
results = pool.const_map(f,x)
Such a function could return a multiset rather than a list because the idea "repeatedly do something" does not have a notion of "order". Or maybe the developers have some interest in adding such a function? I could submit a pull request eventually when I find the time.
Related
I had no idea how to phrase the title of this question, so apologies for any confusion there. I am using the pymanopt package for optimization and would like to be able to create some sort of a function/method that allows for a generalized input (variable amount of input arrays). To use pymanopt, one has to provide a cost function defined in terms of array that are to be optimized to minimize the cost.
For example, a cost function could be:
#pymanopt.function.Autograd
def f(A,B):
return ((X - A#B.T)**2).sum()
To do the optimization, the variable X is defined prior to f, then f is supplied as the cost function to the pymanopt solver. Optimization is done with respect to the arguments of f and these arrays are returned by pymanopt with values that minimize the cost function.
Ideally, I would like to be able to do this definition more dynamically. So instead of defining a function in terms of hard coded arrays, to be able to supply a list of variables to be optimized. So if my cost function was instead:
#pymanopt.function.Autograd
def f(L):
return ((X - np.linalg.multi_dot(L)**2).sum()
Where the arrays A,B,...,C would be stored in a list, L. However, as far as I can tell, the variables to be optimized have to be directly defined as individual arrays in the cost function supplied to the solver.
The only thing I can think of doing is to define the cost function by creating a string that contains the 'hard coded' function and executing it via exec() with something like this:
args = ','.join(['A{}'.format(i) for i in range(len(L))])
exec('#pymanopt.function.Autograd\ndef({}):\n\treturn ((X-np.linalg.multi_dot({}))**2).sum()'.format(args,args))
but I understand that using this method should be avoided if possible. Any advice for navigating this sort of problem is greatly appreciated - thanks! Please let me know if anything is unclear/doesn't make sense.
I am trying to parallelize an interaction with a Python object that is computationally expensive. I would like to use Ray to do this but so far my best efforts have failed.
The object is a CPLEX model object and I'm trying to add a set of constraints for a list of conditions.
Here's my setup:
import numpy as np
import docplex.mp.model as cpx
import ray
m = cpx.Model(name="mymodel")
def mask_array(arr, mask_val):
array_mask = np.argwhere(arr == mask_val)
arg_slice = [i[0] for i in array_mask]
return arg_slice
weeks = [1,3,7,8,9]
const = 1.5
r = rate = np.array(df['r'].tolist(), dtype=np.float)
x1 = m.integer_var_list(data_indices, lb=lower_bound, ub=upper_bound)
x2 = m.dot(x1, r)
#ray.remote
def add_model_constraint(m, x2, x2sum, const):
m.add_constraint(x2sum <= x2*const)
return m
x2sums = []
for w in weeks:
arg_slice = mask_array(x2, w)
x2sum = m.dot([x2[i] for i in arg_slice], r[arg_slice])
x2sums.append(x2sum)
#: this is the expensive part
for x2sum in x2sums:
add_model_constraint.remote(m, x2, x2sum, const)
In a nutshell, what I'm doing is creating a model object, some variables, and then looping over a set of weeks in order to build a constraint. I subset my variable, compute some dot products and apply the constraint. I would like to be able to create the constraint in parallel because it takes a while but so far my code just hangs and I'm not sure why.
I don't know if I should return the model object in my function because by default the m.add_constraint method modifies the object in place. But at the same time I know Ray returns references to the remote value so yea, not sure what's supposed to happen there.
Is this at all a valid use of ray? It it reasonable to expect to be able to modify a CPLEX object in this way (or any other arbitrary python object)?
I am new to Ray so I may be structuring this all wrong, or maybe this will never work for X, Y, and Z reason which would also be good to know.
The Model object is not designed to be used in parallel. You cannot add constraints from multiple threads at the same time. This will result in undefined behavior. You will have to at least a lock to make sure only thread at a time adds constraints.
Note that parallel model building may not be a good idea at all: the order of constraints will be more or less random. On the other hand, behavior of the solver may depend on the order of constraints (this is called performance variability). So you may have a hard time reproducing certain results/behavior.
I understand the primary issue was the performance of module building.
From the code you sent, I have two suggestions to address this:
post constraints in batches, that is store constraints in a list and add them once using Model.add_constraints(), this should be more efficient than adding them one at a time.
experiment with Model.dotf() (functional-style scalar product). It avoids building auxiliary lists, passing instead a function of the key , returning the coefficient.
This method is new in Docplex version 2.12.
For example, assuming a list of 3 variables:
abc = m.integer_var_list(3, name=["a", "b", "c"]) m.dotf(abc, lambda
k: k+2)
docplex.mp.LinearExpression(a+2b+3c)
Model.dotf() is usually faster than Model.dot()
I have few questions which are bothering me since few days back. I'm a beginner Python/Django Programmer so I just want to clear few things before I dive into real time product development.(for Python 2.7.*)
1) saving value in a variable before using in a function
for x in some_list/tuple:
func(do_something(x))
for x in some_list/tuple:
y = do_something(x)
func(y)
Which one is faster or which one I SHOULD use.
2)Creating a new object of a model in Django
def myview(request):
u = User(username="xyz12",city="TA",name="xyz",...)
u.save()
def myview(request):
d = {'username':"xyz12",'city':"TA",'name':"xyz",...}
u = User(**d)
u.save()
3) creating dictionary
var = Dict(key1=val1,key2=val2,...)
var = {'key1':val1,'key2':val2,...}
4) I know .append() is faster than += but what if I want to append a list's elements to another
a = [1,2,3,],b=[4,5,6]
a += b
or
for i in b:
a.append(i)
This is a very interesting question, but I think you don't ask it for the good reason. The performances gained by such optimisations are negligible, especially if you're working with small number of elements.
On the other hand, what is really important is the ease of reading the code and it's clarity.
def myview(request):
d = {'username':"xyz12",'city':"TA",'name':"xyz",...}
u = User(**d)
u.save()
This code for example isn't "easy" to read and to understand at first sight. It requires to think about it before finding what is actually does. Unless you need the intermediary step, don't do it.
For the 4th point, I'd go for the first solution, way much clearer (and it avoids the function call overhead created by calling the same function in a loop). You could also use more specialised function for better performances such as reduce (see this answer : https://stackoverflow.com/a/11739570/3768672 and this thread as well : What is the fastest way to merge two lists in python?).
The 1st and 3rd points are usually up to what you prefer, as both are really similar and will probably be optimised when compiled to bytecode anyway.
If you really want to optimise more your code, I advise you to go check this out : https://wiki.python.org/moin/PythonSpeed/PerformanceTips
PS : Ultimately, you can still do your own tests. Write two functions doing the exact same things with the two different methods you want to test, measure the execution times of these methods and compare them (be careful, do the tests multiple time to reduce the uncertainties).
I'm trying to use scipy.optimize.minimize to minimize a complicated function. I noticed in hindsight that the minimize function takes the objective and derivative functions as separate arguments. Unfortunately, I've already defined a function which returns the objective function value and first-derivative values together -- because the two are computed simultaneously in a for loop. I don't think there is a good way to separate my function into two without the program essentially running the same for loop twice.
Is there a way to pass this combined function to minimize?
(FYI, I'm writing an artificial neural network backpropagation algorithm, so the for loop is used to loop over training data. The objective and derivatives are accumulated concurrently.)
Yes, you can pass them in a single function:
import numpy as np
from scipy.optimize import minimize
def f(x):
return np.sin(x) + x**2, np.cos(x) + 2*x
sol = minimize(f, [0], jac=True, method='L-BFGS-B')
Something that might work is: you can memoize the function, meaning that if it gets called with the same inputs a second time, it will simply return the same outputs corresponding to those inputs without doing any actual work the second time. What is happening behind the scenes is that the results are getting cached. In the context of a nonlinear program, there could be thousands of calls which implies a large cache. Often with memoizers(?), you can specify a cache limit and the population will be managed FIFO. IOW you still benefit fully for your particular case because the inputs will be the same only when you are needing to return function value and derivative around the same point in time. So what I'm getting at is that a small cache should suffice.
You don't say whether you are using py2 or py3. In Py 3.2+, you can use functools.lru_cache as a decorator to provide this memoization. Then, you write your code like this:
#functools.lru_cache
def original_fn(x):
blah
return fnvalue, fnderiv
def new_fn_value(x):
fnvalue, fnderiv = original_fn(x)
return fnvalue
def new_fn_deriv(x):
fnvalue, fnderiv = original_fn(x)
return fnderiv
Then you pass each of the new functions to minimize. You still have a penalty because of the second call, but it will do no work if x is unchanged. You will need to research what unchanged means in the context of floating point numbers, particularly since the change in x will fall away as the minimization begins to converge.
There are lots of recipes for memoization in py2.x if you look around a bit.
Did I make any sense at all?
I have spent many hours trying to figure what is going on here.
The function 'grad_logp' in the code below is called many times in my program, and cProfile and runsnakerun the visualize the results reveals that the function grad_logp spends about .00004s 'locally' every call not in any functions it calls and the function 'n' spends about .00006s locally every call. Together these two times make up about 30% of program time that I care about. It doesn't seem like this is function overhead as other python functions spend far less time 'locally' and merging 'grad_logp' and 'n' does not make my program faster, but the operations that these two functions do seem rather trivial. Does anyone have any suggestions on what might be happening?
Have I done something obviously inefficient? Am I misunderstanding how cProfile works?
def grad_logp(self, variable, calculation_set ):
p = params(self.p,self.parents)
return self.n(variable, self.p)
def n (self, variable, p ):
gradient = self.gg(variable, p)
return np.reshape(gradient, np.shape(variable.value))
def gg(self, variable, p):
if variable is self:
gradient = self._grad_logps['x']( x = self.value, **p)
else:
gradient = __builtin__.sum([self._pgradient(variable, parameter, value, p) for parameter, value in self.parents.iteritems()])
return gradient
Functions coded in C are not instrumented by profiling; so, for example, any time spent in sum (which you're spelling __builtin__.sum) will be charged to its caller. Not sure what np.reshape is, but if it's numpy.reshape, the same applies there.
Your "many hours" might be better spent making your code less like a maze of twisty little passages and also documenting it.
The first method's arg calculation_set is NOT USED.
Then it does p = params(self.p,self.parents) but that p is NOT USED.
variable is self???
__builtin__.sum???
Get it firstly understandable, secondly correct. Then and only then, worry about the speed.