Passing variables to parallelized function

Passing variables to parallelized function - python

I'm parallelizing the generation of a matrix where each element in the matrix is computed by a function fun. I can get it to work if the only thing I pass into this function are the indices i and j. However, I want to pass another variable into this function say x, how do I do this?
I'm using Python 2.7
import numpy as np
import multiprocess as mp
import itertools
p = mp.Pool()
def fun((i,j)):
print i,j
prod = i * j
# what if I want to have a variable x in this function
# prod = i * j * x
return prod
combs = ((i,j) for i,j in itertools.product(xrange(5), repeat=2) if i <= 5)
result = p.map(fun, combs)
p.close()
p.join()
newresult = np.array(result).reshape(5,5)
print newresult

def fun((i,j,x)):
print i,j,x
prod = i * j * x
return prod
Why this works: You are actually just passing one object into the function, which turns out to be a tuple. def fun((i,j)) is just simply breaking the tuple apart again from the object. So to answer your question, you can just add another element to the tuple and it works fine.
A more visibly clear representation of what you are doing:
def fun(data):
i,j,x = data
print i,j,x
prod = i * j * x
return prod
data = (2,4,10)
print(fun(data))
Or you can do this:
def fun((i,j), x):
print i,j, x
prod = i * j * x
# what if I want to have a variable x in this function
# prod = i * j * x
return prod
print(fun((2,4), 10))

Related

How do I run this function for multiple values of N?

I am trying to run the code below for N = np.linspace(20,250,47), but I get multiple errors when trying to change the N. I am new to python and am not sure how to get multiple values of this function using multiple values of N. Below is the code with N = 400 and it does work, but I am not sure how to make it work for multiple N's at the same time.
import matplotlib.pyplot as plt
import numpy as np
S0 = 9
K = 10
T = 3
r = 0.06
sigma = 0.3
N = 400
dt = T / N
u = exp(sigma*sqrt(dt)+(r-0.5*sigma**2)*dt)
d = exp(-sigma*sqrt(dt)+(r-0.5*sigma**2)*dt)
p = 0.5
def binomial_tree_put(N, T, S0, sigma, r, K, array_out=False):
dt = T / N
u = exp(sigma*sqrt(dt)+(r-0.5*sigma**2)*dt)
d = exp(-sigma*sqrt(dt)+(r-0.5*sigma**2)*dt)
p = 0.5
price_tree = np.zeros([N+1,N+1])
for i in range(N+1):
for j in range(i+1):
price_tree[j,i] = S0*(d**j)*(u**(i-j))
option = np.zeros([N+1,N+1])
option[:,N] = np.maximum(np.zeros(N+1), K - price_tree[:,N])
for i in np.arange(N-1, -1, -1):
for j in np.arange(0, i+1):
option[j, i] = np.exp(-r*dt)*(p*option[j, i+1]+(1-p)*option[j+1, i+1])
if array_out:
return [option[0,0], price_tree, option]
else:
return option[0,0]

Suppose you have a list of values for N e.g N = [400, 300, 500, 800], then you need to call the function for every value, you can use a loop for that.
For example,
for num in N:
binomial_tree_put(num, *other arguments*)

np.linspace() creates an np.array but the function expects a sinlge integer. If you want to execute a function for each element contained inside a array/list, you can do that inside a loop like this:
# your code as defined above goes here
for num in np.linspace(20,250,47):
N = int(num) # you could just put N in the line above - this is just to illustrate
binomial_tree_put(N, T, S0, sigma, r, K, array_out=False)
Be aware, depending on how long your function takes to execute and how many elements are in your iterable (e.g. 47 for your case), it may take a while to execute.
Edit: I also noticed you seem to be missing an import in your example code. exp() and sqrt() are part of the math module.

You can also use partial function, like this:
from functools import partial
N = [1, 2, ...] # all your N values
binom_fct = partial(binomial_tree_put, T=T, S0=S0, sigma=sigma, r=r, K=K, array_out=array_out)
for num in N:
binom_fct(num)
partial help here

Finding all the roots of the function x = a*sin(x) in Python

I have to find the number of solution depending on the parameter a. While solving the equation numerically using scipy.optimize.root I get some numbers which aren't root of the function. For example for
x = 7*sin(x) i get numbers -7.71046524 and 7.71046524. My code is:
a = np.linspace(-5, 5)
def fun(x):
return x - b*np.sin(x)
for i in a:
solutions = []
b = i
c = abs(int(round(i)))
for j in range(-c, c+1):
y = root(fun, j)
if (round(y.x[0], 3) not in solutions):
solutions.append(round(y.x[0], 3))
print(len(solutions))

If you use scipy.optimize.root, the return value contains x the solution array and the success boolean flag. You need to filter out any result where success is False.
import numpy as np
from scipy.optimize import root
a = np.linspace(-7, 7)
def fun(x):
return x - b*np.sin(x)
for i in a:
solutions = []
b = i
c = abs(int(round(i)))
for j in range(-c, c+1):
y = root(fun, j)
if y.success and (round(y.x[0], 6) not in solutions):
solutions.append(round(y.x[0], 3))
print(i, solutions)

Python - is there a way to automate this? <code>itertools.product</code>

I have the following code:
import numpy as np
from itertools import product
x = np.arange(-1, 2)
a = np.array([i for i in product(x,x,x,x)])
This is I also need np.array([i for i in product(x,x)]) and np.array([i for i in product(x,x,x)])... So I would like to automate product such that I just have to give an argument for the number of repetitions...
I tried to give product a list and a tuple, that does not work.
Any ideas?

Write your own product for numpy, with a argument repeat, that says, how often x has to be repeated:
def np_product(x, repeat):
result = np.ndarray((len(x),)*repeat + (repeat,))
for n in range(repeat):
index = (None,) * n + (slice(None),) + (None,) * (repeat-n-1)
result[..., n] = x[index]
return result.reshape(-1, repeat)
a = np_product(x, repeat)

product takes an optional integer argument specifying how many times you want to repeat the iterable argument.
np.array(product(x, repeat=2))
np.array(product(x, repeat=3))
np.array(product(x, repeat=4))
# etc

n = 4
lst = [x for _ in range(n)]
[i for i in product(*lst)]
you can you * followed by a list as *args

What do I get from Queue.get() (Python)

Overall question: How do I know what I am getting from a Queue object when I call Queue.get()? How do I sort it, or identify it? Can you get specific items from the Queue and leave others?
Context:
I wanted to learn a little about multi-proccessing (threading?) to make solving a matrix equation more efficient.
To illustrate, below is my working code for solving the matrix equation Ax = b without taking advantage of multiple cores. The solution is [1,1,1].
def jacobi(A, b, x_k):
N = len(x_k)
x_kp1 = np.copy(x_k)
E_rel = 1
iteration = 0
if (N != A.shape[0] or N != A.shape[1]):
raise ValueError('Matrix/vector dimensions do not match.')
while E_rel > ((10**(-14)) * (N**(1/2))):
for i in range(N):
sum = 0
for j in range(N):
if j != i:
sum = sum + A[i,j] * x_k[j]
x_kp1[i] =(1 / A[i,i]) * (b[i] - sum)
E_rel = 0
for n in range(N):
E_rel = E_rel + abs(x_kp1[n] - x_k[n]) / ((abs(x_kp1[n]) + abs(x_k[n])) / 2)
iteration += 1
# print("relative error for this iteration:", E_rel)
if iteration < 11:
print("iteration ", iteration, ":", x_kp1)
x_k = np.copy(x_kp1)
return x_kp1
if __name__ == '__main__':
A = np.matrix([[12.,7,3],[1,5,1],[2,7,-11]])
b = np.array([22.,7,-2])
x = np.array([1.,2,1])
print("Jacobi Method:")
x_1 = jacobi(A, b, x)
Ok, so I wanted to convert this code following this nice example: https://p16.praetorian.com/blog/multi-core-and-distributed-programming-in-python
So I got some code that runs and converges to the correct solution in the same number of iterations! That's really great, but what is the guarantee that this happens? It seems like Queue.get() just grabs whatever result from whatever process finished first (or last?). I was actually very surprised when my code ran, as I expected
for i in range(N):
x_update[i] = q.get(True)
to jumble up the elements of the vector.
Here is my code updated using the multi-processing library:
import numpy as np
import multiprocessing as mu
np.set_printoptions(precision=15)
def Jacobi_step(index, initial_vector, q):
N = len(initial_vector)
sum = 0
for j in range(N):
if j != i:
sum = sum + A[i, j] * initial_vector[j]
# this result is the updated element at given index of our solution vector.
q.put((1 / A[index, index]) * (b[index] - sum))
if __name__ == '__main__':
A = np.matrix([[12.,7,3],[1,5,1],[2,7,-11]])
b = np.array([22.,7,-2])
x = np.array([1.,2,1])
q = mu.Queue()
N = len(x)
x_update = np.copy(x)
p = []
error = 1
iteration = 0
while error > ((10**(-14)) * (N**(1/2))):
# assign a process to each element in the vector x,
# update one element with a single Jacobi step
for i in range(N):
process = mu.Process(target=Jacobi_step(i, x, q))
p.append(process)
process.start()
# fill in the updated vector with each new element aquired by the last step
for i in range(N):
x_update[i] = q.get(True)
# check for convergence
error = 0
for n in range(N):
error = error + abs(x_update[n] - x[n]) / ((abs(x_update[n]) + abs(x[n])) / 2)
p[i].join()
x = np.copy(x_update)
iteration += 1
print("iteration ", iteration, ":", x)
del p[:]

A Queue is first-in-first-out which means the first element inserted is the first element retrieved, in order of insertion.
Since you have no way to control that, I suggest you insert tuples in the Queue, containing the value and some identifying object that can be used to sort/relate to the original computation.
result = (1 / A[index, index]) * (b[index] - sum)
q.put((index, result))
This example puts the index in the Queue together with the result, so that when you .get() later you get the index too and use it to know which computation this is for:
i, x_i = q.get(True)
x_update[i] = x_i
Or something like that.

How to optimize data generation for numpy call

I'd like to know how to make the following code shorter and/or more efficient. Could I (or should I) get rid of the for loop by using a functional method, or is there method I should be using from numpy?
The code calculates the expected value of an array of of integers.
vals = np.arange(self.n+1)
# array of probability of each value in vals
parr = np.ones(len(vals))
for i in range(len(vals)):
parr[i] *= self.prob(vals[i])
return np.dot(vals,parr)
As requested in comments, the implementation of the method prob():
def prob(self, x):
"""Computes probability of removing x items
:param x: number of items to remove
:returns: probability of removing x items
"""
# p is the probability of removing an item
# sl.choose computes n choose x
return sl.choose(self.n, x) * (self.p**x) * \
(1-self.p)**(self.n-x)

I think it will be most faster:
vals = np.arange(self.n+1)
# array of probability of each value in vals
parr = self.prob(vals)
return np.dot(vals,parr)
and function:
def prob(list_of_x):
"""Computes probability of removing x items
:param list_of_x: numbers of items to remove
:returns: probability of removing x items
"""
# p is the probability of removing an item
# sl.choose computes n choose x
return np.asarray([sl.choose(self.n, e) for e in list_of_x]) * (self.p ** list_of_x) * \
(1-self.p)**(self.n - list_of_x)
Because numpy is faster:
import timeit
import numpy as np
list_a = [1, 2, 3] * 1000
list_b = [4, 5, 6] * 1000
np_list_a = np.asarray(list_a)
np_list_b = np.asarray(list_b)
print(timeit.timeit('[a * b for a, b in zip(list_a, list_b)]', 'from __main__ import list_a, list_b', number=1000))
print(timeit.timeit('np_list_a * np_list_b', 'from __main__ import np_list_a, np_list_b', number=1000))
Result:
0.19378583212707723
0.004333830584755033

The loop can be reduced to a list comprehension:
vals = np.arange(self.n+1)
# array of probability of each value in vals
parr = [self.prob(v) for v in vals]
return np.dot(vals, parr)

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Passing variables to parallelized function - python

Related

How do I run this function for multiple values of N?

Finding all the roots of the function x = a*sin(x) in Python

Python - is there a way to automate this? <code>itertools.product</code>

What do I get from Queue.get() (Python)

How to optimize data generation for numpy call

Categories

Resources