How to submit map_blocks operation to a Client

How to submit map_blocks operation to a Client - python

I am trying to chunk-wise evaluate a funtion on a dask array. I can successfully do the following:
import numpy as np
import dask.array as da
arr = np.random.randint(0, 9, (4, 4))
darr = da.from_array(arr, chunks=(2, 2))
def block_mean(block):
return np.array([[block.mean()]])
r = darr.map_blocks(block_mean)
out_arr = r.compute()
How can I replicate this in a Client? I tried the following:
import numpy as np
import dask.array as da
arr = np.random.randint(0, 9, (4, 4))
darr = da.from_array(arr, chunks=(2, 2))
def block_mean(block):
return np.array([[block.mean()]])
r = darr.map_blocks(block_mean)
c = Client()
# The following does not work
out = c.submit(r)

To keep this very short: you need not do anything at all; if you define a Client, it will become the default scheduler, and .compute() uses the default scheduler (unless specified).
r = darr.map_blocks(block_mean)
c = Client()
out_arr = r.compute()
If you read the docs for submit, it expects a function - there are plenty of examples of how to use it.

Related

efficient way to get all numpy slices for different ranges

I want to slice the same numpy array (data_arra) multiple times to find each time the values in a different range
data_ar shpe: (203,)
range_ar shape: (1000,)
I implemented it with a for loop, but it takes way to long since I have a lot of data_arrays:
#create results array
results_ar = np.zeros(shape=(1000),dtype=object)
i=0
for range in range_ar:
results_ar[i] = data_ar[( (data_ar>=(range-delta)) & (data_ar<(range+delta)) )].values
i+=1
so for example:
data_ar = [1,3,4,6,10,12]
range_ar = [7,4,2]
delta= 3
expected output:
(note results_ar shpae=(3,) dtype=object, each element is an array)
results_ar[[6,10];
[1,3,4,6];
[1,3,4]]
some idea on how to tackle this?

You can use numba to speed up the computations.
import numpy as np
import numba
from numba.typed import List
import timeit
data_ar = np.array([1,3,4,6,10,12])
range_ar = np.array([7,4,2])
delta = 3
def foo(data_ar, range_ar):
results_ar = list()
for i in range_ar:
results_ar.append(data_ar[( (data_ar>=(i-delta)) & (data_ar<(i+delta)) )])
print(timeit.timeit(lambda :foo(data_ar, range_ar)))
#numba.njit(parallel=True, fastmath=True)
def foo(data_ar, range_ar):
results_ar = List()
for i in range_ar:
results_ar.append(data_ar[( (data_ar>=(i-delta)) & (data_ar<(i+delta)) )])
print(timeit.timeit(lambda :foo(data_ar, range_ar)))
15.53519330600102
1.6557575029946747
An almost 9.8 times speedup.

You could use np.searchsorted like this:
data_ar = np.array([1, 3, 4, 6, 10, 12])
range_ar = np.array([7, 4, 2])
delta = 3
bounds = range_ar[:, None] + delta * np.array([-1, 1])
result = [data_ar[slice(*row)] for row in np.searchsorted(data_ar, bounds)]

How to map a function to an array of dataframes in parallel in python?

My code has an array of dataframes, each of which I want to apply a function to. The dataframes are all in the same format, here's an example of one:
and the code I have for regular mapping is this:
def return_stat(df):
return np.random.choice(df.iloc[:,1],p=df.iloc[:,0])
weather_df_list = [weather_df1,weather_df2,weather_df3,weather_df4]
expected_values = list(map(lambda i:return_stat(i), weather_df_list))
but I have 16 cores on my computer and want to make use of it to make this code super fast.
How would I implement this same code using parallel computing in Python?
Thanks!

Using multiprocessing.Pool can help to occupy all your cores.
import pandas as pd, numpy as np, multiprocessing
def return_stat(df):
return np.random.choice(df.iloc[:, 1], p = df.iloc[:, 0])
if __name__ == '__main__':
weather_df = pd.DataFrame({'rain_probability': [0.1,0.2,0.7], 'rain_inches': [1,2,3]})
weather_df_list = [weather_df, weather_df, weather_df, weather_df]
with multiprocessing.Pool() as pool:
expected_values = pool.map(return_stat, weather_df_list)
print(expected_values)
Another fancy and also efficient way to solve the problem is using Numba. It transcodes Python into efficient machine code and also has parallelization feature. Although it had no choice() variant supporting probabilities array, hence I had to implement choice() myself. You need to install numba once through python -m pip install numba.
import pandas as pd, numpy as np
from numba import njit
#njit(parallel = True, fastmath = True)
def choices(l):
rnds = np.random.random((len(l),))
def choice(i, a, p):
assert p.shape == a.shape
p = p.cumsum()
p = p / p[-1]
r = rnds[i]
i = np.sum((p <= r).astype(np.int64))
return a[i]
res = np.empty((len(l),), dtype = np.float64)
for i in range(len(l)):
res[i] = choice(i, l[i][:, 1], l[i][:, 0])
return res
weather_df = pd.DataFrame({'rain_probability': [0.1, 0.2, 0.3, 0.4], 'rain_inches': [0, 1, 2, 3]})
weather_df_list = [weather_df, weather_df, weather_df, weather_df, weather_df, weather_df, weather_df, weather_df]
weather_df_arrays = [e.values[:, :2] for e in weather_df_list]
print(choices(weather_df_arrays))
You may try numba variant on your side and tell me how fast it is, if it is not faster than multiprocessing variant then I have some extra ideas how to improve its speed.

Creating a vector of values based off a test using a for loop

This feels like it should be a simple problem but I am newer to python, in R i would use a foreach loop that gave me an option to combine.
I have tried a for loop that lets me print out all the values i need but i want them collected into a vector of values that i can use later.
from scipy.stats import gamma
import scipy.stats as stats
import numpy as np
import random
data2 = np.random.gamma(1,2, size = 500)
gammT = np.log(data2 + 1)
mean = np.mean(gammT)
sd = np.std(gammT)
a = (mean/ sd)**2
b = (sd**2)/ mean
for i in range(1,100):
gammT = random.sample(list(gammT), 500)
gamm = np.random.gamma(a,b, size = len(gammT))
s = stats.anderson_ksamp([gammT,gamm])
s = s[2]
print(s)
So i am able to print all the values i want but i want them all to be gathered together in a vector of values. I have tried to append and make lists but am not able to get them together.

from scipy.stats import gamma
import scipy.stats as stats
import numpy as np
import random
gammT = np.log(data2.iScore + 1)
mean = np.mean(gammT)
sd = np.std(gammT)
a = (mean/ sd)**2
b = (sd**2)/ mean
#initialize empty list
result=[]
for i in range(100):
# removed (1,100) you only need range(100) for 100 elements
gammT = random.sample(list(gammT), 500)
gamm = np.random.gamma(a,b, size = len(gammT))
s = stats.anderson_ksamp([gammT,gamm])
s = s[2]
#append calculation to list
result.append(s)
print(s)
print(result)

TabPy - No return value

I am working in TabPy inside Tableau and want to perform normal statistical calculations.
I am stuck with Cp calculation. Here is the code that I wrote -
SCRIPT_REAL("
import pandas as pd
import numpy as np
from scipy import stats
# Calculate Cp
def Cp(list,_arg2,_arg3):
arr = np.array(list)
arr = arr.ravel()
sigma = np.std(arr)
Cp = float(_arg2 - _arg3) / (6*sigma)
return Cp
",FLOAT([USL - Param]), FLOAT([LSL - Param]))
The error that I am getting is -
No Return Value
although I am clearly returning Cp. What could be the issue?
Please help.

Something like the below would solve some of the issues you're seeing.
I haven't checked the validity of your Cp function, and whether this would work with lists or single values.
SCRIPT_REAL("
import pandas as pd
import numpy as np
from scipy import stats
# Define Cp
def Cp(argu_1,argu_2):
arr = np.array(list)
arr = arr.ravel()
sigma = np.std(arr)
Cp_value = float(argu_1 - argu_2) / (6*sigma)
return Cp_value
# Call function with variables from Tableau, and return the Cp_value
return Cp(<Argument 1>, <Argument 2>)
",FLOAT([USL - Param]), FLOAT([LSL - Param]))

cuda code error within numbapro

import numpy
import numpy as np
from numbapro import cuda
#cuda.autojit
def foo(aryA, aryB,out):
d_ary1 = cuda.to_device(aryA)
d_ary2 = cuda.to_device(aryB)
#dd = numpy.empty(10, dtype=np.int32)
d_ary1.copy_to_host(out)
griddim = 1, 2
blockdim = 3, 4
aryA = numpy.arange(10, dtype=np.int32)
aryB = numpy.arange(10, dtype=np.int32)
out = numpy.empty(10, dtype=np.int32)
foo[griddim, blockdim](aryA, aryB,out)
Exception: Caused by input line 11:
can only get attribute from globals, complex numbers or arrays
I am new to numbapro, hints are needed!

The #cuda.autotjit marks and compiles foo() as a CUDA kernel. The memory transfer operations should be placed outside of the kernel. It should look like the following code:
import numpy
from numbapro import cuda
#cuda.autojit
def foo(aryA, aryB ,out):
# do something here
i = cuda.threadIdx.x + cuda.blockIdx.x * cuda.blockDim.x
out[i] = aryA[i] + aryB[i]
griddim = 1, 2
blockdim = 3, 4
aryA = numpy.arange(10, dtype=numpy.int32)
aryB = numpy.arange(10, dtype=numpy.int32)
out = numpy.empty(10, dtype=numpy.int32)
# transfer memory
d_ary1 = cuda.to_device(aryA)
d_ary2 = cuda.to_device(aryB)
d_out = cuda.device_array_like(aryA) # like numpy.empty_like() but for GPU
# launch kernel
foo[griddim, blockdim](aryA, aryB, d_out)
# transfer memory device to host
d_out.copy_to_host(out)
print out
I recommend new NumbaPro users to look at the examples in https://github.com/ContinuumIO/numbapro-examples.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

How to submit map_blocks operation to a Client - python

Related

efficient way to get all numpy slices for different ranges

How to map a function to an array of dataframes in parallel in python?

Creating a vector of values based off a test using a for loop

TabPy - No return value

cuda code error within numbapro

Categories

Resources