How to use numba.SmartArrays for vector addition? - python

I have written this code for vector addition using numba.SmartArrays. I am using this numba.SmartArrays for the first time. I am not sure how to use that.
This code is not working and it is throwing errors.
import numpy as np
from numba import SmartArray,cuda, jit, uint32
li1=np.uint32([1,2,3,4])
li=np.uint32([1,2,3,4])
b=SmartArray(li,where="host",copy=True)
a=SmartArray(li1,where="host",copy=True)
c=np.uint32([1,1,1,1])
print type(li)
print type(a)
#cuda.jit('void(uint32[:],uint32[:],uint32[:])',type="gpu")
def additionG(c,a,b):
idx=cuda.threadIdx.x+cuda.blockDim.x*cuda.blockIdx.x
if idx< len(a):
a[idx]=c[idx]+b[idx]
dA=cuda.to_device(a)
dB=cuda.to_device(b)
dC=cuda.to_device(c)
additionG[1, 128](c,a,b)
print a.__array__()
Errors:
<type 'numpy.ndarray'>
<class 'numba.smartarray.SmartArray'>
Traceback (most recent call last):
File "C:\Users\hp-pc\My Documents\LiClipse Workspace\cuda\blowfishgpu_smart_arrays.py", line 20, in <module>
dA=cuda.to_device(a)
File "C:\Anaconda\lib\site-packages\numba\cuda\cudadrv\devices.py", line 257, in _require_cuda_context
return fn(*args, **kws)
File "C:\Anaconda\lib\site-packages\numba\cuda\api.py", line 55, in to_device
to, new = devicearray.auto_device(obj, stream=stream, copy=copy)
File "C:\Anaconda\lib\site-packages\numba\cuda\cudadrv\devicearray.py", line 403, in auto_device
devobj.copy_to_device(obj, stream=stream)
File "C:\Anaconda\lib\site-packages\numba\cuda\cudadrv\devicearray.py", line 148, in copy_to_device
sz = min(_driver.host_memory_size(ary), self.alloc_size)
File "C:\Anaconda\lib\site-packages\numba\cuda\cudadrv\driver.py", line 1348, in host_memory_size
s, e = host_memory_extents(obj)
File "C:\Anaconda\lib\site-packages\numba\cuda\cudadrv\driver.py", line 1333, in host_memory_extents
return mviewbuf.memoryview_get_extents(obj)
TypeError: expected a readable buffer object

Its been a while since I posted this question. Still posting the answer so that someone may find it helpful in future.
import numpy as np
from numba import SmartArray,cuda, jit, uint32,autojit
li1=np.uint32([6,7,8,9])
li=np.uint32([1,2,3,4])
a=SmartArray(li1,where='host',copy=True)
b=SmartArray(li,where="host",copy=True)
c=np.uint32([1,1,1,1])
def additionG(a,c):
idx=cuda.threadIdx.x+cuda.blockDim.x*cuda.blockIdx.x
if idx < len(c):
a[idx]=a[idx]+c[idx]
cuda.syncthreads()
bpg=1
tpb=128
dC=cuda.to_device(c)
cfunc = cuda.jit()(additionG)
cfunc[bpg, tpb](a,dC)
print a.__array__()

It looks to me like cuda.to_device doesn't handle smart arrays, which would sort of make sense, because smart arrays are supposed to do away with explicit copy management.
If my reading of the documentation is correct (I have never tried SmartArray before), you should just be able to change this
dA=cuda.to_device(a)
dB=cuda.to_device(b)
dC=cuda.to_device(c)
additionG[1, 128](c,a,b)
to just
dC=cuda.to_device(c)
additionG[1, 128](dC,a.gpu(),b.gpu())
The .gpu() method should return a GPU resident object that the kernel can understand and access.

Related

mpi4py not letting me send and receive array with dtype = object

I have a numpy array of dtype = object that I am trying to send and receive using comm.Send() and comm.Recv, but I'm running into errors and can't seem to debug it. The array I'm trying to send consists of 2 columns: 1 column of strings and 1 column of integers.
from mpi4py import MPI
comm = MPI.COMM_WORLD
rank = comm.Get_rank()
data_array = np.empty(100,2), dtype=object)
data_array[:,0] = var_1
data_array[:,1] = var_2
(data_array_0, data_array_1) = np.array_split(data_array, 2)
data_array_0 = np.ascontiguousarray(data_array_0, dtype = object)
data_array_1 = np.ascontiguousarray(data_array_1, dtype = object)
if rank == 0:
comm.Send(data_array_1, dest=1)
elif rank == 1:
data_array_1 = np.empty([data_array_row_1, data_array_col], dtype = object)
comm.Recv(data_array_1, source=0) # <--- the line that's causing the error
I get the following error message:
Traceback (most recent call last):
File "data_clean_parallel_1.py", line 156, in <module>
comm.Recv(data_array_1, source=0)
File "mpi4py/MPI/Comm.pyx", line 283, in mpi4py.MPI.Comm.Recv
File "mpi4py/MPI/msgbuffer.pxi", line 402, in mpi4py.MPI.message_p2p_recv
File "mpi4py/MPI/msgbuffer.pxi", line 388, in mpi4py.MPI._p_msg_p2p.for_recv
File "mpi4py/MPI/msgbuffer.pxi", line 155, in mpi4py.MPI.message_simple
File "mpi4py/MPI/msgbuffer.pxi", line 101, in mpi4py.MPI.message_basic
KeyError: 'O'
I don't really understand what's causing this issue, and if there's any possible alternative that exists where I could send/receive string data using mpi4py.
You will have to use send and recv (lowercase) for numpy arrays:
As seen on https://mpi4py.readthedocs.io/en/stable/overview.html:
The variants MPI.Comm.send(), MPI.Comm.recv() and MPI.Comm.sendrecv()
can communicate general Python objects.

Keyword error when implementing Scatterv in mpi4py

I'm trying to use Scatterv to distribute parts of an array to each of my processors, but the line where I run the Scatterv call fails, and I get this error:
Traceback (most recent call last):
File "<ipython-input-16-e1f960b94347>", line 1, in <module>
comm.Scatterv([init_data, (sendcount,split)], init_data_local, root=0)
File "mpi4py/MPI/Comm.pyx", line 626, in mpi4py.MPI.Comm.Scatterv
File "mpi4py/MPI/msgbuffer.pxi", line 538, in mpi4py.MPI._p_msg_cco.for_scatter
File "mpi4py/MPI/msgbuffer.pxi", line 440, in mpi4py.MPI._p_msg_cco.for_cco_send
File "mpi4py/MPI/msgbuffer.pxi", line 266, in mpi4py.MPI.message_vector
File "mpi4py/MPI/msgbuffer.pxi", line 100, in mpi4py.MPI.message_basic
KeyError: '38w'
I have no idea what I'm doing wrong or how to fix this error. Any help would be appreciated!
EDIT: Here is a reproducible example of the code. Changing the data type of the init_data array changes the number following KeyError, but still gives the same error. My choice of '<U38' as the dtype is because that is what np.loadtxt uses when loading the array within my actual code.
import numpy as np
from mpi4py import MPI
comm = MPI.COMM_WORLD
rank = comm.Get_rank()
size = comm.Get_size()
if rank==0:
init_data=np.ones((5187,3), dtype='<U38')
length=len(init_data[:,0])
else:
length=None
init_data=None
length=comm.bcast(length, root=0)
sendcount=[]
split=[]
for r in range(size):
split.append(r*length//size)
if r<size-1:
sendcount.append(length//size)
else:
sendcount.append(length-(r*length//size))
sendcount=tuple(sendcount)
split=tuple(split)
init_data_local=np.empty((sendcount[rank], 3),dtype=str)
comm.Scatterv([init_data, (sendcount,split)], init_data_local, root=0)

Numba Indexing Error: TypeError: Can't index at [0] in i8*

I'm learning how to use Numba to speed up functions with jit and vectorize. I didn't have any issues with the jit version of this code, but I am getting an index error with vectorize. I suspect this question's answer is getting at the right idea that there is a type error, but I'm not confident on which direction to take on changing the indexing. Included below is the function I've been playing around with, which outputs the Fibonacci numbers up to a chosen index of the sequence. What is going wrong with the indexing, and how I can correct my code to account for it?
from numba import vectorize
import numpy as np
from timeit import timeit
#vectorize
def fib(n):
'''
Adjusted from:
https://lectures.quantecon.org/py/numba.html
https://en.wikipedia.org/wiki/Fibonacci_number
https://www.geeksforgeeks.org/program-for-nth-fibonacci-number/
'''
if n == 1:
return np.ones(1)
elif n > 1:
x = np.empty(n)
x[0] = 1
x[1] = 1
for i in range(2,n):
x[i] = x[i-1] + x[i-2]
return x
else:
print('WARNING: Check validity of input.')
print(timeit('fib(10)', globals={'fib':fib}))
Which results in the following error output.
Traceback (most recent call last):
File "/usr/local/lib/python3.6/dist-packages/llvmlite/ir/instructions.py", line 619, in __init__
typ = typ.elements[i]
AttributeError: 'PointerType' object has no attribute 'elements'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/home/galen/Projects/myjekyllblog/test_code/quantecon_2.py", line 27, in <module>
print(timeit('fib(10)', globals={'fib':fib}))
File "/usr/lib/python3.6/timeit.py", line 233, in timeit
return Timer(stmt, setup, timer, globals).timeit(number)
File "/usr/lib/python3.6/timeit.py", line 178, in timeit
timing = self.inner(it, self.timer)
File "<timeit-src>", line 6, in inner
File "/usr/local/lib/python3.6/dist-packages/numba/npyufunc/dufunc.py", line 166, in _compile_for_args
return self._compile_for_argtys(tuple(argtys))
File "/usr/local/lib/python3.6/dist-packages/numba/npyufunc/dufunc.py", line 188, in _compile_for_argtys
cres, actual_sig)
File "/usr/local/lib/python3.6/dist-packages/numba/npyufunc/ufuncbuilder.py", line 157, in _build_element_wise_ufunc_wrapper
cres.objectmode, cres)
File "/usr/local/lib/python3.6/dist-packages/numba/npyufunc/wrappers.py", line 220, in build_ufunc_wrapper
env=envptr)
File "/usr/local/lib/python3.6/dist-packages/numba/npyufunc/wrappers.py", line 130, in build_fast_loop_body
env=env)
File "/usr/local/lib/python3.6/dist-packages/numba/npyufunc/wrappers.py", line 23, in _build_ufunc_loop_body
store(retval)
File "/usr/local/lib/python3.6/dist-packages/numba/npyufunc/wrappers.py", line 126, in store
out.store_aligned(retval, ind)
File "/usr/local/lib/python3.6/dist-packages/numba/npyufunc/wrappers.py", line 276, in store_aligned
self.context.pack_value(self.builder, self.fe_type, value, ptr)
File "/usr/local/lib/python3.6/dist-packages/numba/targets/base.py", line 482, in pack_value
dataval = self.data_model_manager[ty].as_data(builder, value)
File "/usr/local/lib/python3.6/dist-packages/numba/datamodel/models.py", line 558, in as_data
elems = self._as("as_data", builder, value)
File "/usr/local/lib/python3.6/dist-packages/numba/datamodel/models.py", line 530, in _as
self.get(builder, value, i)))
File "/usr/local/lib/python3.6/dist-packages/numba/datamodel/models.py", line 558, in as_data
elems = self._as("as_data", builder, value)
File "/usr/local/lib/python3.6/dist-packages/numba/datamodel/models.py", line 530, in _as
self.get(builder, value, i)))
File "/usr/local/lib/python3.6/dist-packages/numba/datamodel/models.py", line 624, in get
name="extracted." + self._fields[pos])
File "/usr/local/lib/python3.6/dist-packages/llvmlite/ir/builder.py", line 911, in extract_value
instr = instructions.ExtractValue(self.block, agg, idx, name=name)
File "/usr/local/lib/python3.6/dist-packages/llvmlite/ir/instructions.py", line 622, in __init__
% (list(indices), agg.type))
TypeError: Can't index at [0] in i8*
The error is because you are trying to vectorize a function which you can say is essentially not vectorizable. I think you are confusing the functionality of how #jit and #vectorize work. In order to speed up your functions, you use #jit, while #vectorize is used to create numpy universal functions. See the official documentation here :
Using vectorize(), you write your function as operating over input
scalars, rather than arrays. Numba will generate the surrounding loop
(or kernel) allowing efficient iteration over the actual inputs.
So it is essentially not possible to create a numpy universal function which has the same functionality as your fibonacci function. Here is the link for official documentation on universal functions if you are interested.
So in order to use #vectorize, you need to create a function which can be essentially used as a numpy universal function. For your purpose of speeding up your code, you simply need to use #jit.

My created function won't accept an array as one of the arguments

I've written a function that takes two arguments, one for no. dimensions and another for no. simulations. The function does exactly what is needed (calculating the volume of a unit hypersphere), however when I wish to plot the function over a range of dimensions it returns an error: ''list' object cannot be interpreted as an integer'.
My function is the following,
def hvolume(ndim, nsim):
ob = [np.random.uniform(0.0,1.0,(nsim, ndim))]
ob = np.concatenate(ob)
i = 0
res = []
while i <= nsim-1:
arr = np.sqrt(np.sum(np.square(ob[i])))
i += 1
res.append(arr)
N = nsim
n = ndim
M = len([i for i in res if i <= 1])
return ((2**n)*M/N)
The error traceback is:
Traceback (most recent call last):
File "<ipython-input-192-4c4a2c778637>", line 1, in <module>
runfile('H:/Documents/Python Scripts/Q4ATTEMPT.py', wdir='H:/Documents/Python Scripts')
File "C:\Users\u1708511\AppData\Local\Continuum\anaconda3\lib\site-packages\spyder_kernels\customize\spydercustomize.py", line 668, in runfile
execfile(filename, namespace)
File "C:\Users\u1708511\AppData\Local\Continuum\anaconda3\lib\site-packages\spyder_kernels\customize\spydercustomize.py", line 108, in execfile
exec(compile(f.read(), filename, 'exec'), namespace)
File "H:/Documents/Python Scripts/Q4ATTEMPT.py", line 20, in <module>
print(hvolume(d, 2))
File "H:/Documents/Python Scripts/Q4ATTEMPT.py", line 4, in hvolume
ob = [np.random.uniform(0.0,1.0,(nsim, ndim))]
File "mtrand.pyx", line 1307, in mtrand.RandomState.uniform
File "mtrand.pyx", line 242, in mtrand.cont2_array_sc
TypeError: 'list' object cannot be interpreted as an integer
I really have no idea where to go from here, and have searched thoroughly online for how to resolve this. Unfortunately I'm a beginner with this!
Any help is appreciated.
If you simply try your first line in the function;
ob = [np.random.uniform(0.0,1.0,(nsim, ndim))]
with a list as one of the variables like so;
[np.random.uniform(0.0,1.0,([1,2], 2))]
you will get the error:
TypeError: 'list' object cannot be interpreted as an integer
This is because the uniform command it looking for an integer, not a list. You will need to make a for loop if you would like to handle lists.
One pattern I use for situations like this would be to begin the function with a block to handle the case of if they're iterators. Something like this for example.
from collections import Iterator
def hvolume(ndim, nsim):
outputs = []
if isinstance(ndim, Iterator):
for ndim_arg in ndim:
outputs.append(hvolume(ndim_arg, nsim))
if isinstance(nsim, Iterator):
for nsim_arg in nsim:
outputs.append(hvolume(ndim, nsim_arg))
if len(outputs) == 0: # neither above is an Iterator
# ... the rest of the function but it appends to outputs
return outputs
Check the input parameters of your method "hvolume", it seems that you give a list either nsim or ndim, which should be both integer values. That makes the uniform throw a TypeError Exception.

How do I fix the 'TypeError: hasattr(): attribute name must be string' error?

I have the following code:
import pymc as pm
from matplotlib import pyplot as plt
from pymc.Matplot import plot as mcplot
import numpy as np
from matplotlib import rc
res = [18.752, 12.450, 11.832]
v = pm.Uniform('v', 0, 20)
errors = pm.Uniform('errors', 0, 100, size = 3)
taus = 1/(errors ** 2)
mydist = pm.Normal('mydist', mu = v, tau = taus, value = res, observed = True)
model=pm.Model([mydist, errors, taus, v, res])
mcmc=pm.MCMC(model) # This is line 19 where the TypeError originates
mcmc.sample(20000,10000)
mcplot(mcmc.trace('mydist'))
For some reason it doesn't work, I get the 'TypeError: hasattr(): attribute name must be string' error, with the following trace:
Traceback (most recent call last):
File "<ipython-input-49-759ebaf4321c>", line 1, in <module>
runfile('C:/Users/Paul/.spyder2-py3/temp.py', wdir='C:/Users/Paul/.spyder2-py3')
File "C:\Users\Paul\Miniconda3\lib\site-packages\spyderlib\widgets\externalshell\sitecustomize.py", line 685, in runfile
execfile(filename, namespace)
File "C:\Users\Paul\Miniconda3\lib\site-packages\spyderlib\widgets\externalshell\sitecustomize.py", line 85, in execfile
exec(compile(open(filename, 'rb').read(), filename, 'exec'), namespace)
File "C:/Users/Paul/.spyder2-py3/temp.py", line 19, in <module>
mcmc=pm.MCMC(model)
File "C:\Users\Paul\Miniconda3\lib\site-packages\pymc\MCMC.py", line 82, in __init__
**kwds)
File "C:\Users\Paul\Miniconda3\lib\site-packages\pymc\Model.py", line 197, in __init__
Model.__init__(self, input, name, verbose)
File "C:\Users\Paul\Miniconda3\lib\site-packages\pymc\Model.py", line 99, in __init__
ObjectContainer.__init__(self, input)
File "C:\Users\Paul\Miniconda3\lib\site-packages\pymc\Container.py", line 606, in __init__
conservative_update(self, input_to_file)
File "C:\Users\Paul\Miniconda3\lib\site-packages\pymc\Container.py", line 549, in conservative_update
if not hasattr(obj, k):
TypeError: hasattr(): attribute name must be string
How do I make it work and output "mydist"?
Edit: I posted a wrong trace at first by accident.
Edit2: It all must be because res doesn't have a name, because it's an array, but I don't know how to assign a name to it, so it'll make this work.
I must admit that I'm not familiar with pymc, but changing it to the following at least made the application run:
mydist = pm.Normal('mydist', mu = v, tau = taus, value = res, observed = False)
mcmc=pm.MCMC([mydist, errors, taus, v, res])
This seems to be because you were wrapping everything in a Model which is an extension of ObjectContainer, but since you passed it a list, MCMC file_items in Container.py tried to assign index 4 in a list to something using replace, but since Model is an ObjectContainer it assigned the key 4 in it's __dict__ causing the weird TypeError you got. Removing the wrapping Model caused MCMC to correctly use an ListContainer instead.
Now, there's probably a bug in Model.py on line 543 where observable stochastics aren't stored in the database - the expression is for object in self.stochastics | self.deterministics: but I suspect it should include self.observable_stochastics too - so I needed to change observable to False or the last line would throw a KeyError.
I'm not familiar enough with pymc to determine if it's actually or bug or desired behaviour so I leave it up to you to submit an issue about it.
You simply need to define res as a numpy array:
res = np.array([18.752, 12.450, 11.832])
Then you'll get an error here mcmc.trace('mydist')because mydist is observed data, and therefore is not sampled. You probably want to plot other variables...

Categories