I'm trying to run keras/theano using GPU on jupyter notebook, my system is mac os High Sierra 10.13 with NVIDIA GeForce GT 330M.
I followed instructions on this site: I installed cuda and cudnn, then edit ~/.bash_profile. I can't use this command $ THEANO_FLAGS=mode=FAST_RUN python imdb_cnn.py since I'm using jupyter notebook.
Moreover I've also edit .theanorc file's and it looks like this:
[global]
floatX = float32
device = gpu
force_device = True
optimizer_including=cudnn
[nvcc]
fastmath = True
[cuda]
root=Users/jack/cuda
Then, I tried to run this code to check if I was using Gpu:
from theano import function, config, shared, tensor
import numpy
import time
vlen = 10 * 30 * 768 # 10 x #cores x # threads per core
iters = 1000
rng = numpy.random.RandomState(22)
x = shared(numpy.asarray(rng.rand(vlen), config.floatX))
f = function([], tensor.exp(x))
print(f.maker.fgraph.toposort())
t0 = time.time()
for i in range(iters):
r = f()
t1 = time.time()
print("Looping %d times took %f seconds" % (iters, t1 - t0))
print("Result is %s" % (r,))
if numpy.any([isinstance(x.op, tensor.Elemwise) and
('Gpu' not in type(x.op).__name__)
for x in f.maker.fgraph.toposort()]):
print('Used the cpu')
else:
print('Used the gpu')
...and I was using cpu instead!
What should I put on the top of this code to use GPU?
I've also tried to put this:
import os
os.environ['THEANO_FLAGS'] = "device=cuda,force_device=True,floatX=float32"
but it didn't work because I always get this output:
[Elemwise{exp,no_inplace}(<TensorType(float32, vector)>)] Looping 1000
times took 2.618842 seconds Result is [1.2317803 1.6187934 1.5227807
... 2.2077181 2.2996776 1.6232328]
Used the cpu
EDIT: I try to run the previous "test" code on terminal using
THEANO_FLAGS=mode=FAST_RUN,device=cuda0,floatX=float32 python gpuocpu.py
I get this error:
ERROR (theano.gpuarray): Could not initialize pygpu, support disabled
Traceback (most recent call last):
File "/Users/jack/miniconda3/lib/python3.6/site-packages/theano/gpuarray/__init__.py", line 227, in <module>
use(config.device)
File "/Users/jack/miniconda3/lib/python3.6/site-packages/theano/gpuarray/__init__.py", line 214, in use
init_dev(device, preallocate=preallocate)
File "/Users/jack/miniconda3/lib/python3.6/site-packages/theano/gpuarray/__init__.py", line 99, in init_dev
**args)
File "pygpu/gpuarray.pyx", line 658, in pygpu.gpuarray.init
File "pygpu/gpuarray.pyx", line 587, in pygpu.gpuarray.pygpu_init
I found this thread and I run this on terminal: set DEVICE=cuda0 and set GPUARRAY_CUDA_VERSION=80 but I still get the same error plus the output saying that I'm using cpu.
EDIT2: I re-installed Cuda (using .dmg file, not from terminal), I followed Invidia Installation Guide and applied this tips:
Uncheck System Preferences > Energy Saver > Automatic Graphic Switch
Drag the Computer sleep bar to Never in System Preferences > Energy Saver
Now i get this new error: Segmentation fault: 11
>>> import pygpu
>>> pygpu.test()
pygpu is installed in /Users/jack/miniconda3/lib/python3.6/site-packages/pygpu
NumPy version 1.15.4
NumPy relaxed strides checking option: True
NumPy is installed in /Users/jack/miniconda3/lib/python3.6/site-packages/numpy
Python version 3.6.7 |Anaconda, Inc.| (default, Oct 23 2018, 14:01:38) [GCC 4.2.1 Compatible Clang 4.0.1 (tags/RELEASE_401/final)]
nose version 1.3.7
Segmentation fault: 11
Related
I want to compile arbitrary nn modules using jit.
In my code I compare my value to a dict and it throws an error.
"""
if type(json_data) is dict:
~~~~ <--- HERE
To Reproduce
Simple, any code that has a comparison with a dict:
class Node(object):
def __init__(self):
pass
#classmethod
def from_json(cls, json_data):
if type(json_data) is dict:
node_data = next(iter(json_data))
assert type(json_data[node_data]) is list
node_children = [cls.from_json(child) for child in json_data[node_data]]
return Node(node_data, node_children)
else:
return Node(json_data)
Expected behavior
Jit makes my checkpoint.
Environment
PyTorch / torchvision Version (e.g., 1.0 / 0.4.0): 1.7.1
OS (e.g., Linux): mac os x
How you installed PyTorch / torchvision (conda, pip, source): conda
Build command you used (if compiling from source): conda
Python version: 3.8
CUDA/cuDNN version: CPU
GPU models and configuration: CPU
Any other relevant information: CPU
Additional context
Compiling arbitrary custom nn modules to jit
error:
/Users/brando/anaconda3/envs/coq_gym/bin/python /Applications/PyCharm.app/Contents/plugins/python/helpers/pydev/pydevd.py --cmd-line --multiproc --qt-support=auto --client 127.0.0.1 --port 59213 --file /Users/brando/ML4Coq/playground/running_pytorch_ocaml/treenn2jit_ckpt.py
Connected to pydev debugger (build 203.7148.72)
1.7.1
Traceback (most recent call last):
File "/Users/brando/anaconda3/envs/coq_gym/lib/python3.7/site-packages/torch/jit/_recursive.py", line 680, in compile_unbound_method
create_methods_and_properties_from_stubs(concrete_type, (stub,), ())
File "/Users/brando/anaconda3/envs/coq_gym/lib/python3.7/site-packages/torch/jit/_recursive.py", line 304, in create_methods_and_properties_from_stubs
concrete_type._create_methods_and_properties(property_defs, property_rcbs, method_defs, method_rcbs, method_defaults)
File "/Users/brando/anaconda3/envs/coq_gym/lib/python3.7/site-packages/torch/jit/annotations.py", line 330, in try_ann_to_type
torch.jit._script._recursive_compile_class(ann, loc)
File "/Users/brando/anaconda3/envs/coq_gym/lib/python3.7/site-packages/torch/jit/_script.py", line 1056, in _recursive_compile_class
_compile_and_register_class(obj, rcb, _qual_name)
File "/Users/brando/anaconda3/envs/coq_gym/lib/python3.7/site-packages/torch/jit/_script.py", line 64, in _compile_and_register_class
torch._C._jit_script_class_compile(qualified_name, ast, defaults, rcb)
RuntimeError:
builtin cannot be used as a value:
File "/Users/brando/ML4Coq/ml4coq-proj/embeddings_zoo/extract_tactic_from_lasse_data.py", line 56
term = string
"""
if type(json_data) is dict:
~~~~ <--- HERE
node_data = next(iter(json_data))
assert type(json_data[node_data]) is list
'Node.from_json' is being compiled since it was called from '__torch__.embeddings_zoo.extract_tactic_from_lasse_data.Node'
related:
https://github.com/pytorch/vision/issues/3392
https://github.com/pytorch/vision/issues/1675
I attempt to install Spark Release 2.4.0 on my pc, which system is win7_x64.
However when I try to run simple code to check whether spark is ready to work:
code:
import os
from pyspark import SparkConf, SparkContext
conf = SparkConf().setMaster('local[*]').setAppName('word_count')
sc = SparkContext(conf=conf)
d = ['a b c d', 'b c d e', 'c d e f']
d_rdd = sc.parallelize(d)
rdd_res = d_rdd.flatMap(lambda x: x.split(' ')).map(lambda word: (word, 1)).reduceByKey(lambda a, b: a+b)
print(rdd_res)
print(rdd_res.collect())
I get this error:
error1
I open the worker.py file to check the code.
I find that, in version 2.4.0, the code is :
worker.py v2.4.0
However, in version 2.3.2, the code is:
worker.py v2.3.2
Then I reinstall spark-2.3.2-bin-hadoop2.7 , the code works fine.
Also, I find this question:
ImportError: No module named 'resource'
So, I think spark-2.4.0-bin-hadoop2.7 can not work in win7 because of importing
resource module in worker.py, which is a Unix specific package.
I hope someone could fix this problem in spark.
i got this error and I have spark 2.4.0, jdk 11 and kafka 2.11 on windows.
I was able to resolve this by doing -
1) cd spark_home\python\lib
ex. cd C:\myprograms\spark-2.4.0-bin-hadoop2.7\python
2) unzip pyspark.zip
3) edit worker.py , comment out 'import resource' and also following para and save the file. This para is just an add on and is not a core code, so its fine to comment it out.
4)remove the older pyspark.zip and create new zip.
5) in jupyter notebook restart the kernel.
commented para in worker.py -
# set up memory limits
#memory_limit_mb = int(os.environ.get('PYSPARK_EXECUTOR_MEMORY_MB', "-1"))
#total_memory = resource.RLIMIT_AS
#try:
# if memory_limit_mb > 0:
#(soft_limit, hard_limit) = resource.getrlimit(total_memory)
#msg = "Current mem limits: {0} of max {1}\n".format(soft_limit, hard_limit)
#print(msg, file=sys.stderr)
# convert to bytes
#new_limit = memory_limit_mb * 1024 * 1024
#if soft_limit == resource.RLIM_INFINITY or new_limit < soft_limit:
# msg = "Setting mem limits to {0} of max {1}\n".format(new_limit, new_limit)
# print(msg, file=sys.stderr)
# resource.setrlimit(total_memory, (new_limit, new_limit))
#except (resource.error, OSError, ValueError) as e:
# # not all systems support resource limits, so warn instead of failing
# print("WARN: Failed to set memory limit: {0}\n".format(e), file=sys.stderr)
Python has some compatibility issue with the newly released Spark 2.4.0 version. I also faced this similar issue. If you download and configure Spark 2.3.2 in your system (change environment variables), the problem will be resolved.
EDIT (begin)
After spending time with #eryksun we dug really deep and found the underlying issue. These two examples fail1.py and fail2.py have the same effect as the original lengthier example I posted.
It turns out the issue has to do with copying 4-byte C ints into an 8-byte stack location without zero'ing out the memory first, potentially leaving garbage in the upper 4 bytes. This was confirmed using a debugger (again props to #eryksun).
This was one of those weird bugs (heisenbug) where you can add some printf statements then the bug doesn't exist any more.
the fix
At the top ffi_prep_args on the line before argp = stack;, add a call
to memset(stack, 0, ecif->cif->bytes);. This will zero the entire
stack.
This is the location to fix:
https://github.com/python/cpython/blob/v2.7.13/Modules/_ctypes/libffi_msvc/ffi.c#L47
fail1.py
import ctypes
kernel32 = ctypes.WinDLL('kernel32')
hStdout = kernel32.GetStdHandle(-11)
written = ctypes.c_ulong()
kernel32.WriteFile(hStdout, b'spam\n', 5, ctypes.byref(written), None)
fail2.py
import ctypes
import msvcrt
import sys
kernel32 = ctypes.WinDLL('kernel32')
hStdout = msvcrt.get_osfhandle(sys.stdout.fileno())
written = ctypes.c_ulong()
kernel32.WriteFile(hStdout, b'spam\n', 5, ctypes.byref(written), None)
EDIT (end)
I built my own Python 2.7.13 for Windows because I'm using bindings I created to a 3rd party application which must be built with a specific build of Visual Studio 2012.
I started to code up some stuff using the "click" module and found an error when using click.echo with any kind of unicode echo.
Python 2.7.13 (default, Mar 27 2017, 11:11:01) [MSC v.1700 64 bit (AMD64)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> import click
>>> click.echo('Hello World')
Hello World
>>> click.echo(u'Hello World')
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "C:\Users\eric\my_env\lib\site-packages\click\utils.py", line 259, in echo
file.write(message)
File "C:\Users\eric\my_env\lib\site-packages\click\_winconsole.py", line 180, in write
return self._text_stream.write(x)
File "C:\Users\eric\my_env\lib\site-packages\click\_compat.py", line 63, in write
return io.TextIOWrapper.write(self, x)
File "C:\Users\eric\my_env\lib\site-packages\click\_winconsole.py", line 164, in write
raise OSError(self._get_error_message(GetLastError()))
OSError: Windows error 6
If I download and install the Python 2.7.13 64-bit installer I don't get this issue. It echo's just fine.
I have looked into this a lot and am at a loss right now.
I'm not too familiar with Windows, Visual Studio, or ctypes.
I spent some time looking at the code path to produce the smallest file (without click) which demonstrates this problem (see below)
It produces the same "Windows error 6"... again, this works fine with the python installed from the 2.7.13 64 bit MSI installer.
Can someone share the process used to create the Windows installers? Is this a manual process or is it automated?
Maybe I'm missing some important switch to msbuild or something. Any help or ideas are appreciated.
I cannot use a downloaded copy of Python... it needs to be built with a specific version, update, patch, etc of Visual Studio.
All I did was
clone cpython from github and checkout 2.7.13
edit some xp stuff out of tk stuff to get it to compile on Windows Server 2003
In externals\tk-8.5.15.0\win\Makefile.in remove ttkWinXPTheme.$(OBJEXT) line
In externals\tk-8.5.15.0\win\makefile.vc remove $(TMP_DIR)\ttkWinXPTheme.obj line
In externals\tk-8.5.15.0\win\ttkWinMonitor.c remove 2 TtkXPTheme_Init lines
In PCbuild\tcltk.props change VC9 to VC11 at the bottom
PCbuild\build.bat -e -p x64 "/p:PlatformToolset=v110"
After that I created an "install" by copying .exe, .pyd, .dll files, ran get-pip.py, then python -m pip install virtualenv, then virtualenv my_env, then activated it, then did a pip install click.
But with this stripped down version you don't need pip, virtualenv or click... just ctypes.
You could probably even build it without the -e switch to build.bat.
from ctypes import byref, POINTER, py_object, pythonapi, Structure, windll
from ctypes import c_char, c_char_p, c_int, c_ssize_t, c_ulong, c_void_p
c_ssize_p = POINTER(c_ssize_t)
kernel32 = windll.kernel32
STDOUT_HANDLE = kernel32.GetStdHandle(-11)
PyBUF_SIMPLE = 0
MAX_BYTES_WRITTEN = 32767
class Py_buffer(Structure):
_fields_ = [
('buf', c_void_p),
('obj', py_object),
('len', c_ssize_t),
('itemsize', c_ssize_t),
('readonly', c_int),
('ndim', c_int),
('format', c_char_p),
('shape', c_ssize_p),
('strides', c_ssize_p),
('suboffsets', c_ssize_p),
('internal', c_void_p)
]
_fields_.insert(-1, ('smalltable', c_ssize_t * 2))
bites = u"Hello World".encode('utf-16-le')
bytes_to_be_written = len(bites)
buf = Py_buffer()
pythonapi.PyObject_GetBuffer(py_object(bites), byref(buf), PyBUF_SIMPLE)
buffer_type = c_char * buf.len
buf = buffer_type.from_address(buf.buf)
code_units_to_be_written = min(bytes_to_be_written, MAX_BYTES_WRITTEN) // 2
code_units_written = c_ulong()
kernel32.WriteConsoleW(STDOUT_HANDLE, buf, code_units_to_be_written, byref(code_units_written), None)
bytes_written = 2 * code_units_written.value
if bytes_written == 0 and bytes_to_be_written > 0:
raise OSError('Windows error %s' % kernel32.GetLastError())
Woke up today and all of a sudden get
C:\Python27\lib\site-packages\pyopencl\__init__.py:61: CompilerWarning: Non-empty compiler output encountered. Set the environment variable PYOPENCL_COMPILER_OUTPUT=1 to see more.
"to see more.", CompilerWarning)
C:\Python27\lib\site-packages\pyopencl\cache.py:101: UserWarning: could not obtain cache lock--delete 'c:\users\User\appdata\local\temp\pyopencl-compiler-cache-v2-uiduser-py2.7.3.final.0\lock' if necessary
% self.lock_file)
When I ran any sort of PqOpenCL code, ex:
import numpy
import pyopencl as cl
import pyopencl.array as clarray
from pyopencl.reduction import ReductionKernel
ctx = cl.create_some_context()
queue = cl.CommandQueue(ctx)
krnl = ReductionKernel(ctx, numpy.float32, neutral="0",
reduce_expr="a+b", map_expr="x[i]*y[i]",
arguments="__global float *x, __global float *y")
x = clarray.arange(queue, 400, dtype=numpy.float32)
y = clarray.arange(queue, 400, dtype=numpy.float32)
m = krnl(x, y).get()
Sample and part of the solution came from here
Solution suggested rolling back numpy, which I did from 1.8.0 to 1.7.2 but still same problem
Edit 1
Added as per suggestion
import os
os.environ['PYOPENCL_COMPILER_OUTPUT'] = '1'
C:\Python27\lib\site-packages\pyopencl\__init__.py:57: CompilerWarning: From-source build succeeded, but resulted in non-empty logs:
Build on <pyopencl.Device 'Intel(R) HD Graphics 4000' on 'Intel(R) OpenCL' at 0x51eadff0> succeeded, but said:
fcl build 1 succeeded.
fcl build 2 succeeded.
bcl build succeeded.
warn(text, CompilerWarning)
import os
os.environ['PYOPENCL_COMPILER_OUTPUT'] = '1'
Do this to see the compiler output, i've gotten the same message before. It was just the intel opencl compiler saying it had vectorized\optimized the opencl kernel.
Example code:
import pycuda.autoinit
import pycuda.driver as drv
import numpy
from pycuda.compiler import SourceModule
mod = SourceModule("""
__global__ void multiply_them(float *dest, float *a, float *b)
{
const int i = threadIdx.x;
dest[i] = a[i] * b[i];
}
""")
multiply_them = mod.get_function("multiply_them")
a = numpy.random.randn(400).astype(numpy.float32)
b = numpy.random.randn(400).astype(numpy.float32)
dest = numpy.zeros_like(a)
multiply_them(
drv.Out(dest), drv.In(a), drv.In(b),
block=(400,1,1), grid=(1,1))
print dest-a*b
Results:
Traceback (most recent call last):
File "test.py", line 12, in <module>
""")
File "build/bdist.linux-x86_64/egg/pycuda/compiler.py", line 238, in __init__
File "build/bdist.linux-x86_64/egg/pycuda/compiler.py", line 223, in compile
File "build/bdist.linux-x86_64/egg/pycuda/compiler.py", line 149, in _find_pycuda_include_path
ImportError: No module named pycuda
Sounds simple enough, so lets test this.
Python 2.7.1 (r271:86832, Feb 17 2011, 14:13:40)
[GCC 4.3.4] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import pycuda
>>> pycuda
<module 'pycuda' from '/home/abolster/lib/python2.7/site-packages/pycuda-0.94.2-py2.7-linux-x86_64.egg/pycuda/__init__.pyc'>
>>>
Ok, thats weird...
Long story short, even stepping through the file line by line into the python console, nothing goes wrong until the actual execution of the mod=SourceModule() line.
(Final Traceback, I promise)
/home/abolster/lib/python2.7/site-packages/pycuda-0.94.2-py2.7-linux-x86_64.egg/pycuda/compiler.pyc in _find_pycuda_include_path()
147 def _find_pycuda_include_path():
148 from imp import find_module
--> 149 file, pathname, descr = find_module("pycuda")
150
151 # Who knew Python installation is so uniform and predictable?
ImportError: No module named pycuda
So it looks like pycuda is getting different include dirs than runtime python, which shouldn't happen (as i understand it)
Any ideas? (Sorry for the long question)
Talonmies borought up a point about nvcc not being found; unless python is getting its envars from somewhere I can't think of, there's no reason it shouldn't :
[bolster#dellgpu src]$ which nvcc
~/cuda/bin/nvcc
Changing to Python 2.6 and reinstalling relevant modules fixed the problem for the OP.
There is nothing wrong with the code you are trying to run - it should work. My guess is that nvcc cannot be found. Make sure that the path to the nvcc executable is set in your environment before you try using pycuda.compiler.
I think you did not install the CUDA toolkit from nvidia and added the
/usr/local/cuda/lib/
to
LD_LIBRARY_PATH
find the the .so of the pycuda module and give us the output of:
>lld pycuda.so