PyCuda: Can import module, then I can't... (PyCUDA Samples) - python

Example code:
import pycuda.autoinit
import pycuda.driver as drv
import numpy
from pycuda.compiler import SourceModule
mod = SourceModule("""
__global__ void multiply_them(float *dest, float *a, float *b)
{
const int i = threadIdx.x;
dest[i] = a[i] * b[i];
}
""")
multiply_them = mod.get_function("multiply_them")
a = numpy.random.randn(400).astype(numpy.float32)
b = numpy.random.randn(400).astype(numpy.float32)
dest = numpy.zeros_like(a)
multiply_them(
drv.Out(dest), drv.In(a), drv.In(b),
block=(400,1,1), grid=(1,1))
print dest-a*b
Results:
Traceback (most recent call last):
File "test.py", line 12, in <module>
""")
File "build/bdist.linux-x86_64/egg/pycuda/compiler.py", line 238, in __init__
File "build/bdist.linux-x86_64/egg/pycuda/compiler.py", line 223, in compile
File "build/bdist.linux-x86_64/egg/pycuda/compiler.py", line 149, in _find_pycuda_include_path
ImportError: No module named pycuda
Sounds simple enough, so lets test this.
Python 2.7.1 (r271:86832, Feb 17 2011, 14:13:40)
[GCC 4.3.4] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import pycuda
>>> pycuda
<module 'pycuda' from '/home/abolster/lib/python2.7/site-packages/pycuda-0.94.2-py2.7-linux-x86_64.egg/pycuda/__init__.pyc'>
>>>
Ok, thats weird...
Long story short, even stepping through the file line by line into the python console, nothing goes wrong until the actual execution of the mod=SourceModule() line.
(Final Traceback, I promise)
/home/abolster/lib/python2.7/site-packages/pycuda-0.94.2-py2.7-linux-x86_64.egg/pycuda/compiler.pyc in _find_pycuda_include_path()
147 def _find_pycuda_include_path():
148 from imp import find_module
--> 149 file, pathname, descr = find_module("pycuda")
150
151 # Who knew Python installation is so uniform and predictable?
ImportError: No module named pycuda
So it looks like pycuda is getting different include dirs than runtime python, which shouldn't happen (as i understand it)
Any ideas? (Sorry for the long question)
Talonmies borought up a point about nvcc not being found; unless python is getting its envars from somewhere I can't think of, there's no reason it shouldn't :
[bolster#dellgpu src]$ which nvcc
~/cuda/bin/nvcc

Changing to Python 2.6 and reinstalling relevant modules fixed the problem for the OP.

There is nothing wrong with the code you are trying to run - it should work. My guess is that nvcc cannot be found. Make sure that the path to the nvcc executable is set in your environment before you try using pycuda.compiler.

I think you did not install the CUDA toolkit from nvidia and added the
/usr/local/cuda/lib/
to
LD_LIBRARY_PATH
find the the .so of the pycuda module and give us the output of:
>lld pycuda.so

Related

Is it possible to specify the pickle protocol when writing pandas to HDF5?

Is there a way to tell Pandas to use a specific pickle protocol (e.g. 4) when writing an HDF5 file?
Here is the situation (much simplified):
Client A is using python=3.8.1 (as well as pandas=1.0.0 and pytables=3.6.1). A writes some DataFrame using df.to_hdf(file, key).
Client B is using python=3.7.1 (and, as it happened, pandas=0.25.1 and pytables=3.5.2 --but that's irrelevant). B tries to read the data written by A using pd.read_hdf(file, key), and fails with ValueError: unsupported pickle protocol: 5.
Mind you, this doesn't happen with a purely numerical DataFrame (e.g. pd.DataFrame(np.random.normal(size=(10,10))). So here is a reproducible example:
(base) $ conda activate py38
(py38) $ python
Python 3.8.1 (default, Jan 8 2020, 22:29:32)
[GCC 7.3.0] :: Anaconda, Inc. on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import pandas as pd
>>> df = pd.DataFrame(['hello', 'world']))
>>> df.to_hdf('foo', 'x')
>>> exit()
(py38) $ conda deactivate
(base) $ python
Python 3.7.4 (default, Aug 13 2019, 20:35:49)
[GCC 7.3.0] :: Anaconda, Inc. on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import pandas as pd
>>> df = pd.read_hdf('foo', 'x')
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/opt/anaconda3/lib/python3.7/site-packages/pandas/io/pytables.py", line 407, in read_hdf
return store.select(key, auto_close=auto_close, **kwargs)
File "/opt/anaconda3/lib/python3.7/site-packages/pandas/io/pytables.py", line 782, in select
return it.get_result()
File "/opt/anaconda3/lib/python3.7/site-packages/pandas/io/pytables.py", line 1639, in get_result
results = self.func(self.start, self.stop, where)
File "/opt/anaconda3/lib/python3.7/site-packages/pandas/io/pytables.py", line 766, in func
return s.read(start=_start, stop=_stop, where=_where, columns=columns)
File "/opt/anaconda3/lib/python3.7/site-packages/pandas/io/pytables.py", line 3206, in read
"block{idx}_values".format(idx=i), start=_start, stop=_stop
File "/opt/anaconda3/lib/python3.7/site-packages/pandas/io/pytables.py", line 2737, in read_array
ret = node[0][start:stop]
File "/opt/anaconda3/lib/python3.7/site-packages/tables/vlarray.py", line 681, in __getitem__
return self.read(start, stop, step)[0]
File "/opt/anaconda3/lib/python3.7/site-packages/tables/vlarray.py", line 825, in read
outlistarr = [atom.fromarray(arr) for arr in listarr]
File "/opt/anaconda3/lib/python3.7/site-packages/tables/vlarray.py", line 825, in <listcomp>
outlistarr = [atom.fromarray(arr) for arr in listarr]
File "/opt/anaconda3/lib/python3.7/site-packages/tables/atom.py", line 1227, in fromarray
return six.moves.cPickle.loads(array.tostring())
ValueError: unsupported pickle protocol: 5
>>>
Note: I tried also reading using pandas=1.0.0 (and pytables=3.6.1) in python=3.7.4. That fails too, so I believe it is simply the Python version (3.8 writer vs 3.7 reader) that causes the problem. This makes sense since pickle protocol 5 was introduced as PEP-574 for Python 3.8.
PyTable uses the highest protocol by default, which is hardcoded here: https://github.com/PyTables/PyTables/blob/50dc721ab50b56e494a5657e9c8da71776e9f358/tables/atom.py#L1216
As a workaround, you can monkey-patch the pickle module on the client A who writes a HDF file. You should do that before importing pandas:
import pickle
pickle.HIGHEST_PROTOCOL = 4
import pandas
df.to_hdf(file, key)
Now the HDF file has been created using pickle protocol version 4 instead version 5.
Update: I was wrong to assume this was not possible. In fact, based on the excellent "monkey-patch" suggestion of #PiotrJurkiewicz, here is a simple context manager that lets us temporarily change the highest pickle protocol. It:
Hides the monkey-patching, and
Has no side-effect outside of the context; it can be used at any time, whether pickle was previously imported or not, before or after pandas, no matter.
Here is the code (e.g. in a file pickle_prot.py):
import importlib
import pickle
class PickleProtocol:
def __init__(self, level):
self.previous = pickle.HIGHEST_PROTOCOL
self.level = level
def __enter__(self):
importlib.reload(pickle)
pickle.HIGHEST_PROTOCOL = self.level
def __exit__(self, *exc):
importlib.reload(pickle)
pickle.HIGHEST_PROTOCOL = self.previous
def pickle_protocol(level):
return PickleProtocol(level)
Usage example in a writer:
import pandas as pd
from pickle_prot import pickle_protocol
pd.DataFrame(['hello', 'world']).to_hdf('foo_0.h5', 'x')
with pickle_protocol(4):
pd.DataFrame(['hello', 'world']).to_hdf('foo_1.h5', 'x')
pd.DataFrame(['hello', 'world']).to_hdf('foo_2.h5', 'x')
And, using a simple test reader:
import pandas as pd
from glob import glob
for filename in sorted(glob('foo_*.h5')):
try:
df = pd.read_hdf(filename, 'x')
print(f'could read {filename}')
except Exception as e:
print(f'failed on {filename}: {e}')
Now, trying to read in py37 after having written in py38, we get:
failed on foo_0.h5: unsupported pickle protocol: 5
could read foo_1.h5
failed on foo_2.h5: unsupported pickle protocol: 5
But, using the same version (37 or 38) to read and write, we of course get no exception.
Note: the issue 33087 is still on Pandas issue tracker.
I'm (was) facing the same problem... I "know" how to solve it and I think you do too...
The solution is to reprocess the whole data to a pickle (or csv) and re-transform it in python3.7 to a hdf5 (which only knows protocol 4).
the flow is something like this:
python3.8 -> hdf5 -> python3.8 -> csv/pickle -> python3.7 -> hdf5 (compatible with both versions)
I avoided this route because I have chuncks of data of a dataframe being exported, creating a large number of files.
Are you actually limited to use python3.7 ? I was limited by tensorflow which as of now only supports up to 3.7 (officially) but you can install tensorflow- nightly-build and it works with python 3.8
Check if you can make the move to 3.8 that would surely solve your problem. :)

NameError: name 'sv' is not defined while importing python module

I am facing a problem in python. Tho the error is quite common, but since i am bit new to python, unable to comprehend the source hence asking you all. There are 2 modules: session.py and objects.py.
session.py
import copy
import pymongo
import spacy
import tweepy
import objects
objects.py:
import re
def refresh (sv = sv, obj = ''):
return 0
now, in python shell, i am getting the error before even executing objects.py:
$ python
Python 2.7.13 (default, Sep 26 2018, 18:42:22)
[GCC 6.3.0 20170516] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import session
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "session.py", line 6, in <module>
import objects
File "objects.py", line 3, in <module>
def refresh (sv = sv, obj = ''):
NameError: name 'sv' is not defined
>>>
I came from perl background to maybe missing some very common thing, but still i am able to do this:
>>> def ff(t): print t
...
In above, whitout defining t, it is working while in objects.py, how can i define sv without starting execution?

Why do I get a "symbol not found" for a found symbol in Pykd?

I'm working on a dump, which I try to investigate, using PYKD technology.
The result of the x /2 *!*``vtable' (just one backtick) contains the following result:
745293b8 mfc110u!CPtrList::`vftable'
However, when I try to get more information about this class, I get a "symbol not found" exception:
Python source code:
dprintln("name=[%s]" % type_stats.name)
if not type_stats.name in typesize_by_type:
try:
type_info = typeInfo(type_stats.name)
except Exception, e:
dprintln("text=[%s]" % (str(e)))
Output:
name=[mfc110u!CPtrList]
text=['CPtrList' - symbol not found]
The result of the lm command returns the mfc110u symbols, as you can see here:
0:000> lm
start end module name
...
744f0000 74930000 mfc110u (pdb symbols) C:\ProgramData\dbg\sym\mfc110u.i386.pdb\...\mfc110u.i386.pdb
...
For your information, I'm now working with the last version of PYKD:
0:000> .chain
Extension DLL search Path:
...
Extension DLL chain:
pykd.pyd: image 0.3.3.4, API 1.0.0, built Mon May 14 11:14:43 2018
[path: C:\Program Files (x86)\Windows Kits\10\Debuggers\x86\winext\pykd.pyd]
Meanwhile I've found a very simple way for reproducing the issue without needing to launch the whole script (using the Windbg prompt):
0:000> !py
Python 2.7.10 (default, May 23 2015, 09:40:32) [MSC v.1500 32 bit (Intel)] on win32
Type "help", "copyright", "credits" or "license" for more information.
(InteractiveConsole)
>>> typeInfo("mfc110u!CPtrList")
Traceback (most recent call last):
File "<console>", line 1, in <module>
SymbolException: 'CPtrList' - symbol not found
In addition to ussrhero's answer, there is following extra information:
The result of x /2 *!CPtrList* contains (amongst many others) the following results:
009530c4 <application>!CPtrList::~CPtrList
009530be <application>!CPtrList::CPtrList
... <application>!CPtrList::...
009abc5c <application>!CPtrList::`RTTI Base Class Array'
009abc4c <application>!CPtrList::`RTTI Class Hierarchy Descriptor'
009bcd18 <application>!CPtrList `RTTI Type Descriptor'
009abc30 <application>!CPtrList::`RTTI Base Class Descriptor at (0,-1,0,64)'
7464e9cb mfc110u!CPtrList::~CPtrList
74544a04 mfc110u!CPtrList::CPtrList
... mfc110u!CPtrList::...
745293b8 mfc110u!CPtrList::`vftable'
747510da mfc110u!CPtrList::`vector deleting destructor'
745293cc mfc110u!CPtrList::`RTTI Complete Object Locator'
7452940c mfc110u!CPtrList::`RTTI Base Class Array'
745293fc mfc110u!CPtrList::`RTTI Class Hierarchy Descriptor'
74795778 mfc110u!CPtrList `RTTI Type Descriptor'
745293e0 mfc110u!CPtrList::`RTTI Base Class Descriptor at (0,-1,0,64)'
746fdc68 mfc110u!CPtrList::classCPtrList
The script I'm using (heap_stat.py) browses through the results of !heap -h 0 and searches for the entry, corresponding with mfc110u!CPtrList::``vtable'.
The result of dt CPtrList starts with the following:
0:000> dt CPtrList
<application>!CPtrList => in other words, no 'mfcu110' entry
+0x000 __VFN_table : Ptr32
I'm already wondering for a long time, what's the difference between the mfc110u!CPtrList and the <application>!CPtrList entries and what's the exact role of the vtable entry in x /2 result?
Any ideas?
Thanks
Meanwhile I've found the solution:
Apparently for some objects, the module prefix needs to be removed:
>>> typeInfo("mdf110u!CPtrList")
-> SymbolException
>>> typeInfo("CPtrList")
-> this is working!!!
Try to see how WinDBG locates this type:
dt CPtrList
It maybe mfc110u does not contain type information fot CPtrList

Windows / ctypes issue with custom build

EDIT (begin)
After spending time with #eryksun we dug really deep and found the underlying issue. These two examples fail1.py and fail2.py have the same effect as the original lengthier example I posted.
It turns out the issue has to do with copying 4-byte C ints into an 8-byte stack location without zero'ing out the memory first, potentially leaving garbage in the upper 4 bytes. This was confirmed using a debugger (again props to #eryksun).
This was one of those weird bugs (heisenbug) where you can add some printf statements then the bug doesn't exist any more.
the fix
At the top ffi_prep_args on the line before argp = stack;, add a call
to memset(stack, 0, ecif->cif->bytes);. This will zero the entire
stack.
This is the location to fix:
https://github.com/python/cpython/blob/v2.7.13/Modules/_ctypes/libffi_msvc/ffi.c#L47
fail1.py
import ctypes
kernel32 = ctypes.WinDLL('kernel32')
hStdout = kernel32.GetStdHandle(-11)
written = ctypes.c_ulong()
kernel32.WriteFile(hStdout, b'spam\n', 5, ctypes.byref(written), None)
fail2.py
import ctypes
import msvcrt
import sys
kernel32 = ctypes.WinDLL('kernel32')
hStdout = msvcrt.get_osfhandle(sys.stdout.fileno())
written = ctypes.c_ulong()
kernel32.WriteFile(hStdout, b'spam\n', 5, ctypes.byref(written), None)
EDIT (end)
I built my own Python 2.7.13 for Windows because I'm using bindings I created to a 3rd party application which must be built with a specific build of Visual Studio 2012.
I started to code up some stuff using the "click" module and found an error when using click.echo with any kind of unicode echo.
Python 2.7.13 (default, Mar 27 2017, 11:11:01) [MSC v.1700 64 bit (AMD64)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> import click
>>> click.echo('Hello World')
Hello World
>>> click.echo(u'Hello World')
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "C:\Users\eric\my_env\lib\site-packages\click\utils.py", line 259, in echo
file.write(message)
File "C:\Users\eric\my_env\lib\site-packages\click\_winconsole.py", line 180, in write
return self._text_stream.write(x)
File "C:\Users\eric\my_env\lib\site-packages\click\_compat.py", line 63, in write
return io.TextIOWrapper.write(self, x)
File "C:\Users\eric\my_env\lib\site-packages\click\_winconsole.py", line 164, in write
raise OSError(self._get_error_message(GetLastError()))
OSError: Windows error 6
If I download and install the Python 2.7.13 64-bit installer I don't get this issue. It echo's just fine.
I have looked into this a lot and am at a loss right now.
I'm not too familiar with Windows, Visual Studio, or ctypes.
I spent some time looking at the code path to produce the smallest file (without click) which demonstrates this problem (see below)
It produces the same "Windows error 6"... again, this works fine with the python installed from the 2.7.13 64 bit MSI installer.
Can someone share the process used to create the Windows installers? Is this a manual process or is it automated?
Maybe I'm missing some important switch to msbuild or something. Any help or ideas are appreciated.
I cannot use a downloaded copy of Python... it needs to be built with a specific version, update, patch, etc of Visual Studio.
All I did was
clone cpython from github and checkout 2.7.13
edit some xp stuff out of tk stuff to get it to compile on Windows Server 2003
In externals\tk-8.5.15.0\win\Makefile.in remove ttkWinXPTheme.$(OBJEXT) line
In externals\tk-8.5.15.0\win\makefile.vc remove $(TMP_DIR)\ttkWinXPTheme.obj line
In externals\tk-8.5.15.0\win\ttkWinMonitor.c remove 2 TtkXPTheme_Init lines
In PCbuild\tcltk.props change VC9 to VC11 at the bottom
PCbuild\build.bat -e -p x64 "/p:PlatformToolset=v110"
After that I created an "install" by copying .exe, .pyd, .dll files, ran get-pip.py, then python -m pip install virtualenv, then virtualenv my_env, then activated it, then did a pip install click.
But with this stripped down version you don't need pip, virtualenv or click... just ctypes.
You could probably even build it without the -e switch to build.bat.
from ctypes import byref, POINTER, py_object, pythonapi, Structure, windll
from ctypes import c_char, c_char_p, c_int, c_ssize_t, c_ulong, c_void_p
c_ssize_p = POINTER(c_ssize_t)
kernel32 = windll.kernel32
STDOUT_HANDLE = kernel32.GetStdHandle(-11)
PyBUF_SIMPLE = 0
MAX_BYTES_WRITTEN = 32767
class Py_buffer(Structure):
_fields_ = [
('buf', c_void_p),
('obj', py_object),
('len', c_ssize_t),
('itemsize', c_ssize_t),
('readonly', c_int),
('ndim', c_int),
('format', c_char_p),
('shape', c_ssize_p),
('strides', c_ssize_p),
('suboffsets', c_ssize_p),
('internal', c_void_p)
]
_fields_.insert(-1, ('smalltable', c_ssize_t * 2))
bites = u"Hello World".encode('utf-16-le')
bytes_to_be_written = len(bites)
buf = Py_buffer()
pythonapi.PyObject_GetBuffer(py_object(bites), byref(buf), PyBUF_SIMPLE)
buffer_type = c_char * buf.len
buf = buffer_type.from_address(buf.buf)
code_units_to_be_written = min(bytes_to_be_written, MAX_BYTES_WRITTEN) // 2
code_units_written = c_ulong()
kernel32.WriteConsoleW(STDOUT_HANDLE, buf, code_units_to_be_written, byref(code_units_written), None)
bytes_written = 2 * code_units_written.value
if bytes_written == 0 and bytes_to_be_written > 0:
raise OSError('Windows error %s' % kernel32.GetLastError())

Python os.environ throws key error?

I'm accessing an environment variable in a script with os.environ.get and it's throwing a KeyError. It doesn't throw the error from the Python prompt. This is running on OS X 10.11.6, and is Python 2.7.10.
What is going on?
$ python score.py
Traceback (most recent call last):
File "score.py", line 4, in <module>
setup_logging()
File "/score/log.py", line 29, in setup_logging
config = get_config()
File "/score/log.py", line 11, in get_config
environment = os.environ.get('NODE_ENV')
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/UserDict.py", line 23, in __getitem__
raise KeyError(key)
KeyError: 'NODE_ENV'
$ python -c "import os; os.environ.get('NODE_ENV')"
$
As requested, here's the source code for score.py
from __future__ import print_function
from log import get_logger, setup_logging
setup_logging()
log = get_logger('score')
And here's log.py
import json
import os
import sys
from iron_worker import IronWorker
from logbook import Logger, Processor, NestedSetup, StderrHandler, SyslogHandler
IRON_IO_TASK_ID = IronWorker.task_id()
def get_config():
environment = os.environ.get('NODE_ENV')
if environment == 'production':
filename = '../config/config-production.json'
elif environment == 'integration':
filename = '../config/config-integration.json'
else:
filename = '../config/config-dev.json'
with open(filename) as f:
return json.load(f)
def setup_logging():
# This defines a remote Syslog handler
# This will include the TASK ID, if defined
app_name = 'scoreworker'
if IRON_IO_TASK_ID:
app_name += '-' + IRON_IO_TASK_ID
config = get_config()
default_log_handler = NestedSetup([
StderrHandler(),
SyslogHandler(
app_name,
address = (config['host'], config['port']),
level = 'ERROR',
bubble = True
)
])
default_log_handler.push_application()
def get_logger(name):
return Logger(name)
Try running:
find . -name \*.pyc -delete
To delete your .pyc files.
Researching your problem I came across this question, where a user was experiencing the same thing: .get() seemingly raising a KeyError. In that case, it was caused, according to this accepted answer, by a .pyc file which contained code where a dict value was being accessed by key (i.e., mydict['potentially_nonexistent_key']), while the traceback was showing the code from the updated .py file where .get() was used. I have never heard of this happening, where the traceback references current code from a .py file, but shows an error raised by an outdated .pyc file, but it seems to have happened at least once in the history of Python...
It is a long shot, but worth a try I thought.
I encountered a similar error when I set the environment variable without exporting it. So if you do this:
me#host:/# NODE_ENV=foo
You will get this:
me#host:/# python3
Python 3.8.2 (default, Apr 27 2020, 15:53:34)
[GCC 9.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import os
>>> node_env = os.environ['NODE_ENV']
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/lib/python3.8/os.py", line 675, in __getitem__
raise KeyError(key) from None
KeyError: 'NODE_ENV'
>>>
But if you do this:
me#host:/# NODE_ENV=foo
me#host:/# export NODE_ENV
It works:
me#host:/# python3
Python 3.8.2 (default, Apr 27 2020, 15:53:34)
[GCC 9.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import os
>>> node_env = os.environ['NODE_ENV']
>>> print(node_env)
foo
>>>
Command for windows to delete the .pyc files:
del /S *.pyc
I had the same problem. I solved that by making some corrections on the .env file:
Before:
Key = Value
After my correction:
Key=Value
without blank spaces and worked!
I was getting this error while trying to source from a .env file.
I didn't explicitly export the env vars so I had to change this.
ENVIRONMENT=DEV
to this
export ENVIRONMENT=DEV
Use export a=10 instead of a=10 while setting env variable. Add the same in ~./bashrc to reload the env var wherever you login.
Doing this resolved the issue
I'd recommend you start debugging os.py, for instance, on windows it's being used this implementation:
def get(self, key, failobj=None):
print self.data.__class__
print key
return self.data.get(key.upper(), failobj)
And if I test it with this:
import os
try:
os.environ.get('NODE_ENV')
except Exception as e:
print("-->{0}".format(e.__class__))
os.environ['NODE_ENV'] = "foobar"
try:
os.environ.get('NODE_ENV')
except Exception as e:
print("{0}".format(e.__class__))
The output will be:
<type 'dict'>
PYTHONUSERBASE
<type 'dict'>
APPDATA
<type 'dict'>
NODE_ENV
<type 'dict'>
NODE_ENV
So it makes sense the exception is not spawned reading dict.get docs.
In any case, if you don't want to mess up or debugging the python modules, try cleaning up the *.pyc files, try to set up properly NODE_ENV. And if all that don't work, restart your terminal to clear up.

Categories