Related
I am looking for a simple way (2 or 3 lines of code) to generate a Phi(k) correlation matrix in Python.
That should be possible since pandas_profiling is doing it, and it works fine.
But I want to be able to do it without pandas_profiling which is too heavy and computes things I don't need.
pandas_profiling is using phik library.
I tried phik library (didn't find anything else)
I don't understand the error I got :
TypeError: sequence item 0: expected str instance, int found
I have no int in my dataframe.
Seems like a bug in phik, but then how does pandas profiling do, since it's using it too ?
What's happening here ?
Many thanks
I have this code :
import numpy as np
import pandas as pd
import phik
NB_SAMPLES = 200
NB_VARIABLES = 3
rand_mat = np.random.uniform(low=0.5, high=15, size=(NB_SAMPLES,NB_VARIABLES))
df = pd.DataFrame(rand_mat)
df['cat_column'] = pd.cut(df[0], bins=5, labels=['F1','F2','F3','F4','F5'])
print(df)
df.phik_matrix()
Result :
0 1 2 cat_column
0 0.911098 8.549206 9.270484 F1
1 13.591250 9.161498 5.614470 F5
2 3.308305 1.589402 5.394675 F1
3 12.031064 9.968686 7.519628 F5
4 14.427813 1.533533 2.352659 F5
.. ... ... ... ...
195 10.556285 3.541869 4.804826 F4
196 5.721784 11.783908 13.104844 F2
197 7.336637 14.512256 14.993096 F3
198 4.375895 11.881784 1.129816 F2
199 0.519900 6.624423 9.239070 F1
[200 rows x 4 columns]
interval_cols not set, guessing: [0, 1, 2]
---------------------------------------------------------------------------
_RemoteTraceback Traceback (most recent call last)
_RemoteTraceback:
"""
Traceback (most recent call last):
File "/opt/conda/lib/python3.7/site-packages/joblib/externals/loky/process_executor.py", line 418, in _process_worker
r = call_item()
File "/opt/conda/lib/python3.7/site-packages/joblib/externals/loky/process_executor.py", line 272, in __call__
return self.fn(*self.args, **self.kwargs)
File "/opt/conda/lib/python3.7/site-packages/joblib/_parallel_backends.py", line 608, in __call__
return self.func(*args, **kwargs)
File "/opt/conda/lib/python3.7/site-packages/joblib/parallel.py", line 256, in __call__
for func, args, kwargs in self.items]
File "/opt/conda/lib/python3.7/site-packages/joblib/parallel.py", line 256, in <listcomp>
for func, args, kwargs in self.items]
File "/opt/conda/lib/python3.7/site-packages/phik/phik.py", line 162, in _calc_phik
combi = ':'.join(comb)
TypeError: sequence item 0: expected str instance, int found
"""
The above exception was the direct cause of the following exception:
TypeError Traceback (most recent call last)
<ipython-input-31-398c72b34799> in <module>
11 df['cat_column'] = pd.cut(df[0], bins=5, labels=['F1','F2','F3','F4','F5'])
12 print(df)
---> 13 df.phik_matrix()
/opt/conda/lib/python3.7/site-packages/phik/phik.py in phik_matrix(df, interval_cols, bins, quantile, noise_correction, dropna, drop_underflow, drop_overflow)
215 data_binned, binning_dict = bin_data(df_clean, cols=interval_cols_clean, bins=bins, quantile=quantile, retbins=True)
216 return phik_from_rebinned_df(data_binned, noise_correction, dropna=dropna, drop_underflow=drop_underflow,
--> 217 drop_overflow=drop_overflow)
218
219
/opt/conda/lib/python3.7/site-packages/phik/phik.py in phik_from_rebinned_df(data_binned, noise_correction, dropna, drop_underflow, drop_overflow)
145
146 phik_list = Parallel(n_jobs=NCORES)(delayed(_calc_phik)(co, data_binned[list(co)], noise_correction)
--> 147 for co in itertools.combinations_with_replacement(data_binned.columns.values, 2))
148
149 phik_overview = create_correlation_overview_table(dict(phik_list))
/opt/conda/lib/python3.7/site-packages/joblib/parallel.py in __call__(self, iterable)
1015
1016 with self._backend.retrieval_context():
-> 1017 self.retrieve()
1018 # Make sure that we get a last message telling us we are done
1019 elapsed_time = time.time() - self._start_time
/opt/conda/lib/python3.7/site-packages/joblib/parallel.py in retrieve(self)
907 try:
908 if getattr(self._backend, 'supports_timeout', False):
--> 909 self._output.extend(job.get(timeout=self.timeout))
910 else:
911 self._output.extend(job.get())
/opt/conda/lib/python3.7/site-packages/joblib/_parallel_backends.py in wrap_future_result(future, timeout)
560 AsyncResults.get from multiprocessing."""
561 try:
--> 562 return future.result(timeout=timeout)
563 except LokyTimeoutError:
564 raise TimeoutError()
/opt/conda/lib/python3.7/concurrent/futures/_base.py in result(self, timeout)
433 raise CancelledError()
434 elif self._state == FINISHED:
--> 435 return self.__get_result()
436 else:
437 raise TimeoutError()
/opt/conda/lib/python3.7/concurrent/futures/_base.py in __get_result(self)
382 def __get_result(self):
383 if self._exception:
--> 384 raise self._exception
385 else:
386 return self._result
TypeError: sequence item 0: expected str instance, int found
Try to reinstall the phik module as the following:
pip install phik==0.10.0
Then, your code together with sns.heatmap results the following:
I have installed cython with conda
cython installation
After restarting the kernel, I have also loaded it without error using %reload_ext Cython.
However, when I try to run the following code
%%cython
import numpy as np
cdef int a = 0
cdef int g[10]
cdef int i
for i in range(10):
g[i] = a
a += i
print(g)
I get the error command 'gcc' failed with exit status 1.
I am fairly new to cython and have no idea on how to solve this or even what questions to search for.
The complete error log:
---------------------------------------------------------------------------
DistutilsExecError Traceback (most recent call last)
~/opt/anaconda3/lib/python3.7/distutils/unixccompiler.py in _compile(self, obj, src, ext, cc_args, extra_postargs, pp_opts)
117 self.spawn(compiler_so + cc_args + [src, '-o', obj] +
--> 118 extra_postargs)
119 except DistutilsExecError as msg:
~/opt/anaconda3/lib/python3.7/distutils/ccompiler.py in spawn(self, cmd)
908 def spawn(self, cmd):
--> 909 spawn(cmd, dry_run=self.dry_run)
910
~/opt/anaconda3/lib/python3.7/distutils/spawn.py in spawn(cmd, search_path, verbose, dry_run)
35 if os.name == 'posix':
---> 36 _spawn_posix(cmd, search_path, dry_run=dry_run)
37 elif os.name == 'nt':
~/opt/anaconda3/lib/python3.7/distutils/spawn.py in _spawn_posix(cmd, search_path, verbose, dry_run)
158 "command %r failed with exit status %d"
--> 159 % (cmd, exit_status))
160 elif os.WIFSTOPPED(status):
DistutilsExecError: command 'gcc' failed with exit status 1
During handling of the above exception, another exception occurred:
CompileError Traceback (most recent call last)
<ipython-input-25-16f694f6508b> in <module>
----> 1 get_ipython().run_cell_magic('cython', '', '\nimport numpy as np\n\ncdef int a = 0\ncdef int g[10]\ncdef int i\n\nfor i in range(10):\n g[i] = a\n a += i\nprint(g)\n')
~/opt/anaconda3/lib/python3.7/site-packages/IPython/core/interactiveshell.py in run_cell_magic(self, magic_name, line, cell)
2357 if getattr(fn, "needs_local_scope", False):
2358 kwargs['local_ns'] = self.user_ns
-> 2359
2360 with self.builtin_trap:
2361 args = (magic_arg_s, cell)
</Users/w849277/opt/anaconda3/lib/python3.7/site-packages/decorator.py:decorator-gen-128> in cython(self, line, cell)
~/opt/anaconda3/lib/python3.7/site-packages/IPython/core/magic.py in <lambda>(f, *a, **k)
185 # but it's overkill for just that one bit of state.
186 def magic_deco(arg):
--> 187 call = lambda f, *a, **k: f(*a, **k)
188
189 if callable(arg):
~/opt/anaconda3/lib/python3.7/site-packages/Cython/Build/IpythonMagic.py in cython(self, line, cell)
331 extension = None
332 if need_cythonize:
--> 333 extensions = self._cythonize(module_name, code, lib_dir, args, quiet=args.quiet)
334 if extensions is None:
335 # Compilation failed and printed error message
~/opt/anaconda3/lib/python3.7/site-packages/Cython/Build/IpythonMagic.py in _build_extension(self, extension, lib_dir, temp_dir, pgo_step_name, quiet)
441 force=True,
442 )
--> 443 if args.language_level is not None:
444 assert args.language_level in (2, 3)
445 opts['language_level'] = args.language_level
~/opt/anaconda3/lib/python3.7/distutils/command/build_ext.py in run(self)
338
339 # Now actually compile and link everything.
--> 340 self.build_extensions()
341
342 def check_extensions_list(self, extensions):
~/opt/anaconda3/lib/python3.7/distutils/command/build_ext.py in build_extensions(self)
447 self._build_extensions_parallel()
448 else:
--> 449 self._build_extensions_serial()
450
451 def _build_extensions_parallel(self):
~/opt/anaconda3/lib/python3.7/distutils/command/build_ext.py in _build_extensions_serial(self)
472 for ext in self.extensions:
473 with self._filter_build_errors(ext):
--> 474 self.build_extension(ext)
475
476 #contextlib.contextmanager
~/opt/anaconda3/lib/python3.7/distutils/command/build_ext.py in build_extension(self, ext)
532 debug=self.debug,
533 extra_postargs=extra_args,
--> 534 depends=ext.depends)
535
536 # XXX outdated variable, kept here in case third-part code
~/opt/anaconda3/lib/python3.7/distutils/ccompiler.py in compile(self, sources, output_dir, macros, include_dirs, debug, extra_preargs, extra_postargs, depends)
572 except KeyError:
573 continue
--> 574 self._compile(obj, src, ext, cc_args, extra_postargs, pp_opts)
575
576 # Return *all* object filenames, not just the ones we just built.
~/opt/anaconda3/lib/python3.7/distutils/unixccompiler.py in _compile(self, obj, src, ext, cc_args, extra_postargs, pp_opts)
118 extra_postargs)
119 except DistutilsExecError as msg:
--> 120 raise CompileError(msg)
121
122 def create_static_lib(self, objects, output_libname,
CompileError: command 'gcc' failed with exit status 1
Also, my mac is updated to 10.15.3 Catalina while my friend's mac that stayed as 10.14.6 Mojave ran the same code without any problems. I have heard compatibility issues with Anaconda when Catalina first came out, but I don't know if this has anything to do with my error.
Just answering my own question so I can close this.
Thank you for two comments on my question! I was able to locate the meaningful answer in my terminal according to the answer provided here: What does CompileError/LinkerError: "command 'gcc' failed with exit status 1" mean, when running %%cython-magic cell in IPython
It turns out the error was
xcrun: error: invalid active developer path (/Library/Developer/CommandLineTools), missing xcrun at: /Library/Developer/CommandLineTools/usr/bin/xcrun
which simply needed to be resolved by running this code in the terminal:
xcode-select --install
For a more detailed solution to the missing xcrun problem, check here https://apple.stackexchange.com/questions/254380/why-am-i-getting-an-invalid-active-developer-path-when-attempting-to-use-git-a
I'm trying to figure out how to work lists/arrays in Cython, which seems impossibly complex, so I would prefer to just use C++ lists as I saw some people use on SO. However when I've ran their code, I'm getting a gcc+ compile error in ipynb. Cython data structures are infuriating.
When ran alone in a cell I get this error, I've tried importing with and without the %%cython magic call and both error...
'''
%%cython
from libcpp.list cimport list as cpplist
'''
def main(int t):
cdef cpplist[int] temp
for x in range(t):
if x> 0:
temp.push_back(x)
cdef int N = temp.size()
cdef list OutputList = N*[0]
for i in range(N):
OutputList[i] = temp.front()
temp.pop_front()
return OutputList
'''
'''
---------------------------------------------------------------------------
DistutilsExecError Traceback (most recent call last)
/anaconda3/lib/python3.6/distutils/unixccompiler.py in _compile(self, obj, src, ext, cc_args, extra_postargs, pp_opts)
117 self.spawn(compiler_so + cc_args + [src, '-o', obj] +
--> 118 extra_postargs)
119 except DistutilsExecError as msg:
/anaconda3/lib/python3.6/distutils/ccompiler.py in spawn(self, cmd)
908 def spawn(self, cmd):
--> 909 spawn(cmd, dry_run=self.dry_run)
910
/anaconda3/lib/python3.6/distutils/spawn.py in spawn(cmd, search_path, verbose, dry_run)
35 if os.name == 'posix':
---> 36 _spawn_posix(cmd, search_path, dry_run=dry_run)
37 elif os.name == 'nt':
/anaconda3/lib/python3.6/distutils/spawn.py in _spawn_posix(cmd, search_path, verbose, dry_run)
158 "command %r failed with exit status %d"
--> 159 % (cmd, exit_status))
160 elif os.WIFSTOPPED(status):
DistutilsExecError: command 'gcc' failed with exit status 1
During handling of the above exception, another exception occurred:
CompileError Traceback (most recent call last)
<ipython-input-6-70891eecfa66> in <module>()
----> 1 get_ipython().run_cell_magic('cython', '--cplus', '\n# distutils: language = c++\nfor i in range(10):\n print(i)\n \n\n\n#from libc.math cimport log\nfrom libcpp.list cimport list as cpplist')
/anaconda3/lib/python3.6/site-packages/IPython/core/interactiveshell.py in run_cell_magic(self, magic_name, line, cell)
2165 magic_arg_s = self.var_expand(line, stack_depth)
2166 with self.builtin_trap:
-> 2167 result = fn(magic_arg_s, cell)
2168 return result
2169
<decorator-gen-127> in cython(self, line, cell)
/anaconda3/lib/python3.6/site-packages/IPython/core/magic.py in <lambda>(f, *a, **k)
185 # but it's overkill for just that one bit of state.
186 def magic_deco(arg):
--> 187 call = lambda f, *a, **k: f(*a, **k)
188
189 if callable(arg):
/anaconda3/lib/python3.6/site-packages/Cython/Build/IpythonMagic.py in cython(self, line, cell)
327
328 self._build_extension(extension, lib_dir, pgo_step_name='use' if args.pgo else None,
--> 329 quiet=args.quiet)
330
331 module = imp.load_dynamic(module_name, module_path)
/anaconda3/lib/python3.6/site-packages/Cython/Build/IpythonMagic.py in _build_extension(self, extension, lib_dir, temp_dir, pgo_step_name, quiet)
437 if not quiet:
438 old_threshold = distutils.log.set_threshold(distutils.log.DEBUG)
--> 439 build_extension.run()
440 finally:
441 if not quiet and old_threshold is not None:
/anaconda3/lib/python3.6/distutils/command/build_ext.py in run(self)
337
338 # Now actually compile and link everything.
--> 339 self.build_extensions()
340
341 def check_extensions_list(self, extensions):
/anaconda3/lib/python3.6/distutils/command/build_ext.py in build_extensions(self)
446 self._build_extensions_parallel()
447 else:
--> 448 self._build_extensions_serial()
449
450 def _build_extensions_parallel(self):
/anaconda3/lib/python3.6/distutils/command/build_ext.py in _build_extensions_serial(self)
471 for ext in self.extensions:
472 with self._filter_build_errors(ext):
--> 473 self.build_extension(ext)
474
475 #contextlib.contextmanager
/anaconda3/lib/python3.6/distutils/command/build_ext.py in build_extension(self, ext)
531 debug=self.debug,
532 extra_postargs=extra_args,
--> 533 depends=ext.depends)
534
535 # XXX outdated variable, kept here in case third-part code
/anaconda3/lib/python3.6/distutils/ccompiler.py in compile(self, sources, output_dir, macros, include_dirs, debug, extra_preargs, extra_postargs, depends)
572 except KeyError:
573 continue
--> 574 self._compile(obj, src, ext, cc_args, extra_postargs, pp_opts)
575
576 # Return *all* object filenames, not just the ones we just built.
/anaconda3/lib/python3.6/distutils/unixccompiler.py in _compile(self, obj, src, ext, cc_args, extra_postargs, pp_opts)
118 extra_postargs)
119 except DistutilsExecError as msg:
--> 120 raise CompileError(msg)
121
122 def create_static_lib(self, objects, output_libname,
CompileError: command 'gcc' failed with exit status 1
When ran alone in a cell I get this error, I've tried importing with and without the %%cython magic call and both error...
in Cython, I get the object of type 'None' has no length (the ONLY error message in Cython language)
or
invalid syntax
Please advise, Cython has me ready to rip my hair out 2 days in.
EDITS: I've tried using:
%%cython --cplus
#distutils: language = c++
same error message.
Also, JUST RUNNING '%%cython --cplus' GIVES ME AN ERROR MESSAGE, SAME ONE?
With anything in the cell, a simple print or anything. Something is wrong with my cpp extension I think... how do I resolve?
In terminal (using runipy -- don't know how else to run ipynb in terminal aside from compiling via a setup.py and distutils Build)
zacharys-mbp:Cython zoakes$ runipy CSTL.ipynb
08/08/2019 08:47:47 PM INFO: Reading notebook CSTL.ipynb
08/08/2019 08:47:49 PM INFO: Running cell:
%load_ext cython
08/08/2019 08:47:49 PM INFO: Cell returned
08/08/2019 08:47:49 PM INFO: Running cell:
%%cython
#distutils: language = c++
from libcpp.list cimport list as cpplist
warning: include path for stdlibc++ headers not found; pass '- stdlib=libc++' on
the command line to use the libc++ standard library instead
[-Wstdlibcxx-not-found]
/Users/zoakes/.ipython/cython/_cython_magic_5a0764b273da2aafc5775e4dd20b1249.cpp:592:10: fatal error:
'ios' file not found
#include "ios"
^~~~~
1 warning and 1 error generated.
08/08/2019 08:47:50 PM INFO: Cell raised uncaught exception:
---------------------------------------------------------------------------
DistutilsExecError Traceback (most recent call last)
/anaconda3/lib/python3.6/distutils/unixccompiler.py in _compile(self, obj, src, ext, cc_args, extra_postargs, pp_opts)
117 self.spawn(compiler_so + cc_args + [src, '-o', obj] +
--> 118 extra_postargs)
119 except DistutilsExecError as msg:
/anaconda3/lib/python3.6/distutils/ccompiler.py in spawn(self, cmd)
908 def spawn(self, cmd):
--> 909 spawn(cmd, dry_run=self.dry_run)
910
/anaconda3/lib/python3.6/distutils/spawn.py in spawn(cmd, search_path, verbose, dry_run)
35 if os.name == 'posix':
---> 36 _spawn_posix(cmd, search_path, dry_run=dry_run)
37 elif os.name == 'nt':
/anaconda3/lib/python3.6/distutils/spawn.py in _spawn_posix(cmd, search_path, verbose, dry_run)
158 "command %r failed with exit status %d"
--> 159 % (cmd, exit_status))
160 elif os.WIFSTOPPED(status):
DistutilsExecError: command 'gcc' failed with exit status 1
During handling of the above exception, another exception occurred:
CompileError Traceback (most recent call last)
<ipython-input-2-e4f283bb7389> in <module>()
----> 1 get_ipython().run_cell_magic('cython', '', '\n#distutils: language = c++\nfrom libcpp.list cimport list as cpplist')
/anaconda3/lib/python3.6/site-packages/IPython/core/interactiveshell.py in run_cell_magic(self, magic_name, line, cell)
2165 magic_arg_s = self.var_expand(line, stack_depth)
2166 with self.builtin_trap:
-> 2167 result = fn(magic_arg_s, cell)
2168 return result
2169
<decorator-gen-127> in cython(self, line, cell)
/anaconda3/lib/python3.6/site-packages/IPython/core/magic.py in <lambda>(f, *a, **k)
185 # but it's overkill for just that one bit of state.
186 def magic_deco(arg):
--> 187 call = lambda f, *a, **k: f(*a, **k)
188
189 if callable(arg):
/anaconda3/lib/python3.6/site-packages/Cython/Build/IpythonMagic.py in cython(self, line, cell)
327
328 self._build_extension(extension, lib_dir, pgo_step_name='use' if args.pgo else None,
--> 329 quiet=args.quiet)
330
331 module = imp.load_dynamic(module_name, module_path)
/anaconda3/lib/python3.6/site-packages/Cython/Build/IpythonMagic.py in _build_extension(self, extension, lib_dir, temp_dir, pgo_step_name, quiet)
437 if not quiet:
438 old_threshold = distutils.log.set_threshold(distutils.log.DEBUG)
--> 439 build_extension.run()
440 finally:
441 if not quiet and old_threshold is not None:
/anaconda3/lib/python3.6/distutils/command/build_ext.py in run(self)
337
338 # Now actually compile and link everything.
--> 339 self.build_extensions()
340
341 def check_extensions_list(self, extensions):
/anaconda3/lib/python3.6/distutils/command/build_ext.py in build_extensions(self)
446 self._build_extensions_parallel()
447 else:
--> 448 self._build_extensions_serial()
449
450 def _build_extensions_parallel(self):
/anaconda3/lib/python3.6/distutils/command/build_ext.py in _build_extensions_serial(self)
471 for ext in self.extensions:
472 with self._filter_build_errors(ext):
--> 473 self.build_extension(ext)
474
475 #contextlib.contextmanager
/anaconda3/lib/python3.6/distutils/command/build_ext.py in build_extension(self, ext)
531 debug=self.debug,
532 extra_postargs=extra_args,
--> 533 depends=ext.depends)
534
535 # XXX outdated variable, kept here in case third-part code
/anaconda3/lib/python3.6/distutils/ccompiler.py in compile(self, sources, output_dir, macros, include_dirs, debug, extra_preargs, extra_postargs, depends)
572 except KeyError:
573 continue
--> 574 self._compile(obj, src, ext, cc_args, extra_postargs, pp_opts)
575
576 # Return *all* object filenames, not just the ones we just built.
/anaconda3/lib/python3.6/distutils/unixccompiler.py in _compile(self, obj, src, ext, cc_args, extra_postargs, pp_opts)
118 extra_postargs)
119 except DistutilsExecError as msg:
--> 120 raise CompileError(msg)
121
122 def create_static_lib(self, objects, output_libname,
CompileError: command 'gcc' failed with exit status 1
08/08/2019 08:47:50 PM INFO: Shutdown kernel
08/08/2019 08:47:50 PM WARNING: Exiting with nonzero exit status
I think this is a Mac issue, which limits my ability to help. However, the key error message seems to be:
warning: include path for stdlibc++ headers not found; pass '-stdlib=libc++' on
the command line to use the libc++ standard library instead
[-Wstdlibcxx-not-found]
If you search for (part of) this message it looks like it's related to XCode. At some point Apple switched the compiler from GCC to Clang, and this changed which implementation of the C++ standard library it uses.
I think the best solution is to install "stdlibc++" on XCode. Unfortunately I have no idea how you'd practically do this.
The second best solution involves adding the suggested command line argument for Cython - I think this is second-best because it's using a slightly mismatched implementation of the C++ standard library.
%%cython --compile-args=-stdlib=libc++ --link-args=-stdlib=libc++
I'm not sure if it needs to be in both compile and link args or just compile args.
I'm taking an online python course (EpiSkills, which uses the Jupyter notebook) that was written in Python 2.7, and I'm on Python 3.6.4 so I have run into a few compatibility issues along the way. Most of the time I've been able to stumble through, but can't figure out this one, so was hoping someone might be able to help.
I start with the following packages:
import pandas as pd
import epipy
import seaborn as sns
%pylab inline
import statsmodels.api as sm
from scipy import stats
import numpy as np
And use the following code to create a pandas series and model:
multivar_model = sm.formula.glm('age ~ onset_to_hospital + onset_to_death +
data=my_data).fit()
new_data = pd.Series([6, 8, 'male'], index=['onset_to_hospital', 'onset_to_death', 'sex'])
When I try to use this to the following code, I throw the error that I've attached:
multivar_model.predict(new_data)
NameError part1
NameError part2
The intended output is meant to be this:
array([ 60.6497459])
I know that a lot of NameErrors are because something has been specified in the local, not global, environment but I'm unsure how to correct it in this instance. Any help is much appreciated.
Thanks!
C
---------------------------------------------------------------------------
NameError Traceback (most recent call last)
~\AppData\Local\Enthought\Canopy\edm\envs\User\lib\site-packages\patsy\compat.py in call_and_wrap_exc(msg, origin, f, *args, **kwargs)
116 try:
--> 117 return f(*args, **kwargs)
118 except Exception as e:
~\AppData\Local\Enthought\Canopy\edm\envs\User\lib\site-packages\patsy\eval.py in eval(self, expr, source_name, inner_namespace)
165 return eval(code, {}, VarLookupDict([inner_namespace]
--> 166 + self._namespaces))
167
<string> in <module>()
NameError: name 'onset_to_death' is not defined
The above exception was the direct cause of the following exception:
PatsyError Traceback (most recent call last)
<ipython-input-79-e0364e267da7> in <module>()
----> 1 multivar_model.predict(new_data)
~\AppData\Local\Enthought\Canopy\edm\envs\User\lib\site-packages\statsmodels\base\model.py in predict(self, exog, transform, *args, **kwargs)
774 exog_index = exog.index
775 exog = dmatrix(self.model.data.design_info.builder,
--> 776 exog, return_type="dataframe")
777 if len(exog) < len(exog_index):
778 # missing values, rows have been dropped
~\AppData\Local\Enthought\Canopy\edm\envs\User\lib\site-packages\patsy\highlevel.py in dmatrix(formula_like, data, eval_env, NA_action, return_type)
289 eval_env = EvalEnvironment.capture(eval_env, reference=1)
290 (lhs, rhs) = _do_highlevel_design(formula_like, data, eval_env,
--> 291 NA_action, return_type)
292 if lhs.shape[1] != 0:
293 raise PatsyError("encountered outcome variables for a model "
~\AppData\Local\Enthought\Canopy\edm\envs\User\lib\site-packages\patsy\highlevel.py in _do_highlevel_design(formula_like, data, eval_env, NA_action, return_type)
167 return build_design_matrices(design_infos, data,
168 NA_action=NA_action,
--> 169 return_type=return_type)
170 else:
171 # No builders, but maybe we can still get matrices
~\AppData\Local\Enthought\Canopy\edm\envs\User\lib\site-packages\patsy\build.py in build_design_matrices(design_infos, data, NA_action, return_type, dtype)
886 for factor_info in six.itervalues(design_info.factor_infos):
887 if factor_info not in factor_info_to_values:
--> 888 value, is_NA = _eval_factor(factor_info, data, NA_action)
889 factor_info_to_isNAs[factor_info] = is_NA
890 # value may now be a Series, DataFrame, or ndarray
~\AppData\Local\Enthought\Canopy\edm\envs\User\lib\site-packages\patsy\build.py in _eval_factor(factor_info, data, NA_action)
61 def _eval_factor(factor_info, data, NA_action):
62 factor = factor_info.factor
---> 63 result = factor.eval(factor_info.state, data)
64 # Returns either a 2d ndarray, or a DataFrame, plus is_NA mask
65 if factor_info.type == "numerical":
~\AppData\Local\Enthought\Canopy\edm\envs\User\lib\site-packages\patsy\eval.py in eval(self, memorize_state, data)
564 return self._eval(memorize_state["eval_code"],
565 memorize_state,
--> 566 data)
567
568 __getstate__ = no_pickling
~\AppData\Local\Enthought\Canopy\edm\envs\User\lib\site-packages\patsy\eval.py in _eval(self, code, memorize_state, data)
549 memorize_state["eval_env"].eval,
550 code,
--> 551 inner_namespace=inner_namespace)
552
553 def memorize_chunk(self, state, which_pass, data):
~\AppData\Local\Enthought\Canopy\edm\envs\User\lib\site-packages\patsy\compat.py in call_and_wrap_exc(msg, origin, f, *args, **kwargs)
122 origin)
123 # Use 'exec' to hide this syntax from the Python 2 parser:
--> 124 exec("raise new_exc from e")
125 else:
126 # In python 2, we just let the original exception escape -- better
~\AppData\Local\Enthought\Canopy\edm\envs\User\lib\site-packages\patsy\compat.py in <module>()
PatsyError: Error evaluating factor: NameError: name 'onset_to_death' is not defined
age ~ onset_to_hospital + onset_to_death + sex
^^^^^^^^^^^^^^
I'm trying to reproduce coal mining example with deterministic function for switchpoint instead of using theano's switch function. Code:
%matplotlib inline
import matplotlib.pyplot as plt
import pymc3
import numpy as np
import theano.tensor as t
import theano
data = np.hstack((np.random.poisson(15,1000),np.random.poisson(2,100)))
plt.plot(data)
#theano.compile.ops.as_op(itypes=[t.lscalar, t.dscalar,t.dscalar],otypes=[t.dvector])
def rate1(sw,mu1,mu2):
n = len(data)
out = np.empty(n)
out[:sw] = mu1
out[sw:] = mu2
return out
with pymc3.Model() as dis:
switchpoint = pymc3.DiscreteUniform('switchpoint',lower=0, upper=len(data)-1)
mu1 = pymc3.Exponential('mu1', lam=1.)
mu2 = pymc3.Exponential('mu2',lam=1.)
disasters=pymc3.Poisson('disasters', mu=rate1, observed = data)
But this code rise an error:
--------------------------------------------------------------------------- KeyError Traceback (most recent call
last) c:\program files\git\theano\theano\tensor\type.py in
dtype_specs(self)
266 'complex64': (complex, 'theano_complex64', 'NPY_COMPLEX64')
--> 267 }[self.dtype]
268 except KeyError:
KeyError: 'object'
During handling of the above exception, another exception occurred:
TypeError Traceback (most recent call
last) c:\program files\git\theano\theano\tensor\basic.py in
constant_or_value(x, rtype, name, ndim, dtype)
407 rval = rtype(
--> 408 TensorType(dtype=x_.dtype, broadcastable=bcastable),
409 x_.copy(),
c:\program files\git\theano\theano\tensor\type.py in init(self,
dtype, broadcastable, name, sparse_grad)
49 self.broadcastable = tuple(bool(b) for b in broadcastable)
---> 50 self.dtype_specs() # error checking is done there
51 self.name = name
c:\program files\git\theano\theano\tensor\type.py in dtype_specs(self)
269 raise TypeError("Unsupported dtype for %s: %s"
--> 270 % (self.class.name, self.dtype))
271
TypeError: Unsupported dtype for TensorType: object
During handling of the above exception, another exception occurred:
TypeError Traceback (most recent call
last) c:\program files\git\theano\theano\tensor\basic.py in
as_tensor_variable(x, name, ndim)
201 try:
--> 202 return constant(x, name=name, ndim=ndim)
203 except TypeError:
c:\program files\git\theano\theano\tensor\basic.py in constant(x,
name, ndim, dtype)
421 ret = constant_or_value(x, rtype=TensorConstant, name=name, ndim=ndim,
--> 422 dtype=dtype)
423
c:\program files\git\theano\theano\tensor\basic.py in
constant_or_value(x, rtype, name, ndim, dtype)
416 except Exception:
--> 417 raise TypeError("Could not convert %s to TensorType" % x, type(x))
418
TypeError: ('Could not convert FromFunctionOp{rate1} to TensorType',
)
During handling of the above exception, another exception occurred:
AsTensorError Traceback (most recent call
last) in ()
14 mu2 = pymc3.Exponential('mu2',lam=1.)
15 #rate1 = pymc3.switch(switchpoint >= np.arange(len(data)), mu1,mu2)
---> 16 disasters=pymc3.Poisson('disasters', mu=rate1, observed = data)
C:\Users\User\Anaconda3\lib\site-packages\pymc3\distributions\distribution.py
in new(cls, name, *args, **kwargs)
19 if isinstance(name, str):
20 data = kwargs.pop('observed', None)
---> 21 dist = cls.dist(*args, **kwargs)
22 return model.Var(name, dist, data)
23 elif name is None:
C:\Users\User\Anaconda3\lib\site-packages\pymc3\distributions\distribution.py
in dist(cls, *args, **kwargs)
32 def dist(cls, *args, **kwargs):
33 dist = object.new(cls)
---> 34 dist.init(*args, **kwargs)
35 return dist
36
C:\Users\User\Anaconda3\lib\site-packages\pymc3\distributions\discrete.py
in init(self, mu, *args, **kwargs)
185 super(Poisson, self).init(*args, **kwargs)
186 self.mu = mu
--> 187 self.mode = floor(mu).astype('int32')
188
189 def random(self, point=None, size=None, repeat=None):
c:\program files\git\theano\theano\gof\op.py in call(self,
*inputs, **kwargs)
598 """
599 return_list = kwargs.pop('return_list', False)
--> 600 node = self.make_node(*inputs, **kwargs)
601
602 if config.compute_test_value != 'off':
c:\program files\git\theano\theano\tensor\elemwise.py in
make_node(self, *inputs)
540 using DimShuffle.
541 """
--> 542 inputs = list(map(as_tensor_variable, inputs))
543 shadow = self.scalar_op.make_node(
544 *[get_scalar_type(dtype=i.type.dtype).make_variable()
c:\program files\git\theano\theano\tensor\basic.py in
as_tensor_variable(x, name, ndim)
206 except Exception:
207 str_x = repr(x)
--> 208 raise AsTensorError("Cannot convert %s to TensorType" % str_x, type(x))
209
210 # this has a different name, because _as_tensor_variable is the
AsTensorError: ('Cannot convert FromFunctionOp{rate1} to TensorType',
)
How i handle this?
The second thing - when i'm using the pymc3.switch function like this:
with pymc3.Model() as dis:
switchpoint = pymc3.DiscreteUniform('switchpoint',lower=0, upper=len(data)-1)
mu1 = pymc3.Exponential('mu1', lam=1.)
mu2 = pymc3.Exponential('mu2',lam=1.)
rate1 = pymc3.switch(switchpoint >= np.arange(len(data)), mu1,mu2)
disasters=pymc3.Poisson('disasters', mu=rate1, observed = data)
And next try to sample:
with dis:
step1 = pymc3.NUTS([mu1, mu2])
step2 = pymc3.Metropolis([switchpoint])
trace = pymc3.sample(10000, step = [step1,step2])
I get an error:
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
c:\program files\git\theano\theano\compile\function_module.py in __call__(self, *args, **kwargs)
858 try:
--> 859 outputs = self.fn()
860 except Exception:
TypeError: expected type_num 9 (NPY_INT64) got 7
During handling of the above exception, another exception occurred:
TypeError Traceback (most recent call last)
<ipython-input-4-3247d908f897> in <module>()
2 step1 = pymc3.NUTS([mu1, mu2])
3 step2 = pymc3.Metropolis([switchpoint])
----> 4 trace = pymc3.sample(10000, step = [step1,step2])
C:\Users\User\Anaconda3\lib\site-packages\pymc3\sampling.py in sample(draws, step, start, trace, chain, njobs, tune, progressbar, model, random_seed)
153 sample_args = [draws, step, start, trace, chain,
154 tune, progressbar, model, random_seed]
--> 155 return sample_func(*sample_args)
156
157
C:\Users\User\Anaconda3\lib\site-packages\pymc3\sampling.py in _sample(draws, step, start, trace, chain, tune, progressbar, model, random_seed)
162 progress = progress_bar(draws)
163 try:
--> 164 for i, strace in enumerate(sampling):
165 if progressbar:
166 progress.update(i)
C:\Users\User\Anaconda3\lib\site-packages\pymc3\sampling.py in _iter_sample(draws, step, start, trace, chain, tune, model, random_seed)
244 if i == tune:
245 step = stop_tuning(step)
--> 246 point = step.step(point)
247 strace.record(point)
248 yield strace
C:\Users\User\Anaconda3\lib\site-packages\pymc3\step_methods\compound.py in step(self, point)
11 def step(self, point):
12 for method in self.methods:
---> 13 point = method.step(point)
14 return point
C:\Users\User\Anaconda3\lib\site-packages\pymc3\step_methods\arraystep.py in step(self, point)
116 bij = DictToArrayBijection(self.ordering, point)
117
--> 118 apoint = self.astep(bij.map(point))
119 return bij.rmap(apoint)
120
C:\Users\User\Anaconda3\lib\site-packages\pymc3\step_methods\metropolis.py in astep(self, q0)
123
124
--> 125 q_new = metrop_select(self.delta_logp(q,q0), q, q0)
126
127 if q_new is q:
c:\program files\git\theano\theano\compile\function_module.py in __call__(self, *args, **kwargs)
869 node=self.fn.nodes[self.fn.position_of_error],
870 thunk=thunk,
--> 871 storage_map=getattr(self.fn, 'storage_map', None))
872 else:
873 # old-style linkers raise their own exceptions
c:\program files\git\theano\theano\gof\link.py in raise_with_op(node, thunk, exc_info, storage_map)
312 # extra long error message in that case.
313 pass
--> 314 reraise(exc_type, exc_value, exc_trace)
315
316
C:\Users\User\Anaconda3\lib\site-packages\six.py in reraise(tp, value, tb)
656 value = tp()
657 if value.__traceback__ is not tb:
--> 658 raise value.with_traceback(tb)
659 raise value
660
c:\program files\git\theano\theano\compile\function_module.py in __call__(self, *args, **kwargs)
857 t0_fn = time.time()
858 try:
--> 859 outputs = self.fn()
860 except Exception:
861 if hasattr(self.fn, 'position_of_error'):
TypeError: expected type_num 9 (NPY_INT64) got 7
Apply node that caused the error: Elemwise{Composite{Switch(GE(i0, i1), i2, i3)}}(InplaceDimShuffle{x}.0, TensorConstant{[ 0 1..1098 1099]}, InplaceDimShuffle{x}.0, InplaceDimShuffle{x}.0)
Toposort index: 11
Inputs types: [TensorType(int64, (True,)), TensorType(int32, vector), TensorType(float64, (True,)), TensorType(float64, (True,))]
Inputs shapes: [(1,), (1100,), (1,), (1,)]
Inputs strides: [(4,), (4,), (8,), (8,)]
Inputs values: [array([549]), 'not shown', array([ 1.07762995]), array([ 1.01502801])]
Outputs clients: [[Elemwise{eq,no_inplace}(Elemwise{Composite{Switch(GE(i0, i1), i2, i3)}}.0, TensorConstant{(1,) of 0}), Elemwise{Composite{Switch(GE(i0, i1), ((Switch(i2, i3, (i4 * log(i0))) - i5) - i0), i3)}}[(0, 0)](Elemwise{Composite{Switch(GE(i0, i1), i2, i3)}}.0, TensorConstant{(1,) of 0}, InplaceDimShuffle{x}.0, TensorConstant{(1,) of -inf}, TensorConstant{[ 13. 13... 0. 1.]}, TensorConstant{[ 22.55216... ]})]]
HINT: Re-running with most Theano optimization disabled could give you a back-trace of when this node was created. This can be done with by setting the Theano flag 'optimizer=fast_compile'. If that does not work, Theano optimizations can be disabled with 'optimizer=None'.
HINT: Use the Theano flag 'exception_verbosity=high' for a debugprint and storage map footprint of this apply node.
Being simple analyst, should i learn all this stuff about theano to being able to work with my statistical problems? Is a new mcmc sampler with gradient feature is only one thing that should motivates me to switch from pymc2 to pymc3?
For your first question, it looks like you're trying to pass a theano function as a variable. You need to call the function with the other variables as arguments, which will then return a theano variable. Try changing your line to
disasters=pymc3.Poisson('disasters', mu=rate1(switchpoint, mu1, mu2), observed = data)
I couldn't reproduce the error in your second part; the sampling worked just fine for me.