Why jupyter notebook only prints the cython result once? - python

I am new to cython(only use it for doing a little hw now).
I use the following code to see a general idea of it in jupyter notebook.
%load_ext Cython
%%cython
def cfunc(int n):
cdef int a = 0
for i in range(n):
a += i
return a
print(cfunc(10))
However, it only prints out the result 45 once. When I run the print function, the cell doesn't show 45 anyone.
Is there any problems with the code? How can I make the cell prints out 45 just the same as a normal python code? Thanks.

When running %%cython-magic a lot happens under the hood. One can see parts of it when calling the magic in verbose mode, i.e. %%cython --verbose:
A file called _cython_magic_b599dcf313706e8c6031a4a7058da2a2.pyx is generated. b599dcf313706e8c6031a4a7058da2a2 is the sha1-hash of the %%cython-cell, which is needed for example to be able to reload a %%cython-cell (see this SO-post).
This file is cythonized and build to a c-extension called _cython_magic_b599dcf313706e8c6031a4a7058da2a2.
This extension gets imported - this is the moment your code prints 45, and everything from this module is added to the global namespace.
When you execute the cell again, nothing of the above happens: given the sha-hash the machinery can see, that this cell was already executed and loaded - so nothing to be done. Only when the content of the cell is changed and thus its hash the cash will not be used but the 3 steps above executed.
To enforce that the steps above are performed one has to pass --force (or -f) options to the %%cython-magic-cell, i.e.:
%%cython --force
...
# 45 is printed
However, because building extension anew is quite time consuming one would probably prefer the following
%%cython
def cfunc(int n):
cdef int a = 0
for i in range(n):
a += i
return a
# put the code of __main__ into a function
def cython_main():
print(cfunc(10))
# execute the old main
cython_main()
and now calling cython_main() in a new cell, so it gets reevaluated the same way the normal python code would.

Related

Issues with accessing PyObjects after writing

I am trying to do some fairly simple list manipulation, using a Jupyter notebook that calls a DLL function. I'd like my Jupyter notebook/Python code to pass in a Python list to a C++ function, which modifies the list, and then I'd like the Python code to be able to access the new list values.
I can actually read (in Jupyter) the items that were not edited by the C++ code, so there must be some issue with how I'm writing, but every example I can find looks just like my code. When I try to access the item in the list that the C++ code writes, my Jupyter kernel dies with no explanation; I've tried to run the same Python code in the terminal, and the terminal session just exits, again with no explanation.
Running on Windows 10, environment with Python 3.9.2. Here's the Python:
import os
import ctypes
import _ctypes
# Import the DLL
mydll = ctypes.cdll.LoadLibrary(*path to DLL*)
# Set up
data_in = [3,6,9]
mydll.testChange.argtypes = [ctypes.py_object]
mydll.testChange.restype = ctypes.c_float
mydll.testChange(data_in)
# Returns 0.08
After running this and closing the DLL, running data_in[1] returns 6, data_in[2] returns 9, and data_in[0] causes my kernel to die.
C code for the DLL:
float testChange(PyObject *data_out) {
Py_SetPythonHome(L"%user directory%\\anaconda3");
Py_Initialize();
PyList_SetItem(data_out, 0, PyLong_FromLong(1L));
return 0.08;
}
I can also insert a number of print statements in this code that show that I can read out all three items in the DLL both before and after the call to PyList_SetItem using calls like PyLong_AsLong(PyList_GetItem(data_out, 1)). It's not clear to me that any reference counts need changing or anything like that, but perhaps I misunderstand the idea. Any ideas you all have would be greatly appreciated.

Intel Vtune cannot find python source file

This is an old problem as is demonstrated as in https://community.intel.com/t5/Analyzers/Unable-to-view-source-code-when-analyzing-results/td-p/1153210. I have tried all the listed methods, none of them works, and I cannot find any more solutions on the internet. Basically vtune cannot find the custom python source file no matter what is tried. I am using the most recently version as of speaking. Please let me whether there is a solution.
For example, if you run the following program.
def myfunc(*args):
# Do a lot of things.
if __name__ = '__main__':
# Do something and call myfunc
Call this script main.py. Now use the newest vtune version (I have using Ubuntu 18.04), run the vtune-gui and basic hotspot analysis. You will not found any information on this file. However, a huge pile of information on Python and its other codes are found (related to your python environment). In theory, you should be able to find the source of main.py as well as cost on each line in that script. However, that is simply not happening.
Desired behavior: I would really like to find the source file and function in the top-down manual (or any really). Any advice is welcome.
VTune offer full support for profiling python code and the tool should be able to display the source code in your python file as you expected. Could you please check if the function you are expecting to see in the VTune results, ran long enough?
Just to confirm that everything is working fine, I wrote a matrix multiplication code as shown below (don't worry about the accuracy of the code itself):
def matrix_mul(X, Y):
result_matrix = [ [ 1 for i in range(len(X)) ] for j in range(len(Y[0])) ]
# iterate through rows of X
for i in range(len(X)):
# iterate through columns of Y
for j in range(len(Y[0])):
# iterate through rows of Y
for k in range(len(Y)):
result_matrix[i][j] += X[i][k] * Y[k][j]
return result_matrix
Then I called this function (matrix_mul) on my Ubuntu machine with large enough matrices so that the overall execution time was in the order of few seconds.
I used the below command to start profiling (you can also see the VTune version I used):
/opt/intel/oneapi/vtune/2021.1.1/bin64/vtune -collect hotspots -knob enable-stack-collection=true -data-limit=500 -ring-buffer=10 -app-working-dir /usr/bin -- python3 /home/johnypau/MyIntel/temp/Python_matrix_mul/mat_mul_method.py
Now open the VTune results in the GUI and under the bottom-up tab, order by "Module / Function / Call-stack" (or whatever preferred grouping is).
You should be able to see the the module (mat_mul_method.py in my case) and the function "matrix_mul". If you double click, VTune should be able to load the sources too.

How to execute python function as whole in VSCode (it splits and sends just the first line to an interpreter)

I'm getting used to VSCode in my daily Data Science remote workflow due to LiveShare feature.
So, upon executing functions it just executes the first line of code; if I mark the whole region then it does work, but it's cumbersome way of dealing with the issue.
I tried number of extensions, but none of them seem to solve the problem.
def gini_normalized(test, pred):
"""Simple normalized Gini based on Scikit-Learn's roc_auc_score"""
gini = lambda a, p: 2 * roc_auc_score(a, p) - 1
return gini(test, pred)
Executing the beginning of the function results in error:
def gini_normalized(test, pred):...
File "", line 1
def gini_normalized(test, pred):
^
SyntaxError: unexpected EOF while parsing
There's a solution for PyCharm: Python Smart Execute - https://plugins.jetbrains.com/plugin/11945-python-smart-execute. Also Atom's Hydrogen doesn't have such issue either.
Any ideas regarding VSCode?
Thanks!
I'm a developer on the VSCode DataScience features. Just to make sure that I'm understanding correctly. You would like the shift-enter command to send the entire function to the Interactive Window if you run it on the definition of the function?
If so, then yes, we don't currently support that. Shift-enter can run line by line or run a section of code that you manually highlight. If you want, you can use #%% lines in your code to put functions into code cells. Then when you are in a cell shift-enter will run that entire cell, might be the best current approach for you.
That smart execute does look interesting, if you would like to file that as a suggestion you can use our GitHub here to get it on our backlog to look at.
https://github.com/Microsoft/vscode-python
Hi you could click the symbol before each line and turn it into > (the indented codes of the function was hidden now). Then if you select the whole line and the next line, shift+enter could run them together.
enter image description here

How to get interactive R output in Jupyter (IPython, rpy2), e.g. for a progress bar?

I am trying to use the built-in R progress-bar (txtProgressBar) with %%R magic in Jupyter. While it does produce a nice animation when executed in the R console or RStudio, it does not produce the desired output in the Jupyter (notebook or lab) with an rpy2 extension instead, printing all the steps at once after finishing (which makes the progress bar useless). Two questions:
How could I make it work?
If it is not possible yet, how do I approach implementing this functionality on the rpy2 side (I already know how to make the interactive output/widgets on the Jupyter/IPython side)?
Here is a simple snippet of a progress bar from rfunction.com:
%%R
SEQ <- seq(1,100)
pb <- txtProgressBar(1, 100, style=3)
TIME <- Sys.time()
for(i in SEQ){
Sys.sleep(0.02)
setTxtProgressBar(pb, i)
}
For the folks new to rpy2: It needs to be installed with pip install rpy2 and the magic needs to be loaded in Jupyter with %load_ext rpy2.ipython.
Edit: The workaround I use for now is to manually invoke the code via robjects.r:
from rpy2.robjects import r
r("""
SEQ <- seq(1,100)
pb <- txtProgressBar(1, 100, style=3)
TIME <- Sys.time()
for(i in SEQ){
Sys.sleep(0.02)
setTxtProgressBar(pb, i)
}
""")
however this is not ideal - I would prefer to keep all the benefits of the rpy2's Rmagic.
There should be a way to achieve this, as the R magic is calling robjects.r() (as you are in your workaround).
In short, the following is happening when you submit an %%R jupyter cell for evaluation.
Parameters on the %%R line are evaluated and eventual setup prior to the evaluation of the R code is done (e.g., use a local converter, convert input parameters, etc...)
The R code in the rest of the %%R cell is evaluated in the R "Global Environment" as a string of code
Exit setup is run and results are returned
The second step is a essentially a call to the R C API, which the GIL makes the only activity happening with that process. However, rpy2 is defining default callbacks that reroute R's printing to the terminal/console to Python's own print() which is why you see the prints as the code is running in your call to robjects.r().
I am seeing that the R magic is caching the R output, and while there is an attribute cache_display_data that should control this is it not used. This is bug, for the reason your are asking on Stackoverflow, and because an R code block printing a lot would use more memory than needed (and even exhaust all RAM). I do not know whether it has always be present or it was introduced during code refactoring; it is now tracked here: https://bitbucket.org/rpy2/rpy2/issues/543
Edit: The fix is now in the repository, and will be part of rpy2-3.0.3 (likely released today).

python: import numpy as np from outer code gets lost within my own user defined module

I'm doing simulations for scientific computing, and I'm almost always going to want to be in the interactive interpreter to poke around at the output of my simulations. I'm trying to write classes to define simulated objects (neural populations) and I'd like to formalize my testing of these classes by calling a script %run test_class_WC.py in ipython. Since the module/file containing the class is changing as I try to debug it/add features, I'm reloading it each time.
./test_class_WC.py:
import WC_class # make sure WC_class exists
reload(WC_class) # make sure it's the most current version
import numpy as np
from WC_class import WC_unit # put the class into my global namespace?
E1 = WC_unit(Iapp=100)
E1.update() # see if it works
print E1.r
So right off the bat I'm using reload to make sure I've got the most current version of the module loaded so I've got the freshest class definition-- I'm sure this is clunky as heck (and maybe more sinister?), but it saves me some trouble from doing %run WC_class.py and having to do a separate call to %run test_WC.py
and ./WC_class:
class WC_unit:
nUnits = 0
def __init__(self,**kwargs):
self.__dict__.update(dict( # a bunch of params
gee = .6, # i need to be able to change
ke=.1,the=.2, # in test_class_WC.py
tau=100.,dt=.1,r=0.,Iapp=1.), **kwargs)
WC_unit.nUnits +=1
def update(self):
def f(x,k=self.ke,th=self.the): # a function i define inside a method
return 1/(1+np.exp(-(x-th)/k)) # using some of those params
x = self.Iapp + self.gee * self.r
self.r += self.dt/self.tau * (-self.r + f(x))
WC_unit basically defines a bunch of default parameters and defines an ODE that updates using basic Euler integration. I expect that test_class_WC sets up a global namespace containing np (and WC_unit, and WC_class)
When I run it, I get the following error:
In [14]: %run test_class_WC.py
---------------------------------------------------------------------------
NameError Traceback (most recent call last)
/Users/steeles/Desktop/science/WC_sequence/test_class_WC.py in <module>()
8
9 E1 = WC_unit(Iapp=100)
---> 10 E1.update()
11
12 # if bPlot:
/Users/steeles/Desktop/science/WC_sequence/WC_class.py in update(self)
19 return 1/(1+np.exp(-(x-th)/k))
20 x = self.Iapp + self.gee * self.r
---> 21 self.r += self.dt/self.tau * (-self.r + f(x))
22
23 # #class_method
/Users/steeles/Desktop/science/WC_sequence/WC_class.py in f(x, k, th)
17 def update(self):
18 def f(x,k=self.ke,th=self.the):
---> 19 return 1/(1+np.exp(-(x-th)/k))
20 x = self.Iapp + self.gee * self.r
21 self.r += self.dt/self.tau * (-self.r + f(x))
NameError: global name 'np' is not defined
Now I can get around this by just importing numpy as np in top of the WC_class module, or even by doing from numpy import exp in test_class_WC and change the update() method to contain exp() instead of np.exp()... but I'm not trying to do this because it's easy, I want to learn how all this namespace/module stuff works so I stop being a python idiot. Why is np getting lost in the WC_unit namespace? Is it because I'm dealing with two different files/modules? Does the call to np.exp inside a function have to do with it?
I'm also open to suggestions regarding improving my workflow and file structure, as it seems to be not particularly pythonic. My background is in MATLAB if that helps anyone understand. I'm editing my .py files in SublimeText2. Sorry the code is not very minimal, I've been having a hard time reproducing the problem.
The correct approach is to do an import numpy as np at the top of your sub-module as well. Here's why:
The key thing to note is that in Python, global actually means "shared at a module-level", and the namespaces for each module exist distinct from each other except when a module explicitly imports from another module. An imported module definitely cannot reach out to its 'parent' module's namespace, which is probably a good thing all things considered, otherwise you'll have modules whose behavior depends entirely on the variables defined in the module that imports it.
So when the stack trace says global name 'np' is not defined, it's talking about it at a module level. Python does not let the WC_Class module access objects in its parent module by default.
(As an aside, effbot has a quick note on how to do inter-module globals)
Another key thing to note is that even if you have multiple import numpy as np in various modules of your code, the module actually only gets loaded (i.e. executed) once. Once loaded, modules (being Python objects themselves) can be found in the dictionary sys.modules, and if a module already exists in this dictionary, any import module_to_import statement simply lets the importing module access names in the namespace of module_to_import. So having import numpy as np scattered across multiple modules in your codebase isn't wasteful.
Edit: On deeper digging, effbot has an even deeper (but still pretty quick and simple) exploration of what actually happens in module imports. For deeper exploration of the topic, you may want to check the import system discussion newly added in the Python 3 documentation.
It is normal in Python to import each module that is needed with in each. Don't count on any 'global' imports. In fact there isn't such a thing. With one exception. I discovered in
Do I have to specify import when Python script is being run in Ipython?
that %run -i myscript runs the script in the Ipython interactive namespace. So for quick test scripts this can save a bunch of imports.
I don't see the need for this triple import
import WC_class # make sure WC_class exists
reload(WC_class) # make sure it's the most current version
...
from WC_class import WC_unit
If all you are using from WC_class just use the last line.

Categories