When I try to call file and its method using Jython it shows the following error, while my Numpy, Python and NLTK is correctly installed and it works properly if I directly run directly from the Python shell
File "C:\Python26\Lib\site-packages\numpy\core\__init__.py", line 5, in <module>
import multiarray
ImportError: No module named multiarray
The code that I am using is simple one:
PyInstance hello = ie.createClass("PreProcessing", "None");
PyString str = new PyString("my name is abcd");
PyObject po = hello.invoke("preprocess", str);
System.out.println(po);
When I run only the file of python containing class PreProcessing and calling method preprocess it works fine, but with Jython it throws error.
Jython is unable to import all the libraries that have only compiled version kept in the folder not the class code itself. Like instead of multiarray.py it only has multiarray.pyd that is the compiled version so it is not getting detected in Jython.
Why is it showing this behaviour? How to resolve it?
Please help!
I know this is an old thread, but I recently ran into this same problem and was able to solve it and I figure the solution should be here in case anyone in the future runs into it. Like said above, Jython cannot deal with numpy's pre-compiled c files, but within nltk, the use of numpy is very limited and it's fairly straightforward to rewrite the affected bits of code. That's what I did, and I'm sure it's not the most computationally effective solution, but it works. This code is found in nltk.metrics.Segmentation, and I will only paste relevant code, but it will still be a little much.
def _init_mat(nrows, ncols, ins_cost, del_cost):
mat = [[4.97232652e-299 for x in xrange(ncols)] for x in xrange(nrows)]
for x in range(0,ncols):
mat[0][x] = x * ins_cost
for x in range(0, nrows):
mat[x][0] = x * del_cost
return mat
def _ghd_aux(mat, rowv, colv, ins_cost, del_cost, shift_cost_coeff):
for i, rowi in enumerate(rowv):
for j, colj in enumerate(colv):
shift_cost = shift_cost_coeff * abs(rowi - colj) + mat[i][j]
if rowi == colj:
# boundaries are at the same location, no transformation required
tcost = mat[i][j]
elif rowi > colj:
# boundary match through a deletion
tcost = del_cost + mat[i][j + 1]
else:
# boundary match through an insertion
tcost = ins_cost + mat[i + 1][j]
mat[i + 1][j + 1] = min(tcost, shift_cost)
Also at the end of ghd, change the return statement to
return mat[-1][-1]
I hope this helps someone! I don't know if there are other places where this is any issue, but this is the only one that I have encountered. If there are any other issues of this sort they can be solved in the same way(using a list of lists instead of a numpy array), again, you probably lose some efficiency, but it works.
jython is Java. Parts of Numpy are implemented as c extensions to Python (.pyd files). Some parts are implemented as .py files, which will work just fine in Jython. However, they cannot function with out access to the C level code. Currently, there is noway to use numpy in jython. See:
Using NumPy and Cpython with Jython
Or
Is there a good NumPy clone for Jython?
For recent discussions on alternatives.
Related
I'm currently using a python module called petsc4py (https://pypi.org/project/petsc4py/). My main issue is that none of the typical intellisense features seems to work with this module.
I'm guessing it might have something to do with it being a C extension module, but I am not sure exactly why this happens. I initially thought that intellisense was unable to look inside ".so" files, but it seems that numpy is able to do this with the array object, which in my case is inside a file called multiarray.cpython-37m-x86_64-linux-gnu (check example below).
Does anyone know why I see this behaviour in the petsc4py module. Is there anything that I (or the developers of petsc4py) can do to get intellisense to work?
Example:
import sys
import petsc4py
petsc4py.init(sys.argv)
from petsc4py import PETSc
x_p = PETSc.Vec().create()
x_p.setSizes(10)
x_p.setFromOptions()
u_p = x_p.duplicate()
import numpy as np
x_n = np.array([1,2,3])
u_n = x_n.copy()
In this example, when trying to work with a Vec object from petsc4py, doing u_p.duplicate() cannot find the function and the suggestion is simply a repetition of the function immediately before. However, using an array from numpy, doing u_n.copy() works perfectly.
If you're compiling in-place then you're bumping up against https://github.com/microsoft/python-language-server/issues/197.
I'm working on a big Python code base that grows and grows and grows. It's not a single application - more of a bunch of experiments that share some common code.
Every so often, I want to make a public release of a given experiment. I don't want to release my entire awful codebase, just the parts required to run a given experiment. So basically I'd like something to crawl through all the imports and copy whatever functions are called (or at least all the modules imported) into a single file, which I can release as a demo. I'd of course like to only do this for files defined in the current project (not a dependent package like numpy).
I'm using PyCharm now, and haven't been able to find that functionality. Is there any tool that does this?
Edit: I created the public-release package to solve this problem. Given a main module, it crawls through dependent modules and copies them into a new repo.
If you just want the modules, you could just run the code and a new session and go through sys.modules for any module in your package.
To move all the dependencies with PyCharm, you could make a macro that moves a highlighted object to a predefined file, attach the macro to a keyboard shortcut and then quickly move any in-project imports recursively. For instance, I made a macro called export_func that moves a function to to_export.py and added a shortcut to F10:
Given a function that I want to move in a file like
from utils import factorize
def my_func():
print(factorize(100))
and utils.py looking something like
import numpy as np
from collections import Counter
import sys
if sys.version_info.major >= 3:
from functools import lru_cache
else:
from functools32 import lru_cache
PREPROC_CAP = int(1e6)
#lru_cache(10)
def get_primes(n):
n = int(n)
sieve = np.ones(n // 3 + (n % 6 == 2), dtype=np.bool)
for i in range(1, int(n ** 0.5) // 3 + 1):
if sieve[i]:
k = 3 * i + 1 | 1
sieve[k * k // 3::2 * k] = False
sieve[k * (k - 2 * (i & 1) + 4) // 3::2 * k] = False
return list(map(int, np.r_[2, 3, ((3 * np.nonzero(sieve)[0][1:] + 1) | 1)]))
#lru_cache(10)
def _get_primes_set(n):
return set(get_primes(n))
#lru_cache(int(1e6))
def factorize(value):
if value == 1:
return Counter()
if value < PREPROC_CAP and value in _get_primes_set(PREPROC_CAP):
return Counter([value])
for p in get_primes(PREPROC_CAP):
if p ** 2 > value:
break
if value % p == 0:
factors = factorize(value // p).copy()
factors[p] += 1
return factors
for p in range(PREPROC_CAP + 1, int(value ** .5) + 1, 2):
if value % p == 0:
factors = factorize(value // p).copy()
factors[p] += 1
return factors
return Counter([value])
I can highlight my_func and press F10 to create to_export.py:
from utils import factorize
def my_func():
print(factorize(100))
Highlighting factorize in to_export.py and hitting F10 gets
from collections import Counter
from functools import lru_cache
from utils import PREPROC_CAP, _get_primes_set, get_primes
def my_func():
print(factorize(100))
#lru_cache(int(1e6))
def factorize(value):
if value == 1:
return Counter()
if value < PREPROC_CAP and value in _get_primes_set(PREPROC_CAP):
return Counter([value])
for p in get_primes(PREPROC_CAP):
if p ** 2 > value:
break
if value % p == 0:
factors = factorize(value // p).copy()
factors[p] += 1
return factors
for p in range(PREPROC_CAP + 1, int(value ** .5) + 1, 2):
if value % p == 0:
factors = factorize(value // p).copy()
factors[p] += 1
return factors
return Counter([value])
Then highlighting each of PREPROC_CAP, _get_primes_set, and get_primes and
then pressing F10 gets
from collections import Counter
from functools import lru_cache
import numpy as np
def my_func():
print(factorize(100))
#lru_cache(int(1e6))
def factorize(value):
if value == 1:
return Counter()
if value < PREPROC_CAP and value in _get_primes_set(PREPROC_CAP):
return Counter([value])
for p in get_primes(PREPROC_CAP):
if p ** 2 > value:
break
if value % p == 0:
factors = factorize(value // p).copy()
factors[p] += 1
return factors
for p in range(PREPROC_CAP + 1, int(value ** .5) + 1, 2):
if value % p == 0:
factors = factorize(value // p).copy()
factors[p] += 1
return factors
return Counter([value])
PREPROC_CAP = int(1e6)
#lru_cache(10)
def _get_primes_set(n):
return set(get_primes(n))
#lru_cache(10)
def get_primes(n):
n = int(n)
sieve = np.ones(n // 3 + (n % 6 == 2), dtype=np.bool)
for i in range(1, int(n ** 0.5) // 3 + 1):
if sieve[i]:
k = 3 * i + 1 | 1
sieve[k * k // 3::2 * k] = False
sieve[k * (k - 2 * (i & 1) + 4) // 3::2 * k] = False
return list(map(int, np.r_[2, 3, ((3 * np.nonzero(sieve)[0][1:] + 1) | 1)]))
It goes pretty fast even if you have a lot of code that you're copying over.
Jamming all your code into a single module isn't a good idea. A good example reason why is the case when one of your experiments depends on two modules with different definitions for the same function name. With separate modules, it's easy for your code to distinguish between them; to stuff them in the same module, the editor would have to do some kind of hacky function renaming (e.g., prepend them with the old module name or something), and the situation gets even worse if some other function in the module calls the one with the conflicting name. You effectively have to fully replace the module scoping mechanism to do this.
Building a list of module dependencies is also a non-trival task. Consider having an experiment that depends on a module that depends on numpy. You almost certainly want your end users to actually install the numpy package rather than bundle it, so now the editor has to have some way of distinguishing what modules to include and which ones you expect to be installed some other way. On top of this, you have to consider things like when a function imports a module in line as opposed to at the top of your module and other out-of-the-ordinary cases.
You're asking too much of your editor. You really have two problems:
Separate your experimental code from your release ready code.
Package your stable code.
Separating your experimental code
Source control is the answer to your first problem. This will allow you to create whatever experimental code you wish on your local machine, and as long as you don't commit it, you won't pollute your code base with experimental code. If you do want to commit this code for back up, tracking, or sharing purposes, you can use branching here. Identify a branch as your stable branch (typically trunk in SVN and master in git), and only commit experimental code to other branches. You can then merge experimental feature branches into the stable branch as they become mature enough to publish. Such a branching set up has the added benefit of allowing you to segregate your experiments from each other, if you choose.
A server hosted source control system will generally make things simpler and safer, but if you're the sole developer, you could still use git locally without a server. A server hosted repository also makes it easier to coordinate with others if you're not the sole developer.
Packaging your stable code
One very simple option to consider is to just tell your users to check out the stable branch from the repository. Distributing this way is far from unheard of. This is still a little better than your current situation since you no longer need to manually gather all your files; you may need to do a little bit of documentation, though. You alternatively could use your source control's built in feature to check out an entire commit as a zip file or similar (export in SVN and archive in git) if you don't want to make your repository publicly available; these can be uploaded anywhere.
If that doesn't seem like enough and you can spare the time right now, setuptools is probably a good answer to this problem. This will allow you to generate a wheel containing your stable code. You can have a setup.py script for each package of code you want to release; the setup.py script will identify which packages and modules to include. You do have to manage this script manually, but if you configure it to include whole packages and directories and then establish good project conventions for organizing your code, you shouldn't have to change it very often. This also has the benefit of giving your end users a standard install mechanism for your code. You could even publish it on PyPI if you wish to share it broadly.
If you go so far as to use setuptools, you may also want to consider a build server, which can pick up on new commits and can run scripts to repackage and potentially publish your code.
So in the end, to solve our problem, I made a tool called public-release, which collects all the dependencies for the module you want to release, throws them into a separate repo with setup scripts and all, so that you code can be easily run later.
Unfortunately, the dynamic features of Python makes it impossible in general. (For example you can call functions by names which come from an arbitrary source.)
You can thinking in the opposite way, which means that you should remove all unused parts of the code.
According to this question PyCharm does not support this. The vulture package provides dead code detection functionality.
Therefor, I propose to make a copy of the project where you collect the required functions into a module. After, detect all unused parts of the demo code and remove them.
In PyCharm you can select the code you wish to move into a new module and from the main menu select - Refactor -> Copy (F6 on mine but I can't remember if thats a customised shortcut). This give you the option to copy the code to a new (or existing file) in a directory of your choosing. It will also add all the relevant imports.
This question concerns Matlab 2014b, Python 3.4 and Mac OS 10.10.
I have the following Python file tmp.py:
from statsmodels.tsa.arima_process import ArmaProcess
import numpy as np
def generate_AR_time_series():
arparams = np.array([-0.8])
maparams = np.array([])
ar = np.r_[1, -arparams]
ma = np.r_[1, maparams]
arma_process = ArmaProcess(ar, ma)
return arma_process.generate_sample(100)
I want to call generate_AR_time_series from Matlab so I used:
py.tmp.generate_AR_time_series()
which gave a vague error message
Undefined variable "py" or class "py.tmp.generate_AR_time_series".
To look into the problem further, I tried
tmp = py.eval('__import__(''tmp'')', struct);
which gave me a detailed but still obscured error message:
Python Error:
dlopen(/opt/local/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/site-packages/scipy/special/_ufuncs.so, 2): Symbol
not found: __gfortran_stop_numeric_f08
Referenced from: /opt/local/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/site-packages/scipy/special/_ufuncs.so
Expected in: /Applications/MATLAB_R2014b.app/sys/os/maci64/libgfortran.3.dylib
in /opt/local/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/site-packages/scipy/special/_ufuncs.so
I can call the function within Python just fine, so I guess the problem is with Matlab. From the detailed message, it seems that the problem lies in something is expected in the Matlab installation path, but of course Matlab installation path does not contain those things since they are third-party libraries for Python.
How to solve this problem?
Edit 1:
libgfortran.3.dylib can be found in a lot of places:
/Applications/MATLAB_R2014a.app/sys/os/maci64/libgfortran.3.dylib
/Applications/MATLAB_R2014b.app/sys/os/maci64/libgfortran.3.dylib
/opt/local/lib/gcc48/libgfortran.3.dylib
/opt/local/lib/gcc49/libgfortran.3.dylib
/opt/local/lib/libgcc/libgfortran.3.dylib
/Users/wdg/Documents/MATLAB/mcode/nativelibs/macosx/bin/libgfortran.3.dylib
Try:
setenv('DYLD_LIBRARY_PATH', '/usr/local/bin/');
For me, using the setenv approach from within MATLAB did not work. Also, MATLAB modifies the DYLD_LIBRARY_PATH variable during startup to include necessary libraries.
First, you have to make sure which version of gfortran scipy was linked against: in Terminal.app, enter otool -L /opt/local/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/site-packages/scipy/special/_ufuncs.so and look for 'libgfortran' in the output.
It worked for me to copy $(MATLABROOT)/bin/.matlab7rc.sh to my home directory and change the line LDPATH_PREFIX='' in the mac section (around line 195 in my case) to LDPATH_PREFIX='/opt/local/lib/gcc49', or whatever path to libgfortran you found above.
This ensures that /opt/local/lib/gcc49/libgfortran.3.dylib is found before the MATLAB version, but leaves other paths intact.
I'm doing simulations for scientific computing, and I'm almost always going to want to be in the interactive interpreter to poke around at the output of my simulations. I'm trying to write classes to define simulated objects (neural populations) and I'd like to formalize my testing of these classes by calling a script %run test_class_WC.py in ipython. Since the module/file containing the class is changing as I try to debug it/add features, I'm reloading it each time.
./test_class_WC.py:
import WC_class # make sure WC_class exists
reload(WC_class) # make sure it's the most current version
import numpy as np
from WC_class import WC_unit # put the class into my global namespace?
E1 = WC_unit(Iapp=100)
E1.update() # see if it works
print E1.r
So right off the bat I'm using reload to make sure I've got the most current version of the module loaded so I've got the freshest class definition-- I'm sure this is clunky as heck (and maybe more sinister?), but it saves me some trouble from doing %run WC_class.py and having to do a separate call to %run test_WC.py
and ./WC_class:
class WC_unit:
nUnits = 0
def __init__(self,**kwargs):
self.__dict__.update(dict( # a bunch of params
gee = .6, # i need to be able to change
ke=.1,the=.2, # in test_class_WC.py
tau=100.,dt=.1,r=0.,Iapp=1.), **kwargs)
WC_unit.nUnits +=1
def update(self):
def f(x,k=self.ke,th=self.the): # a function i define inside a method
return 1/(1+np.exp(-(x-th)/k)) # using some of those params
x = self.Iapp + self.gee * self.r
self.r += self.dt/self.tau * (-self.r + f(x))
WC_unit basically defines a bunch of default parameters and defines an ODE that updates using basic Euler integration. I expect that test_class_WC sets up a global namespace containing np (and WC_unit, and WC_class)
When I run it, I get the following error:
In [14]: %run test_class_WC.py
---------------------------------------------------------------------------
NameError Traceback (most recent call last)
/Users/steeles/Desktop/science/WC_sequence/test_class_WC.py in <module>()
8
9 E1 = WC_unit(Iapp=100)
---> 10 E1.update()
11
12 # if bPlot:
/Users/steeles/Desktop/science/WC_sequence/WC_class.py in update(self)
19 return 1/(1+np.exp(-(x-th)/k))
20 x = self.Iapp + self.gee * self.r
---> 21 self.r += self.dt/self.tau * (-self.r + f(x))
22
23 # #class_method
/Users/steeles/Desktop/science/WC_sequence/WC_class.py in f(x, k, th)
17 def update(self):
18 def f(x,k=self.ke,th=self.the):
---> 19 return 1/(1+np.exp(-(x-th)/k))
20 x = self.Iapp + self.gee * self.r
21 self.r += self.dt/self.tau * (-self.r + f(x))
NameError: global name 'np' is not defined
Now I can get around this by just importing numpy as np in top of the WC_class module, or even by doing from numpy import exp in test_class_WC and change the update() method to contain exp() instead of np.exp()... but I'm not trying to do this because it's easy, I want to learn how all this namespace/module stuff works so I stop being a python idiot. Why is np getting lost in the WC_unit namespace? Is it because I'm dealing with two different files/modules? Does the call to np.exp inside a function have to do with it?
I'm also open to suggestions regarding improving my workflow and file structure, as it seems to be not particularly pythonic. My background is in MATLAB if that helps anyone understand. I'm editing my .py files in SublimeText2. Sorry the code is not very minimal, I've been having a hard time reproducing the problem.
The correct approach is to do an import numpy as np at the top of your sub-module as well. Here's why:
The key thing to note is that in Python, global actually means "shared at a module-level", and the namespaces for each module exist distinct from each other except when a module explicitly imports from another module. An imported module definitely cannot reach out to its 'parent' module's namespace, which is probably a good thing all things considered, otherwise you'll have modules whose behavior depends entirely on the variables defined in the module that imports it.
So when the stack trace says global name 'np' is not defined, it's talking about it at a module level. Python does not let the WC_Class module access objects in its parent module by default.
(As an aside, effbot has a quick note on how to do inter-module globals)
Another key thing to note is that even if you have multiple import numpy as np in various modules of your code, the module actually only gets loaded (i.e. executed) once. Once loaded, modules (being Python objects themselves) can be found in the dictionary sys.modules, and if a module already exists in this dictionary, any import module_to_import statement simply lets the importing module access names in the namespace of module_to_import. So having import numpy as np scattered across multiple modules in your codebase isn't wasteful.
Edit: On deeper digging, effbot has an even deeper (but still pretty quick and simple) exploration of what actually happens in module imports. For deeper exploration of the topic, you may want to check the import system discussion newly added in the Python 3 documentation.
It is normal in Python to import each module that is needed with in each. Don't count on any 'global' imports. In fact there isn't such a thing. With one exception. I discovered in
Do I have to specify import when Python script is being run in Ipython?
that %run -i myscript runs the script in the Ipython interactive namespace. So for quick test scripts this can save a bunch of imports.
I don't see the need for this triple import
import WC_class # make sure WC_class exists
reload(WC_class) # make sure it's the most current version
...
from WC_class import WC_unit
If all you are using from WC_class just use the last line.
I'm currently writing a python script which plots a numpy matrix containing some data (which I'm not having any difficulty computing). For complicated reasons having to do with how I'm creating that data, I have to go through terminal. I've done problems like this a million times in Spyder using imshow(). So, I thought I'd try to do the same in terminal. Here's my code:
from numpy import *
from matplotlib import *
def make_picture():
f = open("DATA2.txt")
arr = zeros((200, 200))
l = f.readlines()
for i in l:
j = i[:-1]
k = j.split(" ")
arr[int(k[0])][int(k[1])] = float(k[2])
f.close()
imshow(arr)
make_picture()
Suffice it to say, the array stuff works just fine. I've tested it, and it extracts the data perfectly well. So, I've got this 200 by 200 array of numbers floating around my RAM and I'd like to display it. When I run this code in Spyder, I get exactly what I expected. However, when I run this code in Terminal, I get an error message:
Traceback (most recent call last):
File "DATAmine.py", line 15, in <module>
make_picture()
File "DATAmine.py", line 13, in make_picture
imshow(arr)
NameError: global name 'imshow' is not defined
(My program's called DATAmine.py) What's the deal here? Is there something else I should be importing? I know I had to configure my Spyder paths, so I wonder if I don't have access to those paths or something. Any suggestions would be greatly appreciated. Thanks!
P.S. Perhaps I should mention I'm using Ubuntu. Don't know if that's relevant.
To make your life easier you can use
from pylab import *
This will import the full pylab package, which includes matplotlib and numpy.
Cheers