I'm using scikit-learn on a supercomputer. I'm using the presently installed versions of the packages I need, and it works just fine. I have a local copy of the code for development.
Due to some apparent discrepancy in the sklearn and numpy versions on the system, I get the following warnings:
python3.7/site-packages/sklearn/utils/__init__.py:4: DeprecationWarning: Using or importing the ABCs from 'collections' instead of from 'collections.abc' is deprecated, and in 3.8 it will stop working
from collections import Sequence
python3.7/site-packages/sklearn/ensemble/weight_boosting.py:29: DeprecationWarning: numpy.core.umath_tests is an internal NumPy module and should not be imported. It will be removed in a future NumPy release.
from numpy.core.umath_tests import inner1d
I don't need to see this warning or fix anything because I'm not developing anything on this system, I just need it to run. But if I use this code in production mode, I am going to get this warning in my error file millions of times and I won't be able to find any errors that actually matter.
In my python scripts I have:
import warnings
warnings.simplefilter(action='ignore', category=DeprecationWarning)
import numpy as np
from sklearn import ensemble
from sklearn import metrics
in my bash script I have:
jsrun -n $2 python3 -W ignore::DeprecationWarning myscript.py $args &
I've read a lot of the answers to similar questions but I don't see why I'm still getting this warning. How can I make it go away?
Related
Pycharm is giving me a
AttributeError: module 'http' has no attribute 'client'
when trying to load in pandas for a specific file. Strangely I only get this error if I also have the following import:
from sklearn import svm
pandas is with the Anaconda package suite, and runs/loads without the sklearn module, and in the console.
Python 3.7, Github extension, anaconda suite. I have tried reinstall pandas and sklearn to no avail. I have tried reordering my file structure to avoid python path issues.
python
import pandas
from sklearn import svm
I expect the code to compile as there are no noticeable syntax errors and even with the rest of the code commented out this still happens.
edit: it compiles when run in the console, so there appears to be an issue with the python path. Is there a way to investigate this more closely or directly control the python path? Also it compiles when I run with
import sklearn
but does not with
from sklearn import svm
if that helps narrow down the issue at all...
The problem appears to be that my file is named ssl.py I don't know if there's another file called that which was cauing path issues but it seems to be fixed now...
I am using vscode for coding my python code. I use pandas, numpy and requests library in my code. If I run the code, It works fine. But in VScode editor, in the problems section, always Its says the message as
Unable to import 'numpy' (pylint import error)
Unable to import 'pandas' (pylint import error)
Unable to import 'requests' (pylint import error)
I searched in StackOverflow questions to find the answer to this problem, It says to install pandas using pip. I did that also. But still I am facing the same problem. How to fix this problem in vs code editor
This is not telling you that numpy or pandas is not installed. It is telling you that pylint can't verify your numpy and pandas calls. Most of numpy and pandas is written in C, not Python.
The pylint documentation says
Linting C extension modules is not supported out of the box,
especially since pylint has no way to get an AST object out of the
extension module.
So there is no problem with your code, even if VSCode says it is a problem. It is a technical limitation of pylint. If it worries you, disable pylint message E401 for these import statements. Put #pylint: disable=E401 on the same line as your import statement.
I try to visualize my topic modeling results with the pyLDAvis python module. But when i try to import:
import pyLDAvis
import pyLDAvis.sklearn
Then i get the following warnings:
...\site-packages\msgpack_numpy.py:77: DeprecationWarning: The binary mode of fromstring
is deprecated, as it behaves surprisingly on unicode inputs.
Use frombuffer instead dtype=np.dtype(descr)).reshape(obj[b'shape'])
A lot of these warnings results, always in the same file.. Does somebody know, how I could fix this? Thanks!
I am using f2py to wrap my PETSc-based fortran analysis code for use in OpenMDAO (as suggested in this post). Rather than use f2py directly, I'm instead using it to generate the relevant .c, .pyc, etc. files and then linking them myself using mpif90.
In a simple python environment, I can import my .so and run the code without any problems:
>>> import module_name
>>> module_name.execute()
expected code output...
However, when trying to do the same thing in an OpenMDAO component, I get the following error:
At line 72 of file driver.F90
Internal Error: list_formatted_write(): Bad type
This happens even when running in serial and the error appears to the first place in the fortran code where I use write(*,*). What could be different about running under OpenMDAO that might cause this issue? Might it have something to do with the need to pass a comm object, as mentioned in the answer to my original question? I am not doing that at the moment as it was not clear to me from the the relevant OpenMDAO example how that should be done in my case.
When I try to find specific information about the error I'm getting, search results almost always point to the mpif90 or gfortran libraries and possibly needing to recompile or update the libraries. However, that doesn't explain to me why my analysis would work perfectly well in a simple python code but not in OpenMDAO.
UPDATE: Per some others' suggestions, I've tried a few more things. Firstly, I get the error regardless of if I'm running using mpiexec python <script> or merely python <script>. I do have the PETSc implementation set up, assuming that doesn't refer to anything beyond the if MPI block in this example.
In my standalone test, I am able to successfully import a handful of things, including
from mpi4py import MPI
from petsc4py import PETSc
from openmdao.core.system import System
from openmdao.core.component import Component
from openmdao.core.basic_impl import BasicImpl
from openmdao.core._checks import check_connections, _both_names
from openmdao.core.driver import Driver
from openmdao.core.mpi_wrap import MPI, under_mpirun, debug
from openmdao.components.indep_var_comp import IndepVarComp
from openmdao.solvers.ln_gauss_seidel import LinearGaussSeidel
from openmdao.units.units import get_conversion_tuple
from openmdao.util.string_util import get_common_ancestor, nearest_child, name_relative_to
from openmdao.util.options import OptionsDictionary
from openmdao.util.dict_util import _jac_to_flat_dict
Not too much rhyme or reason to what I tested, just went down a few random rabbit holes (more direction would be fantastic). Here are some of the things that do result in error if they are imported in the same script:
from openmdao.core.group import Group
from openmdao.core.parallel_group import ParallelGroup
from openmdao.core.parallel_fd_group import ParallelFDGroup
from openmdao.core.relevance import Relevance
from openmdao.solvers.scipy_gmres import ScipyGMRES
from openmdao.solvers.ln_direct import DirectSolver
So it doesn't seem that the MPI imports are a problem? However, not knowing the OpenMDAO code too well, I am having trouble seeing the common thread in the problematic imports.
UPDATE 2: I should add that I'm becoming particularly suspicious of the networkx package. If my script is simply
import networkx as nx
import module_name
module_name.execute()
then I get the error. If I import my module before networkz, however (i.e. switch lines 1 and 2 in the above block), I don't get the error. More strangely, if I also import PETSc:
from petsc4py import PETSc
import networkx as nx
import module_name
module_name.execute()
Then everything works...
UPDATE 3: I'm running OS X El Capitan 10.11.6. I genuinely don't remember how I installed the python2.7 (need to use this rather than 3.x at the moment) I was using. Installed years ago and was located in /usr/local/bin. However, I switched to an anaconda installation, re-installed networkx, and still get the same error.
I've discovered that if I compile the f2py-wrapped stuff using gfortran (I assume this is what you guys do, yes?) rather than mpif90, I don't get the errors. Unfortunately, this causes the PETSc stuff in my fortran code yield some strange errors, probably because those .f90/.F90 files, according to the PETSc compilation rules, are compiled by mpif90 even if I force the final compile to use gfortran.
UPDATE 4: I was finally able to solve the Internal Error: list_formatted_write() issue. By using mpif90 --showme I could see what flags mpif90 is using (since it's essentially just gfortran plus some flags). It turns omitting the flag -Wl,-flat_namespace got rid of those print-related errors.
Now I can import most things and run my code without a problem, with one important exception. If I have a petsc-based fortran module (pc_fort_mod), then also importing PETSc into the python environment, i.e.
from petsc4py import PETSc
import pc_fort_mod
pc_fort_mod.execute()
results in PETSc errors in the fortran analysis (invalid matrices, unsuccessful preallocation). This seems reasonable to me since both would appear to be attempting to use the same PETSc libraries. Any idea if there is a way to do this so that the pc_fort_mod PETSc and petsc4py PETSC don't clash? I guess a workaround may be to have two PETSc builds...
SOLVED: I'm told that the problem described in Update 4 ultimately should not be a problem--it should be possible to simultaneously use PETSc in python and fortran. I was ultimately able to resolve my error by using a self-compiled PETSc build rather than the Homebrew recipe.
I've never quite seen anything like this before and we've used network-X with compiled fortran wrapped in F2py, running under MPI many many times.
I suggest that you remove and re-install your network-x package.
Which python are you using and what os are you running on? We've had very good luck running the anaconda python installation. You have to be a bit careful when installing petsc though. Building from source and running the PETSc tests is the safest way.
I installed scikit-learn with pip (pip install -U scikit-learn). I then went to ipython and ran import sklearn, but if I then try to load any modules, they aren't found. In particular, the tab completion of sklearn doesn't seem correct:
In [2]: sklearn.
sklearn.base sklearn.clone sklearn.externals sklearn.re sklearn.setup_module sklearn.sys sklearn.test sklearn.warnings
Any idea what's going on here? Other modules load fine. For example, numpy works normally.
Import the submodule you want to use explicitly:
import sklearn.<submodule>
print sklearn.<submodule>.function()
or
from sklearn.<submodule> import function
print function()
In large python packages, oftentimes the submodules need to be explicitly imported. This is so that the user can pick and choose what to import without importing the entire package (which can negatively affect startup time).