Python: How do I find which pip package a library belongs to? - python

I got a script transferred from someone else. And there is a module imported into the script. I'm wondering what is the best way to find out which pip package installed this library (other than search online).
I tried to import the package and then just do help() on it but didn't got much information. Is there a reliable and pythonic way to achieve this?
For example:
In the script it has a line
from impala.dbapi import connect
Without searching on internet, how can I find out that following package can install this library? as you can see in this case the package name is is different from the name used in pip.
pip install impyla

Short answer: You can't.
The core of the reason is that the name to import the package and the name to install the package come from different namespaces. When you run the import command, Python is looking for the package in the local environment. When you tell pip to install a package, it's looking for something to install from PyPI (or somewhere else if you told pip to look elsewhere).
It would make sense for these two names to always be the same, but there's no guarantee. Installation is a matter of choosing which set of files to download and install, while importing is a matter of choosing what installed files to run, but the names for those files do not have to stay the same during the installation process. And that's how we get confusion like pip install impyla, import impala.
If you had access to the setup.py file for the package, you could look in there for the name (if you look at the GitHub for impyla/impala, you'll see a name='impyla', line inside the call to setup), but if the package was installed via pip, you won't have the setup.py file locally, so this option is pretty much right out.
It's not a great state of affairs. There's simply no guarantee that you can find the PyPI name for a package from just having local access to the package code and the "real" import name for the package. That said, if you're unfamiliar with the package anyway, you're probably going to want to look up documentation and more info on the internet anyway. Just one more thing to look up, I guess.

If you do not have the package installed, you'll have to use a pip search or a poetry search (or a Google search).
In the case you have the package already installed (i.e. you can import it) , you can get the package name with importlib.metadata:
In Python >=3.8, you can use the standard library importlib.metadata
For Python <3.8, there is importlib_metadata (link to documentation). Replace importlib.metadata with importlib_metadata in the examples below.
Example on usage:
>>> from importlib.metadata import packages_distributions
>>> packages_distributions()
'asttokens': ['asttokens'],
'backcall': ['backcall'],
'bitarray': ['bitarray'],
'colorama': ['colorama'],
'decorator': ['decorator'],
'executing': ['executing'],
'importlib_metadata': ['importlib-metadata'],
'impala': ['impyla'],
'IPython': ['ipython'],
'jedi': ['jedi'],
'matplotlib_inline': ['matplotlib-inline'],
'parso': ['parso'],
'pickleshare': ['pickleshare'],
'pip': ['pip'],
'prompt_toolkit': ['prompt-toolkit'],
'pure_eval': ['pure-eval'],
'puresasl': ['pure-sasl'],
'pygments': ['Pygments'],
'_distutils_hack': ['setuptools'],
'pkg_resources': ['setuptools'],
'setuptools': ['setuptools'],
'six': ['six'],
'stack_data': ['stack-data'],
'thrift': ['thrift'],
'thrift_sasl': ['thrift-sasl'],
'traitlets': ['traitlets'],
'wcwidth': ['wcwidth'],
'zipp': ['zipp']}
In this case, you would look for the "impala" entry in the packages_distributions() dictionary:
>>> packages_distributions()['impala']
['impyla']
You may also check for example the version of the package:
>>> from importlib.metadata import version
>>> # Note: Using the name of the package, not the "import name"
>>> version('impyla')
'0.18.0'
Bonus: Checking the name of the module from a function or class name
>>> connect.__module__.split('.')[0]
'impala'
Note that this does not guarantee that a package was installed from PyPI. You could, if you want, create your own package called "impyla", "matplotlib", whatsoever, install it, and use it as you wish.

Related

How import package from PyPI with hyphen in name?

There is a package in PyPI called neat-python (yes, with a hyphen). I can install it just fine but can't import it into Python. I've tried underscores, parentheses, and making the name a string but of course the import statement doesn't allow them. Does PyPI actually accept packages with illegal Python names or is there a solution I'm overlooking?
hyphen is not allowed in import syntax. In the case of 'neat-python' the package is simply installed as 'neat':
import neat
you can check this yourself by looking in your site-packages directory (for me, that is /usr/local/lib/python3.7/site-packages).
Edit: and yes, this is allowed for PyPI packages, and it can be annoying. Usually the actual package name will be some very similar variant of the name used to install from PyPI.
Starting in python3.x you can use importlib for some generic module that actually installs with a hyphen in the name. I will use neat-python as an example even though I have been informed that it actually installs as neat:
--myscript.py--
import importlib
neat = importlib.import_module("neat-python")
# to then call "mymodule" in neat
neat.mymodule(someobject)

How to know which .whl module is suitable for my system with so many?

We have so may versions of wheel.
How could we know which version should be installed into my system?
I remember there is a certain command which could check my system environment.
Or is there any other ways?
---------------------Example Below this line -----------
scikit_learn-0.17.1-cp27-cp27m-win32.whl
scikit_learn-0.17.1-cp27-cp27m-win_amd64.whl
scikit_learn-0.17.1-cp34-cp34m-win32.whl
scikit_learn-0.17.1-cp34-cp34m-win_amd64.whl
scikit_learn-0.17.1-cp35-cp35m-win32.whl
scikit_learn-0.17.1-cp35-cp35m-win_amd64.whl
scikit_learn-0.18rc2-cp27-cp27m-win32.whl
scikit_learn-0.18rc2-cp27-cp27m-win_amd64.whl
scikit_learn-0.18rc2-cp34-cp34m-win32.whl
scikit_learn-0.18rc2-cp34-cp34m-win_amd64.whl
scikit_learn-0.18rc2-cp35-cp35m-win32.whl
scikit_learn-0.18rc2-cp35-cp35m-win_amd64.whl
In case this is still an issue, the following should tell you the information you need to know about your architecture to choose a wheel:
import platform
print platform.architecture()
You don't have to know. Use pip - it will select the most specific wheel available.
As a warning, pip._internal isn't a stable API, so you wouldn't want to rely on it. But in case it's helpful (as it was to me) - this answer gives a way of solving the problem:
You can get it in python from pip following this solution:
Since pip version 19.3,
TargetPython.get_tags() returns
the supported PEP 425 tags to check wheel candidates against (source). The tags are returned in order of preference (most preferred first).
from pip._internal.models.target_python import TargetPython
target_python = TargetPython()
pep425tags = target_python.get_tags()
The class TargetPython encapsulates the properties of a Python interpreter one is targeting for a package install, download, etc.
To avoid using pip._internal, you can use, in the shell (see here):
$ path/to/pythonX.Y -m pip debug --verbose

eclipse pydev - how to install python modules

Just working my way through a (very good) book call Test Driven Development using Python.
This makes use of Python3.4 by the way. By the way, I am running in a Windows 7 OS.
I've got all the stuff working using a simple text editor and running from the command line... in the course of which in particular I used "pip install" to install Django and Selenium, as per book's instructions.
This created folders "selenium" and "django" under ...\Python34\Lib\site-packages\ ... so I added these to the PythonPath for my Eclipse/PyDev project.
With the correct interpreter selected I then tried to run a file which runs fine on the command line: "> python3 functional_tests.py"... but I get
File "D:\apps\Python34\lib\site-packages\django\http\__init__.py", line 1, in <module>
from django.http.cookie import SimpleCookie, parse_cookie
File "D:\apps\Python34\lib\site-packages\django\http\cookie.py", line 5, in <module>
from django.utils.six.moves import http_cookies
ImportError: cannot import name 'http_cookies'
... to me this looks like a dependency thing... as though "pip install" handles dependency matters in a way just including a single folder doesn't.
Question boils down to this: what's the "proper" way to install a python module using PyDev?
several days later
wow... nothing? Nothing! I suppose this must mean that you either have to add dependencies manually or use something like Ant, Maven or Gradle within Eclipse itself. These latter are not my strong areas, even outside an IDE. Would still be nice to have an answer from a PyDev expert!
Well, pip install should work for PyDev (it should automatically recognize the dependency)...
I.e.: in your use case, the only folder that should be in the PYTHONPATH is D:\apps\Python34\lib\site-packages (and pip should install packages to that folder -- make sure you don't add extra folders for "D:\apps\Python34\lib\site-packages\django" nor anything else inside the site-packages to the PYTHONPATH).
If it's still not working, please check if the module django.utils.six.moves.http_cookies is indeed where you expect it to be. Also, you can print the PYTHONPATH being used in runtime with:
import sys
print('\n'.join(sorted(sys.path)))
To check if that's really what you expect.

What is the cleanest way to add a directory of third-party packages to the beginning of the Python path?

My context is appengine_config.py, but this is really a general Python question.
Given that we've cloned a repo of an app that has an empty directory lib in it, and that we populate lib with packages by using the command pip install -r requirements.txt --target lib, then:
dirname ='lib'
dirpath = os.path.join(os.path.dirname(__file__), dirname)
For importing purposes, we can add such a filesystem path to the beginning of the Python path in the following way (we use index 1 because the first position should remain '.', the current directory):
sys.path.insert(1, dirpath)
However, that won't work if any of the packages in that directory are namespace packages.
To support namespace packages we can instead use:
site.addsitedir(dirpath)
But that appends the new directory to the end of the path, which we don't want in case we need to override a platform-supplied package (such as WebOb) with a newer version.
The solution I have so far is this bit of code which I'd really like to simplify:
sys.path, remainder = sys.path[:1], sys.path[1:]
site.addsitedir(dirpath)
sys.path.extend(remainder)
Is there a cleaner or more Pythonic way of accomplishing this?
For this answer I assume you know how to use setuptools and setup.py.
Assuming you would like to use the standard setuptools workflow for development, I recommend using this code snipped in your appengine_config.py:
import os
import sys
if os.environ.get('CURRENT_VERSION_ID') == 'testbed-version':
# If we are unittesting, fake the non-existence of appengine_config.
# The error message of the import error is handled by gae and must
# exactly match the proper string.
raise ImportError('No module named appengine_config')
# Imports are done relative because Google app engine prohibits
# absolute imports.
lib_dir = os.path.join(os.path.dirname(os.path.realpath(__file__)), 'libs')
# Add every library to sys.path.
if os.path.isdir(lib_dir):
for lib in os.listdir(lib_dir):
if lib.endswith('.egg'):
lib = os.path.join(lib_dir, lib)
# Insert to override default libraries such as webob 1.1.1.
sys.path.insert(0, lib)
And this piece of code in setup.cfg:
[develop]
install-dir = libs
always-copy = true
If you type python setup.py develop, the libraries are downloaded as eggs in the libs directory. appengine_config inserts them to your path.
We use this at work to include webob==1.3.1 and internal packages which are all namespaced using our company namespace.
You may want to have a look at the answers in the Stack Overflow thread, "How do I manage third-party Python libraries with Google App Engine? (virtualenv? pip?)," but for your particular predicament with namespace packages, you're running up against a long-standing issue I filed against site.addsitedir's behavior of appending to sys.path instead of inserting after the first element. Please feel free to add to that discussion with a link to this use case.
I do want to address something else that you said that I think is misleading:
My context is appengine_config.py, but this is really a general Python
question.
The question actually arises from the limitations of Google App Engine and the inability to install third-party packages, and hence, seeking a workaround. Rather than manually adjusting sys.path and using site.addsitedir. In general Python development, if your code uses these, you're Doing It Wrong.
The Python Packaging Authority (PyPA) describes the best practices to put third party libraries on your path, which I outline below:
Create a virtualenv
Mark out your dependencies in your setup.py and/or requirements files (see PyPA's "Concepts and Analyses")
Install your dependencies into the virtualenv with pip
Install your project, itself, into the virtualenv with pip and the -e/--editable flag.
Unfortunately, Google App Engine is incompatible with virtualenv and with pip. GAE chose to block this toolset in an attempt sandbox the environment. Hence, one must use hacks to work around the limitations of GAE to use additional or newer third party libraries.
If you dislike this limitation and want to use standard Python tooling for managing third-party package dependencies, other Platform as a Service providers out there eagerly await your business.

Including a Python Library (suds) in a portable way

I'm using suds (brilliant library, btw), and I'd like to make it portable (so that everyone who uses the code that relies on it, can just checkout the files and run it).
I have tracked down 'suds-0.4-py2.6.egg' (in python/lib/site-packages), and put it in with my files, and I've tried:
import path.to.egg.file.suds
from path.to.egg.file.suds import *
import path.to.egg.file.suds-0.4-py2.6
The first two complain that suds doesn't exist, and the last one has invalid syntax.
In the __init__.py file, I have:
__all__ = [ "FileOne" ,
"FileTwo",
"suds-0.4-py2.6"]
and have previously tried
__all__ = [ "FileOne" ,
"FileTwo",
"suds"]
but neither work.
Is this the right way of going about it? If so, how can I get my imports to work. If not, how else can I achieve the same result?
Thanks
You must add your egg file to sys.path, like this:
import sys
# insert at 0 instead of appending to end to take precedence
# over system-installed suds (if there is one).
sys.path.insert(0, "suds-0.4-py2.6.egg")
import suds
.egg files are zipped archives; hence you cannot directly import them as you have discovered.
The easy way is to simply unzip the archive, and then copy the suds directory to your application's source code directory. Since Python will stop at the first module it discovers; your local copy of suds will be used even if it is not installed globally for Python.
One step up from that, is to add the egg to your path by appending it to sys.path.
However, the proper way would be to package your application for distribution; or provide a requirements file that lets other people know what external packages your program depends on.
Usually I distribute my program with a requirements.txt file that contain all dependencies and their version.
The users can then install these libraries with:
pip install -r requirements.txt
I don't think including eggs with your code is a good idea, what if the user use python2.7 instead of python2.6
More info about requirement file: http://www.pip-installer.org/en/latest/requirements.html

Categories