I was looking for something similar to perl's Dumper functionality in python. So after googling I found one which serves me well # https://gist.github.com/1071857#file_dumper.pyamazon
So I downloaded and installed it and it works fine.
But then I came accross PyPI: http://pypi.python.org/pypi which looks like CPAN equivalent for python.
So I searched for the Dumper module there and I could not find it there. I was hoping that this seems like a very basic module and should have been listed in PyPI.
So my question is, if I have to install a python module, then should I search in PyPI first and if i do not find then look other places on google?
OR is there any other Python Module repository apart from PyPI?
I am learning python and hence this question.
thanks.
If you are using pip, pip search package_name would help you do the same as searching on the web interface provided by PyPi.
Once located, installing a python package is of course as easy as
pip install package_name
Some python libraries may be in development stage and may not directly be available on PyPi OR you may want a specific commit has (git) of that library and if you can find that library's source on github.com or on bitbucket.com for example, you can do
pip install -e git+git://github.com/the/repo/url.git#egg=package_name
And regarding your question about perl Dumper, perl's Dumper has two main uses iirc -
data persistence
debugging and inspecting objects.
As far as I know, there's no exact equivalent of perl's Dumper in python.
However, I use pickle for data persistence.
And pprint is useful for visually inspecting objects/debug.
Both of which are standard, built-in modules in Python. There's no necessity for 3rd party libraries for these functionalities.
If you want to use what is here - https://gist.github.com/1071857#file_dumper.pyamazon.
What you need to do is to copy the code and place it in a local file in your project directory. You can name the file something like pydumper.py. Or any name you prefer really, but end it with suffix .py.
In your project, you can import the functions and classes defined in pydumper.py by doing
from pydumper import *
or if you want to be specific (which is preferred. it's better to be explicit about what you are importing.)
from pydumper import Dumper
and you can start using the Dumper class in your own code.
Are you looking for something like easy_install from setuptools? I might have misunderstood your question as I don't use perl.
From the Scripts directory in the python installation directory ("c:/python27/Scripts" on my machine), you can install modules from the command line like so:
easy_install modulename
Makes life alot easier if you set the Scripts directory to your PATH variable.
Related
This question already has answers here:
Python: How do I find which pip package a library belongs to?
(2 answers)
Closed 24 days ago.
Ok, so you clone a repo, there's an import
import yaml
ok, so you do pip install yaml and you get:
ERROR: No matching distribution found for yaml
Ok, so you look for a package with yaml in it, and there's like a gazillion of them... usually adding py in front does the job, but...
How on earth should I know which one was used?!
And it's not just yaml, oh no... there's:
import cv2 # python-opencv
import PIL # Pillow
and the list goes on and on...
How can I know which import uses which package? Shouldn't there be a PEP for this? Or a naming convention, e.g. import is always the same as the package name?
There's a similar topic here, if you're not frustrated enough :)
[When I clone a repo,] How can I know which import uses which package?
In short: it is the cloned code's responsibility to explain this, and it is an expected courtesy that the cloned code includes an installer that will take care of it.
If this is just some random person's bundle of .py files on GitHub with no installation instructions, look for notes in the associated documentation; failing that, make an issue on the tracker. (Or just give up. Maybe look for a better-engineered project that does the same thing.)
However, most "serious", contemporary Python projects are meant to be installed by using some form of packaging system. These have evolved over the years, and best practices have changed many times; but generally speaking, a properly "packaged" and "distributed" project will have either a setup.py or (newer; better in many ways, but not universally adopted yet) pyproject.toml file at the top level.
A pyproject.toml file is a config file in TOML format that simply describes a bunch of project metadata. This requires a build backend conforming to PEP 517. For a while, this required third-party tools, such as Poetry; but the standard setuptools can handle this since version 40.8.0. (As of this writing, the current release is 65.7.0.)
A setup.py script is executable code that pip will invoke after downloading a package from PyPI (or another package index). Generally, this script will use either setuptools or distutils (the predecessor to setuptools; it has finally been officially deprecated in 3.10, and will be removed in 3.12) to install the project, by calling a function named setup and passing it a big dict with some project metadata.
Security warning: this file is still executable code. It is arbitrary code, and it doesn't have to be following the standard conventions. Also, the package that is actually downloaded from PyPI doesn't necessarily match the project's source shown on GitHub (or another Git provisioning website), if such is even available. (This problem also affects package managers in other languages and ecosystems, notably npm for Javascript.)
With the setup.py based approach, package dependencies are specified using a keyword argument to the setup function. The specification has changed many times; currently, projects still using a setup.py should use the install_requires keyword argument.
With the pyproject.toml based approach, using setuptools' backend, dependencies will be an array (using JSON terminology, as TOML is a superset) stored under project.dependencies. This will vary for other backends; for example, Poetry expects this information under tool.poetry.dependencies.
In any event, pip freeze will output a list of what's installed in the current environment. It's a somewhat common practice for developers to test the code in a virtual environment where the dependencies are installed, dump this output to a requirements.txt file, and include that as documentation.
[When I want to use a third-party library in my own code,] How can I know which import uses which package?
It's worth considering the question the other way around, too: given that we have installed OpenCV for Python using pip install opencv-python, and want to use it in our own code, how do we know to import cv2 specifically?
The answer: there is no convention, and certainly no requirement for the installed package name to match the PyPI name, nor the GitHub etc. repository name. Read the documentation. Everyone who intends for their code to be used as a library, will be more than willing to show how, on at least a basic level.
Watch for requirements.txt . Big projects usually have it. You can import packages from this file. Else just google.
Keep in mind that it might not be a pip package.
Probably what is happening is that the main script is trying to import a secondary script (yaml.py, in this case) with functions or utils for the main script to use.
Check if the repo contains a file named yaml.py. If it's the case make sure to run the main script while the yaml.py is in the same directory.
Also, check for a requirements.txt file.
You can install all the requirements inside the file running in shell this line:
pip install -r *path to your requirements.txt*
Hope that this helps.
Any package on PyPI or cloned from some online repository is free to set itself up with a base directory name it chooses. That base directory xyz determines the import xyz line. Additionally a package name on PyPI doesn't have to match the repository name where its source code revisions are kept (assuming there is any).
This has the disadvantage that there is no one-to-one relation between package name, repo and/or import-line. But the advantage is that you e.g. can install Pillow, which is backwards compatible with PIL and still use import PIL instead of changing all your sources to use import Pillow as PIL.
If the repo you clone has a requirements.txt look there, you can also look in the setup.py for extra_require. But there is no guarantee that these are available, or contain the names of the packages to install (e.g. I use a generic setup.py that reads its info from a datastructure in the __init__.py file when creating/installing a package).
yaml seems to be a reserved name on PyPI (at least when I tried to upload a package with that name a few years ago). So that might be the reason the package is named PyYAML, although the Py is not very informative as the python code will not function in another programming language. PyPI' search is not very helpful as it relevance ordering is not relevant (at least not for yaml).
PyPI has no entry in the metadata for the import line, but you could extract that from .whl package file as the import line is the top level directory that doesn't match .dist-info. This is normally not possible from a .tar.gz` package file. I don't know of any site that does this kind of automatic scraping.
You can click through the packages on PyPI, after searching the import term, and hope you find something that matches the import in the documentation, but that is no guarantee you get the right one.
You might be best of searching for import yaml here on stackoverflow, and hope that the question or the answer mentions the package name.
thank you very much for your help and ideas. Big thanks to Karl Knechter for his exhaustive answer.
tl;dr: I think using some sort of "package" / "distribution" as a standard, would make everyone's lives easier.
However, my question was half-theoretical, to point out something I'd call, an incoherence in Python. You are of course right, there should be setuptools or requirements.txt or at least some documentation. But, if there isn't any, we're prone to error or additional browsing.
GospelBG pointed out something important. There could be a script yaml.py in the main folder and we need to check and/or guess.
Most importantly, naming imports differently than packages is just plainly misleading. There should be a naming convention or a PEP for this. Again, you can of course eventually get the proper package etc., but it's not explicit and obvious, and it should be! Because in programming, we like it that way, don't we?
I'm no seasoned dev in Python and I'm learning C++, but e.g. in C++, you import a header file with a particular name and static or dynamic libraries by their filename. Now I know this is very "step-by-step, on foot method", but at least you use the exact filenames.
On the upper level you have CMake, which would be an equivalent of setuptools where using find_package or find_library you can import package / library. To be honest, I'm not sure if all packages have the exact equivalent name, but at least the ones I used, did match.
Thanks again for your help and answers! I'm open for discussion and comments :)
I am interested in a reliable and robust procedure for collecting python package and dependencies for the Psychopy library into a single collection or environment to make a self-contained and maintainable installation. As well it would be good to have some general recommendations on the recommended way to do this since googling Nixos and Python yields a number of approaches, some of which use poorly documented function e.g. myEnvFun
Psychopy is a python package used for psychology experiments. It has several dependencies, most of which are python packages, but not all (e.g. AVbin); and most of which are in the nixos package collection, but not all (pyo and py-parallel).
My goal would be to be able to get all the needed pieces together and have a functioning psychopy environment with a single installation request. I have figured out how to get psychopy installed, but the path's don't play nicely.
For example if the following is saved as ~/pkg/psychopy/default.nix
let
pkgs = import <nixpkgs> {};
in
{stdenv ? pkgs.stdenv, python ? pkgs.python, fetchurl ? pkgs.fetchurl}:
with pkgs;
buildPythonPackage {
name = "psychopy";
src = fetchurl {
url = http://sourceforge.net/projects/psychpy/files/PsychoPy/PsychoPy-1.82.02.zip;
md5 = "52309280bdca4408970aab0952c674e4";
};
buildInputs = [
python27
];
}
One can run nix-env -f ~/pkg/ -iA psychopy and Psychopy will be installed, but it will not be easily useable because the path to the psychopy library is not seen by any system wide python2 installation, or even the python version that is installed as part of the build inputs.
This leads to the following questions that though they are specifically about psychopy would apply more generally to python and Nixos.
Is the recommend practice to install python packages that exist in the nixos expression collections (e.g. numpy and scipy) once as system or user wide packages or with each particular experimental library?
If one wishes to bundle together a python collection with more than one library outside the nixos expression channel (e.g. psychopy and pyo and pyparallel) what is the recommended procedure? And how does this change if some non-python software is required, e.g. in this case AVbin (which actually has installation instructions that refers to paths that are not standard in Nixos to my understanding, i.e. /usr/lib)?
Can some discussion of handling paths in Python with the context of Nixos be shared?
myEnvFun is deprecated, where did you read about it?
Answers:
It's recommended to create environments with nix-shell, you should be able to run it inside ~/pkg/psychopy/ and get $PYTHONPATH populated.
The idea behind Nix is not to have any global sets of packages, but rather environments for each need.
By just declaring AVBin as build dependency it should be enough. Note that your users will need to install Nix. If you want to avoid that, you'll need to write a wrapper that will do something similar to nix-shell.
There isn't much going on here. All Nix packages are build in isolated chroots. Nix has a concept called setup hooks, which are executed for each package in the dependency tree. So for Python packages https://github.com/NixOS/nixpkgs/blob/master/pkgs/development/interpreters/python/2.7/setup-hook.sh#L15 is called in order to populate $PYTHONPATH. For command line programs we then wrap the resulting script with $PYTHONPATH hardcoded.
For discussion it's best to join Freenode IRC channel #nixos
I have just started learning python. I want to understand how some functions work and how are modules organized. How can I read the implementations of built-in modules?
Where the source code of the Python standard library is located will depend on your operating system and on how you installed Python. However, the following locations are common:
Windows - C:\Python27\Lib
Linux - /usr/lib/python2.7/
Note that some builtins such as the math module are missing -- that's because those builtins are written in C and are baked directly into the interpreter for purposes of speed.
You should also consider taking a look at the source code for some popular 3rd party libraries. They'll vary in quality, but might be worth examining. Here's a list to help you get started.
There are many implementations of Python, such as CPython, IronPython, PyPy, Jython. The most commonly used Python is CPython. Its source code can by found at hg.python.org.
Your installation also contains source code. For example, to find the source code associate with the collections module, type the following in an interactive session:
>>> import collections
>>> collections
<module 'collections' from '/usr/lib/python2.7/collections.pyc'>
Thus you would look in '/usr/lib/python2.7/collections.py' for the source code associated with the collections module. (Note that you should remove the c in pyc from the path. The .py file is Python source code, the .pyc is byte code.)
A clean way to read this code is in the Python Mercurial repository, or in the Git mirror. (I personally find the Git mirror easier to use, but both are equally good for learning the code.)
The Git repository is at https://github.com/python/cpython/tree/2.7
The Mercurial repository is at http://hg.python.org/cpython (click "branches", then click "2.7", then click "browse")
In both of these repositories, the Lib folder is the Python standard library.
After installing the BitTorrent-bencode package, either via easy_install BitTorrent-bencode or pip install BitTorrent-bencode, or by downloading the tarball and installing that via easy_install $tarball, I discover that /usr/local/lib/python2.6/dist-packages/BitTorrent_bencode-5.0.8-py2.6.egg/ contains EGG-INFO/ and test/ directories. Although both of these subdirectories contain files, there are no files in the BitTorr* directory itself. The tarball does contain bencode.py, which is meant to be the actual source for this package, but it's not installed by either of those utils.
I'm pretty new to all of this so I'm not sure if this is a problem with the package or with what I'm doing. The package was packaged a while ago (2007), so perhaps it's using some deprecated configuration aspect that I need to supply a command-line flag for.
I'm more interested in learning what's wrong with either the package or my procedures than in getting this particular package installed; there is another package called hunnyb that seems to do a decent enough job of decoding bencoded data. Mostly I'd like to know how to deal with such problems in other packages. I'd also like to let the package maintainer know if the package needs updating.
edit
#Andrey Popp explains that the problem is likely with the setup.py file. I guess the only way I can really get an answer to my question is by actually R-ing TFM. However since I likely won't have time to do that thoroughly for a while yet, I've posted the setup.py file here.
A quick browse through the easy_install manual reveals that the function find_modules(), which this module's setup.py makes use of, searches for files named __init__.py within the package. The source code file in question is named bencode.py, so perhaps this is the problem: it should be named __init__.py?
edit 2
Having now learned Python packaging, I gather that the problem is that this module is using setuptools.find_packages, and has its source at the root of its directory structure, but hasn't passed anything in package_dir. It would seem to be fairly trivial to fix. However, the author is not reachable by his PyPI contact info. The module's PyPI page lists a "Package Index Owner" as well. I'm not sure what that's supposed to mean, but I did manage to get in touch with that person, who I think is maybe not in a position to maintain the module. In any case, it's still in the same state as when I posted this question back in June.
Given that the module seems to be more or less abandoned, and that there's a suitable replacement for it in hunnyb, I've accepted that #andreypopp's answer is about as good of one as I'm going to get.
It seems this package's setup.py is broken — it does not define right package for distribution. I think, you need to check setup.py in source release and if it is true — report a bug to author of this package.
For my project I would be using the argparse library. My question is, how do I distribute it with my project. I am asking this because of the technicalities and legalities involved.
Do I just:
Put the argparse.py file along with
my project. That is, in the tar file for my project.
Create a package for it for my
distro?
Tell the user to install it himself?
What's your target Python version? It appears that argparse is included from version 2.7.
If you're building a small library with minimal dependencies, I would consider removing the dependency on an external module and only use facilities offered by the standard Python library. You can access command line parameters with sys.argv and parse them yourself, it's usually not that hard to do. Your users will definitely appreciate not having to install yet another third party module just to use your code.
It would be best for the user to install it so that only one copy is present on the system and so that it can be updated if there are any issues, but including it with your project is a viable option if you abide by all requirements specified in the license.
Try to import it from the public location, and if that fails then resort to using the included module.
You could go with Ignacio's suggestion.
But... For what it is worth, there's another library for argument parsing built into Python, which is quite powerful. Have you tried optparse? It belongs to the base Python distribution and has been there for a while...
Good luck!