What's the magic "python setup.py some_incantation_here" command to upload a package to PyPI, in a form that can be downloaded to get the original package in its original form?
I have a package with some source and a few image files (as package_data). If I do "setup.py sdist register upload", the .tar.gz has the image files excluded. If I do "setup.py bdist_egg register upload", the egg contains the images but excludes the setup.py file. I want to be able to get a file uploaded that is just the entirety of my project -- aka "setup.py the_whole_freaking_thing register upload".
Perhaps the best way to do this is to manually tar.gz my project directory and upload it using the PyPI web interface?
Caveat: I'm trying to avoid having to store a simple project I just created in my SVN repo as well as on PyPI -- it seems like a waste of work to keep track of its history and files in two places.
When you perform an "sdist" command, then what controls the list of included files is your "MANIFEST.in" file sitting next to "setup.py", not whatever you have listed in "package_data". This has something to do with the schizophrenic nature of the Python packaging solutions today; "sdist" is powered by the distutils in the standard library, while "bdist_egg" is controlled by the setuptools module.
To solve the problem, try creating a MANIFEST.in next to your setup.py file, and give it contents like this:
include *.jpg
Of course, I'm imaging that your "image files" are actual pictures rather than disk images or ISO images or something; you might have to adjust the above line if I've guessed wrong! But check out the Specifying which files to distribute section of the distutils docs, and see whether you can't get those files appearing in your .tar.gz source distribution! Good luck.
Related
I have a python package built from source code in /Document/pythonpackage directory
/Document/pythonpackage/> python setup.py install
This creates a folder in site-packages directory of python
import pythonpackage
print(pythonpackage.__file__)
>/anaconda3/lib/python3.7/site-packages/pythonpackage-x86_64.egg/pythonpackage/__init__.py
I am running a script on multiple environments so the only path I know I will have is pythonpackage.__file__. However Document/pythonpackage has some data that is not in site-packages is there a way to automatically find the path to /Document/pythonpackage given that you only have access to the module in python?
working like that is discouraged. it's generally assumed that after installing a package the user can remove the installation directory (as most automated package managers would do). instead you'd make sure your setup.py copied any data files over into the relevant places, and then your code would pick them up from there.
assuming you're using the standard setuptools, you can see the docs on Including Data Files, which says at the bottom:
In summary, the three options allow you to:
include_package_data
Accept all data files and directories matched by MANIFEST.in.
package_data
Specify additional patterns to match files that may or may not be matched by MANIFEST.in or found in source control.
exclude_package_data
Specify patterns for data files and directories that should not be included when a package is installed, even if they would otherwise have been included due to the use of the preceding options.
and then says:
Typically, existing programs manipulate a package’s __file__ attribute in order to find the location of data files. However, this manipulation isn’t compatible with PEP 302-based import hooks, including importing from zip files and Python Eggs. It is strongly recommended that, if you are using data files, you should use the ResourceManager API of pkg_resources to access them
Not sure, but you could create a repository for your module and use pip to install it. The egg folder would then have a file called PKG-INFO which would contain the url to the repository you imported your module from.
My goal is to make a program I've written easily accessible to potential employers/etc. in order to... showcase my skills.. or whatever. I am not a computer scientist, and I've never written a python module meant for installation before, so I'm new to this aspect.
I've written a machine learning algorithm, and fit parameters to data that I have locally. I would like to distribute the algorithm with "default" parameters, so that the downloader can use it "out of the box" for classification without having a training set. I've written methods which save the parameters to/load the parameters from text files, which I've confirmed work on my platform. I could simply ask users to download the files I've mentioned seperately and use the loadParameters method I've created to manually load the parameters, but I would like to make the installation process as easy as possible for people who may be evaluating me.
What I'm not sure is how to package the text files in such a way that they can automatically be loaded in the __init__ method of the object I have.
I have put the algorithm and files on github here, and written a setup.py script so that it can be downloaded from github using pip like this:
pip install --upgrade https://github.com/NathanWycoff/SySE/tarball/master
However, this doesn't seem to install the text files containing the data I need, only the __init__.py python file containing my code.
So I guess the question boils down to: How do I force pip to download additional files aside from just the module in __init__.py? Or, is there a better way to load default parameters?
Yes, there is a better way, how you can distribute data files with python package.
First of all, read something about proper python package structure. For instance, it's not recommended to put a code into __init__ files. They're just marking that a directory is a python package, plus you can do some import statements there. So, it's better, if you put your SySE class to (for instance) file syse.py in that directory and in __init__.py you can from .syse import SySE.
To the data files. By default, setuptools will distribute only *.py and several other special files (README, LICENCE and so on). However, you can tell to setuptools that you want distribute some other files with the package. Use setup's kwarg package_data, more about that here. Also don't forget to include all you data file into MANIFEST.in, more on that here.
If you do all the above correctly, than you can use package pkg_resources to discover your data files on runtime. pkg_resources handles all possible situations - your package can be distributed in several ways, it can be installed from pip server, it can be installed from wheel, as egg,...more on that here.
Lastly, if you package is public, I can only recommend to upload it on pypi (in case it is not public, you can run your own pip server). Register there and upload your package. You could than do only pip install syse to install it from anywhere. It's quite likely the best way, how to distribute your package.
It's quite a lot work and reading but I'm pretty sure you will benefit from it.
Hope this help.
Currently I'm using the auto-tools to build/install and package a project of mine, but I would really like to move to something that feels more "pythonic".
My project consists of two scripts, one module, two glade GUI descriptions, and two .desktop files. It's currently a pure python project, though that's likely to change soon-ish.
Looking at setuptools I can easily see how to deal with everything except the .desktop files; they have to end up in a specific directory so that Gnome can find them.
Is using distuils/setuptools a good idea to begin with?
I managed to get this to work, but it kinda feels to me more like a workaround.
Don't know what's the preferred way to handle this...
I used the following setup.py file (full version is here):
from setuptools import setup
setup(
# ...
data_files=[
('share/icons/hicolor/scalable/apps', ['data/mypackage.svg']),
('share/applications', ['data/mypackage.desktop'])
],
entry_points={
'console_scripts': ['startit=mypackage.cli:run']
}
)
The starter script trough entry_points works. But the data_files where put in an egg file and not in the folders specified, so they can't be accessed by the desktop shell.
To work around this, I used the following setup.cfg file:
[install]
single-version-externally-managed=1
record=install.txt
This works. Both data files are created in the right place and the .desktop file is recognized by Gnome.
In general, yes - everything is better than autotools when building Python projects.
I have good experiences with setuptools so far. However, installing files into fixed locations is not a strength of setuptools - after all, it's not something to build installaters for Python apps, but distribute Python libraries.
For the installation of files which are not application data files (like images, UI files etc) but provide integration into the operating system, you are better off with using a real packaging format (like RPM or deb).
That said, nothing stops you from having the build process based on setuptools and a small make file for installing everything into its rightful place.
You can try to use python-distutils-extra. The DistUtilsExtra.auto module automatically supports .desktop files, as well as Glade/GtkBuilder .ui files, Python modules and scripts, misc data files, etc.
It should work both with Distutils and Setuptools.
I've created https://pypi.python.org/pypi/install-freedesktop. It creates .desktop files automatically for the gui_scripts entry points, which can be customized through a setup argument, and supports --user as well as system-wide installation. Compared to DistUtilsExtra, it's more narrow in scope and IMHO more pythonic (explicit is better than implicit).
When I'm maintaining and distributing a Python package, should I keep the MANIFEST file that the command
python setup.py sdist
generates under version control, or should I add it to .gitignore?
The file is generated with some commonly used ideas of what files to include in the source distribution. If it's not there, it can easily be regenerated. Typically, if you want to make changes (for example, adding a file that the generation doesn't by default cover), you actually make changes to the MANIFEST.in file. The Manifest.in file you SHOULD have version controlled.
Of course, in some cases you might want to create the MANIFEST file yourself and not rely on the auto-generation at all. In those cases, you would want to version control the manifest file.
I have not encountered any need to version control it, but you might want to leave the question open for other comments, as I also don't have much experience with more elaborate package building.
I'm writing a Django application that is using pip & virtualenv to manage its development environment.
One of the dependencies, pkgme, comes with many data files which are its "backends" and are configured in its setup.py with data_files=$FOO (rather than package_data).
When pkgme looks for its backends, it looks in os.path.join(sys.prefix, "share", "pkgme", "backends"). This works great when pkgme has been installed normally, and seems to match the documentation but does not work when pkgme is installed as an egg.
There, the data files are installed under $VIRTUAL_ENV/lib/python2.7/site-packages/pkgme-0.1-py2.7.egg/share rather than the expected $VIRTUAL_ENV/share.
Which leaves me with two questions:
Should I be using something other than the os.path.join above to find the data files regardless of whether we are using an egg installation or a traditional system installation? If so, what?
Should I be distributing my data files differently so as to make them more readily available in an egg?
Note that I know about pkgutil.get_data, but would rather not use it. I'm not interested in the contents of these data files, I want to know their location instead, so I can execute them.
My current plan is to do this:
Use package_data instead of data_files
Change pkgme to look for backends relative to pkgme.__file__ rather than sys.prefix
Your current plan is essentially correct, or is at any rate a workable option.
When setuptools creates an egg, it checks whether code in the egg makes use of __file__, and if so, it marks the egg as not being installable in compressed form. In this way, when the egg is installed by easy_install, it'll get extracted to an .egg/ directory instead of being left in an .egg file.
If you want to support compressed/drop-in installation (i.e., just dumping the egg in a directory without "installing" it), then you should use the pkg_resources.resource_filename() (docs here) API instead of __file__, but then your package will be dependent on setuptools or distribute in order to have that API available.
I ended up doing the following:
Changing pkgme to use pkg_resources.resource_filename() to find its own included backends
Added an entry point that any backend written in Python can use to publish the location of its own backend scripts
Kept the sys.prefix-based check for any backend that don't want to use Python
The diff can be found here: http://bazaar.launchpad.net/~pkgme-committers/pkgme/trunk/revision/86