I have an egg distribution of a PyQt application which i build myself, and it contains sphinx generated documentation. When i call the help file from the application it opens the sphinx index.html in a QtWebKit.QWebView window. Apparently, only the index.html file is extracted from the egg into the OS's egg-directory (e.g. [..]\Application Data\Python-Eggs\ under Windows).
This results in broken css, broken images, and broken links, because these other files don't seem to get unpacked; they are present in the egg file, but not in the egg-directory.
Am i missing something here? Is there a way to force unpacking all html, css, image file immediately?
I see that you've already found another way to do it, but for future reference, here's the non-workaround way to do it automatically, from the documentation at http://peak.telecommunity.com/DevCenter/setuptools#automatic-resource-extraction [emphasis added]:
If you are using tools that expect your resources to be "real" files, or your project includes non-extension native libraries or other files that your C extensions expect to be able to access, you may need to list those files in the eager_resources argument to setup(), so that the files will be extracted together
So, in this case, what you want to do is have:
eager_resources=['doc/sphinx/build/html', 'doc/sphinx/build/html/index.html']
in your setup.py, which will cause the 'html' directory to be recursively extracted when you ask for the index.html (assuming that 'doc' in your example is a top-level package).
(You can find out more about the eager_resources keyword in the docs at http://peak.telecommunity.com/DevCenter/setuptools#new-and-changed-setup-keywords)
def get_help_url(self):
from pkg_resources import resource_filename
from doc import sphinx
import os
from PyQt4.QtCore import QUrl
html_path = resource_filename(sphinx.__name__, os.path.join('build', 'html'))
return QUrl(os.path.join(html_path, 'index.html'))
instead of
html = resource_filename(sphinx.__name__, os.path.join('build', 'html', 'index.html'))
return QUrl(html)
did the trick
Probable cause: not all the files are included in the egg in the first place.
Check this by unzipping the .egg (you might need to rename it to a .zip file for that on windows). Check if all the contents are there.
Look at how you made the egg. Do you use a MANIFEST.in file to tell setuptools which files to include? If not, you're probably trusting on setuptools' automatic inclusion of subversion files. All subversion'ed files automatically end up in the egg, python files do to, the rest does not.
The sphinx documentation is probably generated, so it is not in subversion, so it doesn't get included automatically.
Two solutions:
Use a MANIFEST.in file to manually specify (wildcards do work) all the files that should be included. Fail-safe as long as you're complete.
Or specify the html files as package_data, see How does setuptools decide which files to keep for sdist/bdist?
Related
I have a python package built from source code in /Document/pythonpackage directory
/Document/pythonpackage/> python setup.py install
This creates a folder in site-packages directory of python
import pythonpackage
print(pythonpackage.__file__)
>/anaconda3/lib/python3.7/site-packages/pythonpackage-x86_64.egg/pythonpackage/__init__.py
I am running a script on multiple environments so the only path I know I will have is pythonpackage.__file__. However Document/pythonpackage has some data that is not in site-packages is there a way to automatically find the path to /Document/pythonpackage given that you only have access to the module in python?
working like that is discouraged. it's generally assumed that after installing a package the user can remove the installation directory (as most automated package managers would do). instead you'd make sure your setup.py copied any data files over into the relevant places, and then your code would pick them up from there.
assuming you're using the standard setuptools, you can see the docs on Including Data Files, which says at the bottom:
In summary, the three options allow you to:
include_package_data
Accept all data files and directories matched by MANIFEST.in.
package_data
Specify additional patterns to match files that may or may not be matched by MANIFEST.in or found in source control.
exclude_package_data
Specify patterns for data files and directories that should not be included when a package is installed, even if they would otherwise have been included due to the use of the preceding options.
and then says:
Typically, existing programs manipulate a package’s __file__ attribute in order to find the location of data files. However, this manipulation isn’t compatible with PEP 302-based import hooks, including importing from zip files and Python Eggs. It is strongly recommended that, if you are using data files, you should use the ResourceManager API of pkg_resources to access them
Not sure, but you could create a repository for your module and use pip to install it. The egg folder would then have a file called PKG-INFO which would contain the url to the repository you imported your module from.
I would like to create an egg from two directories and want to include .config and .log files. The structure of the directories is the following:
MSKDataDownloader
|_______configs
|________sensors.config
MSKSubscriber
|_______doc
|________dependencies.log
Here's my setup.py file:
from setuptools import setup, find_packages
setup(
name='MSKDataDownloader',
version='1.0.0',
description='Data Downloader',
packages=find_packages(),
include_package_data=True,
package_data={
'MSKDataDownloader': ['config/*.config'],
'MSKSubscriber': ['doc/*.log']
'MSKSubscriber': ['config/*.config']
}
)
What am I doing wrong? Why is it not including the .config and .log files in the egg.
The problem is that include_package_data=True doesn't mean what you think it means (or what most reasonable people would think it means). The short version is, just get rid of it.
From the docs:
If set to True, this tells setuptools to automatically include any data files it finds inside your package directories that are specified by your MANIFEST.in file. For more information, see the section below on Including Data Files.
If you follow the link, you'll see that it in fact makes setuptools ignore whatever you told it explicitly in package_data, and instead look for every file mentioned in MANIFEST.in and find it within your directory tree (or source control tree):
If using the setuptools-specific include_package_data argument, files specified by package_data will not be automatically added to the manifest unless they are listed in the MANIFEST.in file.
And, since you don't have a MANIFEST.in, this means you end up with nothing.
So, you want to do one of two things:
Remove include_package_data=True.
Create a MANIFEST.in and remove package_data=….
This is all complicated by the fact that there are lots of examples and blog posts and tutorials left over from the distribute days1 that are just plain wrong for modern setuptools. In fact, there are a whole lot more out-of-date and wrong posts out there than correct ones.
The obvious answer is to just only use the tutorials and examples from the PyPA on pypa.org… but unfortunately, they haven't get written tutorials that cover anywhere near everything that you need.
So, often, you pretty much much have to read old tutorials, then look up everything they tell you in the reference docs to see which parts are wrong.
1. IIRC, in distribute, the include_package_data=True would cause your extra files to get added to an sdist, just not to anything else. Which still sounds useless, right? Except that you could make your egg and other distributions depend on building the sdist then running a script that generates the MANIFEST.in programmatically. Which was useful for… I forget, something to do with pulling version files from source control maybe?
Using this general structure:
setup.py
/package
__init__.py
project.py
/data
client.log
I have a script that saves a list of names to client.log, so I don't have to reinitialize that list each time I need access to it or run the module. Before I set up this structure with pkg_resources, I used open('.../data/client.log', 'w') to update the log with explicit paths, but this doesn't work anymore.
Is there any way to edit data files within modules? Or is there a better way to save this list?
No, pkg_resources are for reading resources within a package. You can't use it to write log files, because it's the wrong place for log files. Your package directory should typically not be writeable by the user that loads the library. Also, your package may in fact be inside a ZIP-file.
You should instead store the logs in a log directory. Where to put that depends on a lot of things, the biggest issue is your operating system but also if it's system software or user software.
I have a Python project that has the following structure:
package1
class.py
class2.py
...
package2
otherClass.py
otherClass2.py
...
config
dev_settings.ini
prod_settings.ini
I wrote a setup.py file that converts this into an egg with the same file structure. (When I examine it using a zip program the structure seems identical.) The funny thing is, when I run the Python code from my IDE it works fine and can access the config files; but when I try to run it from a different Python script using the egg, it can't seem to find the config files in the egg. If I put the config files into a directory relative to the calling Python script (external to the egg), it works - but that sort of defeats the purpose of having a self-contained egg that has all the functionality of the program and can be called from anywhere. I can use any classes/modules and run any functions from the egg as long as they don't use the config files... but if they do, the egg can't find them and so the functions don't work.
Any help would be really appreciated! We're kind of new to the egg thing here and don't really know where to start.
The problem is, the config files are not files anymore - they're packaged within the egg. It's not easy to find the answer in the docs, but it is there. From the setuptools developer's guide:
Typically, existing programs manipulate a package's __file__ attribute in order to find the location of data files. However, this manipulation isn't compatible with PEP 302-based import hooks, including importing from zip files and Python Eggs.
To access them, you need to follow the instructions for the Resource Management API.
In my own code, I had this problem with a logging configuration file. I used the API successfully like this:
from pkg_resources import resource_stream
_log_config_file = 'logging.conf'
_log_config_location = resource_stream(__name__, _log_config_file)
logging.config.fileConfig(_log_config_location)
_log = logging.getLogger('package.module')
See Setuptools' discussion of accessing pacakged data files at runtime. You have to get at your configuration file a different way if you want the script to work inside an egg. Also, for that to work, you may need to make your config directory a Python package by tossing in an empty __init__.py file.
What's the magic "python setup.py some_incantation_here" command to upload a package to PyPI, in a form that can be downloaded to get the original package in its original form?
I have a package with some source and a few image files (as package_data). If I do "setup.py sdist register upload", the .tar.gz has the image files excluded. If I do "setup.py bdist_egg register upload", the egg contains the images but excludes the setup.py file. I want to be able to get a file uploaded that is just the entirety of my project -- aka "setup.py the_whole_freaking_thing register upload".
Perhaps the best way to do this is to manually tar.gz my project directory and upload it using the PyPI web interface?
Caveat: I'm trying to avoid having to store a simple project I just created in my SVN repo as well as on PyPI -- it seems like a waste of work to keep track of its history and files in two places.
When you perform an "sdist" command, then what controls the list of included files is your "MANIFEST.in" file sitting next to "setup.py", not whatever you have listed in "package_data". This has something to do with the schizophrenic nature of the Python packaging solutions today; "sdist" is powered by the distutils in the standard library, while "bdist_egg" is controlled by the setuptools module.
To solve the problem, try creating a MANIFEST.in next to your setup.py file, and give it contents like this:
include *.jpg
Of course, I'm imaging that your "image files" are actual pictures rather than disk images or ISO images or something; you might have to adjust the above line if I've guessed wrong! But check out the Specifying which files to distribute section of the distutils docs, and see whether you can't get those files appearing in your .tar.gz source distribution! Good luck.