Inconsistent behaviour of bdist vs sdist when distributing a Python package - python

I have a big project with the following structure. utilities is a collections of small modules that are reused in various places by the different components of the big_project, project1, 2, etc.
big_project/
|-- __init__.py
|-- utilities/
|-- mod1.py
|-- mod2.py
|-- project1/
|-- setup.py
|-- __init__.py
|-- src/
|-- __init__.py
|-- mod1.py
|-- mod2.py
|-- examples/
|-- __init__.py
|-- mod.py
|-- project2/
|-- ...
|-- project3/
|-- ...
I want to distribute project1, including utilities (because I don't want to distribute utilities separately). The distributed package would have the following structures:
project1/
|-- utilities/
|-- src/
|-- examples/
and project1/setup.py looks like this:
setup(
name = 'project1',
packages = ['project1.utilities', 'project1.src', 'project1.examples'],
package_dir = {'project1.utilities': '../utilities/',
'project1.src': 'src',
'project1.examples': 'examples'}
)
The problem: python setup.py bdist produces a distribution with the right structure, but python setup.py sdist doesn't:
bdist: content of project1-0.1.linux-x86_64.tar.gz:
/./usr/local/lib/python2.7/site-packages/
|-- project1/
|-- utilities
|-- src
|-- examples
sdist: content of project1-0.1.tar.gz:
project1/
|-- src/
|-- examples/
So sdist left out the utilities module, whereas bdist included it at the correct location. Why?
If anyone wants to look at the real project: https://testpypi.python.org/pypi/microscopy where both the bsdist and sdist archives are available.
Both setuptools and distutils produce the same result. Because the project is pure Python, I'd rather use sdist...

One way that seems to work is to use bdist_wheel, which despite its name produces a platform-agnostic source distribution when the content is pure Python. And wheels are suppose to be the new standard.
setup.py also needs to be told about the root package project1, otherwise project1.__init__.py is missing:
setup(
name = 'project1',
packages = ['project1'
'project1.utilities',
'project1.src',
'project1.examples'],
package_dir = {'project1': '.',
'project1.utilities': '../utilities/',
'project1.src': 'src',
'project1.examples': 'examples'}
)
and then
python2.7 setup.py bdist_wheel

I suggest to update your MANIFEST.in file to include utilities folder
e.g. recursive-include ../utilities *

Related

Python package not installing submodules

I have created a package with the following structure in the dev branch (not merging to main until I verify the package installs correctly):
mypackage
|
|-- __init__.py
|-- setup.py
|-- requirements.txt
|-- module.py
|-- subpackage_one
|
|-- __init__.py
|-- module_ab.py
|-- class_aba
|-- class_abb
|-- module_ac.py
|-- function_aca
|-- subpackage_two
|
|-- __init__.py
|-- module_ba.py
|-- function_baa
Additional information:
The __init__.py files at root and in subpackage__two are both empty
The __init__.py file in subpackage_one contains some additional initialization in the form of from mypackage.subpackage_one.module_xx import class_xxx (or function_xxx)
I am installing the package via pip install git+https://github.com/organization/repo.git#dev
If I am in the root directory of the package, I can import the submodules as expected
The setup.py file is:
import setuptools
with open("README.md", "r", encoding="utf-8") as fh:
long_description = fh.read()
setuptools.setup(
name='mypackage',
version='0.0.2',
author='author1, author2',
author_email='author1_email, author2_email',
description='My Package',
long_description=long_description,
long_description_content_type="text/markdown",
url='https://github.com/organization/repo',
packages=['mypackage'],
install_requires=['requests'],
)
When I run the following snippet:
import pkgutil
for i in pkgutil.iter_modules(mypackage.__path__):
print(i)
I see:
ModuleInfo(module_finder=FileFinder('/path/to/package/mypackage'), name='module', ispkg=False)
And indeed, the subpackages are not in the mypackage folder.
How can I get the subpackages to install along with the package?
Your issue might be the packages parameter. It needs to be supplied with every module or 'package'.
setuptools has a nice function to find them, use it like this: packages=setuptools.find_namespace_packages(),

Python module vs sub-module vs package vs sub-package

In Python, What is the differences between module, sub-module, package and a sub-package?
package
|-- __init__.py
|-- module.py
|-- sub_package
|-- __init__.py
|-- sub_module.py
Consider packages and sub-packages as folders and sub-folders containing init.py file with other python files.
modules are the python files inside the package.
sub-modules are the python files inside the sub-package.

How to add template Python files in package without compiling them?

I'm trying to create Python package with the following structure:
project/
|-- project/
| |-- __init__.py
| |-- templates/
| | |-- somefile.py
|-- setup.py
somefile.py is just a template file that is not syntactically correct.
My setup.py looks like this:
#!/usr/bin/env python
import os
from setuptools import setup, find_packages
setup(name="...",
version="1.1",
...
packages=find_packages(),
package_data={
'project': 'templates/*',
})
This works great for non-Python template files. But with the .py files, setuptools tries to compile somefile.py, which results in a Syntax error since the file is on purpose not syntactically correct. So, how can I add the template Python files in my package without compiling them?

Python multi-project build

I'm in the process of splitting up a monolithic project code base into several smaller projects. I'm having a hard time understanding how to handle dependencies amongst the different projects properly.
The structure looks somewhat like this:
SCM_ROOT
|-- core
| |-- src
| `-- setup.py
|-- project1
| |-- src
| `-- setup.py
|-- project2
| |-- src
| `-- setup.py
`-- project3
|-- src
`-- setup.py
What's the recommended way to handle dependencies between multi-package projects and setup a development environment? I'm using pip, virtualenv and requirements.txt files. Are there any tools that allow me bootstrap my environment from the repository quickly?
Using a build tool like Pybuilder or Pants was unnecessarily complicating the process. I ended up splitting it up into multiple projects in svn - each with it's own trunk/tags/branches directories. Dependencies are handled using a combination of install_requires and requirements.txt file based on information from here and here. Each project has a fabfile to run common tasks like clean, build, upload to pypi etc.

Handling util functions in python

In our current c-project, we use python scripts for support and testing purposes such as unit testing, integration testing, benchmarking and communication.
Current folder structure (most files not shown):
.
|-- workingcopy1
|-- |-- config
|-- |-- |-- __init__.py
|-- |-- |-- parameters_one.py
|-- |-- |-- parameters_two.py
|-- |-- |-- parameters_three.py
|-- |-- src
|-- |-- |-- main_application_c_code.c
|-- |-- tests
|-- |-- |-- tools
|-- |-- |-- |-- display_communication_activity.py
|-- |-- |-- run_all_unit_tests.py
|-- |-- tools
|-- |-- |-- script1.py
|-- |-- |-- script2.py
|-- |-- |-- script3.py
|-- |-- utils
|-- |-- |-- python_utils
|-- |-- |-- |-- __init__.py
|-- |-- |-- |-- communication_utils.py
|-- |-- |-- |-- conversion_utils.py
|-- |-- |-- |-- constants.py
|-- |-- |-- |-- time_utils.py
|-- workingcopy2
...
|-- workingcopy3
...
Some python files are intented to be executed as script files ($ python script1.py) and some are intended to be included as modules in other python files.
What we would like to achive is a structure that enables us to have parameter and utility functions that can be used by:
Test code
Other utility codes
Smaller python application used for monitoring of our system. I.e. custom benchmarking tools
It should also be possible to have several workingcopies checked out
Up until this date, all scripts have following lines at top:
import os, sys
current_path = os.path.abspath(os.path.dirname(__file__))
sys.path.append(os.path.join(current_path, '..', 'utils')) # Variates depending on script location in file tree
sys.path.append(os.path.join(current_path, '..', 'config')) # Variates depending on script location in file tree
import python_utils.constants as constants
import config.parameters_one as parameters
With about 20+ script files this has become hard to maintain. Is there any better way to achive this?
You should convert your folders into Python modules by adding an empty __init__.py file into it.
Also, you can add the Python shebang, so they are executable without explicitly calling the Python command from shell (Should I put #! (shebang) in Python scripts, and what form should it take?).
Once your folders are modules, you have to add only the main source path and you will be able to import the children modules in an easier manner.
Furthermore, you should use a virtual environment (virtualenv) that can handle the paths for you (http://docs.python-guide.org/en/latest/dev/virtualenvs/) (and maybe virtualenvwrapper that allows you extra functionality)
I wanted to add a couple of additional strategies you could use here:
One of the cool things about python is that everything is an object so you could import and pass your script modules as a variable to a function that run's them, initialising the appropriate path. Also, the function could "discover" the scripts by looking into the folder and walking through it.
Again, all this can be easily handled from pre-post activate virtualenvwrapper hooks.

Categories