Distutils package name - python

I am packaging a python package using distutils.
My structure looks like this:
src
\ __init__.py
| util.py
| client
\ __init__.py
| file1.py
| file2.py
In setup.py my package is:
package_dir={'pkg_name':'src'},
packages=['pkg_name','pkg_name.client']
I would like to be able to use the content of client without out having to import pkg_name.client, but just by importing pkg.
Is that possible? I just read https://docs.python.org/2/distutils/setupscript.html#listing-whole-packages. The examples I saw are still all referencing the directory name (or a different one). But is it possible to remove that, and just have it included in root? I've tried a bunch of variations but they are all failing.
Something like package_dir = {'': 'src', '': 'src/client'}
In python, the directory name that init will still need to be referenced but I am hoping distutil has some trick around that.

When I first packaged, I also tried a bunch of variations, will get it working.
If your directory structure can be changed, my suggestion is to use a tool I created to myself and made it public, called machete. There are others, like cookiecutter.
https://pypi.python.org/pypi/machete/
In a empty folder, you just type:
/home/you $ machete -t app
This will generate all the structure needed, and you just need to understand its logic, and fill it with code. The structure is very different from yours, because a package has some needs.
Creating a package for the first time is kind of frustrating.
Aditionally, one tricky thing machete does is to add this on modules folder:
file: __init__.py
# -*- coding: UTF-8 -*-
import glob
__all__ = []
py_files = glob.glob('./yourproject/modules/*.py')
py_files.remove('./yourproject/modules/__init__.py')
for py_file in py_files:
__all__.append(py_file.split('/')[3].split('.')[0])
This way, importing modules.*, perhaps you can achieve your goal with your directory structure. This one works for machete generated projects.
Hope one of my suggestions helps you out.
Marco

Related

Use __init__.py to modify sys path is a good idea?

I want to ask you something that came to my mind doing some stuff.
I have the following structure:
src
- __init__.py
- class1.py
+ folder2
- __init__.py
- class2.py
I class2.py I want to import class1 to use it. Obviously, I cannot use
from src.class1 import Class1
cause it will produce an error. A workaround that works to me is to define the following in the __init__.py inside folder2:
import sys
sys.path.append('src')
My question is if this option is valid and a good idea to use or maybe there are better solutions.
Another question. Imagine that the project structure is:
src
- __init__.py
- class1.py
+ folder2
- __init__.py
- class2.py
+ errorsFolder
- __init__.py
- errors.py
In class1:
from errorsFolder.errors import Errors
this works fine. But if I try to do in class2 which is at the same level than errorsFolder:
from src.errorsFolder.errors import Errors
It fails (ImportError: No module named src.errorsFolder.errors)
Thank you in advance!
Despite the fact that it's slightly shocking to have to import a "parent" module in your package, your workaround depends on the current directory you're running your application, which is bad.
import sys
sys.path.append('src')
should be
import sys,os
sys.path.append(os.path.join(os.path.dirname(os.path.abspath(__file__)),os.pardir))
to add the parent directory of the directory of your current module, regardless of the current directory you're running your application from (your module may be imported by several applications, which don't all require to be run in the same directory)
One correct way to solve this is to set the environment variable PYTHONPATH to the path which contains src. Then import src.class1 will always work, regardless of which directory you start in.
from ..class1 import Class1 should work (at least it does here in a similar layout, using python 2.7.x).
As a general rule: messing with sys.path is usually a very bad idea, specially if this allows a same module to be imported from two different paths (which would be the case with your files layout).
Also, you may want to think twice about your current layout. Python is not Java and doesn't require (nor even encourage) a "one class per module" approach. If both classes need to work together, they might be better in a same module, or at least in modules at the same level in the package tree (note that you can use the top-level package's __init__ as a facade to provide direct access to objects defined in submodules / subpackages). NB : I'm not saying that your current layout is necessarily wrong, just that it might not be the simplest one.
There is also a solution that does not involve the use of the environment variable PYTHONPATH.
In src/ create setup.py with these contents:
from setuptools import setup
setup()
Now you can do pip install -e /path/to/src and import to your hearts content.
-e, --editable <path/url> Install a project in editable mode (i.e. setuptools "develop mode") from a local project path or a VCS url.
No, Its not good. Python takes modules in two ways:
Python looks for its modules and packages in $PYTHONPATH.
Refer: https://docs.python.org/2/using/cmdline.html#envvar-PYTHONPATH
To find out what is included in $PYTHONPATH, run the following code in python (3):
import sys
print(sys.path)
all folder containing init.py are marked as python package.(if they are subdirectories under PYTHONPATH)
By help of these two ways you can full-fill the need to create a python project.

Python finding some, not all custom packages

I have a project with the following file structure:
root/
run.py
bot/
__init__.py
my_discord_bot.py
dice/
__init__.py
dice.py
# dice files
help/
__init__.py
help.py
# help files
parser/
__init__.py
parser.py
# other parser files
The program is run from within the root directory by calling python run.py. run.py imports bot.my_discord_bot and then makes use of a class defined there.
The file bot/my_discord_bot.py has the following import statements:
import dice.dice as d
import help.help as h
import parser.parser as p
On Linux, all three import statements execute correctly. On Windows, the first two seem to execute fine, but then on the third I'm told:
ImportError: No module named 'parser.parser'; 'parser' is not a package
Why does it break on the third import statement, and why does it only break on Windows?
Edit: clarifies how the program is run
Make sure that your parser is not shadowing a built-in or third-party package/module/library.
I am not 100% sure about the specifics of how this name conflict would be resolved, but it seems like you can potentially a). have your module overridden by the existing module (which seems like it might be happening in your Windows case), or b). override the existing module, which could cause bugs down the road. It seems like b is what commonly trips people up.
If you think this might be happening with one of your modules (which seems fairly likely with a name like parser), try renaming your module.
See this very nice article for more details and more common Python "import traps".
Put run.py outside root folder, so you'll have run.py next to root folder, then create __init__.py inside root folder, and change imports to:
import root.parser.parser as p
Or just rename your parser module.
Anyway you should be careful with naming, because you can simply mess your own stuff someday.

How to properly import python modules from an adjacent folder?

I'm trying to follow Learn Python the Hard Way to teach myself python. I want to use the directory structure described in Exercise 46, which for my question I'll simplify down to this:
bin/
app.py
data/
__init__.py
foobar.py
In exercise 50, he says to start the program from the project's top level directory like this:
$ python bin/app.py
Afterwards stating that you start it from the top level directory so the script can access other resources in the project.
But I can't seem to import modules that are in the data folder from app.py. Am I misunderstanding how to setup the directory structure?
Edit: Here's the bare-bones setup I have to try and figure this out
In app.py I have:
import data.foobar
I have __init__.py in the data directory and foobar.py just contains some nonsense like:
class Test:
x = 0
The directory structure matches that of above.
I'm not sure what the exercise asks to do, but your top-level directory needs to be in the PYTHONPATH. Try:
$ export PYTHONPATH=$PYTHONPATH:$PWD
$ python bin/app.py

python import path: packages with the same name in different folders

I am developing several Python projects for several customers at the same time. A simplified version of my project folder structure looks something like this:
/path/
to/
projects/
cust1/
proj1/
pack1/
__init__.py
mod1.py
proj2/
pack2/
__init__.py
mod2.py
cust2/
proj3/
pack3/
__init__.py
mod3.py
When I for example want to use functionality from proj1, I extend sys.path by /path/to/projects/cust1/proj1 (e.g. by setting PYTHONPATH or adding a .pth file to the site_packages folder or even modifying sys.path directly) and then import the module like this:
>>> from pack1.mod1 import something
As I work on more projects, it happens that different projects have identical package names:
/path/
to/
projects/
cust3/
proj4/
pack1/ <-- same package name as in cust1/proj1 above
__init__.py
mod4.py
If I now simply extend sys.path by /path/to/projects/cust3/proj4, I still can import from proj1, but not from proj4:
>>> from pack1.mod1 import something
>>> from pack1.mod4 import something_else
ImportError: No module named mod4
I think the reason why the second import fails is that Python only searches the first folder in sys.path where it finds a pack1 package and gives up if it does not find the mod4 module in there. I've asked about this in an earlier question, see import python modules with the same name, but the internal details are still unclear to me.
Anyway, the obvious solution is to add another layer of namespace qualification by turning project directories into super packages: Add __init__.py files to each proj* folder and remove these folders from the lines by which sys.path is extended, e.g.
$ export PYTHONPATH=/path/to/projects/cust1:/path/to/projects/cust3
$ touch /path/to/projects/cust1/proj1/__init__.py
$ touch /path/to/projects/cust3/proj4/__init__.py
$ python
>>> from proj1.pack1.mod1 import something
>>> from proj4.pack1.mod4 import something_else
Now I am running into a situation where different projects for different customers have the same name, e.g.
/path/
to/
projects/
cust3/
proj1/ <-- same project name as for cust1 above
__init__.py
pack4/
__init__.py
mod4.py
Trying to import from mod4 does not work anymore for the same reason as before:
>>> from proj1.pack4.mod4 import yet_something_else
ImportError: No module named pack4.mod4
Following the same approach that solved this problem before, I would add yet another package / namespace layer and turn customer folders into super super packages.
However, this clashes with other requirements I have to my project folder structure, e.g.
Development / Release structure to maintain several code lines
other kinds of source code like e.g. JavaScript, SQL, etc.
other files than source files like e.g. documents or data.
A less simplified, more real-world depiction of some project folders looks like this:
/path/
to/
projects/
cust1/
proj1/
Development/
code/
javascript/
...
python/
pack1/
__init__.py
mod1.py
doc/
...
Release/
...
proj2/
Development/
code/
python/
pack2/
__init__.py
mod2.py
I don't see how I can satisfy the requirements the python interpreter has to a folder structure and the ones that I have at the same time. Maybe I could create an extra folder structure with some symbolic links and use that in sys.path, but looking at the effort I'm already making, I have a feeling that there is something fundamentally wrong with my entire approach. On a sidenote, I also have a hard time believing that python really restricts me in my choice of source code folder names as it seems to do in the case depicted.
How can I set up my project folders and sys.path so I can import from all projects in a consistent manner if there are project and packages with identical names ?
This is the solution to my problem, albeit it might not be obvious at first.
In my projects, I have now introduced a convention of one namespace per customer. In every customer folder (cust1, cust2, etc.), there is an __init__.py file with this code:
import pkgutil
__path__ = pkgutil.extend_path(__path__, __name__)
All the other __init__.py files in my packages are empty (mostly because I haven't had the time yet to find out what else to do with them).
As explained here, extend_path makes sure Python is aware there is more than one sub-package within a package, physically located elsewhere and - from what I understand - the interpreter then does not stop searching after it fails to find a module under the first package path it encounters in sys.path, but searches all paths in __path__.
I can now access all code in a consistent manner criss-cross between all projects, e.g.
from cust1.proj1.pack1.mod1 import something
from cust3.proj4.pack1.mod4 import something_else
from cust3.proj1.pack4.mod4 import yet_something_else
On a downside, I had to create an even deeper project folder structure:
/path/
to/
projects/
cust1/
proj1/
Development/
code/
python/
cust1/
__init__.py <--- contains code as described above
proj1/
__init__.py <--- empty
pack1/
__init__.py <--- empty
mod1.py
but that seems very acceptable to me, especially considering how little effort I need to make to maintain this convention. sys.path is extended by /path/to/projects/cust1/proj1/Development/code/python for this project.
On a sidenote, I noticed that of all the __init__.py files for the same customer, the one in the path that appears first in sys.path is executed, no matter from which project I import something.
You should be using the excellent virtualenv and virtualenvwrapper tools.
What happens if you accidentally import code from one customer/project in another and don't notice? When you deliver it will almost certainly fail. I would adopt a convention of having PYTHONPATH set up for one project at a time, and not try to have everything you've ever written be importable at once.
You can use a wrapper script per-project to set PYTHONPATH and start python, or use scripts to switch environments when you switch projects.
Of course some projects well have dependencies on other projects (those libraries you mentioned), but if you intend for the customer to be able to import several projects at once then you have to arrange for the names to not clash. You can only have this problem when you have multiple projects on the PYTHONPATH that aren't supposed to be used together.

Problem with python packages and tests

I have a directory struture like that:
project
| __init__.py
| project.py
| src/
| __init__.py
| class_one.py
| class_two.py
| test/
| __init__.py
| test_class_one.py
Which project.py just instantiate ClassOne and run it.
My problem is in the tests, I don't know how to import src classes. I've tried importing these ways and I got nothing:
from project.src.class_one import ClassOne
and
from ..src.class_one import ClassOne
What am I doing wrong? Is there a better directory structure?
----- EDIT -----
I changed my dir structure, it's like this now:
Project/
| project.py
| project/
| __init__.py
| class_one.py
| class_two.py
| test/
| __init__.py
| test_class_one.py
And in the test_class_one.py file I'm trying to import this way:
from project.class_one import ClassOne
And it still doesn't work. I'm not using the executable project.py inside a bin dir exactly because I can't import a package from a higher level dir. :(
Thanks. =D
It all depends on your python path. The easiest way to achieve what you're wanting to do here is to set the PYTHONPATH environment variable to where the "project" directory resides. For example, if your source is living in:
/Users/crocidb/src/project/
I would:
export PYTHONPATH=/Users/crocidb/src
and then in the test_one.py I could:
import project.src.class_one
Actually I would probably do it this way:
export PYTHONPATH=/Users/crocidb/src/project
and then this in test_one.py:
import src.class_one
but that's just my preference and really depends on what the rest of your hierarchy is. Also note that if you already have something in PYTHONPATH you'll want to add to it:
export PYTHONPATH=/Users/crocidb/src/project:$PYTHONPATH
or in the other order if you want your project path to be searched last.
This all applies to windows, too, except you would need to use windows' syntax to set the environment variables.
From Jp Calderone's excellent blog post:
Do:
name the directory something related to your project. For example, if your
project is named "Twisted", name the
top-level directory for its source
files Twisted. When you do releases,
you should include a version number
suffix: Twisted-2.5.
create a directory Twisted/bin and put your executables there, if you
have any. Don't give them a .py
extension, even if they are Python
source files. Don't put any code in
them except an import of and call to a
main function defined somewhere else
in your projects. (Slight wrinkle:
since on Windows, the interpreter is
selected by the file extension, your
Windows users actually do want the .py
extension. So, when you package for
Windows, you may want to add it.
Unfortunately there's no easy
distutils trick that I know of to
automate this process. Considering
that on POSIX the .py extension is a
only a wart, whereas on Windows the
lack is an actual bug, if your
userbase includes Windows users, you
may want to opt to just have the .py
extension everywhere.)
If your project is expressable as a single Python source file, then put it
into the directory and name it
something related to your project. For
example, Twisted/twisted.py. If you
need multiple source files, create a
package instead (Twisted/twisted/,
with an empty
Twisted/twisted/__init__.py) and place
your source files in it. For example,
Twisted/twisted/internet.py.
put your unit tests in a sub-package of your package (note - this means
that the single Python source file
option above was a trick - you always
need at least one other file for your
unit tests). For example,
Twisted/twisted/test/. Of course, make
it a package with
Twisted/twisted/test/__init__.py.
Place tests in files like
Twisted/twisted/test/test_internet.py.
add Twisted/README and Twisted/setup.py to explain and
install your software, respectively,
if you're feeling nice.
Don't:
put your source in a directory called src or lib. This makes it hard
to run without installing.
put your tests outside of your Python package. This makes it hard to
run the tests against an installed
version.
create a package that only has a __init__.py and then put all your code into __init__.py. Just make a module
instead of a package, it's simpler.
try to come up with magical hacks to make Python able to import your module
or package without having the user add
the directory containing it to their
import path (either via PYTHONPATH or
some other mechanism). You will not
correctly handle all cases and users
will get angry at you when your
software doesn't work in their
environment.

Categories