Python finding some, not all custom packages - python

I have a project with the following file structure:
root/
run.py
bot/
__init__.py
my_discord_bot.py
dice/
__init__.py
dice.py
# dice files
help/
__init__.py
help.py
# help files
parser/
__init__.py
parser.py
# other parser files
The program is run from within the root directory by calling python run.py. run.py imports bot.my_discord_bot and then makes use of a class defined there.
The file bot/my_discord_bot.py has the following import statements:
import dice.dice as d
import help.help as h
import parser.parser as p
On Linux, all three import statements execute correctly. On Windows, the first two seem to execute fine, but then on the third I'm told:
ImportError: No module named 'parser.parser'; 'parser' is not a package
Why does it break on the third import statement, and why does it only break on Windows?
Edit: clarifies how the program is run

Make sure that your parser is not shadowing a built-in or third-party package/module/library.
I am not 100% sure about the specifics of how this name conflict would be resolved, but it seems like you can potentially a). have your module overridden by the existing module (which seems like it might be happening in your Windows case), or b). override the existing module, which could cause bugs down the road. It seems like b is what commonly trips people up.
If you think this might be happening with one of your modules (which seems fairly likely with a name like parser), try renaming your module.
See this very nice article for more details and more common Python "import traps".

Put run.py outside root folder, so you'll have run.py next to root folder, then create __init__.py inside root folder, and change imports to:
import root.parser.parser as p
Or just rename your parser module.
Anyway you should be careful with naming, because you can simply mess your own stuff someday.

Related

Use __init__.py to modify sys path is a good idea?

I want to ask you something that came to my mind doing some stuff.
I have the following structure:
src
- __init__.py
- class1.py
+ folder2
- __init__.py
- class2.py
I class2.py I want to import class1 to use it. Obviously, I cannot use
from src.class1 import Class1
cause it will produce an error. A workaround that works to me is to define the following in the __init__.py inside folder2:
import sys
sys.path.append('src')
My question is if this option is valid and a good idea to use or maybe there are better solutions.
Another question. Imagine that the project structure is:
src
- __init__.py
- class1.py
+ folder2
- __init__.py
- class2.py
+ errorsFolder
- __init__.py
- errors.py
In class1:
from errorsFolder.errors import Errors
this works fine. But if I try to do in class2 which is at the same level than errorsFolder:
from src.errorsFolder.errors import Errors
It fails (ImportError: No module named src.errorsFolder.errors)
Thank you in advance!
Despite the fact that it's slightly shocking to have to import a "parent" module in your package, your workaround depends on the current directory you're running your application, which is bad.
import sys
sys.path.append('src')
should be
import sys,os
sys.path.append(os.path.join(os.path.dirname(os.path.abspath(__file__)),os.pardir))
to add the parent directory of the directory of your current module, regardless of the current directory you're running your application from (your module may be imported by several applications, which don't all require to be run in the same directory)
One correct way to solve this is to set the environment variable PYTHONPATH to the path which contains src. Then import src.class1 will always work, regardless of which directory you start in.
from ..class1 import Class1 should work (at least it does here in a similar layout, using python 2.7.x).
As a general rule: messing with sys.path is usually a very bad idea, specially if this allows a same module to be imported from two different paths (which would be the case with your files layout).
Also, you may want to think twice about your current layout. Python is not Java and doesn't require (nor even encourage) a "one class per module" approach. If both classes need to work together, they might be better in a same module, or at least in modules at the same level in the package tree (note that you can use the top-level package's __init__ as a facade to provide direct access to objects defined in submodules / subpackages). NB : I'm not saying that your current layout is necessarily wrong, just that it might not be the simplest one.
There is also a solution that does not involve the use of the environment variable PYTHONPATH.
In src/ create setup.py with these contents:
from setuptools import setup
setup()
Now you can do pip install -e /path/to/src and import to your hearts content.
-e, --editable <path/url> Install a project in editable mode (i.e. setuptools "develop mode") from a local project path or a VCS url.
No, Its not good. Python takes modules in two ways:
Python looks for its modules and packages in $PYTHONPATH.
Refer: https://docs.python.org/2/using/cmdline.html#envvar-PYTHONPATH
To find out what is included in $PYTHONPATH, run the following code in python (3):
import sys
print(sys.path)
all folder containing init.py are marked as python package.(if they are subdirectories under PYTHONPATH)
By help of these two ways you can full-fill the need to create a python project.

Correct Package Importing in a Pythonic Library Design?

I have a python project structured like this:
repo_dir/
----project_package/
--------__init__.py
--------process.py
--------config.py
----tests/
--------test_process.py
__init__.py is empty
config.py looks like this:
name = 'brian'
USAGE
I use the library by running python process.py from the project/project/ directory, or by specifying the python file path absolutely. I'm running Python 2.7 on Amazon EC2 Linux.
When process.py looks like below, everything works fine and process.py prints brian.
import config
print config.name
When process.py looks like below, I get the error ImportError: No module named project.config.
import project.config
print config.name
When process.py looks like below, I get the error ImportError: No module named project. This makes sense as the same behavior from the previous example should be expected.
from project import config
print config.name
If I add these lines to process.py to include the library root in sys.path, all configurations above, work fine.
import os
import sys
sys.path.append(os.path.abspath(os.path.join(os.path.dirname(__file__), '..')))
MY CONFUSION
Many resources suggest setting up python libraries to import modules using project.module_name, but it doesn't seem like sys.path appending is standard, and seems weird that I need it. I can see that the sys.path append added my library root as a path in sys, but I thought that's what the __init__.py in my library root was supposed to do. What gives? What am I missing? I know Python importing creates lots of headaches so I've tried to simplify this as much as possible to wrap my head around it. I'm going crazy and it's Friday before a holiday. I'm bummed. Please help!!
QUESTIONS
How should I set up my libraries? How should I import packages? Where should I have __init__.py files? Do I need to append my library root to sys.path in every project? Why is this so confusing?
Your project setup is alright. I renamed the directories just for clarity
in this example, but the structure is the same as yours:
repo_dir/
project_package/
__init__.py
process.py
config.py
# Declare your project in a setup.py file, so that
# it will be installable, both by users and by you.
setup.py
When you have a module that wants to import from another module in
the same project, the best approach is to use relative imports. For example:
# In process.py
from .config import name
...
While working on the code on your dev box, do your work in a Python virtualenv,
and pip install your project in "editable" mode.
# From the root of your repo:
pip install -e .
With that approach, you'll never need to muck around with sys.path -- which
is almost always the wrong approach.
I think the problem is how you're running your script. If you want the script to be living in a package (the inner project folder), you should run it with python -m project.process, rather than by filename. Then you can make absolute or explicit relative imports to get config from process.
An absolute import would be from project import config or import project.config.
An explicit relative import would be from . import config.
Python 2 also allows implicit relative imports, but they're a really bad misfeature that you should never use. With implicit relative imports, internal package modules can shadow top-level modules. For instance, a project/json.py file would hide the standard library's json module from all the other modules in the package. You can tell Python you want to forbid implicit relative imports by putting from __future__ import absolute_import at the top of the file. It's the standard behavior in Python 3.

Importing Python class from sibling directory

My Directory structure in /VM/repo/project is:
__init__.py
scripts/
getSomething.py
__init__.py
classes/
project.py
db.py
__init__.py
getSomething.py
from ..classes import project
from ..classes import db
project.py
class PROJECT:
def __init__(self):
stuff
db.py
class DB:
def __init__(self):
stuff
When I try to run
python getSomething.py
I get the error
Traceback (most recent call last):
File "scripts/getSomething.py", line 4, in < module >
from ..classes import project
ValueError: Attempted relative import in non-package
What am I missing here?
As stated in the error, you're running getSomething as a main module. But you can't do package relative imports when you aren't in a package. The main module is never in a package. So, if you were to import getSomething as part of a package...:
# /VM/repo/main.py
from project.scripts import getSomething
Then you would not have the import errors.
Perhaps it is helpful to have a quick discussion on python modules and packages. In general, a file that contains python source code and has the .py extension is a module. Typically that module's name is the name of the file (sans extension), but if you run it directly, that module's name is '__main__'. So far, this is all well known and documented. To import a module you just do import module or import package.module and so on. This last import statement refers to something else (a "package") which we'll talk about now...
Packages are directories that you can import. Many directories can't be imported (e.g. maybe they have no python source files -- modules -- in them). So to resolve this ambiguity, there is also the requirement that the directory has an __init__.py file. When you import a directory on your filesystem, python actually imports the __init__.py module and creates the associated package from the things in __init__.py (if there are any).
Putting all of this together shows why executing a file inside a directory that has an __init__.py is not enough for python to consider the module to be part of a package. First, the name of the module is __main__, not package.filename_sans_extension. Second, the creation of a package depends not just on the filesystem structure, but on whether the directory (and therefore __init__.py was actually imported).
You might be asking yourself "Why did they design it this way?" Indeed, I've asked myself that same question on occasion. I think that the reason is because the language designers want certain guarantees to be in place for a package. A package should be a unit of things which are designed to work together for a specific purpose. It shouldn't be a silo of scripts that by virtue of a few haphazard __init__.py files gain the ability to walk around the filesystem to find the other modules that they need. If something is designed to be run as a main module, then it probably shouldn't be part of the package. Rather it should be a separate module that imports the package and relies on it.

How to do imports in python?

I did some websearch, but all I found was frustration.
I have a project in a directory (lets call it) "projectdir", in which I have "main.py".
In projectdir I have a subdirectory called "otherstuff", In which I have "foo.py".
How do I import foo.py, so I can use its contents in main.py, without doing much of the work that python designers/implementors should have, and without relying on boilerplate files?
Or is that impossible in python?
You need to put a __init__.py file in otherstuff subdirectory, to mark it as a package. After, you can import your module using:
import subdirectory.foo
or
from subdirectory import foo
The __init__.py file can be empty. There is no other "clean" way to achieve that in python.
you need to include __init__.py in your otherstuff directory. This is to tell python to search there for imports.
The python documentation explains how the module/package import works. And is def worth the time reading it, despite its kind of long length

Intrapackage module loads in python

I am just starting with python and have troubles understanding the searching path for intra-package module loads. I have a structure like this:
top/ Top-level package
__init__.py Initialize the top package
src/ Subpackage for source files
__init__.py
pkg1/ Source subpackage 1
__init__.py
mod1_1.py
mod1_2.py
...
pkg2/ Source subpackage 2
__init__.py
mod2_1.py
mod2_2.py
...
...
test/ Subpackage for unit testing
__init__.py
pkg1Test/ Tests for subpackage1
__init__.py
testSuite1_1.py
testSuite1_2.py
...
pkg2Test/ Tests for subpackage2
__init__.py
testSuite2_1.py
testSuite2_2.py
...
...
In testSuite1_1 I need to import module mod1_1.py (and so on). What import statement should I use?
Python's official tutorial (at docs.python.org, sec 6.4.2) says:
"If the imported module is not found in the current package (the package of which the current module is a submodule), the import statement looks for a top-level module with the given name."
I took this to mean that I could use (from within testSuite1_1.py):
from src.pkg1 import mod1_1
or
import src.pkg1.mod1_1
neither works. I read several answers to similar questions here, but could not find a solution.
Edit: I changed the module names to follow Python's naming conventions. But I still cannot get this simple example to work.
The module name doesn't include the .py extension. Also, in your example, the top-level module is actually named top. And finally, hyphens aren't legal for names in python, I'd suggest replacing them with underscores. Then try:
from top.src.pkg1 import mod1_1
Problem solved with the help of http://legacy.python.org/doc/essays/packages.html (referred to in a similar question). The key point (perhpas obvious to more experienced python developers) is this:
"In order for a Python program to use a package, the package must be findable by the import statement. In other words, the package must be a subdirectory of a directory that is on sys.path. [...] the easiest way to ensure that a package was on sys.path was to either install it in the standard library or to have users extend sys.path by setting their $PYTHONPATH shell environment variable"
Adding the path to "top" to PYTHONPATH solved the problem.To make the solution portable (this is a personal project, but I need to share it across several machines), I guess having a minimal initialization code in top/setup.py should work.

Categories