I have two directories in my project:
project/
src/
scripts/
"src" contains my polished code, and "scripts" contains one-off Python scripts.
I would like all the scripts to have "../src" added to their sys.path, so that they can access the modules under the "src" tree. One way to do this is to write a scripts/__init__.py file, with the contents:
scripts/__init__.py:
import sys
sys.path.append("../src")
This works, but has the unwanted side-effect of putting all of my scripts in a package called "scripts". Is there some other way to get all my scripts to automatically call the above initialization code?
I could just edit the PYTHONPATH environment variable in my .bashrc, but I want my scripts to work out-of-the-box, without requiring the user to fiddle with PYTHONPATH. Also, I don't like having to make account-wide changes just to accommodate this one project.
Even if you have other plans for distribution, it might be worth putting together a basic setup.py in your src folder. That way, you can run setup.py develop to have distutils put a link to your code onto your default path (meaning any changes you make will be reflected in-place without having to "reinstall", and all modules will "just work," no matter where your scripts are). It'd be a one-time step, but that's still one more step than zero, so it depends on whether that's more trouble than updating .bashrc. If you use pip, the equivalent would be pip install -e /path/to/src.
The more-robust solution--especially if you're going to be mirroring/versioning these scripts on several developers' machines--is to do your development work inside a controlled virtual environment. It turns out virtualenv even has built-in support for making your own bootstrap customizations. It seems like you'd just need an after_install() hook to either tweak sitecustomize, run pip install -e, or add a plain .pth file to site-packages. The custom bootstrap could live in your source control along with the other scripts, and would need to be run once for each developer's setup. You'd also have the normal benefits of using virtualenv (explicit dependency versioning, isolation from system-wide configuration, and standardization between disparate machines, to name a few).
If you really don't want to have any setup steps whatsoever and are willing to only run these scripts from inside the 'project' directory, then you could plop in an __init__.py as such:
project/
src/
some_module.py
scripts/
__init__.py # special "magic"
some_script.py
And these are what your files could look like:
# file: project/src/some_module.py
print("importing %r" % __name__)
def some_function():
print("called some_function() inside %s" % __name__)
--------------------------------------------------------
# file: project/scripts/some_script.py
import some_module
if __name__ == '__main__':
some_module.some_function()
--------------------------------------------------------
# file: project/scripts/__init__.py
import sys
from os.path import dirname, abspath, join
print("doing magic!")
sys.path.insert(0, join(dirname(dirname(abspath(__file__))), 'src'))
Then you'd have to run your scripts like so:
[~/project] $ python -m scripts.some_script
doing magic!
importing 'some_module'
called some_function() inside some_module
Beware! The scripts can only be called like this from inside project/:
[~/otherdir] $ python -m scripts.some_script
ImportError: no module named scripts
To enable that, you're back to editing .bashrc, or using one of the options above. The last option should really be a last resort; as #Simon said, you're really fighting the language at that point.
If you want your scripts to be runnable (I assume from the command line), they have to be on the path somewhere.
Something sounds odd about what you're trying to do though. Can you show us an example of exactly what you're trying to accomplish?
You can add a file called 'pathHack.py' in the project dir and put something like this into it:
import os
import sys
pkgDir = os.path.dirname(__file__)
sys.path.insert(os.path.join(pkgDir, 'scripts')
Then, in a python file in your project dir, start by:
import pathHack
And now you can import stuff from the scripts dir without the 'scripts.' prefix. If you have only one file in this directory, and you don't care about hiding this kind of thing, you may inline this snippet.
Related
I have a project I am working on, let's call it Project, which lives in the directory Project somewhere wholly unknown to me (really it lives both on my local system and on a couple Docker build systems). In that project, I have some source files, source/module1.py and source/module2.py. I also have some example files, some test files, and an init.py So my directory looks something like this:
Project
__init__.py
/source
module1.py
module2.py
/test
testRunner.py
/examples
awesomeExample.py
However, module1 needs some stuff from module2. My naive self thought this could be done by putting an import statement in module1:
import module2
# Do some other interesting stuff
And this works, but only when I am running / importing module 1 from the source directory. If I am, for example, running some unit tests in another directory test/testRunner.py, either from the test directory or in the main Project directory, the import will fail. Same with trying to use it when running an example in the examples directory.
So here is my problem: in general, I don't know where the calling script lives. It might be in the examples directory, it might be in the test directory, or it might be in the main Project directory (for example when trying to import stuff with an init.py). How do I ensure that module1 can always import module2 in each of these scenarios?
I am not looking for a solution like "add all those directories to your python path". Initially I just added Project to my python path on my local machine, and then did all my imports relative to that (import Project.source.module2), but this (predictably) caused my builds to fail on the Docker instances. I don't just want this to work on my local machine, but also on the Docker instances I'm using to build and test this software, and on any user's machine that subsequently installs it (i.e. by doing a pip install Project. What is the most robust way to make sure this dependency is satisfied? How can I make sure module1 can import module2 regardless of where module1 itself is imported from? Any python 3.x.x solution is welcome.
I figured out a way to do it (credit here) - it's a little inelegant, but extremely robust. Works on my local machine independent of whether using an import statement or running a script directly, as well as my build servers which use github actions and Travis CI.
Basically, I added a file in the source directory, called context.py with the following contents:
import os
import sys
fileLocation = os.path.dirname(os.path.abspath(__file__))
sourceLocation = os.path.abspath(os.path.join(fileLocation, '..', 'source/'))
sys.path.insert(0, sourceLocation)
This finds out the current file being executed from python, and then uses that to add to the python path. And then in my module1.py file, at the top I have:
import context
import module2
Now, whenever module1 is imported, it successfully imports module2. More elegant answers or comments on why this works and in which cases it might fail are appreciated.
Consider the following Python project skeleton:
proj/
├── foo
│ └── __init__.py
├── README.md
└── scripts
└── run.py
In this case foo holds the main project files, for example
# foo/__init__.py
class Foo():
def run(self):
print('Running...')
And scripts holds auxiliary scripts that need to import files from foo, which are then invoked via:
[~/proj]$ python scripts/run.py
There are two ways of importing Foo which both fail:
If a relative import is attempted from ..foo import Foo then the error is ValueError: attempted relative import beyond top-level package
If an absolute import is attempted from foo import Foo then the error is ModuleNotFoundError: No module named 'foo'
My current workaround is to append the running path to sys.path:
import sys
sys.path.append('.')
from foo import Foo
Foo().run()
But this feels like a hack, and has to be added to every new script in scripts/.
Is there a better way to structure scripts in such projects?
There's two ways you could resolve this.
(1) Turn your project into an installable package
Add a proj/setup.py file with the following contents:
import setuptools
setuptools.setup(
name="my-project",
version="1.0.0",
author="You",
author_email="you#example.com",
description="This is my project",
packages=["foo"],
)
create a virtualenv:
python3 -m venv virtualenv # this creates a directory "virtualenv" in your project
source ./virtualenv/bin/activate # this switches you into the new environment
python setup.py develop # this places your "foo" package in the environment
inside the virtualenv, foo behaves as an installed package and is importable via import foo.
So you can use absolute imports in your scripts.
To make them run from anywhere, without needing to activate the virtualenv, you can then specify the path as a shebang.
In scripts/run.py (the first line is important):
#!/path/to/proj/virtualenv/bin/python
import foo
print(foo.callfunc())
(2) Make the scripts part of the foo package
Instead of a separate subdirectory scripts, make a subpackage. In proj/foo/commands/run.py:
from .. import callfunc()
def main():
print(callfunc())
if __name__ == "__main__":
main()
Then execute the script from the top-level proj/ directory with:
python -m foo.commands.run
If you combine this with (1) and install your package, you can then run python -m foo.commands.run from anywhere.
Solution
There are multiple ways to achieve this. Both require creating a python package by adding a setup.py (building on #matejcik's answer).
Option 1 (recommended): entry_point + console_scripts register a function in your project as the entry point to script execution (ie: proj:foo:cli:run).
Option 2: scripts: Use this keyword argument in the setup() method to reference the path to your script (ie: `bin/script.py).
Note
I recommend using a CLI library/framework like Click so that your codebase is only concerned with maintaining application specific business logic rather than CLI robust framework feature logic. Also, click recommends using entry_point + console_scripts method of script integration due to cross-platform compatibility.
Setup Tools - Automatic script creation: https://setuptools.readthedocs.io/en/latest/setuptools.html#automatic-script-creation
Setup Tools - keyword arguments: https://setuptools.readthedocs.io/en/latest/setuptools.html#new-and-changed-setup-keywords
Click GitHub: https://github.com/pallets/click/
Click Setuptools integration: https://click.palletsprojects.com/en/master/setuptools/
Best practice? Put a single entry-point in the root
I know this might sound absurd, if you have lots of scripts you want to be able to execute... But it's actually the cleanest option and it's the one that is most often used in big Python projects like magage.py in Django, for example. It also doesn't need to be a huge undertaking. Even more importantly, it is always more secure to have a single entry point than several smaller ones.
proj/
├── run.py
├── foo
│ └── __init__.py
├── README.md
└── scripts
└── my_script.py
When run.py lives in the root directory, it can be very lightweight... Basically just a wrapper to call the function you need from my_scripts.py. It just ties everything together so now all of your imports just work.
Just keep in mind that your entrypoint is your root. The parent of a root doesn't exist. So put your entrypoint in the root, and then import packages relative to the root, aka import foo from scripts.
But how do I call multiple scripts!?
If you need to be able to call multiple scripts, this is a good argument for... Well... arguments! Keep run.py as your single entrypoint/command, and leverage subcommands to pass functionality to the script you care about.
Reinventing the wheel?
Generally, frameworks have already done the architecture for you to add your own subcommands, such as Django and, for a smaller footprint, Flask.
You can easily wrap up a small project without that help, though, as I've illustrated.
Security
No one ever wishes their code was less refactorable after a few years of working with it. No one ever wishes their codebase has less security. As we drive to more secure systems in general, it would make sense to create some gatekeeper script that determines what is and isn’t a safe operation and by whom. Moving the code to an LDAP based system, and need to lock things down by group? No problem. You can either change the single file or add LDAP security in your codebase, even creating your own internal API.
With distributed scripts, security options are much less flexible and much harder to maintain, and a single vulnerability could leave you wide open to exploit.
Bonus advantage
You're adding abstraction to your script base. If you ever want to change the structure of your codebase (maybe you want scripts to have subfolders with more organization), you/your users don't need to do any refactoring for any dependencies, or change paths to longer, more verbose names. Your package is self-contained, and the only thing a user will ever need to touch is your proj/run.py entry-point.
And, obviously, you don't need to play with Python paths as much!
You need to add __init__.py files to scripts and to proj folders for those to be considered Python packages and for you to be able to import from those.
One way this is also commonly done, is to place your foo and scripts folders into a proj/src folder, which then has a __init__.py file, and thus is a Python package.
If you like simplicity, and there are no additional restrictions on what you asked, add one __init__.py to the scripts folder, and to any other sibling folders, making them packages, then always use the absolute import form, as you said you do not want proj as a parent package of those and so there is no __init__.py there, and then call your scripts (instead) from inside the proj folder with:
python -m scripts.run
or whatever name you give to other scripts other than run.py
This is similar to option 2 of #matejcik answer, but even simpler.
another solution is you add a.pth file on your Python directory
and write the content of the following,
# your.pth
#↓ input the directory of proj
C:\...\proj
done
# scripts.py
from foo import Foo
Foo().run()
It will work well.
.. note:: If your IDE is PyCharm, then you can use the Source roots to help you too.
Python looks for packages/modules in the directories listed in sys.path. There are several ways of ensuring that your directories of interest, in this case proj, is one of those directories:
Move your scripts to the proj directory. Python adds the directory containing the input script to sys.path.
Put the directory proj into the contents of the PYTHONPATH environment variable.
Make the module part of an installable package and install it, either in a virtual environment or not.
At run time, dynamically add the directory proj to sys.path.
Option 1 is the most logical and requires no source changes. If you are afraid that might break something, you can perhaps make scripts a symbolic link pointing back to proj?
If you are unwilling to do that, then ...
You may consider it a hack, but I would recommend that you do modify your scripts to update sys.path at runtime. But instead append an absolute path so that the scripts can be executed regardless of what the current directory is. In your case, directory proj is the parent directory of directory scripts, where the scripts reside, so:
import sys
import os.path
parent_directory = os.path.split(os.path.dirname(__file__))[0]
if parent_directory not in sys.path:
#sys.path.insert(0, parent_directory) # the first entry is directory of the running script, so maybe insert after that at index 1
sys.append(parent_directory)
I want to inherit from a class in a file that lies in a directory above the current one.
Is it possible to relatively import that file?
from ..subpkg2 import mod
Per the Python docs: When inside a package hierarchy, use two dots, as the import statement doc says:
When specifying what module to import you do not have to specify the absolute name of the module. When a module or package is contained within another package it is possible to make a relative import within the same top package without having to mention the package name. By using leading dots in the specified module or package after from you can specify how high to traverse up the current package hierarchy without specifying exact names. One leading dot means the current package where the module making the import exists. Two dots means up one package level. Three dots is up two levels, etc. So if you execute from . import mod from a module in the pkg package then you will end up importing pkg.mod. If you execute from ..subpkg2 import mod from within pkg.subpkg1 you will import pkg.subpkg2.mod. The specification for relative imports is contained within PEP 328.
PEP 328 deals with absolute/relative imports.
import sys
sys.path.append("..") # Adds higher directory to python modules path.
#gimel's answer is correct if you can guarantee the package hierarchy he mentions. If you can't -- if your real need is as you expressed it, exclusively tied to directories and without any necessary relationship to packaging -- then you need to work on __file__ to find out the parent directory (a couple of os.path.dirname calls will do;-), then (if that directory is not already on sys.path) prepend temporarily insert said dir at the very start of sys.path, __import__, remove said dir again -- messy work indeed, but, "when you must, you must" (and Pyhon strives to never stop the programmer from doing what must be done -- just like the ISO C standard says in the "Spirit of C" section in its preface!-).
Here is an example that may work for you:
import sys
import os.path
sys.path.append(
os.path.abspath(os.path.join(os.path.dirname(__file__), os.path.pardir)))
import module_in_parent_dir
Import module from a directory which is exactly one level above the current directory:
from .. import module
How to load a module that is a directory up
preface: I did a substantial rewrite of a previous answer with the hopes of helping ease people into python's ecosystem, and hopefully give everyone the best change of success with python's import system.
This will cover relative imports within a package, which I think is the most probable case to OP's question.
Python is a modular system
This is why we write import foo to load a module "foo" from the root namespace, instead of writing:
foo = dict(); # please avoid doing this
with open(os.path.join(os.path.dirname(__file__), '../foo.py') as foo_fh: # please avoid doing this
exec(compile(foo_fh.read(), 'foo.py', 'exec'), foo) # please avoid doing this
Python isn't coupled to a file-system
This is why we can embed python in environment where there isn't a defacto filesystem without providing a virtual one, such as Jython.
Being decoupled from a filesystem lets imports be flexible, this design allows for things like imports from archive/zip files, import singletons, bytecode caching, cffi extensions, even remote code definition loading.
So if imports are not coupled to a filesystem what does "one directory up" mean? We have to pick out some heuristics but we can do that, for example when working within a package, some heuristics have already been defined that makes relative imports like .foo and ..foo work within the same package. Cool!
If you sincerely want to couple your source code loading patterns to a filesystem, you can do that. You'll have to choose your own heuristics, and use some kind of importing machinery, I recommend importlib
Python's importlib example looks something like so:
import importlib.util
import sys
# For illustrative purposes.
file_path = os.path.join(os.path.dirname(__file__), '../foo.py')
module_name = 'foo'
foo_spec = importlib.util.spec_from_file_location(module_name, file_path)
# foo_spec is a ModuleSpec specifying a SourceFileLoader
foo_module = importlib.util.module_from_spec(foo_spec)
sys.modules[module_name] = foo_module
foo_spec.loader.exec_module(foo_module)
foo = sys.modules[module_name]
# foo is the sys.modules['foo'] singleton
Packaging
There is a great example project available officially here: https://github.com/pypa/sampleproject
A python package is a collection of information about your source code, that can inform other tools how to copy your source code to other computers, and how to integrate your source code into that system's path so that import foo works for other computers (regardless of interpreter, host operating system, etc)
Directory Structure
Lets have a package name foo, in some directory (preferably an empty directory).
some_directory/
foo.py # `if __name__ == "__main__":` lives here
My preference is to create setup.py as sibling to foo.py, because it makes writing the setup.py file simpler, however you can write configuration to change/redirect everything setuptools does by default if you like; for example putting foo.py under a "src/" directory is somewhat popular, not covered here.
some_directory/
foo.py
setup.py
.
#!/usr/bin/env python3
# setup.py
import setuptools
setuptools.setup(
name="foo",
...
py_modules=['foo'],
)
.
python3 -m pip install --editable ./ # or path/to/some_directory/
"editable" aka -e will yet-again redirect the importing machinery to load the source files in this directory, instead copying the current exact files to the installing-environment's library. This can also cause behavioral differences on a developer's machine, be sure to test your code!
There are tools other than pip, however I'd recommend pip be the introductory one :)
I also like to make foo a "package" (a directory containing __init__.py) instead of a module (a single ".py" file), both "packages" and "modules" can be loaded into the root namespace, modules allow for nested namespaces, which is helpful if we want to have a "relative one directory up" import.
some_directory/
foo/
__init__.py
setup.py
.
#!/usr/bin/env python3
# setup.py
import setuptools
setuptools.setup(
name="foo",
...
packages=['foo'],
)
I also like to make a foo/__main__.py, this allows python to execute the package as a module, eg python3 -m foo will execute foo/__main__.py as __main__.
some_directory/
foo/
__init__.py
__main__.py # `if __name__ == "__main__":` lives here, `def main():` too!
setup.py
.
#!/usr/bin/env python3
# setup.py
import setuptools
setuptools.setup(
name="foo",
...
packages=['foo'],
...
entry_points={
'console_scripts': [
# "foo" will be added to the installing-environment's text mode shell, eg `bash -c foo`
'foo=foo.__main__:main',
]
},
)
Lets flesh this out with some more modules:
Basically, you can have a directory structure like so:
some_directory/
bar.py # `import bar`
foo/
__init__.py # `import foo`
__main__.py
baz.py # `import foo.baz
spam/
__init__.py # `import foo.spam`
eggs.py # `import foo.spam.eggs`
setup.py
setup.py conventionally holds metadata information about the source code within, such as:
what dependencies are needed to install named "install_requires"
what name should be used for package management (install/uninstall "name"), I suggest this match your primary python package name in our case foo, though substituting underscores for hyphens is popular
licensing information
maturity tags (alpha/beta/etc),
audience tags (for developers, for machine learning, etc),
single-page documentation content (like a README),
shell names (names you type at user shell like bash, or names you find in a graphical user shell like a start menu),
a list of python modules this package will install (and uninstall)
a defacto "run tests" entry point python ./setup.py test
Its very expansive, it can even compile c extensions on the fly if a source module is being installed on a development machine. For a every-day example I recommend the PYPA Sample Repository's setup.py
If you are releasing a build artifact, eg a copy of the code that is meant to run nearly identical computers, a requirements.txt file is a popular way to snapshot exact dependency information, where "install_requires" is a good way to capture minimum and maximum compatible versions. However, given that the target machines are nearly identical anyway, I highly recommend creating a tarball of an entire python prefix. This can be tricky, too detailed to get into here. Check out pip install's --target option, or virtualenv aka venv for leads.
back to the example
how to import a file one directory up:
From foo/spam/eggs.py, if we wanted code from foo/baz we could ask for it by its absolute namespace:
import foo.baz
If we wanted to reserve capability to move eggs.py into some other directory in the future with some other relative baz implementation, we could use a relative import like:
import ..baz
Here's a three-step, somewhat minimalist version of ThorSummoner's answer for the sake of clarity. It doesn't quite do what I want (I'll explain at the bottom), but it works okay.
Step 1: Make directory and setup.py
filepath_to/project_name/
setup.py
In setup.py, write:
import setuptools
setuptools.setup(name='project_name')
Step 2: Install this directory as a package
Run this code in console:
python -m pip install --editable filepath_to/project_name
Instead of python, you may need to use python3 or something, depending on how your python is installed. Also, you can use -e instead of --editable.
Now, your directory will look more or less like this. I don't know what the egg stuff is.
filepath_to/project_name/
setup.py
test_3.egg-info/
dependency_links.txt
PKG-INFO
SOURCES.txt
top_level.txt
This folder is considered a python package and you can import from files in this parent directory even if you're writing a script anywhere else on your computer.
Step 3. Import from above
Let's say you make two files, one in your project's main directory and another in a sub directory. It'll look like this:
filepath_to/project_name/
top_level_file.py
subdirectory/
subfile.py
setup.py |
test_3.egg-info/ |----- Ignore these guys
... |
Now, if top_level_file.py looks like this:
x = 1
Then I can import it from subfile.py, or really any other file anywhere else on your computer.
# subfile.py OR some_other_python_file_somewhere_else.py
import random # This is a standard package that can be imported anywhere.
import top_level_file # Now, top_level_file.py works similarly.
print(top_level_file.x)
This is different than what I was looking for: I hoped python had a one-line way to import from a file above. Instead, I have to treat the script like a module, do a bunch of boilerplate, and install it globally for the entire python installation to have access to it. It's overkill. If anyone has a simpler method than doesn't involve the above process or importlib shenanigans, please let me know.
Polished answer of #alex-martelli with pathlib:
import pathlib
import sys
_parentdir = pathlib.Path(__file__).parent.parent.resolve()
sys.path.insert(0, str(_parentdir))
import module_in_parent_dir
sys.path.remove(str(_parentdir))
To run python /myprogram/submodule/mymodule.py which imports /myprogram/mainmodule.py, e.g., via
from mainmodule import *
on Linux (e.g., in the python Docker image), I had to add the program root directory to PYTHONPATH:
export PYTHONPATH=/myprogram
It is 2022 and none of the answers really worked for me. Here is what worked in the end
import sys
sys.path.append('../my_class')
import my_class
My directory structure:
src
--my_class.py
notebooks
-- mynotebook.ipynb
I imported my_class from mynotebook.ipynb.
You can use the sys.path.append() method to add the directory containing the package to the list of paths searched for modules. For example, if the package is located two directories above the current directory, you can use the following code:
import sys
sys.path.append("../../")
if the package is location one directory above the current directory, you can use below code:
import sys
sys.path.append("..")
Python is a modular system
Python doesn't rely on a file system
To load python code reliably, have that code in a module, and that module installed in python's library.
Installed modules can always be loaded from the top level namespace with import <name>
There is a great sample project available officially here: https://github.com/pypa/sampleproject
Basically, you can have a directory structure like so:
the_foo_project/
setup.py
bar.py # `import bar`
foo/
__init__.py # `import foo`
baz.py # `import foo.baz`
faz/ # `import foo.faz`
__init__.py
daz.py # `import foo.faz.daz` ... etc.
.
Be sure to declare your setuptools.setup() in setup.py,
official example: https://github.com/pypa/sampleproject/blob/master/setup.py
In our case we probably want to export bar.py and foo/__init__.py, my brief example:
setup.py
#!/usr/bin/env python3
import setuptools
setuptools.setup(
...
py_modules=['bar'],
packages=['foo'],
...
entry_points={},
# Note, any changes to your setup.py, like adding to `packages`, or
# changing `entry_points` will require the module to be reinstalled;
# `python3 -m pip install --upgrade --editable ./the_foo_project
)
.
Now we can install our module into the python library;
with pip, you can install the_foo_project into your python library in edit mode,
so we can work on it in real time
python3 -m pip install --editable=./the_foo_project
# if you get a permission error, you can always use
# `pip ... --user` to install in your user python library
.
Now from any python context, we can load our shared py_modules and packages
foo_script.py
#!/usr/bin/env python3
import bar
import foo
print(dir(bar))
print(dir(foo))
I have the following folder structure;
myapp\
myapp\
__init__.py
tests\
test_myapp.py
and my pwd is
C:\Users\wwerner\programming\myapp\
I have the following test setup:
import sys
import pprint
def test_cool():
pprint.pprint(sys.path)
assert False
That produces the following paths:
['C:\\Users\\wwerner\\programming\\myapp\\tests',
'C:\\Users\\wwerner\\programming\\envs\\myapp\\Scripts',
'C:\\Windows\\system32\\python34.zip',
'C:\\Python34\\DLLs',
'C:\\Python34\\lib',
'C:\\Python34',
'C:\\Users\\wwerner\\programming\\envs\\myapp',
'C:\\Users\\wwerner\\programming\\envs\\myapp\\lib\\site-packages']
And when I try to import myapp I get the following error:
ImportError: No module named 'myapp'
So it looks like it's not adding the current directory to my path.
By changing my import line to look like this:
import sys
sys.path.insert(0, '.')
import myapp
I am then able to import myapp with no problems.
Why does my current directory not show up in the path when running pytest? Is my only workaround to insert . into the sys.path? (I'm using Python 3.4 if it matters)
Ahah!
After comparing the layout of my cookiecutter repo, it turns out to be way more simple (and better) than that.
tests/
__init__.py
test_myapp.py
A simple addition of the __init__.py file to my test dir allows me to run py.test from my main directory.
Using an installable package
If you have an installable package (setup.py or pyproject.toml file with a build-system defined) then you want to test against the installed package.
pip install --editable .
pytest
The simplest possible way to make the project shown in the question into an installable package would be by adding this setup.py:
from setuptools import setup
setup(
name="myapp",
version="0.1",
packages=["myapp"],
)
This will put the myapp code at /path/to/myapp/.venv/lib/python3.XY/site-packages, which is in the sys.path of the virtual environment. Now myapp can be imported from the site-packages dir, just as it would be for a user installation. It is neither necessary nor desirable for the current working directory to be present on sys.path during test execution.
Not using an installable package
The project shown in the question does not have any installer, so it can't be installed. It can still be tested by making sure the project root (i.e. the directory which contains both myapp and tests as subdirectories) is present on sys.path.
The best way to do this is to use python -m pytest, rather than invoking the bare pytest command. When you use python -m pytest it adds the current working directory to the start of sys.path. That's the normal Python behavior when executing a package as __main__ (documented here) and it's also a documented usage for pytest - see Invoking pytest versus python -m pytest.
Why does adding an __init__.py to the tests subdirectory (not) work?
The directory structure shown in the question is the "Tests outside application code" pattern, documented here. This is also the directory structure I recommend, since it creates a clear distinction between library/application code and test code.
It's not recommended to add __init__.py files inside the test directories when using a "Tests outside application code" structure, since the test files aren't intended to be "packaged" (e.g. test files do not really need to import from other test files, and they do not need to be installed at all for end users of your package).
The reason adding a myapp/__init__.py actually allows myapp to be imported by pytest, as described in Wayne's answer is actually an accident due to the way test discovery appends sys.path during the test collection phase. This is described as "problematic" in the docs
... this introduces a subtle problem: in order to load the test modules from the tests directory, pytest prepends the root of the repository to sys.path, which adds the side-effect that now mypkg is also importable
They go on to strongly recommend using the src-layout if you intend to have __init__.py files inside test directories, to avoid this confusion of the import system.
But perhaps the best reason not to rely on this side-effect is that pytest collection actually can work in multiple modes (see import modes), and Wayne's answer relies upon pytest using the default "prepend" mode. It is currently mentioned that a future version will switch to "importlib" mode as default:
We intend to make importlib the default in future releases.
The accepted answer does not work with pytest --import-mode=importlib and so will stop working altogether at some stage.
sys.path automatically has the script's directory in it, and not the current working directory.
I am guessing that your script in placed in tests directory. Based on this assumption, your code should look like this:
import sys
import os
ROOT_DIR = os.path.dirname(os.path.dirname(__file__))
sys.path.append(ROOT_DIR)
import myapp # Should work now
Use the environment variable PYTHONPATH.
In Windows:
set PYTHONPATH=.
py.test
In Unix:
PYTHONPATH=. py.test
I'm setting up some code for unittesting. My directory currently looks like this:
project/
src/
__init__.py
sources.py
test/
__init__.py
sources_test.py
In __init__.py for the test directory, I have these two lines:
import sys
sys.path.insert(0, '../')
In the test files, I have the line import src.sources.
When I use nose to run these tests from the project directory, everything works just fine. If I try to run the tests individually it gives me this error:
ImportError: No module named src.sources
I assume that this is because when I run the test from the command line it isn't using __init__.py. Is there a way I can make sure that it will use those lines even when I try to run the tests individually?
I could take the lines out of __init__.py and put them into my test files, but I'm trying to avoid doing that.
To run the tests individually I am running python sources_test.py
You're really trying to abuse packages here, and that isn't a good idea.
The simple solution is to not run the tests from within the tests directory. Just cd up a level, then do python tests/sources_test.py.
Of course that in itself isn't going to import test/__init__.py. For that, you really need to import the package. So python -m tests.sources_test is probably a better idea… except, of course, that if your package is made to be run as a script but not to be imported, that won't work.
Alternatively, you could (on POSIX platforms, at least) do PYTHONPATH=.. python sources_test.py from within tests. This is a bit hacky, but it should work.
Or, better, combine the above, and, from outside of tests, do PYTHONPATH=. python tests/sources_test.py.
A really hacky workaround is to explicitly import __init__. This should basically work for you simple use case, but everything ends up wrong—in particular, you end up with a module named __init__ instead of one named test, and of course your main module isn't named test.sources_test, and in fact there is no test package at all. Unless you accidentally re-import anything after modifying sys.path, in which case you may get duplicates of the modules.
If you write
import src.source
the python interpreter looks into the src directory for a __init__.py file. If it exists, you can use the directory as a package name. If your are not in your project directory, which is the case when you are in the src directory, then python looks into the directories in $PYTHONPATH environment variable (at least in linux, windows should also have some environment variable, maybe with another name), if it can find some directory src with a __init__.py file in it.
Did you set your $PYTHONPATH?