Including extra files in a Python package's unittest test suite

Including extra files in a Python package's unittest test suite - python

I'm working on a small Python package, and I'm writing unit tests for it. I have my package structure laid out in fairly typical Python fashion:
mypackage/
mymodule.py
test/
test_mymodule.py
This particular module is designed to read the contents of a file, do a thing with said contents, and then return a value. In order to test this, I want to include some input files in my test suite.
I suppose I could organize them like this:
mypackage/
mymodule.py
test/
test_mymodule.py
test_input01.txt
test_input02.txt
And then locate the files to read using the __file__ global variable.
My question is: is this a reasonably Pythonic way of organizing things? I'm concerned that guessing paths in this manner won't be reliable in any environment (e.g., if the package is installed via pip or etc.).
Specifically, I'm looking for either 1) some official documentation that supports this (or some other) approach, or 2) a well-regarded existing project that uses this (or some other) approach.

It's pretty much ok to use __file__ for locating the current directory path and a place where the test data lives.
As an example, see pep8 github project, particularly pay attention to how tests are organized in testsuite package - *.py files there with names like E10.py, E11.py are used as target files in tests.
Also, pylint uses similar approach (based on __file__), but the test data is organized in subpackages.
Also, see:
How to make python unit tests always find test data files when run from different working directories?
How can I store testing data for python nosetests?
Hope that helps.

Related

How to work with absolute imports when developing a package

It's been a while that I am struggling with imports in packages. When I develop a package, I read everywhere that it is preferable to use absolute imports in submodules of that package. I understand that and I like it more as well. But then I don't like and I also read that you shouldn't use sys.path.append('/path/to/package') to use your package in development...
So my question is, how do you develop such a package from zero, using directly absolute imports? At the moment I develop the package using relative imports, since then I am able to test the code I am writing before packaging and installing, then I change the imports once I have a release and build the package.
What is the correct way of doing such thing? In Pycharm for example you would mark the folder as 'source roor' and be able to work as if the package folder was in the path. Still I read that this is not the proper way... what am I missing? How do you develop a package while testing its code?

Your mileage may vary but this is what I usually do:
Within a package (foo), absolute (import foo.bar) or relative (import .bar) doesn't matter to me as long as it works. Sometimes, I prefer relative especially when the project is large and one day I might decide to move a number of source files into a subdirectory.
How do I test? My $PYTHONPATH usually has . in it, and my directory hierarchy is like this:
/path/to/foo_project
/setup.py
/foo
/__init__.py
/bar.py
/test
/test1.py
/test2.py
then the script in foo_project/test/test1.py will be like what you normally use the package, using import foo.bar. And when I test my code, I will be in the directory foo_project and run python test/test1.py. Since I have . in my $PYTHONPATH, it will find the directory foo and use it as a package.

Python File Structure on GitHub

I've been looking around at some open source projects on Python, and I'm seeing a lot of files and patterns that I'm not familiar with.
First of all, a lot of projects just have a file called setup.py, which usually contains one function:
setup(blah, blah, blah)
Second, a lot contain a file that is simply called __init__.py and contains next to no information.
Third, some .py files contain a statement similar to this:
if __name__ == "__main__"
Finally, I'm wondering if there are any "best practices" for dividing Python files up in a git repository. With Java, the idea of file division comes pretty naturally because of the class structure. With Python, many scripts have no classes at all, and sometimes a program will have OOP aspects, but a class by class division does not make that much sense. Is it just "whatever makes the code the most readable," or are there some guidelines somewhere about this?

The setup.py is part of Python’s module distribution using the distrubution utilities. It allows for easy installation of the Python module and is useful when, well, you want to distribute your project as a whole Python module.
The __init__.py is used for Python’s package system. An empty file is usually enough to make Python recognize the directory it is in as a package, but you can also define different things in it.
Finally, the __name__ == '__main__' check is to ensure that the current script is run directly (e.g. from the command line) and it is not just imported into some other script. During a Python script execution only a single module’s __name__ property will be equal to __main__. See also my answer here or the more general question on that topic.

The setup.py is part of distutils setup process. You'll want to have one of those if you're distributing a module instead of just a basic script (which even then it's a good idea to have one so you can easily expand into a module later).
The __init__.py part of the python module import process:
Files named init.py are used to mark directories on disk as a
Python package directories. If you have the files
mydir/spam/init.py mydir/spam/module.py and mydir is on your path,
you can import the code in module.py as:
import spam.module or
from spam import module If you remove the init.py file, Python
will no longer look for submodules inside that directory, so attempts
to import the module will fail.
if __name == "__main__" is a way to indicate code that would be executed if the file was run directly instead of imported.
To answer on how to layout your code, the distfiles documentation has a good guide on this.

In addition to #poke's answer, see this related question on what the directory structure of a python project should be. Here is another useful tutorial on how to make your project easily runnable.

How to make python unit tests always find test data files when run from different working directories?

This is a question about testing environment setup.
In my project, I have a few unit tests that access test data files. These unit tests can be run from my project directory via a test runner. Or I can run each test file/module individually, for debugging purposes, for instance.
The problem is that depending on where I run the tests from, the current directory is different. So opening a test data file, as below, by giving a path relative to the current directory will not work when those files are run from the project directory, as the test data file is not in that directory.
f = open('test_data.ext', 'r')
I thought of using __file__ to use a path relative the current test module, but this doesn't work when the test module calling __file__ is the one being run individually.
How do people generally solve this problem?

A number of different ways come to mind:
Set an environment variable for your data directory
Write a small module that you always import that has the sole purpose of having a fixed position relative to your data directory, then call __file__ from there
Generate the data at runtime
Store your data in a database rather than a file
Store your data in a fixed location in the file system rather than a location relative to the package
Don't run your test code directly
The solution that makes the most sense for you depends upon your environment and your specific data and program.

I'm not sure it's your complete solution, but for Unit testing in Python, I always use Nose. It has an excellent Test Discovery procedure . You can find its mechanism here. Maybe you can get some idea to use for Python's traditional unit testing as well ..
py.test also uses a neat discovering mechanism ..

How to deal with relative imports in a Python package

I'm working on a Python project with approximately the following layout
project/
foo/
__init__.py
useful.py
test/
__init__.py
test_useful.py
test_useful.py tries to import project.foo.useful so it can test it, but it doesn't work when I say "python project/foo/test/test_useful.py", but it does work if I copy it into my current directory and run "python test_useful.py".
What is the correct way to handle these imports while developing? It seems like this won't be an issue once installed, because it will be in PYTHONPATH. Should I use distutils to make a build/ folder and add it to my PYTHONPATH?

First of all you need to set up your PYTHONPATH to either include "project" or the parent of "project". This is important while you're developing too :-)
Then you should be able to use an absolute import:
from project.foo import useful
Secondly, I would suggest that instead of running tests by executing the module, you install py.test (pip install pytest). Then you'll be able to use relative imports, as long as your py.test invocation is generic enough (i.e. "py.test foo" will work, but "py.test foo/test/test_useful.py" will not). I would still recommend that you not use relative imports in tests.

Please consider using distutils/setuptools to make your project installable in a Python standard way. (Hint: you'll need to create a setup.py file parallel to the 'foo' directory, also known as a package.)
Doing so will also allow you to then use a number of common Python testing frameworks (nose, py.test, etc.) to make it possible to collect and run tests, where most such frameworks automatically ensure 'foo' is an importable package before running the tests. Your test_useful.py tests can them import 'foo.useful' without a problem.
Also worth noting from your example directory structure is that it seems to be generally recommended that your tests directory NOT be a Python package. i.e. delete the test/init.py file. The framework will ensure the tests are runnable, and not having it as a package will help ensure it only gets distributed in source distributions and not binary ones (where it likely isn't wanted.)

Where do the Python unit tests go? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 1 year ago.
Improve this question
If you're writing a library, or an app, where do the unit test files go?
It's nice to separate the test files from the main app code, but it's awkward to put them into a "tests" subdirectory inside of the app root directory, because it makes it harder to import the modules that you'll be testing.
Is there a best practice here?

For a file module.py, the unit test should normally be called test_module.py, following Pythonic naming conventions.
There are several commonly accepted places to put test_module.py:
In the same directory as module.py.
In ../tests/test_module.py (at the same level as the code directory).
In tests/test_module.py (one level under the code directory).
I prefer #1 for its simplicity of finding the tests and importing them. Whatever build system you're using can easily be configured to run files starting with test_. Actually, the default unittest pattern used for test discovery is test*.py.

Only 1 test file
If there has only 1 test files, putting it in a top-level directory is recommended:
module/
lib/
__init__.py
module.py
test.py
Run the test in CLI
python test.py
Many test files
If has many test files, put it in a tests folder:
module/
lib/
__init__.py
module.py
tests/
test_module.py
test_module_function.py
# test_module.py
import unittest
from lib import module
class TestModule(unittest.TestCase):
def test_module(self):
pass
if __name__ == '__main__':
unittest.main()
Run the test in CLI
# In top-level /module/ folder
python -m tests.test_module
python -m tests.test_module_function
Use unittest discovery
unittest discovery will find all test in package folder.
Create a __init__.py in tests/ folder
module/
lib/
__init__.py
module.py
tests/
__init__.py
test_module.py
test_module_function.py
Run the test in CLI
# In top-level /module/ folder
# -s, --start-directory (default current directory)
# -p, --pattern (default test*.py)
python -m unittest discover
Reference
pytest Good Practices for test layout
unittest
Unit test framework
nose
nose2
pytest

A common practice is to put the tests directory in the same parent directory as your module/package. So if your module was called foo.py your directory layout would look like:
parent_dir/
foo.py
tests/
Of course there is no one way of doing it. You could also make a tests subdirectory and import the module using absolute import.
Wherever you put your tests, I would recommend you use nose to run them. Nose searches through your directories for tests. This way, you can put tests wherever they make the most sense organizationally.

We had the very same question when writing Pythoscope (https://pypi.org/project/pythoscope/), which generates unit tests for Python programs. We polled people on the testing in python list before we chose a directory, there were many different opinions. In the end we chose to put a "tests" directory in the same directory as the source code. In that directory we generate a test file for each module in the parent directory.

I also tend to put my unit tests in the file itself, as Jeremy Cantrell above notes, although I tend to not put the test function in the main body, but rather put everything in an
if __name__ == '__main__':
do tests...
block. This ends up adding documentation to the file as 'example code' for how to use the python file you are testing.
I should add, I tend to write very tight modules/classes. If your modules require very large numbers of tests, you can put them in another, but even then, I'd still add:
if __name__ == '__main__':
import tests.thisModule
tests.thisModule.runtests
This lets anybody reading your source code know where to look for the test code.

Every once in a while I find myself checking out the topic of test placement, and every time the majority recommends a separate folder structure beside the library code, but I find that every time the arguments are the same and are not that convincing. I end up putting my test modules somewhere beside the core modules.
The main reason for doing this is: refactoring.
When I move things around I do want test modules to move with the code; it's easy to lose tests if they are in a separate tree. Let's be honest, sooner or later you end up with a totally different folder structure, like django, flask and many others. Which is fine if you don't care.
The main question you should ask yourself is this:
Am I writing:
a) reusable library or
b) building a project than bundles together some semi-separated modules?
If a:
A separate folder and the extra effort to maintain its structure may be better suited. No one will complain about your tests getting deployed to production.
But it's also just as easy to exclude tests from being distributed when they are mixed with the core folders; put this in the setup.py:
find_packages("src", exclude=["*.tests", "*.tests.*", "tests.*", "tests"])
If b:
You may wish — as every one of us do — that you are writing reusable libraries, but most of the time their life is tied to the life of the project. Ability to easily maintain your project should be a priority.
Then if you did a good job and your module is a good fit for another project, it will probably get copied — not forked or made into a separate library — into this new project, and moving tests that lay beside it in the same folder structure is easy in comparison to fishing up tests in a mess that a separate test folder had become. (You may argue that it shouldn't be a mess in the first place but let's be realistic here).
So the choice is still yours, but I would argue that with mixed up tests you achieve all the same things as with a separate folder, but with less effort on keeping things tidy.

I use a tests/ directory, and then import the main application modules using relative imports. So in MyApp/tests/foo.py, there might be:
from .. import foo
to import the MyApp.foo module.

I don't believe there is an established "best practice".
I put my tests in another directory outside of the app code. I then add the main app directory to sys.path (allowing you to import the modules from anywhere) in my test runner script (which does some other stuff as well) before running all the tests. This way I never have to remove the tests directory from the main code when I release it, saving me time and effort, if an ever so tiny amount.

From my experience in developing Testing frameworks in Python, I would suggest to put python unit tests in a separate directory. Maintain a symmetric directory structure. This would be helpful in packaging just the core libraries and not package the unit tests. Below is implemented through a schematic diagram.
<Main Package>
/ \
/ \
lib tests
/ \
[module1.py, module2.py, [ut_module1.py, ut_module2.py,
module3.py module4.py, ut_module3.py, ut_module.py]
__init__.py]
In this way when you package these libraries using an rpm, you can just package the main library modules (only). This helps maintainability particularly in agile environment.

I recommend you check some main Python projects on GitHub and get some ideas.
When your code gets larger and you add more libraries it's better to create a test folder in the same directory you have setup.py and mirror your project directory structure for each test type (unittest, integration, ...)
For example if you have a directory structure like:
myPackage/
myapp/
moduleA/
__init__.py
module_A.py
moduleB/
__init__.py
module_B.py
setup.py
After adding test folder you will have a directory structure like:
myPackage/
myapp/
moduleA/
__init__.py
module_A.py
moduleB/
__init__.py
module_B.py
test/
unit/
myapp/
moduleA/
module_A_test.py
moduleB/
module_B_test.py
integration/
myapp/
moduleA/
module_A_test.py
moduleB/
module_B_test.py
setup.py
Many properly written Python packages uses the same structure. A very good example is the Boto package.
Check https://github.com/boto/boto

How I do it...
Folder structure:
project/
src/
code.py
tests/
setup.py
Setup.py points to src/ as the location containing my projects modules, then i run:
setup.py develop
Which adds my project into site-packages, pointing to my working copy. To run my tests i use:
setup.py tests
Using whichever test runner I've configured.

I prefer toplevel tests directory. This does mean imports become a little more difficult. For that I have two solutions:
Use setuptools. Then you can pass test_suite='tests.runalltests.suite' into setup(), and can run the tests simply: python setup.py test
Set PYTHONPATH when running the tests: PYTHONPATH=. python tests/runalltests.py
Here's how that stuff is supported by code in M2Crypto:
http://svn.osafoundation.org/m2crypto/trunk/setup.py
http://svn.osafoundation.org/m2crypto/trunk/tests/alltests.py
If you prefer to run tests with nosetests you might need do something a little different.

I put my tests in the same directory as the code under test (CUT). In projects where I can tweak pytest with my plugin, for foo.py I use foo.pt for the tests which makes editing a particular module and its test together really easy: vi foo.*.
Where I can't do this, I use foo_ut.py or similar. You can still use vi foo* though that will also catch foobar.py and foobar_ut.py if those exist.
In either case I tweak the test discovery process to find these.
This puts the tests right beside the code in a directory listing, making it obvious that tests are there, and makes opening the tests as easy as it can possibly be when they're in a separate file. (For editors started from the command line, as described above; for GUI systems, click on the code file and the adjacent (or very nearly adjacent) test file.
As others have pointed out, this also makes it easier to refactor and to extract the code for use elsewhere should that ever be necessary.
I really dislike the idea of putting tests in a completely different directory tree; why make it harder than necessary for developers to open up the tests when they're opening the file with the CUT? It's not like the vast majority of developers are so keen on writing or tweaking tests that they'll ignore any barrier to doing that, instead of using the barrier as an excuse. (Quite the opposite, in my experience; even when you make it as easy as possible I know many developers who can't be bothered to write tests.)

We use
app/src/code.py
app/testing/code_test.py
app/docs/..
In each test file we insert ../src/ in sys.path. It's not the nicest solution but works. I think it would be great if someone came up w/ something like maven in java that gives you standard conventions that just work, no matter what project you work on.

If the tests are simple, simply put them in the docstring -- most of the test frameworks for Python will be able to use that:
>>> import module
>>> module.method('test')
'testresult'
For other more involved tests, I'd put them either in ../tests/test_module.py or in tests/test_module.py.

In C#, I've generally separated the tests into a separate assembly.
In Python -- so far -- I've tended to either write doctests, where the test is in the docstring of a function, or put them in the if __name__ == "__main__" block at the bottom of the module.

When writing a package called "foo", I will put unit tests into a separate package "foo_test". Modules and subpackages will then have the same name as the SUT package module. E.g. tests for a module foo.x.y are found in foo_test.x.y. The __init__.py files of each testing package then contain an AllTests suite that includes all test suites of the package. setuptools provides a convenient way to specify the main testing package, so that after "python setup.py develop" you can just use "python setup.py test" or "python setup.py test -s foo_test.x.SomeTestSuite" to the just a specific suite.

I've recently started to program in Python, so I've not really had chance to find out best practice yet.
But, I've written a module that goes and finds all the tests and runs them.
So, I have:
app/
appfile.py
test/
appfileTest.py
I'll have to see how it goes as I progress to larger projects.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.