File naming conventions for unittesting files - python

For python unit tests, is the following a common naming convention?
MyProject/
run_ingest.py
tests/
run_ingest.py
Or is this too redundant? If so, what would be a better naming convention or directory structure to put testing code?

Having a directory called tests is correct but, as per my experience, the test scripts themselves are usually prefixed with test_ so in your case test_run_ingest.py. Make sure to use underscore rather than - as a separator in the names to avoid import problems. As for the structure, you'd also probably want to include __init__.py files both to the top level and /tests folder to simplify imports.
If you are new to unittest this blog might be of interest.

Add a little bit to Alexander's answer. In test_run_ingest.py, write from run_ingest import *, and when testing, cd to MyProject and type python3 -m unittest tests.test_run_ingest

Related

Use __init__.py to modify sys path is a good idea?

I want to ask you something that came to my mind doing some stuff.
I have the following structure:
src
- __init__.py
- class1.py
+ folder2
- __init__.py
- class2.py
I class2.py I want to import class1 to use it. Obviously, I cannot use
from src.class1 import Class1
cause it will produce an error. A workaround that works to me is to define the following in the __init__.py inside folder2:
import sys
sys.path.append('src')
My question is if this option is valid and a good idea to use or maybe there are better solutions.
Another question. Imagine that the project structure is:
src
- __init__.py
- class1.py
+ folder2
- __init__.py
- class2.py
+ errorsFolder
- __init__.py
- errors.py
In class1:
from errorsFolder.errors import Errors
this works fine. But if I try to do in class2 which is at the same level than errorsFolder:
from src.errorsFolder.errors import Errors
It fails (ImportError: No module named src.errorsFolder.errors)
Thank you in advance!
Despite the fact that it's slightly shocking to have to import a "parent" module in your package, your workaround depends on the current directory you're running your application, which is bad.
import sys
sys.path.append('src')
should be
import sys,os
sys.path.append(os.path.join(os.path.dirname(os.path.abspath(__file__)),os.pardir))
to add the parent directory of the directory of your current module, regardless of the current directory you're running your application from (your module may be imported by several applications, which don't all require to be run in the same directory)
One correct way to solve this is to set the environment variable PYTHONPATH to the path which contains src. Then import src.class1 will always work, regardless of which directory you start in.
from ..class1 import Class1 should work (at least it does here in a similar layout, using python 2.7.x).
As a general rule: messing with sys.path is usually a very bad idea, specially if this allows a same module to be imported from two different paths (which would be the case with your files layout).
Also, you may want to think twice about your current layout. Python is not Java and doesn't require (nor even encourage) a "one class per module" approach. If both classes need to work together, they might be better in a same module, or at least in modules at the same level in the package tree (note that you can use the top-level package's __init__ as a facade to provide direct access to objects defined in submodules / subpackages). NB : I'm not saying that your current layout is necessarily wrong, just that it might not be the simplest one.
There is also a solution that does not involve the use of the environment variable PYTHONPATH.
In src/ create setup.py with these contents:
from setuptools import setup
setup()
Now you can do pip install -e /path/to/src and import to your hearts content.
-e, --editable <path/url> Install a project in editable mode (i.e. setuptools "develop mode") from a local project path or a VCS url.
No, Its not good. Python takes modules in two ways:
Python looks for its modules and packages in $PYTHONPATH.
Refer: https://docs.python.org/2/using/cmdline.html#envvar-PYTHONPATH
To find out what is included in $PYTHONPATH, run the following code in python (3):
import sys
print(sys.path)
all folder containing init.py are marked as python package.(if they are subdirectories under PYTHONPATH)
By help of these two ways you can full-fill the need to create a python project.

Using code in init.py when running python from command line

I'm setting up some code for unittesting. My directory currently looks like this:
project/
src/
__init__.py
sources.py
test/
__init__.py
sources_test.py
In __init__.py for the test directory, I have these two lines:
import sys
sys.path.insert(0, '../')
In the test files, I have the line import src.sources.
When I use nose to run these tests from the project directory, everything works just fine. If I try to run the tests individually it gives me this error:
ImportError: No module named src.sources
I assume that this is because when I run the test from the command line it isn't using __init__.py. Is there a way I can make sure that it will use those lines even when I try to run the tests individually?
I could take the lines out of __init__.py and put them into my test files, but I'm trying to avoid doing that.
To run the tests individually I am running python sources_test.py
You're really trying to abuse packages here, and that isn't a good idea.
The simple solution is to not run the tests from within the tests directory. Just cd up a level, then do python tests/sources_test.py.
Of course that in itself isn't going to import test/__init__.py. For that, you really need to import the package. So python -m tests.sources_test is probably a better idea… except, of course, that if your package is made to be run as a script but not to be imported, that won't work.
Alternatively, you could (on POSIX platforms, at least) do PYTHONPATH=.. python sources_test.py from within tests. This is a bit hacky, but it should work.
Or, better, combine the above, and, from outside of tests, do PYTHONPATH=. python tests/sources_test.py.
A really hacky workaround is to explicitly import __init__. This should basically work for you simple use case, but everything ends up wrong—in particular, you end up with a module named __init__ instead of one named test, and of course your main module isn't named test.sources_test, and in fact there is no test package at all. Unless you accidentally re-import anything after modifying sys.path, in which case you may get duplicates of the modules.
If you write
import src.source
the python interpreter looks into the src directory for a __init__.py file. If it exists, you can use the directory as a package name. If your are not in your project directory, which is the case when you are in the src directory, then python looks into the directories in $PYTHONPATH environment variable (at least in linux, windows should also have some environment variable, maybe with another name), if it can find some directory src with a __init__.py file in it.
Did you set your $PYTHONPATH?

Intrapackage module loads in python

I am just starting with python and have troubles understanding the searching path for intra-package module loads. I have a structure like this:
top/ Top-level package
__init__.py Initialize the top package
src/ Subpackage for source files
__init__.py
pkg1/ Source subpackage 1
__init__.py
mod1_1.py
mod1_2.py
...
pkg2/ Source subpackage 2
__init__.py
mod2_1.py
mod2_2.py
...
...
test/ Subpackage for unit testing
__init__.py
pkg1Test/ Tests for subpackage1
__init__.py
testSuite1_1.py
testSuite1_2.py
...
pkg2Test/ Tests for subpackage2
__init__.py
testSuite2_1.py
testSuite2_2.py
...
...
In testSuite1_1 I need to import module mod1_1.py (and so on). What import statement should I use?
Python's official tutorial (at docs.python.org, sec 6.4.2) says:
"If the imported module is not found in the current package (the package of which the current module is a submodule), the import statement looks for a top-level module with the given name."
I took this to mean that I could use (from within testSuite1_1.py):
from src.pkg1 import mod1_1
or
import src.pkg1.mod1_1
neither works. I read several answers to similar questions here, but could not find a solution.
Edit: I changed the module names to follow Python's naming conventions. But I still cannot get this simple example to work.
The module name doesn't include the .py extension. Also, in your example, the top-level module is actually named top. And finally, hyphens aren't legal for names in python, I'd suggest replacing them with underscores. Then try:
from top.src.pkg1 import mod1_1
Problem solved with the help of http://legacy.python.org/doc/essays/packages.html (referred to in a similar question). The key point (perhpas obvious to more experienced python developers) is this:
"In order for a Python program to use a package, the package must be findable by the import statement. In other words, the package must be a subdirectory of a directory that is on sys.path. [...] the easiest way to ensure that a package was on sys.path was to either install it in the standard library or to have users extend sys.path by setting their $PYTHONPATH shell environment variable"
Adding the path to "top" to PYTHONPATH solved the problem.To make the solution portable (this is a personal project, but I need to share it across several machines), I guess having a minimal initialization code in top/setup.py should work.

python import path: packages with the same name in different folders

I am developing several Python projects for several customers at the same time. A simplified version of my project folder structure looks something like this:
/path/
to/
projects/
cust1/
proj1/
pack1/
__init__.py
mod1.py
proj2/
pack2/
__init__.py
mod2.py
cust2/
proj3/
pack3/
__init__.py
mod3.py
When I for example want to use functionality from proj1, I extend sys.path by /path/to/projects/cust1/proj1 (e.g. by setting PYTHONPATH or adding a .pth file to the site_packages folder or even modifying sys.path directly) and then import the module like this:
>>> from pack1.mod1 import something
As I work on more projects, it happens that different projects have identical package names:
/path/
to/
projects/
cust3/
proj4/
pack1/ <-- same package name as in cust1/proj1 above
__init__.py
mod4.py
If I now simply extend sys.path by /path/to/projects/cust3/proj4, I still can import from proj1, but not from proj4:
>>> from pack1.mod1 import something
>>> from pack1.mod4 import something_else
ImportError: No module named mod4
I think the reason why the second import fails is that Python only searches the first folder in sys.path where it finds a pack1 package and gives up if it does not find the mod4 module in there. I've asked about this in an earlier question, see import python modules with the same name, but the internal details are still unclear to me.
Anyway, the obvious solution is to add another layer of namespace qualification by turning project directories into super packages: Add __init__.py files to each proj* folder and remove these folders from the lines by which sys.path is extended, e.g.
$ export PYTHONPATH=/path/to/projects/cust1:/path/to/projects/cust3
$ touch /path/to/projects/cust1/proj1/__init__.py
$ touch /path/to/projects/cust3/proj4/__init__.py
$ python
>>> from proj1.pack1.mod1 import something
>>> from proj4.pack1.mod4 import something_else
Now I am running into a situation where different projects for different customers have the same name, e.g.
/path/
to/
projects/
cust3/
proj1/ <-- same project name as for cust1 above
__init__.py
pack4/
__init__.py
mod4.py
Trying to import from mod4 does not work anymore for the same reason as before:
>>> from proj1.pack4.mod4 import yet_something_else
ImportError: No module named pack4.mod4
Following the same approach that solved this problem before, I would add yet another package / namespace layer and turn customer folders into super super packages.
However, this clashes with other requirements I have to my project folder structure, e.g.
Development / Release structure to maintain several code lines
other kinds of source code like e.g. JavaScript, SQL, etc.
other files than source files like e.g. documents or data.
A less simplified, more real-world depiction of some project folders looks like this:
/path/
to/
projects/
cust1/
proj1/
Development/
code/
javascript/
...
python/
pack1/
__init__.py
mod1.py
doc/
...
Release/
...
proj2/
Development/
code/
python/
pack2/
__init__.py
mod2.py
I don't see how I can satisfy the requirements the python interpreter has to a folder structure and the ones that I have at the same time. Maybe I could create an extra folder structure with some symbolic links and use that in sys.path, but looking at the effort I'm already making, I have a feeling that there is something fundamentally wrong with my entire approach. On a sidenote, I also have a hard time believing that python really restricts me in my choice of source code folder names as it seems to do in the case depicted.
How can I set up my project folders and sys.path so I can import from all projects in a consistent manner if there are project and packages with identical names ?
This is the solution to my problem, albeit it might not be obvious at first.
In my projects, I have now introduced a convention of one namespace per customer. In every customer folder (cust1, cust2, etc.), there is an __init__.py file with this code:
import pkgutil
__path__ = pkgutil.extend_path(__path__, __name__)
All the other __init__.py files in my packages are empty (mostly because I haven't had the time yet to find out what else to do with them).
As explained here, extend_path makes sure Python is aware there is more than one sub-package within a package, physically located elsewhere and - from what I understand - the interpreter then does not stop searching after it fails to find a module under the first package path it encounters in sys.path, but searches all paths in __path__.
I can now access all code in a consistent manner criss-cross between all projects, e.g.
from cust1.proj1.pack1.mod1 import something
from cust3.proj4.pack1.mod4 import something_else
from cust3.proj1.pack4.mod4 import yet_something_else
On a downside, I had to create an even deeper project folder structure:
/path/
to/
projects/
cust1/
proj1/
Development/
code/
python/
cust1/
__init__.py <--- contains code as described above
proj1/
__init__.py <--- empty
pack1/
__init__.py <--- empty
mod1.py
but that seems very acceptable to me, especially considering how little effort I need to make to maintain this convention. sys.path is extended by /path/to/projects/cust1/proj1/Development/code/python for this project.
On a sidenote, I noticed that of all the __init__.py files for the same customer, the one in the path that appears first in sys.path is executed, no matter from which project I import something.
You should be using the excellent virtualenv and virtualenvwrapper tools.
What happens if you accidentally import code from one customer/project in another and don't notice? When you deliver it will almost certainly fail. I would adopt a convention of having PYTHONPATH set up for one project at a time, and not try to have everything you've ever written be importable at once.
You can use a wrapper script per-project to set PYTHONPATH and start python, or use scripts to switch environments when you switch projects.
Of course some projects well have dependencies on other projects (those libraries you mentioned), but if you intend for the customer to be able to import several projects at once then you have to arrange for the names to not clash. You can only have this problem when you have multiple projects on the PYTHONPATH that aren't supposed to be used together.

Nose: test all modules in a given package

I have a package with several sub-packages, one of them for tests (named tests). Since the sub-package name makes it clear that contained modules are test modules, I prefer not to pollute module names with the test-name pattern nose expects to include them for testing. That's the setup I'm thinking of:
- foo
- __init__.py
...
- bar
- __init__.py
...
- baz
- __init__.py
...
- tests
- __init__.py
- something.py
Now, by default nose does not run tests found in foo.tests.something. I know nose accepts the -i option to define regular expressions for additional stuff to search tests in. So nose -i something does the job here. However, I have a bunch of modules in the tests package and do not want to name them explicitely. nose -i tests\..* does not work, it looks like nose only matches against module base-names. As a last resort I could run nose --all-modules, but this also inspects foo.bar and foo.baz -- I'd like to avoid this.
So, how could I instruct nose to look for tests in all modules within a given package (tests in my case)? I could write a nose plugin for this task, but I'm looking for a standard solution.
If you name the files in tests with a test_ prefix (i.e. rename something.py to test_something.py, running nose should pick them up by default.
You say "I prefer not to pollute module names with the test-name pattern nose expects to include them for testing", but "something" isn't descriptive of the file, because the file tests that something. What's the problem with using the non-confusing, standard way of naming your tests?
You should be able to import the files into your __init__.py and have them picked up. I had this setup:
- app:
- __init__.py
- models.py
- views.py
- tests:
- __init__.py
- models.py # contain tests
- views.py # contain tests
In the tests/__init__.py file I had the following:
from app.tests.models import *
from app.tests.views import *
Which defeats one of the benefits of using a regex to find tests (a goal of nose) but worked.
You can also use the decorator #istest to flag an individual def as a test if you want to avoid the naming of methods to match the regex. I've not tried to do this for modules (py files) that don't also match the regex but I doubt it would work without the above import.
Note, I've moved away from the import in __init__.py and just prefix my test methods and filenames with test_ and postfix my classes with Test. I find this makes the code more self-describing as even in a Test class there may be setup methods and helper methods (e.g. generators).
with Nose:
nosetests --all-modules
With py.test it can be done by put this in setup.cfg (configure file):
[pytest]
python_files=*.py
check this doc: pytest docs.

Categories