Mocking a module import in pytest

Mocking a module import in pytest - python

I am writing a pytest plugin that should test software that's designed to work inside a set of specific environments.
The software I'm writing is run inside a bigger framework, which makes certain Python modules available only when running my Python software inside the framework.
In order to test my software, I'm required to "mock" or fake an entire module (actually, quite a few). I'll need to implement its functionality in some kind of similar-looking way, but my question is how should I make this fake Python module available to my software's code, using a py.test plugin?
For example, let's assume I have the following code in one of my source files:
import fwlib
def fw_sum(a, b):
return fwlib.sum(a, b)
However, the fwlib module is only made available by the framework I run my software from, and I cannot test inside it.
How would I make sure, from within a pytest plugin, that a module named fwlib is already defined in sys.modules? Granted, I'll need to implement fwlib.sum myself. I'm looking for recommendations on how to do just that.

pytest provides a fixture for this use-case: monkeypatch.syspath_prepend.
You may prepend a path to sys.path list of import locations. Write a fake fwlib.py and include it in your tests, appending the directory as necessary. Like the other test modules, it needn't be included with the distribution.
After playing with this myself, I couldn't actually figure out how to get the fixture to mock module level imports correctly from the library code. By the time the tests run, the library code was already imported and then it is too late to patch.
However, I can offer a different solution that works: you may inject the name from within conftest.py, which gets imported first. The subsequent import statement within the code under test will just re-use the object already present in sys.modules.
Package structure:
$ tree .
.
├── conftest.py
├── lib
│   └── my_lib.py
└── tests
└── test_my_lib.py
2 directories, 3 files
Contents of files:
# conftest.py
import sys
def fwlib_sum(a, b):
return a + b
module = type(sys)('fwlib')
module.sum = fwlib_sum
sys.modules['fwlib'] = module
library file:
# lib/my_lib.py
import fwlib
def fw_sum(a, b):
return fwlib.sum(a, b)
test file:
# lib/test_my_lib.py
import my_lib
def test_sum():
assert my_lib.fw_sum(1, 2) == 3

Just to provide a little more details to #wim's good answer, you can use it with submodules too, like so:
import sys
module = type(sys)("my_module_name")
module.submodule = type(sys)("my_submodule_name")
module.submodule.something = sommething
sys.modules["my_module_name"] = module
sys.modules["my_module_name.my_submodule_name"] = module.submodule

Related

Python import in Init

I am writing a lot of tests at the moment that all use pytest. So the first time of every file is import pytest. Is there a way to import it somewhere else for example the __init__.py
tests
- unit_tests
- __init__.py
- test_service.py
- test_plan.py
- test_api.py

An import statement has two purposes:
Define a module, if necessary
Add a name for that module in the current scope.
While putting import pytest in __init__.py would take care of the first one, it does nothing for the second. You have no way of using pytest in each module unless you import sys and use sys.modules['pytest'] everywhere you would have used pytest alone. But that's ugly, so you might think, "Hey, I'll just write
pytest = sys.modules['pytest']
to make a global name pytest refer to the module."
But that's exactly what import pytest already does.

How can I access sibling packages in a maintainable and readable way?

I often end up in a situation where one package needs to use a sibling package. I want to clarify that I'm not asking about how Python allows you to import sibling packages, which has been asked many times. Instead, my question is about a best practice for writing maintainable code.
Let's say we have a tools package, and the function tools.parse_name() depends on tools.split_name(). Initially, both might live in the same file where everything is easy:
# tools/__init__.py
from .name import parse_name, split_name
# tools/name.py
def parse_name(name):
splits = split_name(name) # Can access from same file.
return do_something_with_splits(splits)
def split_name(name):
return do_something_with_name(name)
Now, at some point we decide that the functions have grown and split them into two files:
# tools/__init__.py
from .parse_name import parse_name
from .split_name import split_name
# tools/parse_name.py
import tools
def parse_name(name):
splits = tools.split_name(name) # Won't work because of import order!
return do_something_with_splits(splits)
# tools/split_name.py
def split_name(name):
return do_something_with_name(name)
The problem is that parse_name.py can't just import the tools package which it is part of itself. At least, this won't allow it to use tools listed below its own line in tools/__init__.py.
The technical solution is to import tools.split_name rather than tools:
# tools/__init__.py
from .parse_name import parse_name
from .split_name import split_name
# tools/parse_name.py
import tools.split_name as tools_split_name
def parse_name(name):
splits = tools_split_name.split_name(name) # Works but ugly!
return do_something_with_splits(splits)
# tools/split_name.py
def split_name(name):
return do_something_with_name(name)
This solution technically works but quickly becomes messy if more than just one sibling packages are used. Moreover, renaming the package tools to utilities would be a nightmare, since now all the module aliases should change as well.
It would like to avoid importing functions directly and instead import packages, so that it is clear where a function came from when reading the code. How can I handle this situation in a readable and maintainable way?

I can literally ask you what syntax do you need and provide it. I won't, but you can do it yourself too.
"The problem is that parse_name.py can't just import the tools package which is part of itself."
That looks like a wrong and strange thing to do, indeed.
"At least, this won't allow it to use tools listed below its own line in tools/__init__.py"
Agreed, but again, we don't need that, if things are structured properly.
To simplify the discussion and reduce the degrees of freedom,I assumed several things in the example below.
You can then adapt to different but similar scenarios, because you can modify the code to fit your import syntax requirements.
I give some hints for changes in the end.
Scenario:
You want to build an import package named tools.
You have a lot of functions in there, that you want to make available to client code in client.py. This file uses the package tools by importing it. To keep simplicity I make all the functions (from everywhere) available below tools namespace, by using a from ... import * form. That is dangerous and should be modified in real scenario to prevent name clashes with and between subpackage names.
You organize the functions together by grouping them in import packages inside your tools package (subpackages).
The subpackages have (by definition) their own folder and at least an __init__.py inside. I choose to put the subpackages code in a single module in every subpackage folder, besides the __init__.py. You can have more modules and/or inner packages.
.
├── client.py
└── tools
├── __init__.py
├── splitter
│   ├── __init__.py
│   └── splitter.py
└── formatter
├── __init__.py
└── formatter.py
I keep the __init__.pys empty, except for the outside one, which is responsible to make all the wanted names available to client importing code, in the tools namespace.
This can be changed of course.
#/tools/__init.py___
# note that relative imports avoid using the outer package name
# which is good if later you change your mind for its name
from .splitter.splitter import *
from .formatter.formatter import *
# tools/client.py
# this is user code
import tools
text = "foo bar"
splits = tools.split(text) # the two funcs came
# from different subpackages
text = tools.titlefy(text)
print(splits)
print(text)
# tools/formatter/formatter.py
from ..splitter import splitter # tools formatter sibling
# subpackage splitter,
# module splitter
def titlefy(name):
splits = splitter.split(name)
return ' '.join([s.title() for s in splits])
# tools/splitter/splitter.py
def split(name):
return name.split()
You can actually tailor the imports syntax to your taste, to answer your comment about what they look like.
from form is needed for relative imports. Otherwise use absolute imports by prefixing the path with tools.
__init__.pys can be used to adjust the imported names into the importer code, or to initialize the module. They can also be empty, or actually start as the only file in the subpackage, with all the code in it, and then be splitted in other modules, despite I don't like this "everything in __init__.py" approach as much.
They are just code that runs on import.
You can also avoid repeated names in imported paths by either using different names, or by putting everything in __init__.py, dropping the module with the repeated name, or by using aliases in the __init__.py imports, or with name attributions there. You may also limit what gets exported when the * form is used by importer by assigning names to an __all__ list.
A change you might want for safer readability is to force client.py in specifying the subpackage when using names that is,
name1 = tools.splitter.split('foo bar')
Change the __init__.py to import only the submodules, like this:
from .splitter import splitter
from .formatter import formatter

I'm not proposing this to be actually used in practice, but just for fun, here is a solution using pkgutil and inspect:
import inspect
import os
import pkgutil
def import_siblings(filepath):
"""Import and combine names from all sibling packages of a file."""
path = os.path.dirname(os.path.abspath(filepath))
merged = type('MergedModule', (object,), {})
for importer, module, _ in pkgutil.iter_modules([path]):
if module + '.py' == os.path.basename(filepath):
continue
sibling = importer.find_module(module).load_module(module)
for name, member in inspect.getmembers(sibling):
if name.startswith('__'):
continue
if hasattr(merged, name):
message = "Two sibling packages define the same name '{}'."
raise KeyError(message.format(name))
setattr(merged, name, member)
return merged
The example from the question becomes:
# tools/__init__.py
from .parse_name import parse_name
from .split_name import split_name
# tools/parse_name.py
tools = import_siblings(__file__)
def parse_name(name):
splits = tools.split_name(name) # Same usage as if this was an external module.
return do_something_with_splits(splits)
# tools/split_name.py
def split_name(name):
return do_something_with_name(name)

Executing python class code located in folders

I currently have a folder structure like this:
.
├── main.py
└── parent.py
└── classes
└── subclass1.py
└── subclass2.py
└── subclass3.py
Each of the subclasses are a subclass of parent, and parent is an abstract class. The subclasses need to execute some functions like mix() and meld(), and each of these subclasses must implement mix() and meld().
I would like to write main.py such that the functions in each of the subclasses are executed, without me having to import their files into my program. That is, I'd like something akin to the following to happen:
def main():
# Note that I don't care about the order of which these
for each subclass in the sources folder:
execute `mix()` and `meld()`
# Note that I don't mind which order the order
# of which the subclasses' functions are invoked.
Is there any way I could get this to happen?
Essentially, what I want to do is drop a bunch of classes into the classes folder, with only mix() and meld() defined, and let this program run wild.

I've never tried this, but I think it does what you're asking for:
import os
import imp
import runpy
package = 'classes'
_, path, _ = imp.find_module(package)
for m in os.listdir(path):
if m.endswith('.py'):
runpy.run_module(
package + '.' + os.path.splitext(m)[0],
run_name="Mix_Meld"
)
Then, inside of your subclasses, you can write:
if __name__ == 'Mix_Meld':
ClassName.mix()
ClassName.meld()
This may result in additional code, granted, but if you ever need to stop the execution in one of the files, doing so is just a matter of commenting out that part of the code.
Another advantage is extensibility and polymorphism; if you need to run the code a bit differently for each one of these modules in the future, you only have to change the behaviour in the specific modules. The callee (main.py) will remain oblivious to those changes and continue calling the modules the usual way.

Try importing from the subfolder in the main.py program file:
from glob import *
import os
for i in glob.glob(os.path.join('classes', '*.py')):
__import__(i); className = i[:-3].capitalize()
eval(className).mix(); eval(className).meld()

Getting a list of all modules in the current package

Here's what I want to do: I want to build a test suite that's organized into packages like tests.ui, tests.text, tests.fileio, etc. In each __init__.py in these packages, I want to make a test suite consisting of all the tests in all the modules in that package. Of course, getting all the tests can be done with unittest.TestLoader, but it seems that I have to add each module individually. So supposing that test.ui has editor_window_test.py and preview_window_test.py, I want the __init__.py to import these two files and get a list of the two module objects. The idea is that I want to automate making the test suites so that I can't forget to include something in the test suite.
What's the best way to do this? It seems like it would be an easy thing to do, but I'm not finding anything.
I'm using Python 2.5 btw.

Good answers here, but the best thing to do would be to use a 3rd party test discovery and runner like:
Nose (my favourite)
Trial (pretty nice, especially when testing async stuff)
py.test (less good, in my opinion)
They are all compatible with plain unittest.TestCase and you won't have to modify your tests in any way, neither would you have to use the advanced features in any of them. Just use as a suite discovery.
Is there a specific reason you want to reinvent the nasty stuff in these libs?

Solution to exactly this problem from our django project:
"""Test loader for all module tests
"""
import unittest
import re, os, imp, sys
def find_modules(package):
files = [re.sub('\.py$', '', f) for f in os.listdir(os.path.dirname(package.__file__))
if f.endswith(".py")]
return [imp.load_module(file, *imp.find_module(file, package.__path__)) for file in files]
def suite(package=None):
"""Assemble test suite for Django default test loader"""
if not package: package = myapp.tests # Default argument required for Django test runner
return unittest.TestSuite([unittest.TestLoader().loadTestsFromModule(m)
for m in find_modules(package)])
if __name__ == '__main__':
unittest.TextTestRunner().run(suite(myapp.tests))
EDIT: The benefit compared to bialix's solution is that you can place this loader anytwhere in the project tree, there's no need to modify init.py in every test directory.

You can use os.listdir to find all files in the test.* directory and then filter out .py files:
# Place this code to your __init__.py in test.* directory
import os
modules = []
for name in os.listdir(os.path.dirname(os.path.abspath(__file__))):
m, ext = os.path.splitext()
if ext == '.py':
modules.append(__import__(m))
__all__ = modules
The magic variable __file__ contains filepath of the current module. Try
print __file__
to check.

Adding code to init.py

I'm taking a look at how the model system in django works and I noticed something that I don't understand.
I know that you create an empty __init__.py file to specify that the current directory is a package. And that you can set some variable in __init__.py so that import * works properly.
But django adds a bunch of from ... import ... statements and defines a bunch of classes in __init__.py. Why? Doesn't this just make things look messy? Is there a reason that requires this code in __init__.py?

All imports in __init__.py are made available when you import the package (directory) that contains it.
Example:
./dir/__init__.py:
import something
./test.py:
import dir
# can now use dir.something
EDIT: forgot to mention, the code in __init__.py runs the first time you import any module from that directory. So it's normally a good place to put any package-level initialisation code.
EDIT2: dgrant pointed out to a possible confusion in my example. In __init__.py import something can import any module, not necessary from the package. For example, we can replace it with import datetime, then in our top level test.py both of these snippets will work:
import dir
print dir.datetime.datetime.now()
and
import dir.some_module_in_dir
print dir.datetime.datetime.now()
The bottom line is: all names assigned in __init__.py, be it imported modules, functions or classes, are automatically available in the package namespace whenever you import the package or a module in the package.

It's just personal preference really, and has to do with the layout of your python modules.
Let's say you have a module called erikutils. There are two ways that it can be a module, either you have a file called erikutils.py on your sys.path or you have a directory called erikutils on your sys.path with an empty __init__.py file inside it. Then let's say you have a bunch of modules called fileutils, procutils, parseutils and you want those to be sub-modules under erikutils. So you make some .py files called fileutils.py, procutils.py, and parseutils.py:
erikutils
__init__.py
fileutils.py
procutils.py
parseutils.py
Maybe you have a few functions that just don't belong in the fileutils, procutils, or parseutils modules. And let's say you don't feel like creating a new module called miscutils. AND, you'd like to be able to call the function like so:
erikutils.foo()
erikutils.bar()
rather than doing
erikutils.miscutils.foo()
erikutils.miscutils.bar()
So because the erikutils module is a directory, not a file, we have to define it's functions inside the __init__.py file.
In django, the best example I can think of is django.db.models.fields. ALL the django *Field classes are defined in the __init__.py file in the django/db/models/fields directory. I guess they did this because they didn't want to cram everything into a hypothetical django/db/models/fields.py model, so they split it out into a few submodules (related.py, files.py, for example) and they stuck the made *Field definitions in the fields module itself (hence, __init__.py).

Using the __init__.py file allows you to make the internal package structure invisible from the outside. If the internal structure changes (e.g. because you split one fat module into two) you only have to adjust the __init__.py file, but not the code that depends on the package. You can also make parts of your package invisible, e.g. if they are not ready for general usage.
Note that you can use the del command, so a typical __init__.py may look like this:
from somemodule import some_function1, some_function2, SomeObject
del somemodule
Now if you decide to split somemodule the new __init__.py might be:
from somemodule1 import some_function1, some_function2
from somemodule2 import SomeObject
del somemodule1
del somemodule2
From the outside the package still looks exactly as before.

"We recommend not putting much code in an __init__.py file, though. Programmers do not expect actual logic to happen in this file, and much like with from x import *, it can trip them up if they are looking for the declaration of a particular piece of code and can't find it until they check __init__.py. "
-- Python Object-Oriented Programming Fourth Edition Steven F. Lott Dusty Phillips

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.