Unit testing and import - python

My program reads a python script for configuration. So far I'm loading the script called lab.py like this:
self.lab_file = "/not/interesting/path/lab.py"
sys.path.insert(0, os.path.dirname(self.lab_file))
import lab as _config
But when I'm unit-testing it, I've a strange behavior:
when I launch only one unit test calling this code, it succeed
when I launch several unit tests, each of one calling this code independantly, some tests failed
Tracing the problem with logging, It seems the lab script is imported only the first time. This behavior seems coherent in respect of python but I was assuming than unit tests are isolated between each other. I am wrong ? If test are not independant in respect of the import, how can I write test to force the loading of my script each time ?

Try using reload.
For example:
import lab as _config
reload(_config)
In python 2 reload is a builtin function.
In python 3.2+ reload is in the imp module, but deprecated in 3.4+.
In python 3.4+ reload is in the importlib module.

I would suggest deleting the module from sys.modules
import sys
if 'lab' in sys.modules:
del sys.modules['lab']
import lab as _config
just deleting the import will not work because import checks in sys.modules if the module is already imported.
if you import then reload it works because it first loads the module from sys.modules into the local namespace and then reloads the module from file.

Maybe it helps if you run nose with this flag:
--with-isolation
From the nosedoc
Enable plugin IsolationPlugin: Activate the isolation plugin to isolate changes to external modules to a single test module or package. The isolation plugin resets the contents of sys.modules after each test module or package runs to its state before the test. PLEASE NOTE that this plugin should not be used with the coverage plugin, or in any other case where module reloading may produce undesirable side-effects. [NOSE_WITH_ISOLATION]

Related

Custom Module with Custom Python Package - Module not found error

I wrote a custom python package for Ansible to handle business logic for some servers I manage. I have multiple files and they reference each other by re-importing the package.
So my package named <MyCustomPackage> has functions <Function1> <Function2> <Function3>, etc all in their own files... Some of these functions reference functions in the same package, so to do that the file has:
import MyCustomPackage
at the top. I did it this way instead of a relative import because I'm also unit testing these and mocking would not work with relative paths because of a __init__ file in the test directory which was needed for test discovery. The only way I could mock was through importing the package itself. Seemed simple enough.
The problem is with Ansible. These packages are in module_utils. I import them with:
from ansible.module_utils.MyCustomPackage import MyCustomPackage
but when I use the commands I get module not found errors - and traced it back to the import MyCustomPackage statement in the package itself.
So - how should I be structuring my package? Should I try again with relative file imports, or have the package modify the path so it's found with the friendly name?
Any tips would be helpful! Or if someone has a module they've written with Python modules in module_utils and unit tests that they'd be willing to share, that'd be great also!
Many people have problems with relative imports and imports in general in Python because they are ambiguous and surprisingly depend on your current working directory (and other things).
Thus I've created an experimental, new import library: ultraimport
It gives you more control over your imports and lets you do file system based, relative imports.
Given that you have a file function1.py, to import a function from function2.py, you would then write:
import ultraimport
Function2 = ultraimport('__dir__/function2.py', 'Function2')
This will always work, no matter how you run your code. It also does not force you to a specific package structure. You can just have any files you like.

How to refresh a Python import in a Jupyter Notebook cell?

I have two files:
MyModule.py
MyNotebook.ipynb
I am using Jupyter Notebook, latest, and Python, latest. I have two code cells in my Notebook.
Cell #1
import some stuff
Run some code
(Keep everything in the environment takes about five minutes to run this cell).
Cell #2
import MyModule
Execute code from MyModule
I would like to make code changes in MyModule.py and rerun Cell #2 but without restarting the kernel (Cell #1 did a fair amount of work which I don't want to rerun each time).
If I simply run the second cell, those changes made to MyModule.py do not propagate through. I did some digging, and I tried using importlib.reload. The actual code for Cell #2 :
from Nash.IOEngineNash import *
import importlib
importlib.reload(Nash.IOEngineNash)
Unfortunately, this isn't quite working. How can I push those changes in MyModule.py (or Nash/IOEngineNash.py in actual fact) into my Notebook without restarting the kernel and running from scratch?
I faced a similar issue , while importing a custom script in jupyter notebook
Try importing the module as an alias then reloading it
import Nash as nash
from importlib import reload
reload(nash)
To make sure that all references to the old version of the module are updated, you might want to re-import it after reloading, e.g.
import mymodule
reload(mymodule)
import mymodule
Issues may arise due to implicit dependencies, because importlib.reload only reloads the (top-level) module that you ask it to reload, not its submodules or external modules.
If you want to reload a module recursively, you might have a look at this gist, which allows you to simply add a line like this at the top of a notebook cell:
%reload mymodule
import mymodule
This recursively reloads the mymodule and all of its submodules.

How to understand Python's module lookup

I created two new files, random.py and main.py, in the directory. The code is as follows:
# random.py
if __name__ == "__main__":
print("random")
# main.py
import random
if __name__ == "__main__":
print(random.choice([1, 2, 3]))
When I run the main.py file, the program reports an error.
Traceback (most recent call last):
File "main.py", line 8, in <module>
print(random.choice([1, 2, 3]))
AttributeError: module 'random' has no attribute 'choice'
Main.py imports my own defined random module.
However, if I create a new sys.py file and a main.py file in the same directory, the code is as follows:
# sys.py
if __name__ == "__main__":
print("sys")
# main.py
import sys
if __name__ == "__main__":
print(sys.path)
When I run the main.py file, successfully.
main.py imports the built-in modules sys.
Why is there such a clear difference?
The directory relationship of the script file is as follows:
C:.
main.py
random.py
sys.py
Thank you very much for your answer.
Forgive my poor english.
sys is a built-in module, meaning it's compiled directly into the Python executable itself. Built-in modules outprioritize external files when Python is looking for modules. The standard random module isn't built-in, so it doesn't get that treatment.
Quoting the docs:
When the named module is not found in sys.modules, Python next searches sys.meta_path, which contains a list of meta path finder objects. These finders are queried in order to see if they know how to handle the named module...
Python’s default sys.meta_path has three meta path finders, one that knows how to import built-in modules, one that knows how to import frozen modules, and one that knows how to import modules from an import path (i.e. the path based finder).
Since the finder for built-in modules comes before the finder that searches the import path, built-in modules will be found before anything on the import path.
You can see a tuple of the names of all modules your Python has built-in in sys.builtin_module_names.
That said, while any built-in module would outprioritize a module loaded from a file, sys has its own special handling. sys is one of the foundational building blocks of Python, and much of the sys module's setup needs to happen before the import system is functional enough for the normal import process to work. sys gets explicitly created during interpreter setup in a way that bypasses the normal import system, and then future imports for sys find it in sys.modules without hitting any meta path finders.
How and where sys is created is an implementation detail that varies from Python version to Python version (and is wildly different in different Python implementations), but in the CPython 3.7.4 code, you can see it beginning on line 755 in Python/pylifecycle.c.
tl;dr Caching
sys is somewhat of a special case among other python modules because it gets loaded at program start, unconditionally (presumably because a lot of the constants, functions, and data within - such as the streams stdout and stderr - are used by the python interpreter). As #user2357112 noted in the other answer, this is partly because it's built-in to the python executable, but also because it's necessary for running a substantial amount of python's core functionality (see below how it needs to be loaded for imports to work). random is part of the standard library, but it doesn't get loaded automatically when you execute, which is the primary relevant difference between it and sys, for our purposes
Looking at python's documentation on the subject clarifies how python resolves imports:
The first place checked during import search is sys.modules. This mapping serves as a cache of all modules that have been previously imported, including the intermediate paths.
...
During import, the module name is looked up in sys.modules and if present, the associated value is the module satisfying the import, and the process completes. However, if the value is None, then a ModuleNotFoundError is raised. If the module name is missing, Python will continue searching for the module.
As for where it looks for the module, you can see in your observed behavior that it looks in the local directory first. That is, it searches the local directory first and then the "usual places" afterwards.
The reason for the discrepancy between how sys is handled and how random is handled is caching - sys is cached (so python doesn't even check the path to import), whereas random is not cached (so python does check the path to import it, and imports locally).
There are a few ways you can change this behavior.
First, if you must have a local module called sys, you can use importlib to import it in relative or absolute terms, without running into the ambiguity with the sys that's already cached. I have no idea how this would affect other modules that independently try to import sys, and you really shouldn't be naming your files the same as standard library modules anyway.
Alternatively, if you want the code to check python's built-in modules before checking the local directory, then you should be able to do that by modifying sys.path, which shows the order in which paths are searched for input (the same as the $PATH environment variable, or any other similar language-specific one). The first element of sys.path is usually going to be an empty string '', that would result in searching the current working directory. So you can simply move that to the back of sys.path, to have it searched last instead of first:
sys.path.append(sys.path.pop(0))

Code coverage does not cover dynamically imported packages/modules

I am using the following command to run tests:
nosetests --with-coverage --cover-html --cover-package mypackage
I would like the coverage report to be updated, even if a developer adds new, untested, code to the package.
For example, imagine a developer adds a new module to the package but forgets to write tests for it. Since the tests may not import the new module, the code coverage may not reflect the uncovered code. Obviously this is something which could be prevented at the code review stage but it would be great to catch it even earlier.
My solution was to write a simple test which dynamically imports all modules under the top-level package. I used the following code snippet to do this:
import os
import pkgutil
for loader, name, is_pkg in pkgutil.walk_packages([pkg_dirname]):
mod = loader.find_module(name).load_module(name)
Dynamically importing sub-packages and sub-modules like this does not get picked up by the code coverage plugin in nose.
Can anyone suggest a better way to achieve this type of thing?
The problem seems to be the method for dynamically importing all packages/modules under the top-level package.
Using the method defined here seems to work. The key difference being the use of importlib instead of pkgutil. However, importlib was introduced in python 2.7 and 3.1 so this solution is not appropriate for older versions of python.
I have updated the original code snippet to use __import__ instead of the ImpLoader.load_module method. This also seems to do the trick.
import os
import pkgutil
for loader, name, is_pkg in pkgutil.walk_packages([pkg_dirname]):
mod = loader.find_module(name)
__import__(mod.fullname)

Where should my Python 3 modules be?

I recently installed Python 3 on my Mac OSX 10.6.8 and I haven't had any problems with modules or imports until now. I'm writing a function that tests whether or not a triangle is right angled based on the length of the sides and the guide that the exercise was in has a bunch of equalities to test so I can see if it works:
testEqual(is_rightangled(1.5,2.0,2.5), True)
testEqual(is_rightangled(4.0,8.0,16.0), False)
testEqual(is_rightangled(4.1,8.2,9.1678787077), True)
testEqual(is_rightangled(4.1,8.2,9.16787), True)
testEqual(is_rightangled(4.1,8.2,9.168), False)
testEqual(is_rightangled(0.5,0.4,0.64031), True)
I should apparently import a function called testEqual(a,b,c) from a module called test, since the example programme in the guide starts with from test import testEqual, but when I typed that into my file I got this message:
from test import testEqual
ImportError: cannot import name testEqual
I suppose I should specify the path to the test module, but I can't find it my Python 3 library anywhere in my computer – just the 2.x ones that came installed with the computer, which are in /Library/Python. import turtle and import math worked, so it must be somewhere.
The test module in the Python stdlib doesn't contain a function called testEqual(). Its documentation starts with
Note: The test package is meant for internal use by Python only. It is
documented for the benefit of the core developers of Python. Any use
of this package outside of Python’s standard library is discouraged as
code mentioned here can change or be removed without notice between
releases of Python.
Are you sure that this guide you're following doesn't have its own test.py program that you're supposed to use instead?
When you write your testEqual() function make note of the directory you are working in. For instance on my mac I created a directory (folder) in documents so my path looks like: /Users/myName/Documents/python. Save your function (module) as testEqual.py and when you write you test.py script import testEqual after the shebang line. Once you have your scripts debugged your modules will be in a folder that python creates titled pycache don't remove that as it is compiled code. Now, as long as you are working in the same dir as your module you should not need to do anything other than use the import statement.

Categories