unit test structure and import - python

I want to know how to properly import file in my test file without using __init__.py in test root folder. Last sentence of this artice states, that test directory should not have init file.
I don't know much about python so I would like to know:
1) Why not
2) How to import file tested.py to the test_main.py in order to test its functionality without using init file as a script that insert paths to PYTHONPATHS?
My project has a following structure
.
├── __init__.py
├── my
│   ├── __init__.py
│   ├── main.py
│   └── my_sub
│   ├── __init__.py
│   └── tested.py
└── test
└── test_main.py
Files contains following code
#/Python_test/__init__.py
import my.main
#/Python_test/my/__init__.py
#empty
#/Python_test/my/main.py
from .my_sub.tested import inc
print(inc(5))
#/Python_test/my/my_sub/__init__.py
#empty
#/Python_test/my/my_sub/tested.py
def inc(x):
return x+1
#/Python_test/test/test_main.py
from Python_pytest.my.my_sub.tested import func
def test_answer():
assert func(3) == 3
When I run the code from command line python __init__.py it prints 6, which is correct.
I would like to test the function inc() in tested.py file.
1. I installed pytest package as a testing framework,
2. created test file similar to the one from tutorial here called test_main.py.
3. Added __init__.py with a code that finds path of the root directory and adds it to sys.path
It worked well but then I read that this shouldn't be done this way. But how should it be done? I have hard time reading some unit tested code from some github repositories that are tested and don't use the init file. (one of them is boto3) I can't find any clue that suggest how to properly use it.
I also tried to use relative imports this way
from ..my.main import func
but it throws ValueError: attempted relative import beyond top-level package. Which is ok. But I tried it anyway.
Now I don't know how to do that. Tutorials concerning importing usually states that we should add paths of imported modules to sys.path (if they are not present already) but how should I do that when there shouldn't be the init file which can hold the functionality?

Related

Trying to run python code from repo without instructions

I am trying to run a code from this repo without success. There are no instructions on how to run it. I suspect I should run FactcheckingRANLP/Factchecking_clean/classification/lstm_train.py and then run .../lstm_test.py.
The problem is that this code uses import statements as a module, referencing to folders and files that are in different directories, for example, in lstm_train.py:
File "lstm_train.py", line 3, in <module>
from classification.lstm_utils import *
ModuleNotFoundError: No module named 'classification'
This is the tree structure of the classification folder:
.
├── classification
│   ├── __init__.py
│   ├── __init__.pyc
│   ├── lstm_repres.py
│   ├── lstm_test.py
│   ├── lstm_train.py
│   ├── lstm_train.pyc
│   ├── lstm_utils.py
│   ├── lstm_utils.pyc
│   ├── __pycache__
│   │   ├── __init__.cpython-36.pyc
│   │   ├── lstm_train.cpython-36.pyc
│   │   └── lstm_utils.cpython-36.pyc
│   └── svm_run.py
I would like to know how can I make python run lsmt_train/test.py files in such a manner that the import statements contained within them are compiled correctly. I prefer not to modify the code as this could possibly generate a lot of errors..
You could add the path pointing to the classification folder to your python path variable.
I suggest using the sys package:
import sys
sys.path.append('<<<PathToRepo>>>/FactcheckingRANLP/Factchecking_clean')
With the repo classification directory added to your python path, the import statements should work.
EDIT:
Correction; in the initial post I suggested adding the path to .../classification to your path variable, instead the parent folder .../Factchecking_clean is required as the file imports the module 'classification'.
Also, in Lucas Azevedo's answer, the parent directory path is added in the repository lstm_train file itself. While this definitely works, I still think it should be possible without editing the original repository.
I took a look at the repo in question and files like lstm_train.py are scripts which should be executed with the python working directory set as '<<<PathToRepo>>>/FactcheckingRANLP/Factchecking_clean'.
There are a few ways to do so:
You could open the project in a python IDE and configure your execution to use the directory .../Factchecking_clean as the working directory. In pycharm for example this could be done by importing the repo directory .../Factchecking_clean as a project. The following image shows how to set a working directory for execution in pycharm:
I think the repository was developed with this execution configuration set up.
Another possibility is to execute the python script from within another python file. This seems to be rather inconvenient to me, regardless you could do so by creating a separate python file with:
import sys
import os
sys.path.append('<<<PathToRepo>>>/FactcheckingRANLP/Factchecking_clean')
os.chdir('<<<PathToRepo>>>/FactcheckingRANLP/Factchecking_clean')
exec(open('./classification/lstm_train.py').read())
This adds the Factchecking_clean directory to the python path (using sys.path.append()) to be able to import stuff like classification.utils. The working directory is set by os.chdir() and finally exec(open('<<<filepath>>>')).read() executes the lstm_train file with the correct working directory and path variable set up.
Executing the new python file with the code above works for me (without editing the original repository).
However, as scripts like lstm_train.py actually are used to execute specific parts of the code provided in the rest of the repository modules, I think editing these files for experimental purposes is fine. In general, when working with repositories like this I recommend using an IDE (like pycharm) with a correctly set up configuration (method 1).
Ended up having to modify the repository's code, following the suggestion of Jeff.
import sys,os
parent_dir_path = os.path.abspath(__file__+"/../..")
sys.path.append(parent_dir_path)
adding the path until classification doesnt work, as the import is done mentioning the classification folder. The parent directory is the one that has to be added.
__file__ gives the current file name and os.path.abspath resolves the path navigation done with /../..

How can I structure my folders to test my code:

I'm working on a project and would like to write some unit tests for my code. Here is a minimal example of my project structure
myproject
└── scripts
├── tests
│ └── test_func.py
└── tools
├── __init__.py
└── func.py
func.py contains
def f(x):
return 0
and test_func.py contains
import pytest
from ..tools import f
def test_f():
assert f(1)==0
When I try to run my tests with pytest pytest scripts/tests/test_func.py, I get the following error ImportError: attempted relative import with no known parent package.
Clearly, there is a problem with the structure of my project. There is a solution here which involves adding something to my path but I'd rather not do that if there is a way I can just structure my project to avoid this instead.
You need to add (empty) __init__.py files to the directories scripts/ and scripts/tests/. These files provide the scaffolding that lets import find modules, especially when using relative imports.
The __init__.py files are required to make Python treat directories containing the file as packages.
From https://docs.python.org/3/tutorial/modules.html

How do you organise a python project that contains multiple packages so that each file in a package can still be run individually?

TL;DR
Here's an example repository that is set up as described in the first diagram (below): https://github.com/Poddster/package_problems
If you could please make it look like the second diagram in terms of project organisation and can still run the following commands, then you've answered the question:
$ git clone https://github.com/Poddster/package_problems.git
$ cd package_problems
<do your magic here>
$ nosetests
$ ./my_tool/my_tool.py
$ ./my_tool/t.py
$ ./my_tool/d.py
(or for the above commands, $ cd ./my_tool/ && ./my_tool.py is also acceptable)
Alternatively: Give me a different project structure that allows me to group together related files ('package'), run all of the files individually, import the files into other files in the same package, and import the packages/files into other package's files.
Current situation
I have a bunch of python files. Most of them are useful when callable from the command line i.e. they all use argparse and if __name__ == "__main__" to do useful things.
Currently I have this directory structure, and everything is working fine:
.
├── config.txt
├── docs/
│   ├── ...
├── my_tool.py
├── a.py
├── b.py
├── c.py
├── d.py
├── e.py
├── README.md
├── tests
│   ├── __init__.py
│   ├── a.py
│   ├── b.py
│   ├── c.py
│   ├── d.py
│   └── e.py
└── resources
├── ...
Some of the scripts import things from other scripts to do their work. But no script is merely a library, they are all invokable. e.g. I could invoke ./my_tool.py, ./a.by, ./b.py, ./c.py etc and they would do useful things for the user.
"my_tool.py" is the main script that leverages all of the other scripts.
What I want to happen
However I want to change the way the project is organised. The project itself represents an entire program useable by the user, and will be distributed as such, but I know that parts of it will be useful in different projects later so I want to try and encapsulate the current files into a package. In the immediate future I will also add other packages to this same project.
To facilitate this I've decided to re-organise the project to something like the following:
.
├── config.txt
├── docs/
│   ├── ...
├── my_tool
│   ├── __init__.py
│   ├── my_tool.py
│   ├── a.py
│   ├── b.py
│   ├── c.py
│   ├── d.py
│   ├── e.py
│   └── tests
│   ├── __init__.py
│   ├── a.py
│     ├── b.py
│   ├── c.py
│   ├── d.py
│   └── e.py
├── package2
│   ├── __init__.py
│   ├── my_second_package.py
| ├── ...
├── README.md
└── resources
├── ...
However, I can't figure out an project organisation that satisfies the following criteria:
All of the scripts are invokable on the command line (either as my_tool\a.py or cd my_tool && a.py)
The tests actually run :)
Files in package2 can do import my_tool
The main problem is with the import statements used by the packages and the tests.
Currently, all of the packages, including the tests, simply do import <module> and it's resolved correctly. But when jiggering things around it doesn't work.
Note that supporting py2.7 is a requirement so all of the files have from __future__ import absolute_import, ... at the top.
What I've tried, and the disastrous results
1
If I move the files around as shown above, but leave all of the import statements as they currently are:
$ ./my_tool/*.py works and they all run properly
$ nosetests run from the top directory doesn't work. The tests fail to import the packages scripts.
pycharm highlights import statements in red when editing those files :(
2
If I then change the test scripts to do:
from my_tool import x
$ ./my_tool/*.py still works and they all run properly
$ nosetests run from the top directory doesn't work. Then tests can import the correct scripts, but the imports in the scripts themselves fail when the test scripts import them.
pycharm highlights import statements in red in the main scripts still :(
3
If I keep the same structure and change everything to be from my_tool import then:
$ ./my_tool/*.py results in ImportErrors
$ nosetests runs everything ok.
pycharm doesn't complain about anything
e.g. of 1.:
Traceback (most recent call last):
File "./my_tool/a.py", line 34, in <module>
from my_tool import b
ImportError: cannot import name b
4
I also tried from . import x but that just ends up with ValueError: Attempted relative import in non-package for the direct running of scripts.
Looking at some other SO answers:
I can't just use python -m pkg.tests.core_test as
a) I don't have main.py. I guess I could have one?
b) I want to be able to run all of the scripts, not just main?
I've tried:
if __name__ == '__main__' and __package__ is None:
from os import sys, path
sys.path.append(path.dirname(path.dirname(path.abspath(__file__))))
but it didn't help.
I also tried:
__package__ = "my_tool"
from . import b
But received:
SystemError: Parent module 'loading_tool' not loaded, cannot perform relative import
adding import my_tool before from . import b just ends up back with ImportError: cannot import name b
Fix?
What's the correct set of magical incantations and directory layout to make all of this work?
Once you move to your desired configuration, the absolute imports you are using to load the modules that are specific to my_tool no longer work.
You need three modifications after you create the my_tool subdirectory and move the files into it:
Create my_tool/__init__.py. (You seem to already do this but I wanted to mention it for completeness.)
In the files directly under in my_tool: change the import statements to load the modules from the current package. So in my_tool.py change:
import c
import d
import k
import s
to:
from . import c
from . import d
from . import k
from . import s
You need to make a similar change to all your other files. (You mention having tried setting __package__ and then doing a relative import but setting __package__ is not needed.)
In the files located in my_tool/tests: change the import statements that import the code you want to test to relative imports that load from one package up in the hierarchy. So in test_my_tool.py change:
import my_tool
to:
from .. import my_tool
Similarly for all the other test files.
With the modifications above, I can run modules directly:
$ python -m my_tool.my_tool
C!
D!
F!
V!
K!
T!
S!
my_tool!
my_tool main!
|main tool!||detected||tar edit!||installed||keys||LOL||ssl connect||parse ASN.1||config|
$ python -m my_tool.k
F!
V!
K!
K main!
|keys||LOL||ssl connect||parse ASN.1|
and I can run tests:
$ nosetests
........
----------------------------------------------------------------------
Ran 8 tests in 0.006s
OK
Note that I can run the above both with Python 2.7 and Python 3.
Rather than make the various modules under my_tool be directly executable, I suggest using a proper setup.py file to declare entry points and let setup.py create these entry points when the package is installed. Since you intend to distribute this code, you should use a setup.py to formally package it anyway.
Modify the modules that can be invoked from the command line so that, taking my_tool/my_tool.py as example, instead of this:
if __name__ == "__main__":
print("my_tool main!")
print(do_something())
You have:
def main():
print("my_tool main!")
print(do_something())
if __name__ == "__main__":
main()
Create a setup.py file that contains the proper entry_points. For instance:
from setuptools import setup, find_packages
setup(
name="my_tool",
version="0.1.0",
packages=find_packages(),
entry_points={
'console_scripts': [
'my_tool = my_tool.my_tool:main'
],
},
author="",
author_email="",
description="Does stuff.",
license="MIT",
keywords=[],
url="",
classifiers=[
],
)
The file above instructs setup.py to create a script named my_tool that will invoke the main method in the module my_tool.my_tool. On my system, once the package is installed, there is a script located at /usr/local/bin/my_tool that invokes the main method in my_tool.my_tool. It produces the same output as running python -m my_tool.my_tool, which I've shown above.
Point 1
I believe it's working, so I don't comment on it.
Point 2
I always used tests at the same level as my_tool, not below it, but they should work if you do this at the top of each tests files (before importing my_tool or any other py file in the same directory)
import os
import sys
sys.path.insert(0, os.path.abspath(__file__).rsplit(os.sep, 2)[0])
Point 3
In my_second_package.py do this at the top (before importing my_tool)
import os
import sys
sys.path.insert(0,
os.path.abspath(__file__).rsplit(os.sep, 2)[0] + os.sep
+ 'my_tool')
Best regards,
JM
To run it from both command line and act like library while allowing nosetest to operate in a standard manner, I believe you will have to do a double up approach on Imports.
For example, the Python files will require:
try:
import f
except ImportError:
import tools.f as f
I went through and made a PR off the github you linked with all test cases working.
https://github.com/Poddster/package_problems/pull/1
Edit: Forgot the imports in __init__.py to be properly usable in other packages, added. Now should be able to do:
import tools
tools.c.do_something()

Python can't import my package

I have the following directory structure:
myapp
├── a
│   ├── amodule.py
│   └── __init__.py
├── b
│   ├── bmodule.py
│   ├── __init__.py
└── __init__.py
In a/amodule.py
I have this snippet which calls a simple function in b/bmodule.py
from myapp.b import bmodule
b.myfunc()
But when i run python a/amodule.py I get this error:
File "a/amodule.py", line 1, in <module>
from myapp.b import bmodule
ImportError: No module named 'myapp'
What am I doing wrong?
you need to put your project root onto your python path
you can set the PYTHONPATH environmental variable
or you can alter sys.path before importing
or you can use an IDE like pycharm that will do this kind of thing for you
(although it will probably be from b import blah)
there is likely other ways to resolve this issue as well
watch out for circular imports ...
(in python 3 you can also do relative imports... although I am not a big fan of this feature)
from ..b import blah
the best way to allow
from myapp.b import whatever
would be to edit your .bashrc file to always add your parent path to the PYTHONPATH
export PYTHONPATH=$PYTHONPATH;/home/lee/Code
now every time you log into the system python will treat your Code folder as a default place to look for import modules, regardless of where the file is executed from

Import issues in Python

I am having an import error problem in a test script. It looks like its something to do with the directory structure.
I have the following folder structure:
A
├── F1
│   ├── __init__.py
│   └── Src
│   └── F2
│   └── __init__.py
└── tests1
└── tests1
└── test_script.py
A/F1/Src/F2
F1 has "__init__py" in its level
F2 has "__init__.py" in its level
In the same level as F1, there is another folder "tests1"
tests1/tests1/test_script.py
in test_script.py, I have a line which says
from F1.src.F2 import C
With the above, I get an error saying, no module named "F1.src.F2"
Does someone know what is going on here?
from F1.src.F2 import C is an absolute import. In order for it to work, "F1" has to be located somewhere on your Python path (sys.path). Generally this also includes the current directory if you ran Python on the command line.
So if the A directory is not one of the directories on your Python path and is not your current working directory, there is no reason the import would work.
Module names are case sensitive. You have Src in one place and src in another, but I'm not sure that reflects your actual directory structure or just what you typed here.
Using a relative import will not work if you are running test_script.py as a script (Which is what it sounds like.) So, what you really want to do is make sure that either you run the script from the A directory, or go whole hog, make your project into a legit package with a setup.py and use a test runner such as tox.
I just had to create a shared library with the "egg" file.
As simple as that but it occurred to me late!

Categories