Properly importing modules in Python - python

How do I set up module imports so that each module can access the objects of all the others?
I have a medium size Python application with modules files in various subdirectories. I have created modules that append these subdirectories to sys.path and imports a group of modules, using import thisModule as tm. Module objects are referred to with that qualification. I then import that module into the others with from moduleImports import *. The code is sloppy right now and has several of these things, which are often duplicative.
First, the application is failing because some module references aren't assigned. This same code does run when unit tested.
Second, I'm worried that I'm causing a problem with recursive module imports. Importing moduleImports imports thisModule, which imports moduleImports . . . .
What is the right way to do this?

"I have a medium size Python application with modules files in various subdirectories."
Good. Make absolutely sure that each directory include a __init__.py file, so that it's a package.
"I have created modules that append these subdirectories to sys.path"
Bad. Use PYTHONPATH or install the whole structure Lib/site-packages. Don't update sys.path dynamically. It's a bad thing. Hard to manage and maintain.
"imports a group of modules, using import thisModule as tm."
Doesn't make sense. Perhaps you have one import thisModule as tm for each module in your structure. This is typical, standard practice: import just the modules you need, no others.
"I then import that module into the others with from moduleImports import *"
Bad. Don't blanket import a bunch of random stuff.
Each module should have a longish list of the specific things it needs.
import this
import that
import package.module
Explicit list. No magic. No dynamic change to sys.path.
My current project has 100's of modules, a dozen or so packages. Each module imports just what it needs. No magic.

Few pointers
You may have already split
functionality in various module. If
correctly done most of the time you
will not fall into circular import
problems (e.g. if module a depends
on b and b on a you can make a third
module c to remove such circular
dependency). As last resort, in a
import b but in b import a at the
point where a is needed e.g. inside
function.
Once functionality is properly in
modules group them in packages under
a subdir and add a __init__.py file
to it so that you can import the
package. Keep such pakages in a
folder e.g. lib and then either add
to sys.path or set PYTHONPATH env
variable
from module import * may not
be good idea. Instead, import whatever
is needed. It may be fully qualified. It
doesn't hurt to be verbose. e.g.
from pakageA.moduleB import
CoolClass.

The way to do this is to avoid magic. In other words, if your module requires something from another module, it should import it explicitly. You shouldn't rely on things being imported automatically.
As the Zen of Python (import this) has it, explicit is better than implicit.

You won't get recursion on imports because Python caches each module and won't reload one it already has.

Related

Custom Module with Custom Python Package - Module not found error

I wrote a custom python package for Ansible to handle business logic for some servers I manage. I have multiple files and they reference each other by re-importing the package.
So my package named <MyCustomPackage> has functions <Function1> <Function2> <Function3>, etc all in their own files... Some of these functions reference functions in the same package, so to do that the file has:
import MyCustomPackage
at the top. I did it this way instead of a relative import because I'm also unit testing these and mocking would not work with relative paths because of a __init__ file in the test directory which was needed for test discovery. The only way I could mock was through importing the package itself. Seemed simple enough.
The problem is with Ansible. These packages are in module_utils. I import them with:
from ansible.module_utils.MyCustomPackage import MyCustomPackage
but when I use the commands I get module not found errors - and traced it back to the import MyCustomPackage statement in the package itself.
So - how should I be structuring my package? Should I try again with relative file imports, or have the package modify the path so it's found with the friendly name?
Any tips would be helpful! Or if someone has a module they've written with Python modules in module_utils and unit tests that they'd be willing to share, that'd be great also!
Many people have problems with relative imports and imports in general in Python because they are ambiguous and surprisingly depend on your current working directory (and other things).
Thus I've created an experimental, new import library: ultraimport
It gives you more control over your imports and lets you do file system based, relative imports.
Given that you have a file function1.py, to import a function from function2.py, you would then write:
import ultraimport
Function2 = ultraimport('__dir__/function2.py', 'Function2')
This will always work, no matter how you run your code. It also does not force you to a specific package structure. You can just have any files you like.

Python Standard Library Import Relationships

I am writing an application in C# with VisualStudio and am using IronPython to write some Python scripts for my application. However, it does not have the entire standard library support by default. So to import some modules (such as os) I need to point my C# code to where the os module actually is. I also understand that it will still be limited to libraries implemented in pure python.
Ultimately I want to have something that can be installed on another machine. My current workaround is to include a copy of https://github.com/python/cpython/tree/2.7/Lib in the Debug folder where the executable is running and it seems excessive/unnecessary to have to include the entire thing. I tried just placing the files I need (for example os.py) here but obviously it imports other modules, which import other modules, etc... I would have to re-run the code to get the error for which module it couldn't find and add them in 1 by 1 and it was getting too tedious.
I was wondering if there was any sort of resource that specifies the relationships between standard library modules and could tell me exactly what files to copy. Essentially what I'm looking for is the graph of the standard library imports. So if I want to import os in these scripts I know to copy os.py, ntpath.py, ...
Thanks
you probably don't need the imports as a tree, but as a simple list, so you can just copy the needed files. You can get that from sys.modules, after you import everything that your script needs - it will contain all modules needed by those that you imported, e.g.:
import sys # even if you don't use it - it's a built-in module, won't add a file to the list, needed to get sys.modules
import os
import time
#import whatever-else
# this gives a list of tuples (module,file)
m=[(z,x.__file__) for z,x in sys.modules.items() if hasattr(x,"__file__") ]
for x in m:
print x[0],x[1]

Project structure leads to redundant dot notation

I have created a Python package which builds on the structure indicated in Kenneth Reitz' "Repository Structure and Python" (1). The main package path is:
/projects-folder (not site-packages)
/package
/package
__init__.py
Datasets.py
Draw.py
Gmaps.py
ShapeSVG.py
project.py
__init__.py
setup.py
With the current structure, I must use the following module import syntax:
import package.package.Datasets
I would prefer to type the following:
import package.Datasets
I am capable of typing the same word twice, of course, but it feels wrong in a deeper sense, i.e., I am structuring my package incorrectly or misunderstanding how Python interprets that structure.
The outer __init__.py is required for Python to detect this package at all, per the docs (2). But that sets up /package/ as the top level of the package and /package/package/ as a sub-package, forcing me into the unwieldy import syntax above.
To avoid this, it seems that my options are to:
Create a package in which the outer folder contains the top level of package modules.
Add the inner folder to my PYTHONPATH environment variable.
Yet both of these seem like suboptimal workarounds for something that shouldn't be an issue in the first place. What should I do?
You've misunderstood. You have two package packages for some reason, but the source you cite never said to do that. The outer folder, with setup.py, is not supposed to be a package.
It sounds like you're running Python in projects-folder and trying to import your package from there. That's not what you should be doing. You have several options to get your package into the import system. (I'll refer to the folder with setup.py in it as setupfolder, to distinguish it from the inner folder):
Build your package with setup.py, for example, python setup.py bdist-wheel --universal, and install the built package with pip.
Skip the build step and just run pip install path/to/setupfolder. Building the package produces an installer useful if you want to distribute your package, but maybe you don't want to do that.
"Install" the package's source tree in development mode with pip install -e path/to/setupfolder, so the Python import system will locate the package's source tree when performing imports. This is handy because you don't have to rebuild and reinstall if you edit the source repository, although you'll still want to restart any running Python processes that are using the package.
Run Python from directly inside the setupfolder.
Any of these options will cause your package to be importable directly as package instead of package.package, as it should be.
While I do not entirely agree with your package structure, you can make use of __all__ and possibly the one legitimate use for star imports I've seen so far. __init__.py can serve more purposes than just identifying your folder as a package or sub-package.
Using a Star Import
In package/package/__init__.py, add a variable __all__ that declares all the public elements you want to export:
__all__ = ['Datasets', 'Draw', 'Gmaps', 'ShapeSVG', 'project']
In package/__init__.py do from package.package import *. Now all the attributes that were available as package.package.x will also be available as package.x.
If you want to directly copy package.package.__all__ to package.__all__ (which is optional, but will allow you to do from package import * properly), you can do something like
from package.package import *
from package.package import __all__ as _all
__all__ = _all
del _all
Not Using a Star Import
You can accomplish the same thing without using package.package.__all__ at all. Just add __all__ directly to package/__init__.py and use from package.package import x-style imports:
from package.package import (
Datasets, Draw, Gmaps, ShapeSVG, project
)
# As before, package.__all__ is optional
__all__ = ['Datasets', 'Gmaps', 'ShapeSVG', 'project']
I would still recommend having a package.package.__all__ variable, but it is optional for this particular purpose.
Pros and Cons
Both approaches are pretty legitimate and I have seen both used in major projects. The first approach reduces redundancy. You only define the public exports in one place: package.package.__all__. The star imports and package.__all__ reference that definition directly, leading to one place that you really have to maintain. On the other hand, there are times when you want to separate the "full" package.package.x API from what you expose via package.x to the casual user. In that case, go with the second option. The only downside here is that you have to be more careful to keep package.__all__ and the corresponding imports synchronized properly.
Note
A number of projects I've seen (numpy especially comes to mind), export the attributes of the individual modules to the top level using this technique. For example, if you had a function package.package.Datasets.get_data, it would be listed in package.package.Datasets.__all__, which would be imported into pacakge.package.__init__, appended to package.package.__all__, and then be referenced by the top-level package and package.__all__.

Importing a module which have the same name with a system module

My situation is similar to one in this question... The difference is,
In our python/django project, we have a directory called utils, which keeps basic functions...
Sometimes, we need to test some modules by running thm from console like
python myproject/some_module.py
It is all good, until python tries to import something from our utils directory...
from utils.custom_modules import some_function
ImportError: No module named custom_modules
I check my python path, and our project is on the list, each folder under the project file has __init__.py files, and when i run ipython within project directory... Everything is ok, otherwise, python imports from its own utils directory...
My collegues use the sama method without any problem, but it throws ImportError on my environment... What could be the problem that all of us was missing?
UPDATE: My project directory, and each sub-drectory have __init__.py file, and i can import other modules from my project without any problem... When i am in a diffrent folder than my procekt and i run ipython, a such import have no problem...
from someothermodule.submodule imprort blahblahblah
But, when it comes to importing utils, it fails...
UPATE 2: What caused the problem was the utils directory under django folder, which is also in the python path...
See the PEP on absolute and relative imports for the semantics. You probably want
from .utils.custom_modules import some_function
if you're in a file at the top level of your package.
Edit: This can only be done from inside a package. This is for a good reason -- if you're importing something that is part of your project, then you're already treating it like a Python package, and you should actually make it one. You do this by adding an __init__.py file to project directory.
Edit 2: You've completely changed the question. It may be possible to work around the problem, but the correct thing to do is not refer to your packed the same way as a builtin package. You either need to rename utils, or make it a subpackage of another package, so you can refer to it by a non-conflicting name (like from mydjangoapp.utils.custom_modules import some_function).
I'm not going to bother telling you to try and make sure you don't name your own modules after the stdlib modules;
If you want to leave the name like this, you'll have to use something like this in everything that imports your own utils module:
import sys, imp
utils = sys.modules.get('utils')
if not utils: utils = imp.load_module('utils',*imp.find_module('utils/utils'))
However, unless there's a lot of stuff that you'd have to change after renaming it, I'd suggest you rename it.

import statement fails for one module

Ok I found the problem, it was an environmental issue, I had the same modules (minus options.py) on the sys.path and it was importing from there instead. Thanks everyone for your help.
I have a series of import statements, the last of which will not work. Any idea why? options.py is sitting in the same directory as everything else.
from snipplets.main import MainHandler
from snipplets.createnew import CreateNewHandler
from snipplets.db import DbSnipplet
from snipplets.highlight import HighLighter
from snipplets.options import Options
ImportError: No module named options
my __init__.py file in the snipplets directory is blank.
I suspect that one of your other imports redefined snipplets with an assignment statement. Or one of your other modules changed sys.path.
Edit
"so the flow goes like this: add snipplets packages to path import..."
No.
Do not modify sys.path -- that way lies problems. Modifying site.path leads to ambiguity about what is -- or is not -- on the path, and what order they are in.
The simplest, most reliable, most obvious, most controllable things to do are the following. Pick exactly one.
Define PYTHONPATH (once, external to your program). A single, simple environment variable that is nearly identical to installation on site-packages.
Install your package in site-packages.
your master branch doesn't have options.py. could it be that you dev and master branches are conflicting?
if this is your actual code then you have option variable at line 21.
Does the following work?
import snipplets.options.Options
If so, one of your other snipplets files probably sets a global variable named options.
Are you on windows? You might want to try defining an __all__ list in your __init__.py file like noted here. It shouldn't make a difference unless you're importing *, but I've seen modules not import unless they were defined there.
Secondly, you might try setting up a virtualenv. Using a lot of site-wide python packages can lead to these kinds of things.
Lastly, make sure the permissions of options are set correctly. I've spent hours trying to figure these things out only to find out it was an issue of me not having permission to import it.

Categories