I have a python 2.6 Django app which has a folder structure like this:
/foo/bar/__init__.py
I have another couple directories on the filesystem full of python modules like this:
/modules/__init__.py
/modules/module1/__init__.py
/other_modules/module2/__init__.py
/other_modules/module2/file.py
Each module __init__ has a class. For example module1Class() and module2Class() respectively. In module2, file.py contains a class called myFileClass().
What I would like to do is put some code in /foo/bar/__init__.py so I can import in my Django project like this:
from foo.bar.module1 import module1Class
from foo.bar.module2 import module2Class
from foo.bar.module2.file import myFileClass
The list of directories which have modules is contained in a tuple in a Django config which looks like this:
module_list = ("/modules", "/other_modules",)
I've tried using __import__ and vars() to dynamically generate variables like this:
import os
import sys
for m in module_list:
sys.path.insert(0, m)
for d in os.listdir(m):
if os.path.isdir(d):
vars()[d] = getattr(__import__(m.split("/")[-1], fromlist=[d], d)
But that doesn't seem to work. Is there any way to do this?
Thanks!
I can see at least one problem with your code. The line...
if os.path.isdir(d):
...won't work, because os.listdir() returns relative pathnames, so you'll need to convert them to absolute pathnames, otherwise the os.path.isdir() will return False because the path doesn't exist (relative to the current working directory), rather than raising an exception (which would make more sense, IMO).
The following code works for me...
import sys
import os
# Directories to search for packages
root_path_list = ("/modules", "/other_modules",)
# Make a backup of sys.path
old_sys_path = sys.path[:]
# Add all paths to sys.path first, in case one package imports from another
for root_path in root_path_list:
sys.path.insert(0, root_path)
# Add new packages to current scope
for root_path in root_path_list:
filenames = os.listdir(root_path)
for filename in filenames:
full_path = os.path.join(root_path, filename)
if os.path.isdir(full_path):
locals()[filename] = __import__(filename)
# Restore sys.path
sys.path[:] = old_sys_path
# Clean up locals
del sys, os, root_path_list, old_sys_path, root_path, filenames, filename, full_path
Update
Thinking about it, it might be safer to check for the presence of __init__.py, rather than using os.path.isdir() in case you have subdirectories which don't contain such a file, otherwise the __import__() will fail.
So you could change the lines...
full_path = os.path.join(root_path, filename)
if os.path.isdir(full_path):
locals()[filename] = __import__(filename)
...to...
full_path = os.path.join(root_path, filename, '__init__.py')
if os.path.exists(full_path):
locals()[filename] = __import__(filename)
...but it might be unnecessary.
We wound up biting the bullet and changing how we do things. Now the list of directories to find modules is passed in the Django config and each one is added to sys.path (similar to a comment Aya mentioned and something I did before but wasn't too happy with). Then for each module inside of it, we check for an __init__.py and if it exists, attempt to treat it as a module to use inside of the app without using the foo.bar piece.
This required some adjustment on how we interact with the modules and how developers code their modules (they now need to use relative imports within their module instead of the full path imports they used before) but I think this will be an easier design for developers to use long-term.
We didn't add these to INSTALLED_APPS because we do some exception handling where if we cannot import a module due to dependency issues or bad code our software will continue running just without that module. If they were in INSTALLED_APPS we wouldn't be able to leverage that flexibility on when/how to deal with those exceptions.
Thanks for all of the help!
Related
I have a Python module which uses some resources in a subdirectory of the module directory. After searching around on stack overflow and finding related answers, I managed to direct the module to the resources by using something like
import os
os.path.join(os.path.dirname(__file__), 'fonts/myfont.ttf')
This works fine when I call the module from elsewhere, but it breaks when I call the module after changing the current working directory. The problem is that the contents of __file__ are a relative path, which doesn't take into account the fact that I changed the directory:
>>> mymodule.__file__
'mymodule/__init__.pyc'
>>> os.chdir('..')
>>> mymodule.__file__
'mymodule/__init__.pyc'
How can I encode the absolute path in __file__, or barring that, how can I access my resources in the module no matter what the current working directory is? Thanks!
Store the absolute path to the module directory at the very beginning of the module:
package_directory = os.path.dirname(os.path.abspath(__file__))
Afterwards, load your resources based on this package_directory:
font_file = os.path.join(package_directory, 'fonts', 'myfont.ttf')
And after all, do not modify of process-wide resources like the current working directory. There is never a real need to change the working directory in a well-written program, consequently avoid os.chdir().
Building on lunaryorn's answer, I keep a function at the top of my modules in which I have to build multiple paths. This saves me repeated typing of joins.
def package_path(*paths, package_directory=os.path.dirname(os.path.abspath(__file__))):
return os.path.join(package_directory, *paths)
To build the path, call it like this:
font_file = package_path('fonts', 'myfont.ttf')
Or if you just need the package directory:
package_directory = package_path()
I have put the question in the figure below:
EDIT
The question put next to the figure is:
How do I make script_A1 import a function from script_B2?
Similar questions have been asked before. But most answers suggest to add the module/script/package(whatever) to the PATH variable. For example:
sys.path.append('...')
But adding the module to the PATH variable just feels so wrong. I do not want to alter my system in any way. When my application closes, I want my Python environment to be clean and 'untouched'. I'm afraid that adding uncontrolled modules to the PATH variables on my system will cause headaches later on.
Thank you for helping me out :-)
You can use a trick of adding the top folder to path:
import sys
sys.path.append('..')
import folderB.something
You can also use imp.load_source if you prefer.
I think I solved the issue.
In the following way, you can append the parent directory to the PATH. Put this at the top of script_A1:
import sys
import os
myDir = os.path.dirname(os.path.abspath(__file__))
parentDir = os.path.split(myDir)[0]
if(sys.path.__contains__(parentDir)):
print('parent already in path')
pass
else:
print('parent directory added')
sys.path.append(parentDir)
# Now comes the rest of your script
You can verify that the parent directory myProject is indeed added to the PATH by printing out:
print(sys.path)
Since the parent directory myProject is now part of the PATH, you can import scripts/modules/whatever from any of its subdirectories. This is how you import script_B2 from folder_B:
import folder_B.script_B2 as script_B2
After closing your application, you can verify if your Python environment is restored to its initial state. just print the PATH again and check if the directory you had appended is gone.
Put this at the top of script_A1;
from folderB.script_B2 import YourClass as your_class
Here is my current directory structure:
proj/
proj/__init__.py
proj/submodFolder/
proj/submodFolder/submod/
proj/submodFolder/submod/__init__.py
I'm writing a project and I would like to have import submod or even import submodFolder.submod in proj/__init__.py. However without __init__.py in submodFolder this won't work.
Assume submodFolder is a git repository that i have sub-repoed (a third party library if you will); adding the requisite __init__.py will break the git subrepo and complicate updating libraries from their master repos.
Assuming submodFolder is an immutable git sub-repo what is the best way to push python down the dirtree to the module? Modifying the python path seemed the nearest solution to me - but none of the questions already asked assumed an immutable submodFolder.
Examples welcome, note relative paths.
If you prefer not to modify the PYTHONPATH environment variable, you can modify sys.path inside of proj/__init__.py, the following should work:
import sys
import os
sys.path.append(os.path.join(os.path.dirname(os.path.realpath(__file__)), 'submodFolder'))
import submod
Step-by-step code with comments, so it makes a little more sense:
# get absolute path to proj/__init__.py
script_path = os.path.realpath(__file__)
# strip off the file name to get the absolute path to proj
proj_path = os.path.dirname(script_path)
# join on os.sep to get absolute path to proj/submodFolder
submod_path = os.path.join(proj_path, 'submodFolder')
# add the complete path to proj/submodFolder to sys.path
sys.path.append(submod_path)
I have a directory, let's call it Storage full of packages with unwieldy names like mypackage-xxyyzzww, and of course Storage is on my PYTHONPATH. Since packages have long unmemorable names, all of the packages are symlinked to friendlier names, such as mypackage.
Now, I don't want to rely on file system symbolic links to do this, instead I tried mucking around with sys.path and sys.modules. Currently I'm doing something like this:
import imp
imp.load_package('mypackage', 'Storage/mypackage-xxyyzzww')
How bad is it to do things this way, and is there a chance this will break in the future? One funny thing is that there's even no mention of imp.load_package function in the docs.
EDIT: besides not relying on symbolic links, I can't use PYTHONPATH variable anymore.
Instead of using imp, you can assign different names to imported modules.
import mypackage_xxyyzzww as mypackage
If you then create a __init__.py file inside of Storage, you can add several of the above lines to make importing easier.
Storage/__init__.py:
import mypackage_xxyyzzww as mypackage
import otherpackage_xxyyzzww as otherpackage
Interpreter:
>>> from Storage import mypackage, otherpackage
importlib may be more appropriate, as it uses/implements the PEP302 mechanism.
Follow the DictImporter example, but override find_module to find the real filename and store it in the dict, then override load_module to get the code from the found file.
You shouldn't need to use sys.path once you've created your Storage module
#from importlib import abc
import imp
import os
import sys
import logging
logging.basicConfig(level=logging.DEBUG)
dprint = logging.debug
class MyImporter(object):
def __init__(self,path):
self.path=path
self.names = {}
def find_module(self,fullname,path=None):
dprint("find_module({fullname},{path})".format(**locals()))
ml = imp.find_module(fullname,path)
dprint(repr(ml))
raise ImportError
def load_module(self,fullname):
dprint("load_module({fullname})".format(**locals()))
return imp.load_module(fullname)
raise ImportError
def load_storage( path, modname=None ):
if modname is None:
modname = os.path.basename(path)
mod = imp.new_module(modname)
sys.modules[modname] = mod
assert mod.__name__== modname
mod.__path__=[path]
#sys.meta_path.append(MyImporter(path))
mod.__loader__= MyImporter(path)
return mod
if __name__=="__main__":
load_storage("arbitrary-path-to-code/Storage")
from Storage import plain
from Storage import mypkg
Then when you import Storage.mypackage, python will immediately use your importer without bothering to look on sys.path
That doesn't work. The code above does work to import ordinary modules under Storage without requiring Storage to be on sys.path, but both 3.1 and 2.6 seem to ignore the loader attribute mentioned in PEP302.
If I uncomment the sys.meta_path line, 3.1 dies with StackOverflow, and 2.6 dies with ImportError. hmmm... I'm out of time now, but may look at it later.
Packages are just entries in the namespace. You should not name your path components with anything that is not a legal python variable name.
I am writing a minimal replacement for mod_python's publisher.py
The basic premise is that it is loading modules based on a URL scheme:
/foo/bar/a/b/c/d
Whereby /foo/ might be a directory and 'bar' is a method ExposedBar in a publishable class in /foo/index.py. Likewise /foo might map to /foo.py and bar is a method in the exposed class. The semantics of this aren't really important. I have a line:
sys.path.insert(0, path_to_file) # /var/www/html/{bar|foo}
mod_obj = __import__(module_name)
mod_obj.__name__ = req.filename
Then the module is inspected for the appropriate class/functions/methods. When the process gets as far as it can the remaining URI data, /a/b/c is passed to that method or function.
This was working fine until I had /var/www/html/foo/index.py and /var/www/html/bar/index.py
When viewing in the browser, it is fairly random which 'index.py' gets selected, even though I set the first search path to '/var/www/html/foo' or '/var/www/html/bar' and then loaded __import__('index'). I have no idea why it is finding either by seemingly random choice. This is shown by:
__name__ is "/var/www/html/foo/index.py"
req.filename is "/var/www/html/foo/index.py"
__file__ is "/var/www/html/bar/index.py"
This question then is, why would the __import__ be randomly selecting either index. I would understand this if the path was '/var/www/html' but it isn't. Secondly:
Can I load a module by it's absolute path into a module object? Without modification of sys.path. I can't find any docs on __import__ or new.module() for this.
Can I load a module by it's absolute
path into a module object? Without
modification of sys.path. I can't find
any docs on __import__ or new.module()
for this.
import imp
import os
def module_from_path(path):
filename = os.path.basename(path)
modulename = os.path.splitext(filename)[0]
with open(path) as f:
return imp.load_module(modulename, f, path, ('py', 'U', imp.PY_SOURCE))