Find all subclasses of a class in various single-file Python scripts - python

I have a single folder ('root') which contains various subfolders and *.py files. 'root' is not a package just a folder with different scripts. I'm trying to find all scripts in 'root' that contain classes that inherit from a specific baseclass. The goal is to present the user with a specific selection of scripts to choose from.
The most obvious way to find subclasses is to use __subclasses__() which however requires all classes to be loaded (i.e. calling __subclasses__() on a class will only find those subclasses that are currently loaded), which isn't all that great when you have a folder full of individual scripts.
I tried to search for all *.py files in 'root' and than tried to import them one by one using importlib, which is not only painfully slow (as you might imagine), I didn't even get it to work properly.
The baseclass is defined as MyBaseClass(with_metaclass(abc.ABCMeta, SomeOtherClass))
I know very little about abc so I'm not sure if it provides any functionality that could help. (I'm looking into this right now)
Java has a neat Lookup package (http://bits.netbeans.org/7.4/javadoc/org-openide-util-lookup/org/openide/util/Lookup.html) which allows you to tag classes and later lookup all classes with a certain tag. (It usually does its lookup within the confines of a jar, so not exactly the same thing)
My question is: Is there any decent way to get this to work?
EDIT
originally posted as a comment, put here for better visibility:
"Those scripts are different launch scripts & producers for a simulation. They all inherit from 1 of only a few baseclasses. The baseclass can be changed if needed. So I was looking for a way to gather all scripts that inherit from a specific baseclass to present a choice of specific scripts."
The code I currently use to crawl a directory and import all *.py files:
def all_files(directory):
for path, dirs, files in os.walk(directory):
for f in files:
if f.endswith('.py'):
yield os.path.join(path, f)
# Found here:
# https://stackoverflow.com/questions/3137731/is-this-correct-way-to-import-python-scripts-residing-in-arbitrary-folders
def import_from_absolute_path(fullpath, global_name=None):
script_dir, filename = os.path.split(fullpath)
script, ext = os.path.splitext(filename)
sys.path.insert(0, script_dir)
try:
module = __import__(script)
if global_name is None:
global_name = script
globals()[global_name] = module
sys.modules[global_name] = module
except ModuleNotFoundError as mnf:
print(mnf)
except ImportError as ie:
print(ie)
except FileNotFoundError as fnf:
print(fnf)
finally:
del sys.path[0]

Related

Load module to invoke its decorators

I have a program consistring of several modules specifying the respective web application handlers and one, specifying the respective router.
The library I use can be found here.
Excerpt from webapp.service (there are more such modules):
from webapp.router import ROUTER
#ROUTER.route('/service/[id:int]')
class ServicePermissions(AuthenticatedService):
"""Handles service permissions."""
NODE = 'services'
NAME = 'services manager'
DESCRIPTION = 'Manages services permissions'
PROMOTE = False
webapp.router:
ROUTER = Router()
When I import the webapp.router module, the webapp.service module does obviously not run. Hence, the #ROUTER.route('/service/[id:int]') decorator is not run and my web aplication will fail with the message, that the respective route is not available.
What is the best practice in that case to run the code in webapp.service to "run" the decorators? I do not really need to import the module itself or any of its members.
As stated in the comments fot the question,
you simply have to import the modules. As for linter complaints, those are the lesser of your problems. Linters are there to help - if they get into the way, just don't listen to them.
So, the simple way just to get your things working is, at the end of your __main__.py or __init__.py, depending on your app structure, to import explicitly all the modules that make use of the view decorator.
If you have a linter, check how to silence it on the import lines - that is usually accomplished with a special comment on the import line.
Python's introspection is fantastic, but it can't find instances of a class, or subclasses, if those are defined in modules that are not imported: such a module is just a text file sitting on the disk, like any data file.
What some frameworks offer as an approach is to have a "discovery" utility that will silently import all "py" files in the project folders. That way your views can "come into existence" without explicit imports.
You could use a function like:
import os
def discover(caller_file):
caller_folder = os.path.dirname(caller_file)
for current, folders, files in os.walk(caller_folder):
if current == "__pycache__":
continue
for file in files:
if file.endswith(".py"):
__import__(os.path.join(current, file))
And call it on your main module with discover(__file__)

Importing Python files that are in a folder

I've been working on a project that creates its own .py files that store handlers for the method, I've been trying to figure out how to store the Python files in folder and open them. Here is the code I'm using to create the files if they don't already exist, then importing the file:
if os.path.isfile("Btn"+ str(self.ButtonSet[self.IntBtnID].IntPID) +".py") == False:
TestPy = open("Btn"+ str(self.ButtonSet[self.IntBtnID].IntPID) +".py","w+")
try:
TestPy.write(StrHandler)
except Exception as Error:
print(Error)
TestPy.close()
self.ButtonSet[self.IntBtnID].ImpHandler = __import__("Btn" + str(self.IntBtnID))
self.IntBtnID += 1
when I change this line:
self.ButtonSet[self.IntBtnID].ImpHandler = __import__("Btn" + str(self.IntBtnID))
to this:
self.ButtonSet[self.IntBtnID].ImpHandler = __import__("Buttons\\Btn" + str(self.IntBtnID))
the fill can't be found and ends up throwing an error because it can't find the file in the folder.
Do know why it doesn't work I just don't know how to get around the issue:/
My question is how do I open the .py when its stored in a folder?
There are a couple of unidiomatic things in your code that may be the cause of your issue. First of all, it is generally better to use the functions in os.path to manipulate paths to files. From your backslash usage, it appears you're working on Windows, but the os.path module ensures consistent behaviour across all platforms.
Also there is importlib.import_module, which is usually recommended over __import__. Furthermore, in case you want to load the generated module more than once during the lifetime of your program, you have to do that explicitly using imp.reload.
One last tip: I'd factor out the module path to avoid having to change it in more than one place.
You can't reference a path directory when you are importing files. Instead, you want to add the directory to your path and then import the name of the module.
import sys
sys.path.append( "Buttons" )
__import__("Btn"+str(self.IntBtnId))
See this so question for more information.
The first argument to the __import__() function is the name of the module, not a path to it. Therefore I think you need to use:
self.ButtonSet[self.IntBtnID].ImpHandler = __import__("Buttons.Btn" + str(self.IntBtnID))
You may also need to put an empty __init__.py file in the Buttons folder to indicate it's a package of modules.

Configuration variables for a collection of scripts in Python

I have a collection of scripts written in Python. Each of them can be executed independently. However, most of the time they should be executed one after the other, so there is a MainScript.py which calls them in the appropriate order. Each script has some configurable variables (let's call them Root_Dir, Data_Dir and LinWinFlag). If this collection of scripts is moved to a different computer, or different data needs to be processed, these variable values need to be changed. As there are many scripts this duplication is annoying and error-prone. I would like to group all configuration variables into a single file.
I tried making Config.py which would contain them as per this thread, but import Config produces ImportError: No module named Config because they are not part of a package.
Then I tried relying on variable inheritance: define them once in MainScript.py which calls all the others. This works, but I realized that each script would not be able to run on its own. To solve this, I tried adding useGlobal=True in MainScript.py and in other files:
if (useGlobal is None or useGlobal==False):
# define all variables
But this fails when scripts are run standalone: NameError: name 'useGlobal' is not defined. The workaround is to define useGlobal and set it to False when running the scripts independently of MainScript.py. It there a more elegant solution?
The idea is that python wants to access files - including the Config.py - primarily as part of a module.
The nice thing is that Python makes building modules (i.e. python packages) really easy - initializing it can be done by creating a
__init__.py
file in each directory you want as a module, a submodule, a subsubmodule, and so on.
So your import should go through if you have created this file.
If you have further questions, look at the excellent python documentation.
The best way to do this is to use a configuration file placed in your home directory (~/.config/yourscript/config.json).
You can then load the file on start and provide default values if the file does not exist :
Example (config.py) :
import json
default_config = {
"name": "volnt",
"mail": "oh#hi.com"
}
def load_settings():
settings = default_config
try:
with open("~/.config/yourscript/config.json", "r") as config_file:
loaded_config = json.loads(config_file.read())
for key in loaded_config:
settings[key] = loaded_config[key]
except IOError: # file does not exist
pass
return settings
For a configuration file it's a good idea to use json and not python, because it makes it easy to edit for people using your scripts.
As suggested by cleros, ConfigParser module seems to be the closest thing to what I wanted (one-line statement in each file which would set up multiple variables).

Plugin design question

My program is broken down into two parts: the engine, which deals with user interface and other "main program" stuff, and a set of plugins, which provide methods to deal with specific input.
Each plugin is written in its own module, and provides a function that will allow me to send and retrieve data to and from the plugin.
The name of this function is the same across all plugins, so all I need is to determine which one to call and then the plugin will handle the rest.
I've placed all of the plugins in a sub-folder, wrote an __ init__.py that imports each plugin, and then I import the folder (I think it's called a package?)
Anyways currently I explicitly tell it what to import (which is basically "import this", "import that"). Is there a way for me to write it so that it will import everything in that folder that is a plug-in so that I can add additional plugins without having to edit the init file?
Here is the code I use to do this:
def _loadPackagePlugins(package):
"Load plugins from a specified package."
ppath = package.__path__
pname = package.__name__ + "."
for importer, modname, ispkg in pkgutil.iter_modules(ppath, pname):
module = __import__(modname, fromlist = "dummy")
The main difference from Jakob's answer is that it uses pkgutil.iter_modules instead of os.listdir. I used to use os.listdir and changed to doing it this way, but I don't remember why. It might have been that os.listdir failed when I packaged my app with py2exe and py2app.
You could always have a dict called plugins, use __import__ to import the modules and store them that way.
e.g.
plugins = {}
for plugin in os.listdir('plugins'):
plugin = plugin.split()[0]
plugins[plugin] = __import__(plugin)
This is assuming that every plugin is a single file. Personally I would go with something that looks in each folder for a __run__.py file, like a __init__.py in a package it would indicate a plugin, that code would look more like something like this
for root, dirs, files in os.walk('.'):
for dir in dirs:
if "__run__.py" in os.listdir(os.path.join(root, dir)):
plugins[dir] = __import__(dir)
Code written without testing. YMMV

Is this correct way to import python scripts residing in arbitrary folders?

This snippet is from an earlier answer here on SO. It is about a year old (and the answer was not accepted). I am new to Python and I am finding the system path a real pain. I have a few functions written in scripts in different directories, and I would like to be able to import them into new projects without having to jump through hoops.
This is the snippet:
def import_path(fullpath):
""" Import a file with full path specification. Allows one to
import from anywhere, something __import__ does not do.
"""
path, filename = os.path.split(fullpath)
filename, ext = os.path.splitext(filename)
sys.path.append(path)
module = __import__(filename)
reload(module) # Might be out of date
del sys.path[-1]
return module
Its from here:
How to do relative imports in Python?
I would like some feedback as to whether I can use it or not - and if there are any undesirable side effects that may not be obvious to a newbie.
I intend to use it something like this:
import_path(/home/pydev/path1/script1.py)
script1.func1()
etc
Is it 'safe' to use the function in the way I intend to?
The "official" and fully safe approach is the imp module of the standard Python library.
Use imp.find_module to find the module on your precisely-specified list of acceptable directories -- it returns a 3-tuple (file, pathname, description) -- if unsuccessful, file is actually None (but it can also raise ImportError so you should use a try/except for that as well as checking if file is None:).
If the search is successful, call imp.load_module (in a try/finally to make sure you close the file!) with the above three arguments after the first one which must be the same name you passed to find_module -- it returns the module object (phew;-).
As mentioned, please consider thread safety, if appropriate. I prefer something closer to a solution posted in a similar post. The main differences below: the use of insert to specify priority of the import, correct restoration of sys.path using try...finally, and setting the global namespace.
# inspired by Alex Martelli's solution to
# http://stackoverflow.com/questions/1096216/override-namespace-in-python/1096247#1096247
def import_from_absolute_path(fullpath, global_name=None):
"""Dynamic script import using full path."""
import os
import sys
script_dir, filename = os.path.split(fullpath)
script, ext = os.path.splitext(filename)
sys.path.insert(0, script_dir)
try:
module = __import__(script)
if global_name is None:
global_name = script
globals()[global_name] = module
sys.modules[global_name] = module
finally:
del sys.path[0]
It does feel like a bit of a hack, but at the moment, I can't think of any unintended side effects that are likely to occur, at least not as long as you're just using this for your own scripts. Basically what it does is temporarily add the parent directory of the specified file (in your example, /home/pydev/path1/) to the list of paths that Python checks when it's looking for a module to import.
The only risk I can think of right now would arise in a multithreaded environment, where two or more threads (or processes) are running this function simultaneously. If thread A wants to import module A from path dirA/A.py, and thread B wants to import module B from path dirB/B.py, you'd wind up with both dirA and dirB in sys.path for a short time. And if there is a file named B.py in dirA, it's possible that thread B will find that (dirA/B.py) instead of the file it's looking for (dirB/B.py), thus importing the wrong module. For this reason, I wouldn't use it in production code, or code that you're going to distribute to other people (at least not without warning them that this hack is in here!). In a situation like that, you could write a more complex function that allows you to specify the file to import without messing with the standard set of paths. (That's what mod_python does, for example)
I would be worried that your script name might correspond with a module that shows up earlier in the path. To dispel this fear, I would fully replace the path with a new list containing just the directory containing the module, then put it back once the import has completed. Also, you should wrap this in some sort of lock so that multiple threads trying to do the same thing don't interfere with each other.

Categories