Getting attributes from a function in another file - python

I have a main file which is looking at files within a /modules/ folder, it needs to look at every .py file and find all functions that have a specific attribute.
An example module will be like this:
def Command1_1():
True
Command1_1.command = ['cmd1']
def Command1_2():
True
The code I am currently using to look through each file and function is this:
for module in glob.glob('modules/*.py'):
print(module)
tree = ast.parse(open(module, "rt").read(), filename=PyBot.msggrp + module)
for item in [x.name for x in ast.walk(tree) if isinstance(x, ast.FunctionDef)]:
if item is not None:
print(str(item))
Below is what the code produces but I cannot find a way to show if a function has a ".command" attribute:
modules/Placeholder001.py
Command1_1
Command1_2
modules/Placeholder002.py
Command2_1
Command2_2
Command2_3

The easiest way is to import each file and then look for functions in its global scope. Functions can be identified with the use of callable. Checking if a function has an attribute can be done with hasattr.
The code to import a module from a path is taken from this answer.
from pathlib import Path
import importlib.util
def import_from_path(path):
spec = importlib.util.spec_from_file_location(path.stem, str(path))
module = importlib.util.module_from_spec(spec)
spec.loader.exec_module(module)
return module
for module_path in Path('modules').glob('*.py'):
module = import_from_path(module_path)
for name, value in vars(module).items():
if callable(value):
has_attribute = hasattr(value, 'command')
print(name, has_attribute)
Output:
Command1_1 True
Command1_2 False

Related

Python: always import the last revision in the directory

Imagine that we have the following Data Base structure with the data stored in python files ready to be imported:
data_base/
foo_data/
rev_1.py
rev_2.py
bar_data/
rev_1.py
rev_2.py
rev_3.py
In my main script, I would like to import the last revision of the data available in the folder. For example, instead of doing this:
from data_base.foo_data.rev_2 import foofoo
from data_base.bar_data.rev_3 import barbar
I want to call a method:
import_from_db(path='data_base.foo_data', attr='foofoo', rev='last')
import_from_db(path='data_base.bar_data', attr='barbar', rev='last')
I could take a relative path to the Data Base and use glob.glob to search the last revision, but for this, I should know the path to the data_base folder, which complicates things (imagine that the parent folder of the data_base is in sys.path so the from data_base.*** import will work)
Is there an efficient way to maybe retrieve a full path knowing only part of it (data_base.foo_data)? Other ideas?
I think it's better to install the last version.
but going on with your flow, you may use getattr on the module:
from data_base import foo_data
i = 0
while True:
try:
your_module = getattr(foo_data, f'rev_{i}')
except AttributeError:
break
i += 1
# Now your_module is the latest rev
#JohnDoriaN 's idea led me to a quite simple solution:
import os, glob
def import_from_db(import_path, attr, rev_id=None):
"""
"""
# Get all the modules/folders names
dir_list = import_path.split('.')
# Import the last module
exec(f"from {'.'.join(dir_list[:-1])} import {dir_list[-1]}")
db_parent = locals()[dir_list[-1]]
# Get an absolute path to corresponding to the db_parent folder
abs_path = db_parent.__path__._path[0]
rev_path = os.path.join(abs_path, 'rev_*.py')
rev_names = [os.path.basename(x) for x in glob.glob(rev_path)]
if rev_id is None:
revision = rev_names[-1]
else:
revision = rev_names[rev_id]
revision = revision.split('.')[0]
# import attribute
exec(f'from {import_path}.{revision} import {attr}', globals())
Some explanations:
Apparently (I didn't know this), we can import a folder as a module; this module has a __path__ attribute (found out using the built-in dir method).
glob.glob allows us to use regex expressions to search for a required pattern for files in the directory.
using exec without parameters will import only in the local namespace (namespace of the method) so without polluting the global namespace.
using exec with globals() allows us to import in the global namespace.

Use inspect module to grab the name of inherited object

I want to look at a file and get the names of classes and check if the "Runconfig" name is inherited. So if a file has
class some_function(RunConfig):
I want to return true.
My code looks like this right now:
for file in list_of_files:
if file in ['some_file.py']:
for name,obj in inspect.getmembers(file):
if inspect.isclass(obj):
print("NAME",name,"obj",obj)
This returns objects but I don't see anything that says 'RunConfig' on it.
What am I missing here?
Thank you so much in advance!
You can do something like:
import importlib
import inspect
def is_class_inherited_in_file(file_name, class_ref):
module = importlib.import_module(file_name.split('.')[0])
module_members = inspect.getmembers(module)
for member in module_members:
if type(member[1]) == type and issubclass(member[1], class_ref):
return True
return False
>>> is_class_inherited_in_file('some_file.py', RunConfig)
True
Assumption:
The filename is in the working directory. If you would like to import from any directory, then do something like: How to import a module given the full path?

Patching a function in a file where it is defined

I am trying to learn unittest patching. I have a single file that both defines a function, then later uses that function. When I try to patch this function, its return value is giving me the real return value, not the patched return value.
How do I patch a function that is both defined and used in the same file? Note: I did try to follow the advice given here, but it didn't seem to solve my problem.
walk_dir.py
from os.path import dirname, join
from os import walk
from json import load
def get_config():
current_path =dirname(__file__)
with open(join(current_path, 'config', 'json', 'folder.json')) as json_file:
json_data = load(json_file)
return json_data['parent_dir']
def get_all_folders():
dir_to_walk = get_config()
for root, dir, _ in walk(dir_to_walk):
return [join(root, name) for name in dir]
test_walk_dir.py
from hello_world.walk_dir import get_all_folders
from unittest.mock import patch
#patch('walk_dir.get_config')
def test_get_all_folders(mock_get_config):
mock_get_config.return_value = 'C:\\temp\\test\\'
result = get_all_folders()
assert set(result) == set('C:\\temp\\test\\test_walk_dir')
Try declaring the patch in such way:
#patch('hello_world.walk_dir.get_config')
As you can see this answer to the question you linked, it's recommended that your import statements match your patch statements. In your case from hello_world.walk_dir import get_all_folders and #patch('walk_dir.get_config') doesn't match.

Can I handle imports in an Abstract Syntax Tree?

I want to parse and check config.py for admissible nodes.
config.py can import other config files, which also must be checked.
Is there any functionality in the ast module to parse ast.Import and ast.ImportFrom objects to ast.Module objects?
Here is a code example, I am checking a configuration file (path_to_config), but I want to also check any files that it imports:
with open(path_to_config) as config_file:
ast_tree = ast.parse(config_file.read())
for script_object in ast_tree.body:
if isinstance(script_object, ast.Import):
# Imported file must be checked too
elif isinstance(script_object, ast.ImportFrom):
# Imported file must be checked too
elif not _is_admissible_node(script_object):
raise Exception("Config file '%s' contains unacceptable statements" % path_to_config)
This is a little more complex than you think. from foo import name is a valid way of importing both an object defined in the foo module, and the foo.name module, so you may have to try both forms to see if they resolve to a file. Python also allows for aliases, where code can import foo.bar, but the actual module is really defined as foo._bar_implementation and made available as an attribute of the foo package. You can't detect all of these cases purely by looking at Import and ImportFrom nodes.
If you ignore those cases and only look at the from name, then you'll still have to turn the module name into a filename, then parse the source from the file, for each import.
In Python 2 you can use imp.find_module to get an open file object for the module (*). You want to keep the full module name around when parsing each module, because you'll need it to help you figure out package-relative imports later on. imp.find_module() can't handle package imports so I created a wrapper function:
import imp
_package_paths = {}
def find_module(module):
# imp.find_module can't handle package paths, so we need to do this ourselves
# returns an open file object, the filename, and a flag indicating if this
# is a package directory with __init__.py file.
path = None
if '.' in module:
# resolve the package path first
parts = module.split('.')
module = parts.pop()
for i, part in enumerate(parts, 1):
name = '.'.join(parts[:i])
if name in _package_paths:
path = [_package_paths[name]]
else:
_, filename, (_, _, type_) = imp.find_module(part, path)
if type_ is not imp.PKG_DIRECTORY:
# no Python source code for this package, abort search
return None, None
_package_paths[name] = filename
path = [filename]
source, filename, (_, _, type_) = imp.find_module(module, path)
is_package = False
if type_ is imp.PKG_DIRECTORY:
# load __init__ file in package
source, filename, (_, _, type_) = imp.find_module('__init__', [filename])
is_package = True
if type_ is not imp.PY_SOURCE:
return None, None, False
return source, filename, is_package
I'd also track what module names you already imported so you don't process them twice; use the name from the spec object to make sure you track their canonical names.
Use a stack to process all the modules:
with open(path_to_config) as config_file:
# stack consists of (modulename, ast) tuples
stack = [('', ast.parse(config_file.read()))]
seen = set()
while stack:
modulename, ast_tree = stack.pop()
for script_object in ast_tree.body:
if isinstance(script_object, (ast.Import, ast.ImportFrom)):
names = [a.name for a in script_object.names]
from_names = []
if hasattr(script_object, 'level'): # ImportFrom
from_names = names
name = script_object.module
if script_object.level:
package = modulename.rsplit('.', script_object.level - 1)[0]
if script_object.module:
name = "{}.{}".format(name, script_object.module)
else:
name = package
names = [name]
for name in names:
if name in seen:
continue
seen.add(name)
source, filename, is_package = find_module(name)
if source is None:
continue
if is_package and from_names:
# importing from a package, assume the imported names
# are modules
names += ('{}.{}'.format(name, fn) for fn in from_names)
continue
with source:
module_ast = ast.parse(source.read(), filename)
stack.append((name, module_ast))
elif not _is_admissible_node(script_object):
raise Exception("Config file '%s' contains unacceptable statements" % path_to_config)
In case of from foo import bar imports, if foo is a package then foo/__init__.py is skipped and it is assumed that bar will be a module.
(*) imp.find_module() is deprecated for Python 3 code. On Python 3 you would use importlib.util.find_spec() to get the module loader spec, and then use the ModuleSpec.origin attribute to get the filename. importlib.util.find_spec() knows how to handle packages.

Get full module path to class

I'm using a wrapper to dock a Qt window in a program. The issue is the wrapper takes any window class as a str, then initialises it later, making it not very pythonic if you have split things over multiple files (see the original edit for an idea of how it works, not important to the question though).
For example:
import module.submodule.MainWindow: The path is "module.submodule.MainWindow"
import module.submodule.MainWindow as MW: The path is "MW"
from module import submodule: The path is "submodule.MainWindow"
My current workaround is module.MainWindow.shpw(namespace="module.MainWindow"), but I would prefer not to be required to provide it as an argument
I tried making a function to parse globals() to find the path, and it does work quite nicely at the top level. However, I found out that globals() is unique to each import, so it does nothing when called from within my template class.
from types import ModuleType
import inspect
import site
site_packages_loc = site.getsitepackages()
default_modules = set([sys.modules[i] for i in set(sys.modules.keys()) & set(sys.builtin_module_names)])
def get_class_namespace(search, globals_dict=None, path=[]):
if globals_dict is None:
globals_dict = globals()
#Find all objects that are modules
modules = {}
for k, v in globals_dict.iteritems():
if v == search:
return '.'.join(path + [v.__name__])
if isinstance(v, ModuleType) and v not in default_modules:
modules[k] = v
#Check recursively in each module
for k, v in modules.items():
#Check it's not a built in module
module_path = inspect.getsourcefile(v)
if any(module_path.startswith(i) for i in site_packages_loc):
continue
module_globals = {func: getattr(v, func) for func in dir(v)}
return_val = get_class_namespace(search, module_globals, path=path+[k])
if return_val is not None:
return return_val
Is there a better way to go about this, or could I somehow request the very top level globals() from within an import?

Categories