I want to pass arguments to the python module to help me decide whether or not to execute some part of the module initialisation code.
Suppose I have a python module named my_module
import sys
flag = sys.argv[1]
if (flag):
# Do Some thing
else:
# Do something else
def module_hello ():
print "hello"
However, I don't want a user script to interfere with the arguments. It has to be purely based on parameters passed while spawning. In the environment where this will be used, I control the spawn of the script. But the script is provided by user of the module
Say a user writes script which imports this module
sys.argv[1] = "Change to something unpleasant"
import my_module
I don't want user to have control over sys.argv. The CLI arguments passed to the script should go to the module unharmed.
Is there a way to achieve this?
If you want to set some global values for a module, you should probably consider encapsulating it in a class, or setting them by calling an intialisation function, so you can pass the parameters like that.
main.py:
import my_module
my_module.init('some values')
mymodule.py:
VALUES = None
function init(values):
global VALUES
VALUES = values
But why not simply declare some variables in the module and just set the value when you load it?
main.py:
import my_module
my_module.values = 'some values'
mymodule.py:
values = None
Or if you just want to read the arguments, it's like any other script:
main.py:
import my_module
mymodule.py:
import sys
values = sys.argv[1]
Of course you can get as fancy as you like, read https://docs.python.org/3/library/argparse.html
So, try to read arguments at your module.
my-module.py
import sys
# Assign my module properties
is_debug = arg[1]
Related
I have many python files and sometimes I need to check them if they works. So they require different import ways in the file for libraries. The below code explains the situation.
I use else situations for fles that import XLS.
XLS.py
if __name__ == "__main__":
import config
else:
import DataLoaders.config as config
However I want to add dataset.py file to if situation.
if __name__ == "__main__" or __name__=="dataset.py":
import config
else:
import DataLoaders.config as config
I tried above code but it doesn't work. How to add the file name in if to import config file as demanded?
Frequently this is better accomplished by either (though logically these end up the same because they both split import logic off)
splitting off functionality into separate files and importing it where needed
keeping all of the relevant logic together in a custom class or method of one
An example of this using #classmethod (which largely exists to aid this sort of design) could be
class MyClass:
#classmethod
def from_XLS(cls, xls_path):
import custom_xls_loader
data = custom_xls_loader.load(xls_path)
# call any other other needed transform methods
# opportunity to make result an instance of this class
# (otherwise why not use a normal function)
return cls(data)
This is used for optional support in many large libraries (where there may be a huge number of busy dependencies that a user may not want or need to install for their use case)
Here's an example from Pandas where some Apache Arrow support is optional https://github.com/pandas-dev/pandas/blob/bbb1cdf13a1e9240b43d691aa0ec3ca1b37afee4/pandas/core/arrays/arrow/dtype.py#L112
Or in your case (referring to config as self.config in the class or as the .config property of an instance)
class MyClass
def __init__(self, data, config=None)
if config is None:
import config
self.config = config
#classmethod
def from_config_normal(cls, data):
return cls(data)
#classmethod
def from_DataLoaders_config(cls, data):
import DataLoaders.config as config
return cls(data, config)
__name__ is set to the name of the module, not whatever creates the module. So in XLS.py, its value is either "__main__", if the file is executed as a script, or XLS, if the module is imported by someone else.
The problem here is that config is, to some extent, a parameter for your module, whose value is determined elsewhere. (config, if you execute as a script, Databases.config if you import from dataset.py, maybe some other module if you import from elsewhere.)
Python, unfortunately, doesn't allow you to parameterize an import like you can a function. A workaround is to simply leave config undefined in XLS.py, and leave the caller the responsibility to set the value appropriately.
# XLS.py
...
if __name__ == "__main__":
import config
...
# dataset.py
import XLS
import DataLoaders
XLS.config = DataLoaders.config
...
This works as long as XLS doesn't use config at import time; I assume it's just a name that other functions in the module may refer to when they are called.
__file__ return the path of the python file that is currently running. You can use the fonction os.path.basename() to extract only the name of the file.
Here is an example:
import os
filename = os.path.basename(__file__)
print(filename)
It returns:
>>> test.py
You can then use a condition to check the name of the file and import the desired modules.
let's say I wanted to make a core library for a project, with functions like:
def foo(x):
"""common, useful function"""
and I want to make these functions globally available in my project so that when I call them in a file, I don't need to import them. I have a virtualenv, so I feel like I should be able to modify my interpreter to make them globally available, but wasn't sure if there was any established methodologies behind this. I am aware it defies some pythonic principles.
It is possible to create a custom "launcher" that sets up some global variables and executes the code in a python file:
from sys import argv
# we read the code of the file passed as the first CLI argument
with open(argv[1]) as fin:
code = fin.read()
# just an example: this will be available in the executed python file
def my_function():
return "World"
global_variables = {
'MY_CONSTANT': "Hello", # prepare a global variable
'my_function': my_function # prepare a global function
}
exec(code, global_variables) # run the file with new global variables
Use it like this: python launcher.py my_dsl_file.py.
Example my_dsl_file.py:
# notice: no imports at all
print(MY_CONSTANT)
print(my_function())
Interestingly Python (at least CPython) uses a different way to setup some useful functions like help. It runs a file called site.py that adds some values to the builtins module.
import builtins
def my_function():
return "World"
builtins.MY_CONSTANT = "Hello"
builtins.my_function = my_function
# run your file like above or simply import it
import <your file>
I wouldn't recommend either of these ways. A simple from <your library> import * is a much better approach.
The downside of the first two variants is that no tool will know anything about your injected globals. E.g. mypy, flake8 and all IDEs i know of will fail.
I'm trying to dynamically import a python-based SQL query module from a sub-folder, and that folder is obtained by using the argparse module.
My project structure :
main_file.py
Data_Projects/
ITA_data/
__init__.py
sqlfile.py
UK_data/
__init__.py
sqlfile.py
Within main_file.py is the argparse module with the argument 'dir' containing the location of the specified directory i.e.
parser.add_argument('--dir', default='Data_Projects/ITA_data/', type=str,
help="set the data directory")
My understanding thus far is that modules should be imported at the top and to import just one sql query I would use:
from Data_Project.ITA_data import sqlfile
I know I can't set the import statement after the args have been defined, so how can I keep the format correct with the imports at the top, and yet retrospectively update this with the arguments that get defined afterwards?
Many thanks.
UPDATE
Thanks to the below answer. I've now tried to assign :
sqlfile = __import__(in_arg.dir + 'sqlfile.py')
However I'm getting the following error:
*** ModuleNotFoundError: No module named 'Data_Projects/ITA_data/sqlfile'
I've tried using things like
os.path.join(Path(__file__).resolve().parents[0], in_arg.dir + 'sqlfile')
If it helps, when I try just :
__import__('Data_Projects') - works fine
__import__('Data_Projects/ITA_data') - doesn't work - ModuleNotFound
And as a check to verify I'm not crazy:
os.path.exists('Data_Projects/ITA_Data/sqlfile.py') >>> True
os.path.exists(in_arg.dir + 'sqlfile.py') >>> True
I don't see anything wrong with
import argparse
parser = ...
parser.add_argument('data', choices=['UK', 'ITA'])
args = parser.parse_args()
if args.dir == 'UK':
import UK_data as data
elif args.dir == 'ITA':
import ITA_data as data
else ...
You could refine this with functions and __name__ etc. But a conditional import is ok, just so long as it occurs before the data module is used.
You can use __import__(filename: str) function instead of import statement. It does the same:
# option 1
import foo as bar
# option 2
bar = __import__('foo')
If you need to import from aside, you need to add your directory to module search paths. There are several ways to achieve that, depending on your version of Python. You can find them all in great post:
How to import a module given the full path?
The issue was resolved by using :
import sys
sys.path.insert(0, os.getcwd() + "/" + in_arg.dir)
This sets the PYTHONPATH variable to include the directory I want (which changes depending on the argument) to use to search for the file.
From there using grapes help it was a case of doing:
sqlfile = __import__('sqlfile')
And from there I could use the variable to perform the relevant sql query.
I have different config files which are basically python files that define variables and I want to import them in my main program.
Usually I will have a "config_file1.py" and do something like:
import config_file1 as params
# and then I can access the different parameters as
params.var1
params.var2
Now I want to be able to select which config_file I want to use, so I want to pass a parameter when calling my program that tells which config file to use. Something like:
config_filename = sys.argv[1]
import config_filename as params
However this looks for a file named "config_filename".
I want instead to import the file referenced by the value of config_filename
EDIT:
Basically my program will run a set of experiments, and those experiments need a set of parameters to run.
Eg:
*** config1.py ****
num_iterations = 100
initial_value1 = 10
initial_value2 = 20
So I can run my program loading those variables into memory.
However another config file (config2.py) might have another set of parameters, so I want to be able to select which experiment I want to run by loading the desired config file.
If you really want to do this, you can use the importlib module:
import importlib
params = importlib.import_module(sys.argv[1])
Then you can use the var like this
params.var1
This is in response to your details and not the question.
If you want to load variables such as num_iterations = 100, initial_value1 = 10, initial_value2 = 20 from a file, then I'd really recommend some sort of config instead of abusing imports for global variables.
Json would be the easiest way, where you'd load the file and you'd straight up get a dict:
>>> import json
>>> params = json.loads(config_file1)
{'num_iterations': 100, 'initial_value1': 10, 'initial_value2': 20}
Alternatively you could use ConfigParser, which looks nicer, but I've found it to be quite prone to breaking.
You can do like this:
config_filename = sys.argv[1]
params = __import__(config_filename)
I wouldn't recommend such a risky approach. You relinquish all controls over at the point of the sys.argv and your script can fail if any one of the named attribute doesn't exist within your module.
Instead I would suggest explicitly controlling what are the supported modules being passed in:
config_filename = sys.argv[1].lower()
if config_filename == 'os':
import os as params
elif config_filename == 'shutil':
import shutil as params
else: # unhandled modules
raise ImportError("Unknown module")
Instead of using import statement, you can use __import__ function, like this:
params = __import__('module_name')
Definition:
importlib.__import__(name, globals=None, locals=None, fromlist=(), level=0)
Reference:
https://docs.python.org/3/library/importlib.html#importlib.import
As described in this answer how to import module one can import a module located in another path this way:
import sys
sys.path.append('PathToModule')
import models.user
My question is:
How can I execute this other module (and also pass parameters to it), if this other module is setup this way:
if __name__ == '__main__':
do_something()
and do_something() uses argparse.ArgumentParser to work with the parameters supplied?
I ADDED THE FOLLOWING AFTER THE FIRST QUESTIONS/COMMENTS CAME UP
I am able to pass the parameters via
sys.argv[1:] = [
"--param1", "123",
"--param2", "456",
"--param3", "111"
]
so this topic is already covered.
Why do I want to call another module with parameters?
I would like to be able to do a kind of a small regression test for another project. I would like to get this other project via a git clone and have different versions locally available, that I can debug, too, if needed.
But I do not want to be involved too much in that other project (so that forking does not make sense).
AND SO MY REMAINING QUESTION IS
How can I tweak the contents of __name__ when calling the other module?
There are multiple ways to approach this problem.
If the module you want to import is well-written, it should have separate functions for parsing the command line arguments and for actually doing work. It should look something like this:
def main(arg1, arg2):
pass # do something
def parse_args():
parser = argparse.ArgumentParser()
... # lots of code
return vars(parser.parse_args())
if __name__ == '__main__':
args = parse_args()
main(**args)
In this case, you would simply import the module and then call its main function with the correct arguments:
import yourModule
yourModule.main('foo', 'bar')
This is the optimal solution.
If the module doesn't define such a main function, you can manually set sys.argv and use runpy.run_module to execute the module:
import runpy
import sys
sys.argv[1:] = ['foo', 'bar']
runpy.run_module('yourModule', run_name='__main__', alter_sys=True)
Note that this only executes the module; it doesn't import it. (I.e. the module won't be added to sys.modules and you don't get a module object that you can interact with.)