I wrote a Machine Learning script which I want to control from the command line. I already parsed all the options like for example --optimize 400, to perform 400 iterations over a RandomizedSearchCV grid.
However, I'm struggeling with one thing: I want to choose my estimator, for example GradientBoostingRegressor() or Lasso(), from the command line. I tried two things:
def cli()
p = arg.ArgumentParser(description="Perform ML regression.")
p.add_argument("-e","--estimator",default=Lasso(), help="Choose between Lasso() and GradientBoostingRegressor()")
return p.parse_args()
args = cli()
estimator = args.estimator
But when I try to open the program with:
python script.py -e GradientBoostingRegressor()
I get errors, because of the "()", and also without the (), because it gets interpreted as a string.
Another thing I tried is:
def cli()
p = arg.ArgumentParser(description="Perform ML regression.")
group = p.add_mutually_exclusive_group()
group.add_argument("-SVR", nargs='?', default = SVR(),
help="Choose Suppor Vector Regression")
group.add_argument("-GBR", nargs='?', default = GradientBoostingRegressor())
return p.parse_args()
args = cli()
But now I dont know how to access the estimator and also it seems like when I call the programm like this:
python script.py -SVR
the namespace "args" holds SVR=None and GBR=GradientBoostingRegressor(default_GBR_options), which is exactly the opposite to what I want.
Ideally I could choose between -SVR and -GBR in the command line and the parser would pass it just like my other options and I could initialize an object like this:
estimator = args.estimator
I hope anybody has some information on how to do that.
Thank you very much!
Arguments are always just strings. You need to parse the string to get a function which you can call.
import argparse
def Lasso():
print("Lasso!")
def GradientBoostingRegressor():
print("GradientBoostingRegressor!")
class GetEstimator(argparse.Action):
estimators = {
"Lasso": Lasso,
"GBR": GradientBoostingRegressor,
}
def __call__(self, parser, namespace, values, option_string=None):
setattr(namespace, self.dest, self.estimators[values])
p = argparse.ArgumentParser()
p.add_argument( "-e", "--estimator", action=GetEstimator, default=Lasso, choices=GetEstimaor.estimators.keys())
args = p.parse_args()
args.estimator()
(This replaces a previous answer that used the type parameter to map a string argument to a function. I misunderstood how type and choices interact.)
While #chepner's use of type is a nice use of argparse, the approach can be difficult to get right and understand.
As written it raises an error in the add_argument method:
Traceback (most recent call last):
File "stack50799294.py", line 18, in <module>
p.add_argument("-e", "--estimator", type=estimators.get, default=Lasso, choices=estimators.keys())
File "/usr/lib/python3.6/argparse.py", line 1346, in add_argument
type_func = self._registry_get('type', action.type, action.type)
File "/usr/lib/python3.6/argparse.py", line 1288, in _registry_get
return self._registries[registry_name].get(value, default)
TypeError: unhashable type: 'dict'
It's testing that the type parameter is either an item in the registry, or that it's a valid function. I'm not sure why it's raising this error.
def mytype(astr):
return estimators.get(astr)
works better in type=mytype. But there's further level of difficulty - choices is the keys(), strings. But mytype returns a class, producing an error like:
0942:~/mypy$ python3 stack50799294.py -e GBR
usage: stack50799294.py [-h] [-e {Lasso,GBR}]
stack50799294.py: error: argument -e/--estimator: invalid choice: <class '__main__.GradientBoostingRegressor'> (choose from 'Lasso', 'GBR')
To avoid those difficulties, I'd suggest separating the argument to class mapping. This should be easier to understand and to expand:
import argparse
class Lasso():pass
class GradientBoostingRegressor():pass
# Map an easy-to-type string to each function
estimators = {
'Lasso': Lasso,
'GBR': GradientBoostingRegressor
}
p = argparse.ArgumentParser(description="Perform ML regression.")
p.add_argument("-e", "--estimator", default='Lasso', choices=estimators.keys())
args = p.parse_args()
print(args)
estimator = estimators.get(args.estimator, None)
if estimator is not None:
print(type(estimator()))
samples:
0946:~/mypy$ python3 stack50799294.py -e GBR
Namespace(estimator='GBR')
<class '__main__.GradientBoostingRegressor'>
0946:~/mypy$ python3 stack50799294.py
Namespace(estimator='Lasso')
<class '__main__.Lasso'>
0946:~/mypy$ python3 stack50799294.py -e Lasso
Namespace(estimator='Lasso')
<class '__main__.Lasso'>
0946:~/mypy$ python3 stack50799294.py -e lasso
usage: stack50799294.py [-h] [-e {Lasso,GBR}]
stack50799294.py: error: argument -e/--estimator: invalid choice: 'lasso' (choose from 'Lasso', 'GBR')
const parameter
You could use store_const to choose between 2 classes, a default and a const:
parser.add_argument('-e', action='store_const', default=Lasso(), const=GradientBoostingRegressor())
but that can't be generalized to more. `nargs='?' provides a 3 way choice - default, const, and user provided. But there's still the problem of converting the commandline string to a class object.
Subparsers, https://docs.python.org/3/library/argparse.html#sub-commands, shows how set_defaults can be used to set functions or classes. But to use this you have to define a subparser for each choice.
In general it's better to start with the simple argparse approach, accepting strings and string choices, and doing the mapping after. Using more argparse features can come later.
get error
#chepner's error has something to do with how Python views the d.get method. Even though it looks like a method, it's somehow seeing the dict rather than the method:
In [444]: d = {}
In [445]: d.get(d.get)
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-445-c6d679ba4e8d> in <module>()
----> 1 d.get(d.get)
TypeError: unhashable type: 'dict'
In [446]: def foo(astr):
...: return d.get(astr)
...:
...:
In [447]: d.get(foo)
That could be viewed as a basic python bug, or a argparse bug, but a user defined function or lambda is an easy work around.
Maybe you could separate input from functionality.
First collect the input from the user, then according to the input run the functionality that the user requested.
For example, if the user ran:
python script.py -e GradientBoostingRegressor
you would do:
if args.estimator == "GradientBoostingRegressor":
do stuff...
Related
I run my bash script my_file.sh in a python file as follows:
import subprocess
def rest_api():
params = {
'query': 'indepedence day',
'formats': '["NEWSPAPER"]',
}
subprocess.call(['bash',
'my_file.sh',
f'QUERY={params.get("query")}',
f'DOC_TYPE={params.get("formats")}',
f'LANGUAGE={params.get("lang")}', # returns None!
])
if __name__ == '__main__':
rest_api()
Several of my input arguments in subprocess.call do not normally exist in a dictionary params={} (here I provided f'LANGUAGE={params.get("lang")}' as one example). I handle such unavailability in my_file.sh to initialize with something, for instance:
if [ -z "$LANGUAGE" ]; then LANGUAGE="${LANGUAGE:-[]}"; fi
What I want is to apply some sort of if else statement in subprocess.call function with this logic:
if params.get("lang") is None, do not even send it as an input to bash file, e.g., treat it as I never provided such input for my_file.sh.
Therefore, I tried to rewrote my code like this:
subprocess.call(['bash',
'my_file.sh',
f'QUERY={params.get("query")}',
f'DOC_TYPE={params.get("formats")}',
if params.get("lang"): f'LANGUAGE={params.get("lang")}', # syntax Error
])
which is wrong I get the following invalid syntax error:
Traceback (most recent call last):
File "nationalbiblioteket_logs.py", line 13, in <module>
from url_scraping import *
File "/home/xenial/WS_Farid/DARIAH-FI/url_scraping.py", line 17, in <module>
from utils import *
File "/home/xenial/WS_Farid/DARIAH-FI/utils.py", line 53
if params.get("lang"): f'LANGUAGE={params.get("lang")}',
^
SyntaxError: invalid syntax
Do I have a wrong understanding of applying if else statement for the input arguments of a python function or is there an easier or cleaner way doing it?
Cheers,
You can specify the default when calling .get(), so use an empty string.
f'LANGUAGE={params.get("lang", "")}'
If you don't want the LANGUAGE= argument at all when no value is provided, you need to build the list dynamically.
cmd = ['bash',
'my_file.sh',
f'QUERY={params.get("query")}',
f'DOC_TYPE={params.get("formats")}']
if (lang := params.get("lang")) is not None:
cmd += [f'LANGUAGE={lang}']
subprocess.call(cmd)
I have a python app which can load and run extensions. It looks like this:
import importlib, sys
sys.path.append('.')
module = importlib.import_module( 'extension' )
Extension = getattr(module, 'Extension')
instance = Extension()
instance.FunctionCall(arg1='foo')
The users can create their own loadable extension by implementing a class I defined. In this case, it is in extension.py:
class Extension:
def FunctionCall(self, arg1):
print(arg1)
This works great for now, but I would like to extend my interface to pass another argument into Extension.FunctionCall
instance.FunctionCall(arg1='foo', arg2='bar')
How can I do this while maintaining reverse compatibility with the existing interface?
Here's what happens to existing users if I make the change:
$ python3 loader.py
Traceback (most recent call last):
File "loader.py", line 9, in <module>
instance.FunctionCall(arg1='foo', arg2='bar')
TypeError: FunctionCall() got an unexpected keyword argument 'arg2'
One option is to catch the TypeError, but is this the best solution? It looks like it could get unmaintainable with several iterations of interfaces.
try:
instance.FunctionCall(arg1='foo', arg2='bar')
except TypeError:
instance.FunctionCall(arg1='foo')
Another option is to define the "interface version" as a property of the extension and then use that in a case statement, but it would be nicer to simply detect what is supported instead of pushing overhead to the users.
if hasattr(Extension, 'Standard') and callable(getattr(Extension, 'Standard')):
standard=instance.Standard()
else:
standard=0
if standard == 0:
instance.FunctionCall(arg1='foo')
elif standard == 1:
instance.FunctionCall(arg1='foo', arg2='bar')
I need to convert an xsd to a PostgreSQL schema, for which I found this Python script. Unfortunately it's a Python 2 script, so I want to convert it to Python 3. It fails on this part however:
parser.add_argument(
'xsd',
metavar='FILE',
type=file,
nargs='+',
help='XSD file to base the Postgres Schema on'
)
The error says:
Traceback (most recent call last):
File "xsd2pgsql.py", line 183, in <module>
type=file,
NameError: name 'file' is not defined
I know there's no type file any more in Python 3. How can I fix this?
The type argument will accept any callable which will take the string used on the command line and return some object of the desired type.
So in this case, just doing type=open would be equivalent to the action of type=file in Python 2, which is to return the handle of a file opened in read-only mode.
import argparse
parser = argparse.ArgumentParser()
parser.add_argument('infile', type=file)
args = parser.parse_args()
print(args.infile.readline())
gives:
$ python2 test.py /etc/passwd
root:x:0:0:root:/root:/bin/bash
$ python3 test.py /etc/passwd
Traceback (most recent call last):
File "test.py", line 5, in <module>
parser.add_argument('filename', type=file)
NameError: name 'file' is not defined
After changing to type=open
$ python3 test.py /etc/passwd
root:x:0:0:root:/root:/bin/bash
(This will also work with python 2.)
However, a greater range of file opening options is available using argparse.FileType (see documentation).
This needs to be instantiated, so the basic usage is argparse.FileType(), but for example you could use argparse.FileType(mode="w") (or just argparse.FileType("w")) to get a file opened for writing. So this code:
import argparse
parser = argparse.ArgumentParser()
parser.add_argument('outfile', type=argparse.FileType(mode="w"))
args = parser.parse_args()
args.outfile.write("hello\n")
args.outfile.close()
would give this:
$ python3 test.py myoutput
$ cat myoutput
hello
although this would not be necessary for conversion of the Python 2 example in the question.
argparse comes with a file wrapper that will open a file for you.
parser.add_argument(
'xsd',
metavar='FILE',
type=argparse.FileType('r'),
help='XSD file to base the Postgres Schema on'
)
This would have been preferred in Python 2 as well.
I am trying to implement hostname like module and my target machine in an amazon-ec2. But When I am running the script its giving me below error:
[ansible-user#ansible-master ~]$ ansible node1 -m edit_hostname.py -a node2
ERROR! this task 'edit_hostname.py' has extra params, which is only allowed in the following modules: meta, group_by, add_host, include_tasks, import_role, raw, set_fact, command, win_shell, import_tasks, script, shell, include_vars, include_role, include, win_command
My module is like this:
#!/usr/bin/python
from ansible.module_utils.basic import *
try:
import json
except ImportError:
import simplejson as json
def write_to_file(module, hostname, hostname_file):
try:
with open(hostname_file, 'w+') as f:
try:
f.write("%s\n" %hostname)
finally:
f.close()
except Exception:
err = get_exception()
module.fail_json(msg="failed to write to the /etc/hostname file")
def main():
hostname_file = '/etc/hostname'
module = AnsibleModule(argument_spec=dict(name=dict(required=True, type=str)))
name = module.params['name']
write_to _file(module, name, hostname_file)
module.exit_json(changed=True, meta=name)
if __name__ == "__main__":
main()
I don't know where I am making the mistake. Any help will be greatly appreciated. Thank you.
When developing a new module, I would recommend to use the boilerplate described in the documentation. This also shows that you'll need to use AnsibleModule to define your arguments.
In your main, you should add something like the following:
def main():
# define available arguments/parameters a user can pass to the module
module_args = dict(
name=dict(type='str', required=True)
)
# seed the result dict in the object
# we primarily care about changed and state
# change is if this module effectively modified the target
# state will include any data that you want your module to pass back
# for consumption, for example, in a subsequent task
result = dict(
changed=False,
original_hostname='',
hostname=''
)
module = AnsibleModule(
argument_spec=module_args
supports_check_mode=False
)
# manipulate or modify the state as needed (this is going to be the
# part where your module will do what it needs to do)
result['original_hostname'] = module.params['name']
result['hostname'] = 'goodbye'
# use whatever logic you need to determine whether or not this module
# made any modifications to your target
result['changed'] = True
# in the event of a successful module execution, you will want to
# simple AnsibleModule.exit_json(), passing the key/value results
module.exit_json(**result)
Then, you can call the module like so:
ansible node1 -m mymodule.py -a "name=myname"
ERROR! this task 'edit_hostname.py' has extra params, which is only allowed in the following modules: meta, group_by, add_host, include_tasks, import_role, raw, set_fact, command, win_shell, import_tasks, script, shell, include_vars, include_role, include, win_command
As explained by your error message, an anonymous default parameter is only supported by a limited number of modules. In your custom module, the paramter you created is called name. Moreover, you should not include the .py extension in the module name. You have to call your module like so as an ad-hoc command:
$ ansible node1 -m edit_hostname -a name=node2
I did not test your module code so you may have further errors to fix.
Meanwhile, I still strongly suggest you use the default boilerplate from the ansible documentation as proposed in #Simon's answer.
My project has several categories with multiple scripts inside, each script allows you to perform a unique task.
To run my project, I give my main.py arguments that I retrieve with argparse.
Example: ./main.py --category web --script S1_HTTP_HEADER --url https://qwerty.com --port 1234
However, I need to create an argument to run all my scripts at once.
Example : ./main.py --category web --all --url https://qwerty.com --port 1234
Actually, my main.py looks like this :
if args.category == "web":
if args.script == 'S1_HTTP_HEADER':
from scripts.WEB_S1_HTTPHEADER import RequestHeader
make_request = RequestHeader(args.url, args.port)
make_request.insert_value()
Do you have solutions to run all scripts at once with only one argument?
For information, each script has a class that I have to instantiate with a URL and a PORT. Of course I have to import my class in my main.py before the operation.
Thank's !
I would define a custom parsing action to load all callables respective to the category.
That class defines a mapping binding both categories and associated callables.
Each callable is a tuple composed of the script name and the associated callable object.
An additional argument "scripts" is created by the custom action as a list object. Empty by default.
Import mechanism needs to be changed a little too.
from scripts import WEB_S1_HTTPHEADER, WEB_S2_HTTPHEADER, WEB_S3_HTTPHEADER
class GetScripts(argparse.Action):
"""
"""
SCRIPTS = {"web": [("S1_HTTPHEADER", WEB_S1_HTTPHEADER.RequestHeader), ("S2_HTTPHEADER", WEB_S2_HTTPHEADER.RequestHeader), ("S3_HTTPHEADER", WEB_S3_HTTPHEADER.RequestHeader)]}
def __init__(self, option_strings, dest, **kwargs):
super(GetScripts, self).__init__(option_strings, dest, **kwargs)
def __call__(self, parsobj, namespace, values, option_string=None):
setattr(namespace, self.dest, values)
if values:
category = getattr(namespace, "category")
setattr(namespace, "scripts", self.SCRIPTS.get(category, []))
Define the keyword argument "--all" associated to the custom action:
your_parser.add_argument("--all", nargs="?", default=False, const=True, action=GetScripts)
You can then iterate over the "scripts" argument to run each configured callable.
main.py now looks like:
if args.all:
for _, func in args.scripts:
make_request = func(args.url, args.port)
make_request.insert_value()
A new category or a new script means however changing the class member SCRIPTS.
Does it answer to your need?