I need to deal with many arguments/options. I think up to 30-40+.
I am using argparse (Python 3.5) to let people set the arguments/options from the command line and in the if __name__ == "__main__":, I pass all the arguments: MyClass(arg1, arg2, arg3,...).
Is there a better way, more elegant way to do it?
Rather than passing lots of arguments to your class, just pass the entire result of parse_args() from argparse. Then it's just a single value which contains everything you need.
Related
How can I detect if an argument already exists in an ArgumentParser? For example, I'd like to do something like this:
import argparse
arg_parser = argparse.ArgumentParser()
arg_parser.add_argument("--arg1")
if not has_arg(arg_parser, "--arg1"):
arg_parser.add_argument("--arg1")
In this example it seems pointless, but I have a use case where has_arg would be useful. I have utility functions that take an arg_parser and add a bunch of arguments to it. These utility functions share some common arguments, and sometimes I need to call more than one of these functions, which would repeat a few arguments and would throw an exception. How can I handle this case? (I'm looking for something more principled than just a generic try except, ideally some function like has_arg as in my example)
I'm parsing system arguments in my Python project using sys.argv. At some point i had to modify the script after having written the logic that parses system args. I added a line the basically appends a string to sys.argv so the logic that parses it won't be changed -
sys.argv.append('some string here')
Is it a bad practice to modify the system arguments after they have been created for the program ?
It is bad practice to modify sys.argv in Python and it's equivalent in other languages.
In these situations I recommend a parsed_args variable which has all your parsed data from sys.argv, any default values that you would like to set, and any modifications that "middleware" would make.
In my opinion in such case it would be better to first parse those arguments using argparse.
That way you can get Namespace object, that can be modified and much easier to maintain than hardcodingsys.argv.
After that you can modify Namespace any way you want
Small example:
def parse_args():
parser = argparse.ArgumentParser(description="Description")
parser.add_argument('--file_path')
parsed_args = parser.parse_args()
return parsed_args
if __name__ == '__main__':
args = parse_args()
print(args.file_path) # the argument passed while running script
args.another_value = "value"
print(args.another_value) # value added within code
setattr(args, 'yet_another', 'value') # another way to set attributes
Generally, I like to leave the original command-line (argv) alone and work on a copy. That way, I can always see the original. Perhaps I want to compare what the user actually typed out with what I filled in later from environment variables, default values, or whatever. That's pretty handy while tracking down a problem.
Consider the case where you are using argparse. The particular library isn't important and this is a good practice for just about anything. Instead of relying on a default source, such as sys.argv, always supply it.
In this incomplete example, main() takes a list that you supply. That can be from sys.argv or whatever list you like. That's pretty handy for testing as you supply different lists to check various scenarios. This decoupling gives you quite a bit of flexibility:
def main(options):
parsed_args = process_args(options)
... rest of program ...
def process_args(options):
...do stuff to fix up options, like enable things by env vars...
parser = argparse.ArgumentParser(...)
parser.add_argument(...)
parsed_args = parser.parse_args(options) # argument instead of default
...fix up parsed_args for whatever you want to add...
return parsed_args
if __name__ == "__main__":
main(sys.argv[1:]) # remember the program name is the first item
Magic or default variables are nice for quick and dirty things, but those generally constrain you too much for larger projects where consumers might want to do things you hadn't considered. Remember, Pythonic code likes to be explicit rather than implicit (yet, so many opportunities in core to be implicit!).
I am trying to create an user interface using argparse module.
One of the argument need to be converted, so I use the type keyword:
add_argument('positional', ..., type=myfunction)
and there is another optional argument:
add_argument('-s', dest='switch', ...)
in addition, I have
parsed_argument=parse_args()
However, in myfunction, I hope I can use an additional parameter to control the behavior, which is the optional argument above, i.e.
def myfunction(positional, switch=parsed_argument.switch):
...
How can I achieve that?
Simple answer: You can’t. The arguments are parsed separately, and there is no real guarantee that some order is maintained. Instead of putting your logic into the argument type, just store it as a string and do your stuff after parsing the command line:
parser.add_argument('positional')
parser.add_argument('-s', '--switch')
args = parser.parse_args()
myfunction(args.positional, switch=args.switch)
I'm not sure I did understand correctly what you want to achieve, but if what you want to do is something that looks like:
myprog.py cmd1 --switcha
myprog.py cmd2 --switchb
yes you can, you need to use subparsers. I wrote a good example of it for a little PoC I wrote to access stackoverflow's API from CLI. The whole logic is a bit long to put thoroughly here, but mainly the idea is:
create your parser using parser = argparse.ArgumentParser(...)
create the subparsers using subparsers = parser.add_subparsers(...)
add the commands with things like `subparser.add_parser('mycommand', help='Its only a command').set_defaults(func=mycmd_fn) where
mycmd_fn takes args as parameters where you have all the switches you issued to the command!
the difference from what you ask, is that you'll need one function per command, and not one function with the positional argument as first argument. But you can leverage that easily by having mycmd_fn being like: mycmd_fn = lambda *args: myfunction('mycmd', *args)
HTH
From the documentation:
type= can take any callable that takes a single string argument and returns the converted value:
Python functions like int and float are good examples of a type function should be like. int takes a string and returns a number. If it can't convert the string it raises a ValueError. Your function could do the same. argparse.ArgumentTypeError is another option. argparse isn't going to pass any optional arguments to it. Look at the code for argparse.FileType to see a more elaborate example of a custom type.
action is another place where you can customize behavior. The documentation has an example of a custom Action. Its arguments include the namespace, the object where the parser is collecting the values it will return to you. This object contains any arguments have already been set. In theory your switch value will be available there - if it occurs first.
There are many SO answers that give custom Actions.
Subparsers are another good way of customizing the handling of arguments.
Often it is better to check for the interaction of arguments after parse_args. In your case 'switch' could occur after the positional and still have effect. And argparse.Error lets you use the argparse error mechanism (e.g. displaying the usage)
Is it feasible to use a variable which is also another argument variable when the first argument is not specified in argparse?
So in the following code:
import argparse
parser = argparse.ArgumentParser()
parser.add_option('-f', '--file', dest='outputfile')
parser.add_option('-d', '--db', dest='outpufDB')
And when I run the above script via script.py -f file_name, and don't specify the argument on outputDB, then I want to set the same value on outputDB as on outputfile. I know using default argument enables to set default value, but is it also feasible to set default value derived from another argument?
Thanks.
The traditional way to do this (the way people did it going back to old Unix getopt in C) is to give outputDB a useless "sentinel" default value. Then, after you do the parse, if outputDB matches the sentinel, use outputfile's value instead.
See default in the docs for full details on all of the options available. But the simplest—as long as it doesn't break any of your other params—seems to be to leave it off, and pass argument_default=SUPPRESS. If you do that, args.outputDB just won't exist if nothing was passed, so you can just check for that with hasattr or in or try.
Alternatively, you can pass an empty string as the default, but then of course a user can always trigger the same thing with --outputDB='', which you may not want to allow.
To get around that, you can not give it a type, which means you can give any default value that isn't a string, and there's no way the user can pass you the same thing. The pythonic way to get a sentinel when None or the falsey value of the appropriate type isn't usable is:
sentinel = object()
x = do_stuff(default_value=sentinel)
# ...
if x is sentinel:
# x got the default value
But here I don't think that's necessary. The default None should be perfectly fine (there's no way a user can specify that, since they can only specify strings).
(It's quite possible that argparse has an even cooler way to do this that I haven't discovered, so you might want to wait a while for other answers.)
The simple approach - do your own checking after parse_args.
import argparse
parser = argparse.ArgumentParser()
parser.add_argument('-f', '--file', dest='outputfile')
parser.add_argument('-d', '--db', dest='outpufDB')
args = parser.parse_args()
if args.outpufDB is None:
args.outpufDB = args.outputfile
The default default value is None, which is easy test for.
Doing something entirely within argparse is useful if it is reflected in the help or you want the standardize error message. But you are just filling in a default after parsing.
I can imagine doing the same with a custom '-f' Action, but the logic would be more complex.
I'm not sure if this is a good practice, but I tried this on a similar problem. I have adapted the following snippet for your code, hope it helps.
import argparse
parser = argparse.ArgumentParser()
parser.add_option('-f', '--file', dest='outputfile', default='dummy_file')
parser.add_option('-d', '--db', dest='outputDB', default=parser.parse_args().outputfile)
If you don't pass any arguments the name of both, the 'outputfile' and 'outputDB' will be 'dummy_file'.
I'm writing a python command line program which has some interdependent options, I would like for the user to be able to enter the options in whichever order they please.
Currently I am using the getopts library to parse the command line options, unfortunately that parses them in-order. I've thrown together a system of boolean flags to leave the processing of certain command line arguments until the one they're dependent on is processed, however I had the idea of using a Priority Queue of function calls which would execute after all the command line options are parsed.
I know that Python can store functions under variable names, but that seems to call the function at the same time.
For example:
help = obj.PrintHelp()
heapq.heappush(commandQ, (0, help))
Will print the help dialog immediately. How would I go about implementing my code such that it won't call PrintHelp() immediately upon assigning it a name.
EDIT:
Oh i just realized I was pushing into a queue called help, that's my mistake.
Thanks for the tip on removing the () after PrintHelp.
What if I want to now call a function that requires more than the self argument?
myFun = obj.parseFile(path)
heapq.heappush(commandQ, (1, myFun))
Would I just make the tuple bigger and take the command line argument?
If you heappush like this:
myFun = obj.parseFile
heapq.heappush(commandQ, (1, myFun, path))
then to later call the function, you could do this:
while commandQ:
x=heapq.heappop(commandQ)
func=x[1]
args=x[2:]
func(*args)
Use
help = obj.PrintHelp
without the parentheses. This makes help reference the function.
Later, you can call the function with help().
Note also (if I understand your situation correctly), you could just use the optparse or (if you have Python2.7 or better) argparse modules in the standard library to handle the command-line options in any order.
PS. help is a built-in function in Python. Naming a variable help overrides the built-in, making it difficult (though not impossible) to access the built-in. Generally, it's a good idea not to overwrite the names of built-ins.
Instead of using getopts, I would suggest using optparse (argparse, if you are using a newer python version): most probably, you will get everything you need, already implemented.
That said, in your example code, you are actually calling the function, while you should simply get its name:
help = obj.PrintHelp
heapq.heappush(help, (0, help))
If you want to store a complete function call in Python, you can do it one of two ways:
# option 1: hold the parameters separately
# I've also skipped saving the function in a 'help' variable'
heapq.heappush(commandQ, (0, obj.PrintHelp, param1, param2))
# later:
command = commandQ[0]
heapq.heappop(commandQ)
command[1](*command[2:]) # call the function (second item) with args (remainder of items)
Alternatively, you can use a helper to package the arguments up via lambda:
# option 2: build a no-argument anonymous function that knows what arguments
# to give the real one
# module scope
def makeCall(func, *args):
return lambda: func(*args)
# now you can:
help = makeCall(obj.PrintHelp, param1, param2)
heapq.heappush(commandQ, (0, help))
If you need keyword arguments, let me know and I'll edit to take care of those too.