python: strange error behaviour in argparse

python: strange error behaviour in argparse - python

argparse is giving me different results when combining flags (-x -y > -xy). It's hard to explain in words, so I have reduced the problem to the following minimal setup:
# test.py
def invalid_argument_type(x):
raise Exception("can't parse this") # in my code, it doesn't *always* fail
parser = argparse.ArgumentParser()
parser.add_argument('args', type = invalid_argument_type)
parser.add_argument('-x')
print parser.parse_args()
Now, erroneously invoking this program yields unexpected results. The first command is correct, the second has an invalid flag, and the third should be the same as the second:
$ python test.py -x foo
Namespace(args=[], x='foo')
$ python test.py -A -x foo
test.py: error: unrecognized arguments: -A
$ python test.py -Ax foo
Exception: can't parse this
It seems that when the flags are combined, the "unknown flag" error swallows -x and foo is treated as a regular argument. Note that if the -A flag existed, both -A and -x would work as expected in every scenario.
This results in highly confusing error messages.
Am I using argparse wrong? Is there a way to fix this, or should I move the error handling into my own hands?

You are seeing this because exceptions happening during the argument parsing are of higher importance than simple argument errors. The argument parsing does recognize that -Ax is an invalid flag and simply keeps note of that to be displayed later; but because the parsing of args fails with an exception, that exception is displayed immediately and the other invalid arguments are simply no longer mentioned.
You can also confirm this behavior from the source. parse_args will delegate the parsing step to parse_known_args which returns the parsed namespace, and a list of invalid arguments. parse_known_args however will internally try to parse it and in case of an exception immediately display that. So exceptions will abort the process before parse_args can display the invalid arguments.
Now, your three examples all work differently, which is why you only see the exception for the last one. So let’s check them in detail:
-x foo: Here, -x is a valid argument name, and foo is the value for that. So all is well.
-A -x foo: -A is an unrecognized argument, so that’s noted down. The remaining part is -x foo which is again a valid sequence for the argument x.
-Ax foo: -Ax is an unrecognized argument, so that’s again noted down. The remaining part is foo. As there’s no argument flag, the parser will try to match that to the args parameter—which then raises an exception.
Note that argparse does only support combined flags (-Ax) if it can correctly parse those from left-to-right. Because single-dash flags can also be longer than one character (e.g. -foo would be fine), it cannot safely say that -Ax would be -A -x because you could have actually defined a -Ax argument. To be able to make this statement, it would actually start to parse -Ax where it will first try to find an argument that matches it. As there is no -A argument, it assumes that it should be -Ax. But that also doesn’t exist, so it fails there.

Related

Is there a way to disable positional arguments given that a flag is true with Python argparse?

I am building a command line tool which should work as follows:
mytool [-h] [-c|--config FILE] [-l|--list] ACTION
positional arguments:
ACTION the action to be performed
optional arguments:
-h, --help show this help message and exit
-v, --version show program's version number and exit
-l, --list show list of configured actions and exit
-c, --config CONFIG use CONFIG instead of the default configuration file
I am using argparse to parse the command line and I am facing what seems to be a limitation of the library. Let me explain through some use-cases.
Use-case 1
$ mytool -h
$ mytool -c path/to/file -h
$ mytool -l -h
$ mytool some-action -h
All the above invocations of mytool shall print the help message and exit, exactly as it is shown above, more importantly showing ACTIONS to be mandatory.
Use-case 2
$ mytool -l
$ mytool -c path/to/file --list
$ mytool --list some-action
$ mytool --list --config path/to/file
All the above invocations must list the configured actions given the content of the configuration files, and exit. The allowed values of ACTION depend on the content of the configuration file, they are not simply hard-coded in the program.
Notice that even if an action is given, it is ignored because -l|--list has a higher precendance, similar to how -h works against other flags and arguments.
Also, please note that solutions such as this, which implement custom argparse.Action sub-classes won't work because the action of the listing flag needs the value of the configuration flag, so the parsing must complete (at least partially) for the listing to begin.
Finally, the absence of the otherwise required positional argument ACTION does not cause the parser to abort with an error.
Use-case 3
In the absence of -l|--list the parser works as expected:
$ mytool -c path/to/file # exits with error, positional argument missing
$ mytool some-action # ok
In simple words, I am trying to make the -l|--list flag "disable" the mandatory enforcing of the positional argument ACTION. Moreover I am looking for a solution that allows me to perform the listing (the action of the -l|--list flag) after the parsing has (at least partially) completed, so the value of the -c|--config flag is available.

-h works the way it does because it uses a special action, argparse._HelpAction. (Simiarly, -v uses argparse._VersionAction.) This action causes the script to exit while parsing, so no folllowing arguments are processed. (Note that any side effects produced while parsing previous arguments may still occur; the only notion of precedence parse_args has is that arguments are processed from left to right.)
If you want -l to similarly show available actions and exit, you need to define your own custom action. For example,
class ListAction(argparse.Action):
def __call__(self, parser, namespace, values, option_string):
print("Allowable actions are...")
print("foo")
print("bar")
print("baz")
parser.exit()
p = argparse.ArgumentParser()
p.add_argument("-l", "--list", action=ListAction)
...
args = p.parse_args()
In particular, p.parse_args(["-l", "-h"]) will list actions without displaying the help message, and p.parse_args(["-h", "-l"]) will print the help message but not list actions. Which one gets processed and terminates your script depends solely on which one appears first in the argument list.

As shown in the usage, ACTION will be required, unless the user uses '-h' or '-v', which exit in the middle of parsing.
You could define ACTION as nargs='?', or as a flagged argument, in which case it is optional.
I was going to suggest giving ACTION choices, which will then appear in the help. But if they depend on '--config' value that could be more awkward (though not impossible).
'-l' could have a custom action class that behaves like 'version', but prints the desired list and exits. But then you can't provide an action as well.
Creating arguments that depend on each other in some way is awkward, though not impossible. It easiest to handle interdependencies after parsing, when all arguments have been read. Keep in mind that argparse tries to accept arguments in any order, '-c' could be before, or after '-l', or Action.
In theory though your custom list action, could check the Namespace for a 'config' value, and base its action on that value. That would work only if you can count on the user providing '-c' before '-l'; argparse won't enforce that for you.
As a general rule it's best to think of argparse as a tool for finding out what the user wants, with limited assistance in policing that, and an ability to automatically format usage and help. Don't expect it to help with complicated interactions and logical mixes of arguments.

Handle invalid arguments with argparse in Python

I am using argparse to parse command line arguments and by default on receiving invalid arguments it prints help message and exit. Is it possible to customize the behavior of argparse when it receives invalid arguments?
Generally I want to catch all invalid arguments and do stuff with them. I am looking for something like:
parser = argparse.ArgumentParser()
# add some arguments here
try:
parser.parse_args()
except InvalidArgvsError, iae:
print "handle this invalid argument '{arg}' my way!".format(arg=iae.get_arg())
So that I can have:
>> python example.py --invalid some_text
handle this invalid argument 'invalid' my way!

You might want to use parse_known_args and then take a look at the second item in the tuple to see what arguments were not understood.
That said, I believe this will only help with extra arguments, not expected arguments that have invalid values.

Some previous questions:
Python argparse and controlling/overriding the exit status code
I want Python argparse to throw an exception rather than usage
and probably more.
The argparse documentation talks about using parse_known_args. This returns a list of arguments that it does not recognize. That's a handy way of dealing with one type of error.
It also talks about writing your own error and exit methods. That error that you don't like passes through those 2 methods. The proper way to change those is to subclass ArgumentParser, though you can monkey-patch an existing parser. The default versions are at the end of the argparse.py file, so you can study what they do.
A third option is to try/except the Systemexit.
try:
parser=argparse.ArgumentParser()
args=parser.parse_args()
except SystemExit:
exc = sys.exc_info()[1]
print(exc)
This way, error/exit still produce the error message (to sys.stderr) but you can block exit and go on and do other things.
1649:~/mypy$ python stack38340252.py -x
usage: stack38340252.py [-h]
stack38340252.py: error: unrecognized arguments: -x
2
One of the earlier question complained that parser.error does not get much information about the error; it just gets a formatted message:
def myerror(message):
print('error message')
print(message)
parser=argparse.ArgumentParser()
parser.error = myerror
args=parser.parse_args()
displays
1705:~/mypy$ python stack38340252.py -x
error message
unrecognized arguments: -x
You could parse that message to find out the -x is the unrecognized string. In an improvement over earlier versions it can list multiple arguments
1705:~/mypy$ python stack38340252.py foo -x abc -b
error message
unrecognized arguments: foo -x abc -b
Look up self.error to see all the cases that can trigger an error message. If you need more ideas, focus on a particular type of error.
===============
The unrecognized arguments error is produced by parse_args, which calls parse_known_args, and raises this error if the extras is not empty. So its special information is the list of strings that parse_known_args could not handle.
parse_known_args for its part calls self.error if it traps an ArgumentError. Generally those are produced when a specific argument (Action) has problems. But _parse_known_args calls self.error if required Action is missing, or if there's a mutually-exclusive-group error. choices can produce a different error, as can type.

You can try subclassing argparse.ArgumentParser() and overriding the error method.
From the argparse source:
def error(self, message):
"""error(message: string)
Prints a usage message incorporating the message to stderr and
exits.
If you override this in a subclass, it should not return -- it
should either exit or raise an exception.
"""
self.print_usage(_sys.stderr)
self.exit(2, _('%s: error: %s\n') % (self.prog, message))

Since the error code 2 is reserved for internal docker usage, I'm using the following to parse arguments in scripts inside docker containers:
ERROR_CODE = 1
class DockerArgumentParser(argparse.ArgumentParser):
def error(self, message):
"""error(message: string)
Prints a usage message incorporating the message to stderr and
exits.
If you override this in a subclass, it should not return -- it
should either exit or raise an exception.
Overrides error method of the parent class to exit with error code 1 since default value is
reserved for internal docker usage
"""
self.print_usage(sys.stderr)
args = {'prog': self.prog, 'message': message}
self.exit(ERROR_CODE, '%(prog)s: error: %(message)s\n' % args)

Strange python error with subprocess.check_call

I'm having a really strange error with the python subprocess.check_call() function. Here are two tests that should both fail because of permission problems, but the first one only returns a 'usage' (the "unexpected behaviour"):
# Test #1
import subprocess
subprocess.check_call(['git', 'clone', 'https://github.com/achedeuzot/project',
'/var/vhosts/project'], shell=True)
# Shell output
usage: git [--version] [--exec-path[=<path>]] [--html-path] [--man-path] [--info-path]
[-p|--paginate|--no-pager] [--no-replace-objects] [--bare]
[--git-dir=<path>] [--work-tree=<path>] [--namespace=<name>]
[-c name=value] [--help]
<command> [<args>]
The most commonly used git commands are:
[...]
Now for the second test (the "expected behaviour" one):
# Test #2
import subprocess
subprocess.check_call(' '.join(['git', 'clone', 'https://github.com/achedeuzot/project',
'/var/vhosts/project']), shell=True)
# Here, we're making it into a string, but the call should be *exactly* the same.
# Shell output
fatal: could not create work tree dir '/var/vhosts/project'.: Permission denied
This second error is the correct one. I don't have the permissions indeed. But why is there a difference between the two calls ? I thought that using a single string or a list is the same with the check_call() function. I have read the python documentation and various usage examples and both look correct.
Did someone have the same strange error ? Or does someone know why is there a difference in output when the commands should be exactly the same ?
Side notes: Python 3.4

Remove shell=True from the first one. If you carefully reread the subprocess module documentation you will see. If shell=False (default) the first argument is a list of the command line with arguments and all (or a string with only the command, no arguments supplied at all). If shell=True, then the first argument is a string representing a shell command line, a shell is executed, which in turn parses the command line for you and splits by white space (+ much more dangerous things you might not want it to do). If shell=True and the first argument is a list, then the first list item is the shell command line, and the rest are passed as arguments to the shell, not the command.
Unless you know you really, really need to, always let shell=False.

Here's the relevant bit from the documentation:
If args is a sequence, the first item specifies the command string, and any additional items will be treated as additional arguments to the shell itself. That is to say, Popen does the equivalent of:
Popen(['/bin/sh', '-c', args[0], args[1], ...])

is there a way to clear python argparse?

Consider the following script:
import argparse
parser1 = argparse.ArgumentParser()
parser1.add_argument('-a')
args1 = parser1.parse_args()
parser2 = argparse.ArgumentParser()
parser2.add_argument('-b')
args2 = parser2.parse_args()
I have several questions:
Is parse_args a one-time method or is there a way to clear the
arguments before adding new ones? (e.g. something like
args1.clear() or parser1.clear())
The result of this script is unusable. Although this script accepts
the -a argument, it does not accept any value for 'a'. Nor does it
accept any -b argument. Is there some way to make any of the arguments really work?
This is my actual scenario: I have 2 scripts. Both import the same
file which has initialization code (load config files, create
loggers, etc.), lets call it init.py This init.py file also parses
the arguments only because it needs one value from it. The problem
is that I need one of the scripts to accept other arguments as well.
Since init.py does something with one argument, I cannot wait with
parse_args. How can I make it work?
Edit:
Here is the output of my script:
[prompt]# python2.7 myscript.py -a
usage: a.py [-h] [-a A]
myscript.py: error: argument -a: expected one argument
[prompt]# python2.7 myscript.py -a 1
Namespace(a='1')
usage: a.py [-h] [-b B]
myscript.py: error: unrecognized arguments: -a 1

Your scenario is quite unclear, but I guess what you're looking for is parse_known_args
Here I guessed that you called init.py from the other files, say caller1.py and caller2.py
Also suppose that init.py only parses -a argument, while the original script will parse the rest.
You can do something like this:
in init.py put this in do_things method:
parser = argparse.ArgumentParser()
parser.add_argument('-a')
parsed = parser.parse_known_args(sys.argv)
print 'From init.py: %s' % parsed['a']
In caller1.py:
init.do_things(sys.argv)
parser = argparse.ArgumentParser()
parser.add_argument('-b')
parsed = parser.parse_known_args(sys.argv)
print 'From caller1.py: %s' % parsed['b']
If you call caller1.py as follows: python caller1.py -a foo -b bar, the result will be:
From init.py: foo
From caller1.py: bar
But if your scenario is not actually like this, I would suggest to use #Michael0x2a answer, which is just to use single ArgumentParser object in caller1.py and pass the value appropriately for init.py

This doesn't really make sense, because for all intents and purposes, the parser object is stateless. There's nothing to clear, since all it does is takes in the console arguments, and returns a Namespace object (a pseudo-dict) without ever modifying anything in the process.
Therefore, you can consider parse_args() to be idempotent. You can repeatedly call it over and over, and the same output will occur. By default, it will read the arguments from sys.argv, which is where the console arguments are stored.
However, note that you can pipe in custom arguments by passing in a list to the parse_args function so that the parser will using something other then sys.argv as input.
I'm not sure what you mean. If you call python myscript.py -a 15, args1 will equal Namespace(a='15'). You can then do args1['a'] to obtain the value of 15. If you want to make the flag act as a toggle, call parser.add_argument('-a', action='store_true'). Here is a list of all available actions.
I would try and confine all the console/interface code into a single module and into a single parser. Basically, remove the code to parse the command line from init.py and the second file into an independent little section. Once you run the parser, which presents a unified interface for everything in your program, pass in the appropriate variables to functions inside init.py. This has the added advantage of keeping the UI separate and more easily interchangeable with the rest of the code.

How to make a custom command line interface using OptionParser?

I am using the OptionParser from optparse module to parse my command that I get using the raw_input().
I have these questions.
1.) I use OptionParser to parse this input, say for eg. (getting multiple args)
my prompt> -a foo -b bar -c spam eggs
I did this with setting the action='store_true' in add_option() for '-c',now if there is another option with multiple argument say -d x y z then how to know which arguments come from which option? also if one of the arguments has to be parsed again like
my prompt> -a foo -b bar -c spam '-f anotheroption'
2.) if i wanted to do something like this..
my prompt> -a foo -b bar
my prompt> -c spam eggs
my prompt> -d x y z
now each entry must not affect the other options set by the previous command. how to accomplish these?

For part 2: you want a new OptionParser instance for each line you process. And look at the cmd module for writing a command loop like this.

You can also solve #1 using the nargs option attribute as follows:
parser = OptionParser()
parser.add_option("-c", "", nargs=2)
parser.add_option("-d", "", nargs=3)

optparse solves #1 by requiring that an argument always have the same number of parameters (even if that number is 0), variable-parameter arguments are not allowed:
Typically, a given option either takes
an argument or it doesn’t. Lots of
people want an “optional option
arguments” feature, meaning that some
options will take an argument if they
see it, and won’t if they don’t. This
is somewhat controversial, because it
makes parsing ambiguous: if "-a" takes
an optional argument and "-b" is
another option entirely, how do we
interpret "-ab"? Because of this
ambiguity, optparse does not support
this feature.
You would solve #2 by not reusing the previous values to parse_args, so it would create a new values object rather than update.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.