Python Argparse: Issue with optional arguments which are negative numbers - python

I'm having a small issue with argparse. I have an option xlim which is the xrange of a plot. I want to be able to pass numbers like -2e-5. However this does not work - argparse interprets this is a positional argument. If I do -0.00002 it works: argparse reads it as a negative number. Is it possible to have able to read in -2e-3?
The code is below, and an example of how I would run it is:
./blaa.py --xlim -2.e-3 1e4
If I do the following it works:
./blaa.py --xlim -0.002 1e4
The code:
parser.add_argument('--xlim', nargs = 2,
help = 'X axis limits',
action = 'store', type = float,
default = [-1.e-3, 1.e-3])
Whilst I can get it to work this way I would really rather be able to use scientific notation. Anyone have any ideas?
Cheers

One workaround I've found is to quote the value, but adding a space. That is,
./blaa.py --xlim " -2.e-3" 1e4
This way argparse won't think -2.e-3 is an option name because the first character is not a hyphen-dash, but it will still be converted properly to a float because float(string) ignores spaces on the left.

As already pointed out by the comments, the problem is that a - prefix is parsed as an option instead of as an argument. One way to workaround this is change the prefix used for options with prefix_chars argument:
#!/usr/bin/python
import argparse
parser = argparse.ArgumentParser(prefix_chars='#')
parser.add_argument('##xlim', nargs = 2,
help = 'X axis limits',
action = 'store', type = float,
default = [-1.e-3, 1.e-3])
print parser.parse_args()
Example output:
$ ./blaa.py ##xlim -2.e-3 1e4
Namespace(xlim=[-0.002, 10000.0])
Edit: Alternatively, you can keep using - as separator, pass xlim as a single value and use a function in type to implement your own parsing:
#!/usr/bin/python
import argparse
def two_floats(value):
values = value.split()
if len(values) != 2:
raise argparse.ArgumentError
values = map(float, values)
return values
parser = argparse.ArgumentParser()
parser.add_argument('--xlim',
help = 'X axis limits',
action = 'store', type=two_floats,
default = [-1.e-3, 1.e-3])
print parser.parse_args()
Example output:
$ ./blaa.py --xlim "-2e-3 1e4"
Namespace(xlim=[-0.002, 10000.0])

If you specify the value for your option with an equals sign, argparse will not treat it as a separate option, even if it starts with -:
./blaa.py --xlim='-0.002 1e4'
# As opposed to --xlim '-0.002 1e4'
And if the value does not have spaces in it (or other special characters given your shell), you can drop the quotes:
./blaa.py --xlim=-0.002
See: https://www.gnu.org/software/guile/manual/html_node/Command-Line-Format.html
With this, there is no need to write your own type= parser or redefine the prefix character from - to # as the accepted answer suggests.

Here is the code that I use. (It is similar to jeremiahbuddha's but it answers the question more directly since it deals with negative numbers.)
Put this before calling argparse.ArgumentParser()
for i, arg in enumerate(sys.argv):
if (arg[0] == '-') and arg[1].isdigit(): sys.argv[i] = ' ' + arg

Another workaround is to pass in the argument using '=' symbol in addition to quoting the argument - i.e., --xlim="-2.3e14"

If you are up to modifying argparse.py itself, you could change the negative number matcher to handle scientific notation:
In class _ActionsContainer.__init__()
self._negative_number_matcher = _re.compile(r'^-(\d+\.?|\d*\.\d+)([eE][+\-]?\d+)?$')
Or after creating the parser, you could set parser._negative_number_matcher to this value. This approach might have problems if you are creating groups or subparsers, but should work with a simple parser.

Inspired by andrewfn's approach, I created a separate helper function to do the sys.argv fiddling:
def _tweak_neg_scinot():
import re
import sys
p = re.compile('-\\d*\\.?\\d*e', re.I)
sys.argv = [' ' + a if p.match(a) else a for a in sys.argv]
The regex looks for:
- : a negative sign
\\d* : zero or more digits (for oddly formatted values like -.5e-2 or -4354.5e-6)
\\.? : an optional period (e.g., -2e-5 is reasonable)
\\d* : another set of zero or more digits (for things like -2e-5 and -7.e-3)
e : to match the exponent marker
re.I makes it match both -2e-5 and -2E-5. Using p.match means that it only searches from the start of each string.

Related

Passing multiple sets of arguments [duplicate]

I need to let the end user of my python script types something like:
script.py -sizes <2,2> <3,3> <6,6>
where each element of the -sizes option is a pair of two positive integers. How can I achieve this with argparse ?
Define a custom type:
def pair(arg):
# For simplity, assume arg is a pair of integers
# separated by a comma. If you want to do more
# validation, raise argparse.ArgumentError if you
# encounter a problem.
return [int(x) for x in arg.split(',')]
then use this as the type for a regular argument:
p.add_argument('--sizes', type=pair, nargs='+')
Then
>>> p.parse_args('--sizes 1,3 4,6'.split())
Namespace(sizes=[[1, 3], [4, 6]])
Argparse doesn't try to cover all possible formats of input data. You can always get sizes as string and parse them with few lines of code:
import argparse
parser = argparse.ArgumentParser()
parser.add_argument('--sizes', nargs='+')
args = parser.parse_args()
try:
sizes = [tuple(map(int, s.split(',', maxsplit=1))) for s in args.sizes]
except Exception:
print('sizes cannot be parsed')
print(sizes)
"Special cases aren't special enough to break the rules."

Make argparse treat dashes and underscore identically

argparse replaces dashes in optional arguments by underscores to determine their destination:
import argparse
parser = argparse.ArgumentParser()
parser.add_argument('--use-unicorns', action='store_true')
args = parser.parse_args(['--use-unicorns'])
print(args) # returns: Namespace(use_unicorns=True)
However the user has to remember whether the option is --use-unicorns or --use_unicorns; using the wrong variant raises an error.
This can cause some frustration as the variable args.use_unicorns in the code does not make it clear which variant was defined.
How can I make argparse accept both --use-unicorns and --use_unicorns as valid ways to define this optional argument?
parser.add_argument('--use-unicorns', action='store_true')
args = parser.parse_args(['--use-unicorns'])
print(args) # returns: Namespace(use_unicorns=True)
argparse translates the '-' to '_' because the use of '-' in flags is well established POSIX practice. But args.use-unicones is not acceptable Python. In other words, it does the translation so the dest will be a valid Python variable or attribute name.
Note that argparse does not perform this translation with positionals. In that case the programmer has full control over the dest parameter, and can choose anything that's convenient. Since the argparse only uses getattr and setattr when accessing the Namespace, the constraints on a valid dest are minimal.
There are two users. There's you, the programmer, and there's your end user. What's convenient for you might not be optimal for the other.
You can also specify a dest with defining an optional. And metavar gives you further control over the help display.
It's parser._get_optional_kwargs that performs the '-' replace:
if dest is None:
....
dest = dest.replace('-', '_')
parser.add_argument accepts more than one flag for an argument (link to documentation). One easy way to make the parser accept both variants is to declare the argument as
parser.add_argument('--use-unicorns', '--use_unicorns', action='store_true')
However both options will show up in the help, and it is not very elegant as it forces one to write the variants manually.
An alternative is to subclass argparse.ArgumentParser to make the matching invariant to replacing dashes by underscore. This requires a little bit of fiddling, as both argparse_ActionsContainer._parse_optional and argparse_ActionsContainer._get_option_tuples have to be modified to handle this matching and abbrevations, e.g. --use_unic.
I ended up with the following subclassed method, where the matching to abbrevations is delegated from _parse_optional to _get_option_tuples:
from gettext import gettext as _
import argparse
class ArgumentParser(argparse.ArgumentParser):
def _parse_optional(self, arg_string):
# if it's an empty string, it was meant to be a positional
if not arg_string:
return None
# if it doesn't start with a prefix, it was meant to be positional
if not arg_string[0] in self.prefix_chars:
return None
# if it's just a single character, it was meant to be positional
if len(arg_string) == 1:
return None
option_tuples = self._get_option_tuples(arg_string)
# if multiple actions match, the option string was ambiguous
if len(option_tuples) > 1:
options = ', '.join([option_string
for action, option_string, explicit_arg in option_tuples])
args = {'option': arg_string, 'matches': options}
msg = _('ambiguous option: %(option)s could match %(matches)s')
self.error(msg % args)
# if exactly one action matched, this segmentation is good,
# so return the parsed action
elif len(option_tuples) == 1:
option_tuple, = option_tuples
return option_tuple
# if it was not found as an option, but it looks like a negative
# number, it was meant to be positional
# unless there are negative-number-like options
if self._negative_number_matcher.match(arg_string):
if not self._has_negative_number_optionals:
return None
# if it contains a space, it was meant to be a positional
if ' ' in arg_string:
return None
# it was meant to be an optional but there is no such option
# in this parser (though it might be a valid option in a subparser)
return None, arg_string, None
def _get_option_tuples(self, option_string):
result = []
if '=' in option_string:
option_prefix, explicit_arg = option_string.split('=', 1)
else:
option_prefix = option_string
explicit_arg = None
if option_prefix in self._option_string_actions:
action = self._option_string_actions[option_prefix]
tup = action, option_prefix, explicit_arg
result.append(tup)
else: # imperfect match
chars = self.prefix_chars
if option_string[0] in chars and option_string[1] not in chars:
# short option: if single character, can be concatenated with arguments
short_option_prefix = option_string[:2]
short_explicit_arg = option_string[2:]
if short_option_prefix in self._option_string_actions:
action = self._option_string_actions[short_option_prefix]
tup = action, short_option_prefix, short_explicit_arg
result.append(tup)
underscored = {k.replace('-', '_'): k for k in self._option_string_actions}
option_prefix = option_prefix.replace('-', '_')
if option_prefix in underscored:
action = self._option_string_actions[underscored[option_prefix]]
tup = action, underscored[option_prefix], explicit_arg
result.append(tup)
elif self.allow_abbrev:
for option_string in underscored:
if option_string.startswith(option_prefix):
action = self._option_string_actions[underscored[option_string]]
tup = action, underscored[option_string], explicit_arg
result.append(tup)
# return the collected option tuples
return result
A lot of this code is directly derived from the corresponding methods in argparse (from the CPython implementation here). Using this subclass should make the matching of optional arguments invariant to using dashes - or underscores _.

How should a variable be passed within format() in Python?

I am seeking to be able to use variables within the format() parentheses, in order to parameterize it within a function. Providing an example below:
sample_str = 'sample_str_{nvars}'
nvars_test = 'apple'
sample_str.format(nvars = nvars_test) #Successful Result: ''sample_str_apple''
But the following does not work -
sample_str = 'sample_str_{nvars}'
nvars_test_2 = 'nvars = apple'
sample_str.format(nvars_test_2) # KeyError: 'nvars'
Would anyone know how to do this? Thanks.
Many thanks for guidance. I did a bit more searching. For anyone who may run into the same problem, please see examples here: https://pyformat.info
sample_str = 'sample_str_{nvars}'
nvars_test_2 = {'nvars':'apple'}
sample_str.format(**nvars_test_2) #Successful Result: ''sample_str_apple''
First, I'd recommend checking out the string format examples.
Your first example works as expected. From the documentation, you are permitted to actually name the thing you are passing into {}, and then pass in a same-named variable for str.format():
'Coordinates: {latitude}, {longitude}'.format(latitude='37.24N', longitude='-115.81W')
# returns 'Coordinates: 37.24N, -115.81W'
Your second example doesn't work because you are not passing a variable called nvars in with str.format() - you are passing in a string: 'nvars = apple'.
sample_str = 'sample_str_{nvars}'
nvars_test_2 = 'nvars = apple'
sample_str.format(nvars_test_2) # KeyError: 'nvars'
It's a little more common (I think) to not name those curly-braced parameters - easier to read at least.
print('sample_str_{}'.format("apple")) should return 'sample_str_apple'.
If you're using Python 3.6 you also have access to Python's formatted string literals.
>>> greeting = 'hello'
>>> name = 'Jane'
>>> f'{greeting} {name}'
'hello Jane'
Note that the literal expects the variables to be already present. Otherwise you get an error.
>>> f'the time is now {time}'
NameError: name 'time' is not defined

How to truncate and pad in python3 using just str.format?

I would like to do this:
'{pathname:>90}'.format(pathname='abcde')[-2:]
using string formatting instead of array indexing.
So the result would be 'de'
or in the case of pathname='e' the result would be ' e' with a space before e. If the index would be [2:] this question would be answered by How to truncate a string using str.format in Python?
I need this in the following example:
import logging
import sys
logging.basicConfig(stream=sys.stdout, level=logging.INFO,style='{',format='{pathname:>90}{lineno:4d}{msg}')
logging.info('k')
The precision trick (using a precision format) doesn't work. Only works to truncate the end of the string.
a workaround would be to slice the string before passing it to str.format:
>>> '{pathname:>2}'.format(pathname='abcde'[-2:])
'de'
>>> '{pathname:>2}'.format(pathname='e'[-2:])
' e'
since you cannot control the arguments passed to format, you could create a subclass of str and redefine format so when it meets pathname in the keyword arguments it truncates, then calls original str.format method.
Small self-contained example:
class TruncatePathnameStr(str):
def format(self,*args,**kwargs):
if "pathname" in kwargs:
# truncate
kwargs["pathname"] = kwargs["pathname"][-2:]
return str.format(self,*args,**kwargs)
s = TruncatePathnameStr('##{pathname:>4}##')
print(s.format(pathname='abcde'))
that prints:
## de##
use it in your real-life example:
logging.basicConfig(stream=sys.stdout, level=logging.INFO,style='{',
format=TruncatePathnameStr('{pathname:>90}{lineno:4d}{msg}'))

Parse Args that aren't declared

I'm writing a utility for running bash commands that essentially takes as input a string and a list optional argument and uses the optional arguments to interpolate string.
I'd like it to work like this:
interpolate.py Hello {user_arg} my name is {computer_arg} %% --user_arg=john --computer_arg=hal
The %% is a separator, it separates the string to be interpolated from the arguments. What follows is the arguments used to interpolate the string. In this case the user has chosen user_arg and computer_arg as arguments. The program can't know in advance which argument names the user will choose.
My problem is how to parse the arguments? I can trivially split the input arguments on the separator but I can't figure out how to get optparse to just give the list of optional args as a dictionary, without specifying them in advance. Does anyone know how to do this without writing a lot of regex?
Well, if you use '--' to separate options from arguments instead of %%, optparse/argparse will just give you the arguments as a plain list (treating them as positional arguments instead of switched). After that it's not 'a lot of' regex, it's just a mere split:
for argument in args:
if not argument.startswith("--"):
# decide what to do in this case...
continue
arg_name, arg_value = argument.split("=", 1)
arg_name = arg_name[2:]
# use the argument any way you like
With argparse you could use the parse_known_args method to consume predefined arguments and any additional arguments. For example, using the following script
import sys
import argparse
def main(argv=None):
parser = argparse.ArgumentParser()
parser.add_argument('string', type=str, nargs='*',
help="""String to process. Optionally with interpolation
(explain this here...)""")
args, opt_args = parser.parse_known_args(argv)
print args
print opt_args
return 0
if __name__=='__main__':
sys.exit(main(sys.argv[1:]))
and calling with
python script.py Hello, my name is {name} --name=chris
yields the following output:
Namespace(string=['Hello,' 'my', 'name', 'is', '{name}'])
['--name=chris']
All that is left to do is to loop through the args namespace looking for strings of the form {...} and replacing them with the corresponding element in opt_args, if present. (I'm not sure if argparse can do argument interpolation automatically, the above example is the only immediate solution which comes to mind).
For something like this, you really don't need optparse or argparse - the benefit of such libraries are of little use in this circumstance (things like lone -v type arguments, checking for invalid options, value validation and so on)
def partition_list(lst, sep):
"""Slices a list in two, cutting on index matching "sep"
>>> partition_list(['a', 'b', 'c'], sep='b')
(['a'], ['c'])
"""
if sep in lst:
idx = lst.index(sep)
return (lst[:idx], lst[idx+1:])
else:
return (lst[:], )
def args_to_dict(args):
"""Crudely parses "--blah=123" type arguments into dict like
{'blah': '123'}
"""
ret = {}
for a in args:
key, _, value = a.partition("=")
key = key.replace("--", "", 1)
ret[key] = value
return ret
if __name__ == '__main__':
import sys
# Get stuff before/after the "%%" separator
string, args = partition_list(sys.argv[1:], "%%")
# Join input string
string_joined = " ".join(string)
# Parse --args=stuff
d = args_to_dict(args)
# Do string-interpolation
print string_joined.format(**d)

Categories