I have this code that works fine:
import click
#click.command(context_settings=dict(help_option_names=['-h', '--help']))
#click.option('--team_name', required=True, help='Team name')
#click.option('--input_file', default='url_file.txt', help='Input file name for applications, URLs')
#click.option('--output_file', default='test_results_file.txt', help='Output file name to store test results')
def main(team_name, input_file, output_file):
# function body
if __name__ == '__main__':
main() # how does this work?
As you see, main is being called with no arguments though it is supposed to receive three. How does this work?
As mentioned in comments, this is taken care by the decorators. The click.command decorator turns the function into an instance of click.Command.
Each of options decorators build an instance of a click.Option and attach it to the click.Command object to be used later.
This click.Command object implements a __call__ method which is invoked by your call to main().
def __call__(self, *args, **kwargs):
"""Alias for :meth:`main`."""
return self.main(*args, **kwargs)
It is quite simple and simply invokes click.Command.main().
Near the top of click.Command.main() is:
if args is None:
args = get_os_args()
else:
args = list(args)
This code gets argv from the command line or uses a provided list of args. Further code in this method does, among other things, the parsing of the command line into a context, and the eventual calling of your main() with the values from the click.Option instances built earlier:
with self.make_context(prog_name, args, **extra) as ctx:
rv = self.invoke(ctx)
This is the source of the mysterious 3 arguments.
Related
I am trying to patch __new__ method of a class, and it is not working as I expect.
from contextlib import contextmanager
class A:
def __init__(self, arg):
print('A init', arg)
#contextmanager
def patch_a():
new = A.__new__
def fake_new(cls, *args, **kwargs):
print('call fake_new')
return new(cls, *args, **kwargs)
# here I get error: TypeError: object.__new__() takes exactly one argument (the type to instantiate)
A.__new__ = fake_new
try:
yield
finally:
A.__new__ = new
if __name__ == '__main__':
A('foo')
with patch_a():
A('bar')
A('baz')
I expect the following output:
A init foo
call fake_new
A init bar
A init baz
But after call fake_new I get an error (see comment in the code).
For me It seems like I just decorate a __new__ method and propagate all args unchanged.
It doesn't work and the reason is obscure for me.
Also I can write return new(cls) and call A('bar') works fine. But then A('baz') breaks.
Can someone explain what is going on?
Python version is 3.8
You've run into a complicated part of Python object instantiation - in which the language opted for a design that would allow one to create a custom __init__ method with parameters, without having to touch __new__.
However, the in the base of class hierarchy, object, both __new__ and __init__ take one single parameter each.
IIRC, it goes this way: if your class have a custom __init__ and you did not touch __new__ and there are more any parameters to the class instantiation that would be passed to both __init__ and __new__, the parameters will be stripped from the call do __new__, so you don't have to customize it just to swallow the parameters you consume in __init__. The converse is also true: if your class have a custom __new__ with extra parameters, and no custom __init__, these are not passed to object.__init__.
With your design, Python sees a custom __new__ and passes it the same extra arguments that are passed to __init__ - and by using *args, **kw, you forward those to object.__new__ which accepts a single parameter - and you get the error you presented us.
The fix is to not pass those extra parameters to the original __new__ method - unless they are needed there - so you have to make the same check Python's type does when initiating an object.
And an interesting surprise to top it: while making the example work, I found out that even if A.__new__
is deleted when restoring the patch, it is still considered as "touched" by cPython's type instantiation, and the arguments are passed through.
In order to get your code working I needed to leave a permanent stub A.__new__ that will forward only the cls argument:
from contextlib import contextmanager
class A:
def __init__(self, arg):
print('A init', arg)
#contextmanager
def patch_a():
new = A.__new__
def fake_new(cls, *args, **kwargs):
print('call fake_new')
if new is object.__new__:
return new(cls)
return new(cls, *args, **kwargs)
# here I get error: TypeError: object.__new__() takes exactly one argument (the type to instantiate)
A.__new__ = fake_new
try:
yield
finally:
del A.__new__
if new is not object.__new__:
A.__new__ = new
else:
A.__new__ = lambda cls, *args, **kw: object.__new__(cls)
print(A.__new__)
if __name__ == '__main__':
A('foo')
with patch_a():
A('bar')
A('baz')
(I tried inspecting the original __new__ signature instead of the new is object.__new__ comparison - to no avail: object.__new__ signature is *args, **kwargs - possibly made so that it will never fail on static checking)
I frequently use the following pattern, where I create an instance of a Main object, typically for a batch, then call process() on it. All the necessary options are passed into the __init__. This means that, provided I create a Main with the appropriate parameters, I can run it from anywhere (celery being an example).
(I have learned to minimize in-celery debugging to the greatest extent to keep my sanity - the scripts are debugged fully from the command line first, and only then are Mains created and launched from celery)
This is my working Click approach:
Pretty simple. the decorated run function defines its arguments and passes them to Main.__init__.
Conceptually however, run and its docstring are very separated from that __init__. So on the __init__ I have to look at run to understand what to pass and when I --help the command line, I get the docstring for run.
import click
#click.command()
#click.option('--anoption')
def run(**kwargs):
mgr = Main(**kwargs)
mgr.process()
class Main:
def __init__(self, **kwargs):
self.__dict__.update(**kwargs)
def process(self):
print(f"{self}.process({vars(self)})")
if __name__ == "__main__":
run()
output:
<__main__.Main object at 0x10d5e42b0>.process({'anoption': None})
This is what I would like instead:
import click
class Main:
#click.command()
#click.option('--anoption')
def __init__(self, **kwargs):
self.__dict__.update(**kwargs)
def process(self):
print(f"{self}.process({vars(self)})")
if __name__ == "__main__":
main = Main(standalone_mode=False)
main.process()
But that gets me TypeError: __init__() missing 1 required positional argument: 'self'
Using a factory method
the closest I've come is with a factory staticmethod, but that's not great. At least the factory is sitting on the class, but there's still not much of a link to the init.
import click
class Main:
#staticmethod
#click.command()
#click.option('--anoption')
def factory(**kwargs):
return Main(**kwargs)
def __init__(self, **kwargs):
self.__dict__.update(**kwargs)
def process(self):
print(f"{self}.process({vars(self)})")
if __name__ == "__main__":
main = Main.factory(standalone_mode=False)
main.process()
output:
<__main__.Main object at 0x10ef59128>.process({'anoption': None})
What's a simple Pythonic way to bring click and that Main.__init__ closer together?
I've written a CLI with click originally as a module and it worked fine. But since my project got bigger I now need to have attributes the CLI can work with, so I tried to turn it into a class, but I'm running into an error doing it. My code is like the following:
import click
import click_repl
import os
from prompt_toolkit.history import FileHistory
class CLI:
def __init__(self):
pass
#click.group(invoke_without_command=True)
#click.pass_context
def cli(self, ctx):
if ctx.invoked_subcommand is None:
ctx.invoke(self.repl)
#cli.command()
def foo(self):
print("foo")
#cli.command()
def repl(self):
prompt_kwargs = {
'history': FileHistory(os.path.expanduser('~/.repl_history'))
}
click_repl.repl(click.get_current_context(), prompt_kwargs)
def main(self):
while True:
try:
self.cli(obj={})
except SystemExit:
pass
if __name__ == "__main__":
foo = CLI()
foo.main()
Without all the selfs and the class CLI: the CLI is working as expected, but as a class it runs into an error: TypeError: cli() missing 1 required positional argument: 'ctx' I don't understand why this happens. As far as I know calling self.cli() should pass self automatically, thus obj={} should be passed as ctx.obj, so it shouldn't make any difference to cli if it's wrapped in a class or not.
Can someone explain to me, why this happens and more important, how I can fix it?
In case it's relevant here is the complete error stack trace:
Traceback (most recent call last):
File "C:/Users/user/.PyCharmCE2018.2/config/scratches/exec.py", line
37, in <module>
foo.main()
File "C:/Users/user/.PyCharmCE2018.2/config/scratches/exec.py", line
30, in main
self.cli(obj={})
File "C:\Users\user\AppData\Local\Programs\Python\Python37\lib\site- packages\click\core.py", line 764, in __call__
return self.main(*args, **kwargs)
File "C:\Users\user\AppData\Local\Programs\Python\Python37\lib\site-packages\click\core.py", line 717, in main
rv = self.invoke(ctx)
File "C:\Users\user\AppData\Local\Programs\Python\Python37\lib\site-packages\click\core.py", line 1114, in invoke
return Command.invoke(self, ctx)
File "C:\Users\user\AppData\Local\Programs\Python\Python37\lib\site-packages\click\core.py", line 956, in invoke
return ctx.invoke(self.callback, **ctx.params)
File "C:\Users\user\AppData\Local\Programs\Python\Python37\lib\site-packages\click\core.py", line 555, in invoke
return callback(*args, **kwargs)
File "C:\Users\user\AppData\Local\Programs\Python\Python37\lib\site-packages\click\decorators.py", line 17, in new_func
return f(get_current_context(), *args, **kwargs)
TypeError: cli() missing 1 required positional argument: 'ctx'
EDIT: The problem seems to be the pass_context call. Usually pass_context would provide the current context as first parameter to the function, so that obj={} would be passed to the context instance. But since I wrapped the click-group into a class, the first spot is taken by the self-reference, so that the current context can't be passed to the function. Any ideas of how to work around that?
I tried changing def cli() the following way:
#click.group(invoke_without_command=True)
def cli(self):
ctx = click.get_current_context()
ctx.obj = {}
if ctx.invoked_subcommand is None:
ctx.invoke(self.repl)
So I don't pass the context by call avoiding a conflict with self, but if I try to run this with self.cli() error TypeError: cli() missing 1 required positional argument: 'self' happens.
Calling it with self.cli(self) runs into TypeError: 'CLI' object is not iterable
I am afraid the click library is not designed to work as a class. Click makes use of decorators. Don't take decorators too lightly. Decorators literally take your function as argument and return a different function.
For example:
#cli.command()
def foo(self):
Is something in line of
foo = cli.command()(foo)
So, I am afraid that click has not support to decorate functions bound to classes, but can only decorate functions that are unbound. So, basically the solution to your answer is, don't use a class.
You might be wondering how to organize your code now. Most languages present you the class as an unit of organization.
Python however goes one step further and gives you modules as well. Basically a file is a module and within this file everything you put in there is automatically associated with that file as a module.
So, just name a file cli.py and create your attributes as global variables. This might give you other problems, since you cannot alter global variables in a function scope, but you can use a class to contain your variables instead.
class Variables:
pass
variables = Variables()
variables.something = "Something"
def f():
variables.something = "Nothing"
I am trying to wrap the constructor for pyspark Pipeline.init constructor, and monkey patch in the newly wrapped constructor. However, I am running into an error that seems to have something to do with the way Pipeline.init uses decorators
Here is the code that actually does the monkey patch:
def monkeyPatchPipeline():
oldInit = Pipeline.__init__
def newInit(self, **keywordArgs):
oldInit(self, stages=keywordArgs["stages"])
Pipeline.__init__ = newInit
However, when I run a simple program:
import PythonSparkCombinatorLibrary
from pyspark.ml import Pipeline
from pyspark.ml.classification import LogisticRegression
from pyspark.ml.feature import HashingTF, Tokenizer
PythonSparkCombinatorLibrary.TransformWrapper.monkeyPatchPipeline()
tokenizer = Tokenizer(inputCol="text", outputCol="words")
hashingTF = HashingTF(inputCol=tokenizer.getOutputCol(),outputCol="features")
lr = LogisticRegression(maxIter=10, regParam=0.001)
pipeline = Pipeline(stages=[tokenizer, hashingTF, lr])
I get this error:
Traceback (most recent call last):
File "C:\<my path>\PythonApplication1\main.py", line 26, in <module>
pipeline = Pipeline(stages=[tokenizer, hashingTF, lr])
File "C:<my path>PythonApplication1 \PythonSparkCombinatorLibrary.py", line 36, in newInit
oldInit(self, stages=keywordArgs["stages"])
File "C:\<pyspark_path>\pyspark\__init__.py", line 98, in wrapper
return func(*args, **kwargs)
File "C:\<pyspark_path>\pyspark\ml\pipeline.py", line 63, in __init__
kwargs = self.__init__._input_kwargs
AttributeError: 'function' object has no attribute '_input_kwargs'
Looking into the pyspark interface, I see that Pipeline.init looks like this:
#keyword_only
def __init__(self, stages=None):
"""
__init__(self, stages=None)
"""
if stages is None:
stages = []
super(Pipeline, self).__init__()
kwargs = self.__init__._input_kwargs
self.setParams(**kwargs)
And noting the #keyword_only decorator, I inspected that code as well:
def keyword_only(func):
"""
A decorator that forces keyword arguments in the wrapped method
and saves actual input keyword arguments in `_input_kwargs`.
"""
#wraps(func)
def wrapper(*args, **kwargs):
if len(args) > 1:
raise TypeError("Method %s forces keyword arguments." % func.__name__)
wrapper._input_kwargs = kwargs
return func(*args, **kwargs)
return wrapper
I'm totally confused both about how this code works in the first place, and also why it seems to cause problems with my own wrapper. I see that wrapper is adding a _input_kwargs field to itself, but how is Pipeline.__init__ about to read that field with self.__init__._input_kwargs? And why doesn't the same thing happen when I wrap Pipeline.__init__ again?
Decorator 101. Decorator is a higher-order function which takes a function as its first argument (and typically only), and returns a function. # annotation is just a syntactic sugar for a simple function call, so following
#decorator
def decorated(x):
...
can be rewritten for example as:
def decorated_(x):
...
decorated = decorator(decorated_)
So Pipeline.__init__ is actually a functools.wrapped wrapper which captures defined __init__ (func argument of the keyword_only) as a part of its closure. When it is called, it uses received kwargs as a function attribute of itself. Basically what happens here can be simplified to:
def f(**kwargs):
f._input_kwargs = kwargs # f is in the current scope
hasattr(f, "_input_kwargs")
False
f(foo=1, bar="x")
hasattr(f, "_input_kwargs")
True
When you further wrap (decorate) __init__ the external function won't have _input_kwargs attached, hence the error. If you want to make it work you have apply the same process, as used by the original __init__, to your own version, for example with the same decorator:
#keyword_only
def newInit(self, **keywordArgs):
oldInit(self, stages=keywordArgs["stages"])
but I liked I mentioned in the comments, you should rather consider subclassing.
In python's OptionParser, how can I instruct it to ignore undefined options supplied to method parse_args?
e.g.
I've only defined option --foo for my OptionParser instance, but I call parse_args with list: [ '--foo', '--bar' ]
I don't care if it filters them out of the original list. I just want undefined options ignored.
The reason I'm doing this is because I'm using SCons' AddOption interface to add custom build options. However, some of those options guide the declaration of the targets. Thus I need to parse them out of sys.argv at different points in the script without having access to all the options. In the end, the top level Scons OptionParser will catch all the undefined options in the command line.
Here's one way to have unknown arguments added to the result args of OptionParser.parse_args, with a simple subclass.
from optparse import (OptionParser,BadOptionError,AmbiguousOptionError)
class PassThroughOptionParser(OptionParser):
"""
An unknown option pass-through implementation of OptionParser.
When unknown arguments are encountered, bundle with largs and try again,
until rargs is depleted.
sys.exit(status) will still be called if a known argument is passed
incorrectly (e.g. missing arguments or bad argument types, etc.)
"""
def _process_args(self, largs, rargs, values):
while rargs:
try:
OptionParser._process_args(self,largs,rargs,values)
except (BadOptionError,AmbiguousOptionError), e:
largs.append(e.opt_str)
And here's a snippet to show that it works:
# Show that the pass-through option parser works.
if __name__ == "__main__": #pragma: no cover
parser = PassThroughOptionParser()
parser.add_option('-k', '--known-arg',dest='known_arg',nargs=1, type='int')
(options,args) = parser.parse_args(['--shazbot','--known-arg=1'])
assert args[0] == '--shazbot'
assert options.known_arg == 1
(options,args) = parser.parse_args(['--k','4','--batman-and-robin'])
assert args[0] == '--batman-and-robin'
assert options.known_arg == 4
By default there is no way to modify the behavior of the call to error() that is raised when an undefined option is passed. From the documentation at the bottom of the section on how optparse handles errors:
If optparse‘s default error-handling behaviour does not suit your needs, you’ll need to
subclass OptionParser and override its exit() and/or error() methods.
The simplest example of this would be:
class MyOptionParser(OptionParser):
def error(self, msg):
pass
This would simply make all calls to error() do nothing. Of course this isn't ideal, but I believe that this illustrates what you'd need to do. Keep in mind the docstring from error() and you should be good to go as you proceed:
Print a usage message incorporating 'msg' to stderr and
exit.
If you override this in a subclass, it should not return -- it
should either exit or raise an exception.
Python 2.7 (which didn't exist when this question was asked) now provides the argparse module. You may be able to use ArgumentParser.parse_known_args() to accomplish the goal of this question.
This is pass_through.py example from Optik distribution.
#!/usr/bin/env python
# "Pass-through" option parsing -- an OptionParser that ignores
# unknown options and lets them pile up in the leftover argument
# list. Useful for programs that pass unknown options through
# to a sub-program.
from optparse import OptionParser, BadOptionError
class PassThroughOptionParser(OptionParser):
def _process_long_opt(self, rargs, values):
try:
OptionParser._process_long_opt(self, rargs, values)
except BadOptionError, err:
self.largs.append(err.opt_str)
def _process_short_opts(self, rargs, values):
try:
OptionParser._process_short_opts(self, rargs, values)
except BadOptionError, err:
self.largs.append(err.opt_str)
def main():
parser = PassThroughOptionParser()
parser.add_option("-a", help="some option")
parser.add_option("-b", help="some other option")
parser.add_option("--other", action='store_true',
help="long option that takes no arg")
parser.add_option("--value",
help="long option that takes an arg")
(options, args) = parser.parse_args()
print "options:", options
print "args:", args
main()
Per synack's request in a different answer's comments, I'm posting my hack of a solution which sanitizes the inputs before passing them to the parent OptionParser:
import optparse
import re
import copy
import SCons
class NoErrOptionParser(optparse.OptionParser):
def __init__(self,*args,**kwargs):
self.valid_args_cre_list = []
optparse.OptionParser.__init__(self, *args, **kwargs)
def error(self,msg):
pass
def add_option(self,*args,**kwargs):
self.valid_args_cre_list.append(re.compile('^'+args[0]+'='))
optparse.OptionParser.add_option(self, *args, **kwargs)
def parse_args(self,*args,**kwargs):
# filter out invalid options
args_to_parse = args[0]
new_args_to_parse = []
for a in args_to_parse:
for cre in self.valid_args_cre_list:
if cre.match(a):
new_args_to_parse.append(a)
# nuke old values and insert the new
while len(args_to_parse) > 0:
args_to_parse.pop()
for a in new_args_to_parse:
args_to_parse.append(a)
return optparse.OptionParser.parse_args(self,*args,**kwargs)
def AddOption_and_get_NoErrOptionParser( *args, **kwargs):
apply( SCons.Script.AddOption, args, kwargs)
no_err_optparser = NoErrOptionParser(optparse.SUPPRESS_USAGE)
apply(no_err_optparser.add_option, args, kwargs)
return no_err_optpars