Sphinx: Autodocs for classes inheriting from base classes implemented in C - python

I am running into an error/warning when inheriting from collections.deque. Sphinx complains:
utils.py:docstring of utils.Test.count:1: WARNING: py:class reference target not found: integer -- return number of occurrences of value
For this example, utils.py only contains
from collections import deque
class Test(deque):
pass
My best guess is that this might be caused by deque being implemented in C, since integer is not a valid Python type. Looking around a bit, I found this line in the CPython source of collections, which could be exactly that part, should my hypothesis be correct. But I have no clue about CPython, the inner workings of Sphinx autodoc and the boundary between those worlds.
Is there a good solution to correcting this, whether in the annotation of CPython or the autodoc plugin?

It seems I was more or less on the money. integer is used elsewhere and seems to work, but more importantly Sphinx expects docstrings to follow the example outlined in PEP 7, which suggests signatures to be placed on the first line (followed by a line break before any other documentation). So it looks for a class named by the whole string between -> and \n, in my case being integer -- return number of occurrences of value.
The workaround is to put all those strings into nitpick_ignore:
# Wrong docstrings in the CPython source of collections.queue
("py:class", "-- reverse *IN PLACE*"),
("py:class", "integer -- return first index of value."),
("py:class", "integer -- return number of occurrences of value"),
("py:class", "-- insert object before index"),
("py:class", "-- remove first occurrence of value."),
("py:class", "-- size of D in memory, in bytes"),
("py:class", "-- return a reverse iterator over the deque"),
A proper fix would be to change the documentation in the CPython code accordingly.
I was about to do that, when I found out my coworker beat me to it by 2 weeks. But on the issue he opened, a core developer argues that this scheme is nothing but an example and that the documentation is only required to be human readable as opposed to specifying the exact types. So it doesn't look like there is going to be a fix anytime soon.

Related

Keeping alias types simple in Python documentation?

I'm trying to use the typing module to document my Python package, and I have a number of situations where several different types are allowable for a function parameter. For instance, you can either pass a number, an Envelope object (one of the classes in my package), or a list of numbers from which an Envelope is constructed, or a list of lists of numbers from which an envelope is constructed. So I make an alias type as follows:
NumberOrEnvelope = Union[Sequence[Real], Sequence[Sequence[Real]], Real, Envelope]
Then I write the function:
def example_function(parameter: NumberOrEnvelope):
...
And that looks great to me. However, when I create the documentation using Sphinx, I end up with this horrifically unreadable function signature:
example_function(parameter: Union[Sequence[numbers.Real], Sequence[Sequence[numbers.Real]], numbers.Real, expenvelope.envelope.Envelope])
Same thing also with the hints that pop up when I start to try to use the function in PyCharm.
Is there some way I can have it just leave it as "NumberOrEnvelope". Ideally that would also link in the documentation to a clarification of what "NumberOrEnvelope" is, though even if it didn't it would be way better than what's appearing now.
I had the same issue and used https://www.sphinx-doc.org/en/master/usage/extensions/autodoc.html#confval-autodoc_type_aliases, introduced in version 3.3.
In your sphinx conf.py, insert this section. It does not seem to make much sense at the first sight, but does the trick:
autodoc_type_aliases = dict(NumberOrEnvelope='NumberOrEnvelope')
Warning: It only works in modules that start with from __future__ import annotation
Note: If there is a target in the documentation, type references even have a hyperlink to the definition. I have classes, documented elsewhere with autoclass, which are used as types of function parameters, and the docs show the nice names of the types with links.
Support for this appears to be in the works.
See Issue #6518.
That issue can be closed by the recent updates to Pull Request #8007 (under review).
If you want the fix ASAP, you can perhaps try using that build.
EDIT: This doesn't quite work, sadly.
Turns out after a little more searching, I found what I was looking for. Instead of:
NumberOrEnvelope = Union[Sequence[Real], Sequence[Sequence[Real]], Real, Envelope]
I found that you can create your own compound type that does the same thing:
NumberOrEnvelope = TypeVar("NumberOrEnvelope", Sequence[Real], Sequence[Sequence[Real]], Real, Envelope)
This displays in documentation as "NumberOrEnvelope", just as I wanted.

What order should these elements be in a Python module?

When present, what is the order that these elements should be declared in a Python module?
Hash bang (#!/usr/bin/env python)
Encoding (# coding: utf-8)
Future imports (from __future__ import unicode_literals, ...)
Docstring
If declared last, will the docstring work in a call help(module)?
Hash bang. The kernel literally looks at the first two bytes of the file to see if they are equal to #!, so it won't work otherwise.
Encoding. According to the Python Language Reference it must be "on the first or second line".
Docstring. According to PEP 257, a docstring is "a string literal that occurs as the first statement in a module, function, class, or method definition", so it cannot go after any imports statements. You can see for yourself that help(module) no longer reports your docstring if you put it in a different place.
Future imports, because they cannot go before any of the above.

Can sphinx validate documentation against the actual implementation

I'm adding sphinx documentation to a couple of projects, but I'm concerned that down the line future developers aren't going to be rigorous in updating documentation when the method interface changes. Is there a way to have sphinx validate the list of arguments being described against what is actually specified in the method definition?
For example, let's say I set up sphinx using sphinx-quickstart, added autodoc, and am trying to build the following "module":
def product(arg_one, arg_two):
'''
:param arg_one: an object to be multiplied by arg_two
:type arg_one: Object
:param arg_two: an integer which defines the number of arg_one to return
:type arg_two: integer
:returns: The product of arg_one and arg_two
:raises: TypeError
'''
return arg_one * arg_two
And, a week from now, Bob Doe updates product as follows:
def product(value, number, catch_errors=False):
'''
:param arg_one: an object to be multiplied by arg_two
:type arg_one: Object
:param arg_two: an integer which defines the number of arg_one to return
:type arg_two: integer
:returns: The product of arg_one and arg_two
:raises: TypeError
'''
try:
return value * number
except TypeError as exc:
if catch_errors:
return None
raise
Now, the sphinx documentation isn't correct - it's missing the new catch_errors field and the variables have been renamed. But, running
sphinx-build . ./_build
a second time doesn't catch the problems - it just reports
Running Sphinx v1.6.5
loading pickled environment... done
building [mo]: targets for 0 po files that are out of date
building [html]: targets for 0 source files that are out of date
updating environment: 0 added, 0 changed, 0 removed
looking for now-outdated files... none found
no targets are out of date.
build succeeded.
I would expect (or hope) that Sphinx is capable of validating the docstring against the actual code of the method, and failing (or at least indicating in some way) when the documentation and the implementation aren't in-sync.
If sphinx isn't capable of this, is there an alternative which is? I know there's function annotations in Python 3, but we're currently supporting both Python 2 and Python 3, so this isn't an option.
After some looking, I think the closest tool I could find easily is darglint but it was recently put into archive-mode on Github (December 2022) so long term support is not guaranteed or even likely.
One option, inspired by this discussion from torchvision is to use Sphinx-Napoleon and pydocstyle with almost everything turned off, except D417 which is a check for missing arguments in the docstring.
So pydocstyle can check for missing arguments but it cannot currently check for extra arguments.
Edit: seems pylint can do this with an extension https://pylint.pycqa.org/en/latest/user_guide/checkers/extensions.html#pylint-extensions-docparams but there is some functional overlap between the various linters and style checkers so you might need to just pick which ones you go with.
For catching re-named arguments, you could also enforce 100% coverage in your tests, but that wouldn't catch refactors to non-keyword arguments.
Of some other packages available:
doctest is good for checking if your docstring examples run correctly, but it doesn't do any signature validation. So if you set a default parameter like the example in the question, it will still pass doctest. If you used keyword arguments for all your examples, you could use this as a check, but it's a bit verbose.
interrogate is capable of checking docstring coverage, but not much else (as far as I can tell?) I'm not sure whether other tools actually check coverage though, so this is potentially a good tool to use in tandem.

How to document a module constant in Python?

I have a module, errors.py in which several global constants are defined (note: I understand that Python doesn't have constants, but I've defined them by convention using UPPERCASE).
"""Indicates some unknown error."""
API_ERROR = 1
"""Indicates that the request was bad in some way."""
BAD_REQUEST = 2
"""Indicates that the request is missing required parameters."""
MISSING_PARAMS = 3
Using reStructuredText how can I document these constants? As you can see I've listed a docstring above them, but I haven't found any documentation that indicates to do that, I've just done it as a guess.
Unfortunately, variables (and constants) do not have docstrings. After all, the variable is just a name for an integer, and you wouldn't want to attach a docstring to the number 1 the way you would to a function or class object.
If you look at almost any module in the stdlib, like pickle, you will see that the only documentation they use is comments. And yes, that means that help(pickle) only shows this:
DATA
APPEND = b'a'
APPENDS = b'e'
…
… completely ignoring the comments. If you want your docs to show up in the built-in help, you have to add them to the module's docstring, which is not exactly ideal.
But Sphinx can do more than the built-in help can. You can configure it to extract the comments on the constants, or use autodata to do it semi-automatically. For example:
#: Indicates some unknown error.
API_ERROR = 1
Multiple #: lines before any assignment statement, or a single #: comment to the right of the statement, work effectively the same as docstrings on objects picked up by autodoc. Which includes handling inline rST, and auto-generating an rST header for the variable name; there's nothing extra you have to do to make that work.
As a side note, you may want to consider using an enum instead of separate constants like this. If you're not using Python 3.4 (which you probably aren't yet…), there's a backport.enum package for 3.2+, or flufl.enum (which is not identical, but it is similar, as it was the main inspiration for the stdlib module) for 2.6+.
Enum instances (not flufl.enum, but the stdlib/backport version) can even have docstrings:
class MyErrors(enum.Enum):
"""Indicates some unknown error."""
API_ERROR = 1
"""Indicates that the request was bad in some way."""
BAD_REQUEST = 2
"""Indicates that the request is missing required parameters."""
MISSING_PARAMS = 3
Although they unfortunately don't show up in help(MyErrors.MISSING_PARAMS), they are docstrings that Sphinx autodoc can pick up.
If you put a string after the variable, then sphinx will pick it up as the variable's documentation. I know it works because I do it all over the place. Like this:
FOO = 1
"""
Constant signifying foo.
Blah blah blah...
""" # pylint: disable=W0105
The pylint directive tells pylint to avoid flagging the documentation as being a statement with no effect.
This is an older question, but I noted that a relevant answer was missing.
Or you can just include a description of the constants in the docstring of the module via .. py:data::. That way the documentation is also made available via the interactive help. Sphinx will render this nicely.
"""
Docstring for my module.
.. data:: API_ERROR
Indicates some unknown error.
.. data:: BAD_REQUEST
Indicates that the request was bad in some way.
.. data:: MISSING_PARAMS
Indicates that the request is missing required parameters.
"""
You can use hash + colon to document attributes (class or module level).
#: Use this content as input for moo to do bar
MY_CONSTANT = "foo"
This will be picked up by some document generators.
An example here, could not find a better one: Sphinx document module properties
the following worked for me with Sphinx 2.4.4:
in foo.py :
API_ERROR = 1
"""int: Indicates some unknown error."""
then to document it:
.. automodule:: foo.py
:members:
I think you're out of luck here.
Python don't support directly docstrings on variables: there is no attribute that can be attached to variables and retrieved interactively like the __doc__ attribute on modules, classes and functions.
Source.
The Sphinx Napoleon Python documentation extension allows to document module-level variables in an Attributes section.
Per https://sphinxcontrib-napoleon.readthedocs.io/en/latest/example_numpy.html :
Attributes
----------
module_level_variable1 : int
Module level variables may be documented in either the ``Attributes``
section of the module docstring, or in an inline docstring immediately
following the variable.
Either form is acceptable, but the two should not be mixed. Choose
one convention to document module level variables and be consistent
with it.
Writing only because I haven't seen this option in the answers so far:
You can also define your constants as functions that simply return the desired constant value when called, so for example:
def get_const_my_const() -> str:
"""Returns 'my_const'."""
return "my_const"
This way they'll be a bit "more constant" on one hand (less worrying about reassignment) and they'll also provide the opportunity for regular documentation, as with any other function.

Pylint best practices

Pylint looks like a good tool for running analysis of Python code.
However, our main objective is to catch any potential bugs and not coding conventions. Enabling all Pylint checks seems to generate a lot of noise. What is the set of Pylint features you use and is effective?
You can block any warnings/errors you don't like, via:
pylint --disable=error1,error2
I've blocked the following (description from http://www.logilab.org/card/pylintfeatures):
W0511: Used when a warning note as FIXME or XXX is detected
W0142: Used * or * magic*. Used when a function or method is called using *args or **kwargs to dispatch arguments. This doesn't improve readability and should be used with care.
W0141: Used builtin function %r. Used when a black listed builtin function is used (see the bad-function option). Usual black listed functions are the ones like map, or filter, where Python offers now some cleaner alternative like list comprehension.
R0912: Too many branches (%s/%s). Used when a function or method has too many branches, making it hard to follow.
R0913: Too many arguments (%s/%s). Used when a function or method takes too many arguments.
R0914: Too many local variables (%s/%s). Used when a function or method has too many local variables.
R0903: Too few public methods (%s/%s). Used when class has too few public methods, so be sure it's really worth it.
W0212: Access to a protected member %s of a client class. Used when a protected member (i.e. class member with a name beginning with an underscore) is access outside the class or a descendant of the class where it's defined.
W0312: Found indentation with %ss instead of %ss. Used when there are some mixed tabs and spaces in a module.
C0111: Missing docstring. Used when a module, function, class or method has no docstring. Some special methods like __init__ don't necessarily require a docstring.
C0103: Invalid name "%s" (should match %s). Used when the name doesn't match the regular expression associated to its type (constant, variable, class...).
To persistently disable warnings and conventions:
Create a ~/.pylintrc file by running pylint --generate-rcfile > ~/.pylintrc
Edit ~/.pylintrc
Uncomment disable= and change that line to disable=W,C
Pyflakes should serve your purpose well.
-E will only flag what Pylint thinks is an error (i.e., no warnings, no conventions, etc.)
Using grep like:
pylint my_file.py | grep -v "^C"
Edit :
As mentionned in the question, to remove the conventions advices from pylint output, you remove the lines that start with an uppercase C.
From the doc of pylint, the output consists in lines that fit the format
MESSAGE_TYPE: LINE_NUM:[OBJECT:] MESSAGE
and the message type can be:
[R]efactor for a “good practice” metric violation
[C]onvention for coding standard violation
[W]arning for stylistic problems, or minor programming issues
[E]rror for important programming issues (i.e. most probably bug)
[F]atal for errors which prevented further processing
Only the first letter is displayed, so you can play with grep to select/remove the level of message type you want.
I didn't use Pylint recently, but I would probably use a parameter inside Pylint to do so.

Categories