Can someone tell me the differences between the following docstring parameters?
:type and :param
I've seen both being used to specify the type of method arguments, but I don't think they do exactly the same. Is one of them for the programmer and the other for the IDE or something like that?
:rtype, :return and :returns
Especially :return and :returns seem very similar, so which are to use in which situation?
These conventions are used by the Sphinx documentation tool, which was originally designed for processing Python docs. Its popularity has, however, led it to be extended into other domains, defined in the Sphinx documentation as "a collection of markup (reStructuredText directives and roles) to describe and link to objects belonging together".
According to the linked page :return comes from the Python domain, :returns from the JavaScript domain, and they both appear to be used for the same thing (i.e. documenting the return value of a function or method). In practice :returns appears so infrequently one wonders whether it's a documentation typo.
:rtype specifies the return type, and will create a link to the type definition if that's possible (i.e. if Sphinx can find the definition in the code you are documenting).
None of them mean anything by themselves. Various programs will scan a docstring and interpret certain pieces (or tags) specially for formatting, linking, etc. By convention (starting with javadoc?), such tags often begin with :. Beyond that, the specific meaning depends on the program parsing the docstring, and there is no defined standard for what tags should be used. Some programs use :return to document the return value of a function, others use :rtype.
The only real answer to your question is, consult the documentation for the program you expect to process your docstrings.
Related
THE PROBLEM:
I'm looking for a python library that might already implement a text parser that I have written in another language.
I have lines of text that represent either configuration commands in devices or dynamic output from commands run on devices. For simplicity, let's assume each line is parsed independently.
The bottom line is that a line contains fixed keywords and values/variables/parameters. Some keywords are optional and some are mandatory and in specific order. The number and type of variables/values associated with / following a given keyword can vary from one keyword to another.
SOLUTION IN OTHER LANGUAGE:
I wrote generic code in c++ that would parse any line and convert the input into structured data. The input to the code is
1. the line to be parsed and
2. a model/structure that described what keywords to look for, whether they are optional or not, in what order they might appear and what type of values/variables to expect for each (and also how many values/variables).
In c++ the interface allows the user among other things to supply a set of user-defined callback functions (one for each keyword) to be invoked by the parsing engine to supply the results (the parsed parameters associated with the given keyword). The implementation of the callback is user-defined but the callback signature is pre-defined.
WHAT ABOUT PYTHON?
I'm hoping for a simple library in python (or a completely different direction if this is something done differently/better in python) that provides an interface to specify the grammar/syntax/model of a given line (the details of all keywords, their order, what number and type of parameters each requires) and then does the parsing of input lines based on that syntax.
I'm not sure how much argparse fits what I need but this is not about parsing a command line input thou similar.
AN EXAMPLE:
Here is an example line from the IP networking world but the problem is more generic:
access-list SOMENAME-IN extended permit tcp host 117.21.212.54 host 174.163.16.23 range 5160 7000
In the above line, the keywords and their corresponding parameters are:
key: extended, no parameters
key: permit, no parameters
key: tcp, no parameters
key: host, par1: 117.21.212.54
key: host, par1: 174.163.16.23
key: range, par1: 5160, par2: 7000
This is a form of firewall access control list ACL. In this case the parser would be used to fill a structure that indicates
- the name of the ACL (SOMENAME-IN in the above example)
- the type of ACL (extended in the above example but there are other valid keywords)
- the protocol (tcp in the above example)
- the src host/IP (117.21.212.54 in the example)
- the src port (optional and not present in the above example)
- the dst host/IP (174.163.16.23 in the example)
- the dst port (a range of ports from 5160 to 7000 in the above example)
One can rather easily write a dedicated parser that assume the above example specific syntax and checks for it (perhaps this might also be more efficient and more clear since targeted to a specific syntax) but what I want is to be able to write a general parsing code, where all the keywords and the expected syntax is provided as data / model to the parsing engine which uses it to parse the lines and is also capable of pointing out errors in the parsed line.
I'm not obviously looking for a full solution cause that would be a lot but I hope for thoughts specifically in the context of using python and reusing any features or libraries python may have to do such parsing.
Thanks,
Al.
If I understand your needs correctly (and it is possible that I don't, because it is hard to know what limits you place on the possible grammars), then you should be able to solve this problem fairly simply with a tokeniser and a map of keyword parsers for each command.
If your needs are very simple, you might be able to tokenise using the split string method, but it is possible that you would prefer a tokeniser which at least handles quoted strings and maybe some operator symbols. The standard Python library provides the shlex module for this purpose.
There is no standard library module which does the parsing. There are a large variety of third-party parsing frameworks, but I think that they are likely to be overkill for your needs (although I might be wrong, and you might want to check them out even if you don't need anything that sophisticated). While I don't usually advocate hand-rolling a parser, this particular application is both simple enough to make that possible and sufficiently different from the possibilities of a context-free grammar to make direct coding useful.
(What makes a context-free grammar impractical is the desire to allow different command options to be provided in arbitrary order without allowing repetition of non-repeatable options. But rereading this answer, I realize that it is just an assumption on my part that you need that feature.)
Both epydoc and Sphinx document generators permit the coder to annotate what the types should be of any/all function parameter.
My question is: Is there a way (or module) that enforces these types (at run-time) when documented in the docstring. This wouldn't be strong-typing (compile-time checking), but (more likely) might be called firm-typing (run-time checking). Maybe raising a "ValueError", or even better still... raising a "SemanticError"
Ideally there would already be something (like a module) similar to the "import antigravity" module as per xkcd, and this "firm_type_check" module would already exist somewhere handy for download.
FYI: The docstring for epydoc and sphinz are as follows:
epydoc:
Functions and Methods parameters:
#param p: ... # A description of the parameter p for a function or method.
#type p: ... # The expected type for the parameter p.
#return: ... # The return value for a function or method.
#rtype: ... # The type of the return value for a function or method.
#keyword p: ... # A description of the keyword parameter p.
#raise e: ... # A description of the circumstances under which a function or method
raises exception e.
Sphinx: Inside Python object description directives, reST field lists with these fields are recognized and formatted nicely:
param, parameter, arg, argument, key, keyword: Description of a parameter.
type: Type of a parameter.
raises, raise, except, exception: That (and when) a specific exception is raised.
var, ivar, cvar: Description of a variable.
returns, return: Description of the return value.
rtype: Return type.
The closest I could find was a mention by Guido in mail.python.org and created by Jukka Lehtosalo at Mypy Examples. CMIIW: mypy cannot be imported as a py3 module.
Similar stackoverflow questions that do not use the docstring per se:
Pythonic Way To Check for A Parameter Type
What's the canonical way to check for type in python?
To my knowledge, nothing of the sort exists, for a few important reasons:
First, docstrings are documentation, just like comments. And just like comments, people will expect them to have no effect on the way your program works. Making your program's behavior depend on its documentation is a major antipattern, and a horrible idea all around.
Second, docstrings aren't guaranteed to be preserved. If you run python with -OO, for example, all docstrings are removed. What then?
Finally, Python 3 introduced optional function annotations, which would serve that purpose much better: http://legacy.python.org/dev/peps/pep-3107/ . Python currently does nothing with them (they're documentation), but if I were to write such a module, I'd use those, not docstrings.
My honest opinion is this: if you're gonna go through the (considerable) trouble of writing a (necessarily half-baked) static type system for Python, all the time it will take you would be put to better use by learning another programming language that supports static typing in a less insane way:
Clojure (http://clojure.org/) is incredibly dynamic and powerful (due to its nature as a Lisp) and supports optional static typing through core.typed (https://github.com/clojure/core.typed). It is geared towards concurrency and networking (it has STM and persistent data structures <3 ), has a great community, and is one of the most elegantly-designed languages I've seen. That said, it runs on the JVM, which is both a good and a bad thing.
Golang (http://golang.org/) feels sort-of Pythonic (or at least, it's attracting a lot of refugees from Python), is statically typed and compiles to native code.
Rust (http://www.rust-lang.org/) is lower-level than that, but it has one of the best type systems I've seen (type inference, pattern matching, traits, generics, zero-sized types...) and enforces memory and resource safety at compile time. It is being developed by Mozilla as a language to write their next browser (Servo) in, so performance and safety are its main goals. You can think of it as a modern take on C++. It compiles to native code, but hasn't hit 1.0 yet and as such, the language itself is still subject to change. Which is why I wouldn't recommend writing production code in it yet.
This question already has answers here:
Why is using 'eval' a bad practice?
(8 answers)
Closed 9 years ago.
I do know that one shouldn't use eval. For all the obvious reasons (performance, maintainability, etc.). My question is more on the side – is there a legitimate use for it? Where one should use it rather than implement the code in another way.
Since it is implemented in several languages and can lead to bad programming style, I assume there is a reason why it's still available.
First, here is Mathwork's list of alternatives to eval.
You could also be clever and use eval() in a compiled application to build your mCode interpreter, but the Matlab compiler doesn't allow that for obvious reasons.
One place where I have found a reasonable use of eval is in obtaining small predicates of code that consumers of my software need to be able to supply as part of a parameter file.
For example, there might be an item called "Data" that has a location for reading and writing the data, but also requires some predicate applied to it upon load. In a Yaml file, this might look like:
Data:
Name: CustomerID
ReadLoc: some_server.some_table
WriteLoc: write_server.write_table
Predicate: "lambda x: x[:4]"
Upon loading and parsing the objects from Yaml, I can use eval to turn the predicate string into a callable lambda function. In this case, it implies that CustomerID is a long string and only the first 4 characters are needed in this particular instance.
Yaml offers some clunky ways to magically invoke object constructors (e.g. using something like !Data in my code above, and then having defined a class for Data in the code that appropriately uses Yaml hooks into the constructor). In fact, one of the biggest criticisms I have of the Yaml magic object construction is that it is effectively like making your whole parameter file into one giant eval statement. And this is very problematic if you need to validate things and if you need flexibility in the way multiple parts of the code absorb multiple parts of the parameter file. It also doesn't lend itself easily to templating with Mako, whereas my approach above makes that easy.
I think this simpler design which can be easily parsed with any XML tools is better, and using eval lets me allow the user to pass in whatever arbitrary callable they want.
A couple of notes on why this works in my case:
The users of the code are not Python programmers. They don't have the ability to write their own functions and then just pass a module location, function name, and argument signature (although, putting all that in a parameter file is another way to solve this that wouldn't rely on eval if the consumers can be trusted to write code.)
The users are responsible for their bad lambda functions. I can do some validation that eval works on the passed predicate, and maybe even create some tests on the fly or have a nice failure mode, but at the end of the day I am allowed to tell them that it's their job to supply valid predicates and to ensure the data can be manipulated with simple predicates. If this constraint wasn't in place, I'd have to shuck this for a different system.
The users of these parameter files compose a small group mostly willing to conform to conventions. If that weren't true, it would be risky that folks would hi-jack the predicate field to do many inappropriate things -- and this would be hard to guard against. On big projects, it would not be a great idea.
I don't know if my points apply very generally, but I would say that using eval to add flexibility to a parameter file is good if you can guarantee your users are a small group of convention-upholders (a rare feat, I know).
In MATLAB the eval function is useful when functions make use of the name of the input argument via the inputname function. For example, to overload the builtin display function (which is sensitive to the name of the input argument) the eval function is required. For example, to call the built in display from an overloaded display you would do
function display(X)
eval([inputname(1), ' = X;']);
eval(['builtin(''display'', ', inputname(1), ');']);
end
In MATLAB there is also evalc. From the documentation:
T = evalc(S) is the same as EVAL(S) except that anything that would
normally be written to the command window, except for error messages,
is captured and returned in the character array T (lines in T are
separated by '\n' characters).
If you still consider this eval, then it is very powerful when dealing with closed source code that displays useful information in the command window and you need to capture and parse that output.
To ask my very specific question I find I need quite a long introduction to motivate and explain it -- I promise there's a proper question at the end!
While reading part of a large Python codebase, sometimes one comes across code where the interface required of an argument is not obvious from "nearby" code in the same module or package. As an example:
def make_factory(schema):
entity = schema.get_entity()
...
There might be many "schemas" and "factories" that the code deals with, and "def get_entity()" might be quite common too (or perhaps the function doesn't call any methods on schema, but just passes it to another function). So a quick grep isn't always helpful to find out more about what "schema" is (and the same goes for the return type). Though "duck typing" is a nice feature of Python, sometimes the uncertainty in a reader's mind about the interface of arguments passed in as the "schema" gets in the way of quickly understanding the code (and the same goes for uncertainty about typical concrete classes that implement the interface). Looking at the automated tests can help, but explicit documentation can be better because it's quicker to read. Any such documentation is best when it can itself be tested so that it doesn't get out of date.
Doctests are one possible approach to solving this problem, but that's not what this question is about.
Python 3 has a "parameter annotations" feature (part of the function annotations feature, defined in PEP 3107). The uses to which that feature might be put aren't defined by the language, but it can be used for this purpose. That might look like this:
def make_factory(schema: "xml_schema"):
...
Here, "xml_schema" identifies a Python interface that the argument passed to this function should support. Elsewhere there would be code that defines that interface in terms of attributes, methods & their argument signatures, etc. and code that allows introspection to verify whether particular objects provide an interface (perhaps implemented using something like zope.interface / zope.schema). Note that this doesn't necessarily mean that the interface gets checked every time an argument is passed, nor that static analysis is done. Rather, the motivation of defining the interface is to provide ways to write automated tests that verify that this documentation isn't out of date (they might be fairly generic tests so that you don't have to write a new test for each function that uses the parameters, or you might turn on run-time interface checking but only when you run your unit tests). You can go further and annotate the interface of the return value, which I won't illustrate.
So, the question:
I want to do exactly that, but using Python 2 instead of Python 3. Python 2 doesn't have the function annotations feature. What's the "closest thing" in Python 2? Clearly there is more than one way to do it, but I suspect there is one (relatively) obvious way to do it.
For extra points: name a library that implements the one obvious way.
Take a look at plac that uses annotations to define a command-line interface for a script. On Python 2.x it uses plac.annotations() decorator.
The closest thing is, I believe, an annotation library called PyAnno.
From the project webpage:
"The Pyanno annotations have two functions:
Provide a structured way to document Python code
Perform limited run-time checking "
Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 21 days ago.
Improve this question
What, in Your opinion is a meaningful docstring? What do You expect to be described there?
For example, consider this Python class's __init__:
def __init__(self, name, value, displayName=None, matchingRule="strict"):
"""
name - field name
value - field value
displayName - nice display name, if empty will be set to field name
matchingRule - I have no idea what this does, set to strict by default
"""
Do you find this meaningful? Post Your good/bad examples for all to know (and a general answer so it can be accepted).
I agree with "Anything that you can't tell from the method's signature". It might also mean to explain what a method/function returns.
You might also want to use Sphinx (and reStructuredText syntax) for documentation purposes inside your docstrings. That way you can include this in your documentation easily. For an example check out e.g. repoze.bfg which uses this extensively (example file, documentation example).
Another thing one can put in docstrings is also doctests. This might make sense esp. for module or class docstrings as you can also show that way how to use it and have this testable at the same time.
From PEP 8:
Conventions for writing good documentation strings (a.k.a.
"docstrings") are immortalized in PEP 257.
Write docstrings for all public modules, functions, classes, and methods. Docstrings are not necessary for non-public methods, but you
should have a comment that describes what the method does. This
comment should appear after the "def" line.
PEP 257 describes good docstring conventions. Note that most importantly, the """ that ends a multiline docstring should be on a
line by itself, and preferably preceded by a blank line.
For one liner docstrings, it's okay to keep the closing """ on the same line.
Check out numpy's docstrings for good examples (e.g. http://github.com/numpy/numpy/blob/master/numpy/core/numeric.py).
The docstrings are split into several sections and look like this:
Compute the sum of the elements of a list.
Parameters
----------
foo: sequence of ints
The list of integers to sum up.
Returns
-------
res: int
sum of elements of foo
See also
--------
cumsum: compute cumulative sum of elemenents
What should go there:
Anything that you can't tell from the method's signature. In this case the only bit useful is: displayName - if empty will be set to field name.
The most striking things I can think of to include in a docstring are the things that aren't obvious. Usually this includes type information, or capability requirements - eg. "Requires a file-like object". In some cases this will be evident from the signature, not so in other cases.
Another useful thing you can put in to your docstrings is a doctest.
I like to use the documentation to describe in as much detail as possible what the function does, especially the behavior at corner cases (a.k.a. edge cases). Ideally, a programmer using the function should never have to look at the source code - in practice, that means that whenever another programmer does have to look at source code to figure out some detail of how the function works, that detail probably should have been mentioned in the documentation. As Freddy said, anything that doesn't add any detail to the method's signature probably shouldn't be in a documentation string.
Generally purpose of adding adding doc string in starting of function is to describe function, what it does, what it would return, and description about parameters. You can add implementation details if required. Even you can add details about author who wrote the code for future developer.