Python has a built-in function called help() which returns the docstrings on methods or functions, per the turtle documentation. For example, say I was to type this:
>>> help('os')
I get greeted with
Help on module os:
NAME
os - OS routines for NT or Posix depending on what system we're on.
MODULE REFERENCE
https://docs.python.org/3.8/library/os
The following documentation is automatically generated from the Python
source files. It may be incomplete, incorrect or include features that
are considered implementation detail and may vary between Python
implementations. When in doubt, consult the module reference at the
location listed above.
. . .
and so forth. It is quite simple to find the module reference by just looking at this, but say I was collecting the references to 100 different modules. It would take quite some time and would be very repetitive work.
How can I parse through each help() function for the link to the module's documentation? It would involve finding a similar value such as https:// or .org or .com.
I'd argue that you actually don't need to do any parsing, since as far as I know, all the standard library Python modules have documentation accessible at the URL https://docs.python.org/<version>/library/<modulename>. It would be far more efficient to construct the URL according to that pattern compared to parsing the help text.
That being said, if you really do want to parse the help text, the re.search function should be useful. You can write a regular expression to match the URL of a Python standard library documentation page and presumably the first match should be the result you want.
Related
So I wanted to check out some implementations of standard libraries. I started with the os library with the code being here on github.
I took one method for example os.listdir() and I have absolutely no idea how it is implemented even after looking at the code ( pardon this noob ). I have following questions:
os.__all__ do not list this method but I think it is definitely a method as print(type(os.listdir)) listed <class 'builtin_function_or_method'> and I searched on google to find all the builtin functions which I found on this doc page and this is not one of them.
There is not such exclusive function named listdir defined in the module. In the code, from my limited understanding, the function is taken from globals() and put into a support_fd set. How this method is being called I do not understand.
I think the main problem I have is how that module is designed and I was not able to find any resources online to explain in simpler terms hence I am asking here for pointers.
EDIT: For those who are asking, I tried the following code in onlinegdb
import os
if "listdir" in os.__all__:
print("Yes")
print(os.listdir())
The result is only main.py, it should also print Yes, maybe the platform onlinegdb is the problem but it clearly shows the output of listdir as main.py.
After having discussion in the comments I see now that this is more of a online python version problem and not an issue with python or the module itself.
Sphinx allows linking to external documentation (such as the standard library docs) via intersphinx.
Is it possible to link to the definition of special methods like __del__(), without just making a regular link?
Ok, so in my case, I just needed to link to the object.__del__ method:
:py:meth:`__del__() <object.__del__>`
To do this generically:
Use python -m sphinx.ext.intersphinx https://docs.python.org/3/objects.inv to get the inventory of the python docs. You're gonna want to pipe the output through less or grep or save it to a file
Search through the results for the thing you're looking for. For special methods, it'll probably be object.__spam__
Look at the section the thing is under, and add it to your rst
Given a method name, how to determine which module(s) in the standard library contain this method?
E.g. If I am told about a method called strip(), but told nothing about how it works or that it is part of str, how would I go and find out which module it belongs to? I obliviously mean using Python itself to find out, not Googling "Python strip" :)
The trouble is, strip is not defined in any module. It is not a part of the standard library at all, but a method on str, which in turn is a built in class. So there isn't really any way of iterating through modules to find it.
You could use modulefinder to determin all the loaded modules then loop though each one and to get a list of methods using inspect.getmembers looping though those to find what you are looking for. I don't thing there is a built-in way to do this.
https://python.readthedocs.org/en/v2.7.2/library/modulefinder.html
https://docs.python.org/2/library/inspect.html
In my current work environment, we produce a large number of Python packages for internal use (10s if not 100s). Each package has some dependencies, usually on a mixture of internal and external packages, and some of these dependencies are shared.
As we approach dependency hell, updating dependencies becomes a time consuming process. While we care about the functional changes a new version might introduce, of equal (if not more) importance are the API changes that break the code.
Although running unit/integration tests against newer versions of a dependency helps us to catch some issues, our coverage is not close enough to 100% to make this a robust strategy. Release notes and a change log help identify major changes at a high-level, but these rarely exist for internally developed tools or go into enough detail to understand the implications the new version has on the (public) API.
I am looking at otherways to automate this process.
I would like to be able to automatically compare two versions of a Python package and report the API differences between them. In particular this would include backwards incompatible changes such as removing functions/methods/classes/modules, adding positional arguments to a function/method/class and changing the number of items a function/method returns. As a developer, based on the report this generates I should have a greater understanding about the code level implications this version change will introduce, and so the time require to integrate it.
Elsewhere, we use the C++ abi-compliance-checker and are looking at the Java api-compliance-checker to help with this process. Is there a similar tool available for Python? I have found plenty of lint/analysis/refactor tools but nothing that provides this level of functionality. I understand that Python's dynamic typing will make a comprehensive report impossible.
If such a tool does not exist, are they any libraries that could help with implementing a solution? For example, my current approach would be to use an ast.NodeVisitor to traverse the package and build a tree where each node represents a module/class/method/function and then compare this tree to that of another version for the same package.
Edit: since posting the question I have found pysdiff which covers some of my requirements, but interested to see alternatives still.
Edit: also found Upstream-Tracker would is a good example of the sort of information I'd like to end up with.
What about using the AST module to parse the files?
import ast
with file("test.py") as f:
python_src = f.read()
node = ast.parse(python_src) # Note: doesn't compile the src
print ast.dump(node)
There's the walk method on the ast node (described http://docs.python.org/2/library/ast.html)
The astdump might work (available on pypi)
This out of date pretty printer
http://code.activestate.com/recipes/533146-ast-pretty-printer/
The documentation tool Sphinx also extracts the information you are looking for. Perhaps give that a look.
So walk the AST and build a tree with the information you want in it. Once you have a tree you can pickle it and diff later or convert the tree to a text representation in a
text file you can diff with difftools, or some external diff program.
The ast has parse() and compile() methods. Only thing is I'm not entirely sure how much information is available to you after parsing (as you don't want to compile()).
Perhaps you can start by using the inspect module
import inspect
import types
def genFunctions(module):
moduleDict = module.__dict__
for name in dir(module):
if name.startswith('_'):
continue
element = moduleDict[name]
if isinstance(element, types.FunctionType):
argSpec = inspect.getargspec(element)
argList = argSpec.args
print "{}.{}({})".format(module.__name__, name, ", ".join(argList))
That will give you a list of "public" (not starting with underscore) functions with their argument lists. You can add more stuff to print the kwargs, classes, etc.
Once you run that on all the packages/modules you care about, in both old and new versions, you'll have two lists like this:
myPackage.myModule.myFunction1(foo, bar)
myPackage.myModule.myFunction2(baz)
Then you can either just sort and diff them, or write some smarter tooling in Python to actually compare all the names, e.g. to permit additional optional arguments but reject new mandatory arguments.
Check out zope.interfaces (you can get it from PyPI). Then you can incorporate unit testing that modules support interfaces into your unit tests. May take a while to retro fit however - also it's not a silver bullet.
I couldn't find a documentation about this function...
I specifically want to know what the parameters are and what do the parameters exactly represent...
using python 3
The convention in Python is to use a single leading _ for non-public methods and instance variables. So, _siftdown is not intended to be called externally and, thus, is not documented in the standard library documentation. If you want to examine how it works, look at the code. Note that the latest Python 3.2 documentation now includes links to the source code; see the link near the top of the page here.