When using the intersphinx extension to cross-reference objects from external Python packages the references can get pretty long if the module the object is defined in is nested several layers deep. Specifically I'm working with the Sympy library which leads to constantly having to type things like :class:`sympy.core.symbol.Symbol` all over the place.
I know that you can:
Prepend a ~ character to the reference to make the link only appear as "Symbol" in the generated documentation.
Truncate the path from the beginning, leaving a dot at the beginning, like .Symbol, .symbol.Symbol, .core.symbol.Symbol, etc.
Just use Symbol and count on Sphinx to find it, if I know there won't be any conflicts.
#1 is fine for the generated documentation but is still very verbose in the source files. #2 and #3 are shorter but leave out the most important part, which is the actual root package.
Is there a way shorten references in a way that removes the intermediate module names but also keeps the root package namespace? Something like sympy.*.Symbol or sympy::Symbol?
Related
I want to inject some code to all python modules in my project automatically.
I used ast.NodeTransformer and managed to change it quite easily, the problem is that I want to run the project.
An ast node is per module and I want to change all modules in the project and then run; and I have this example
The problem is that it applies to one node, viz. one file. I want to run this file, which imports and uses other files which I want to change too, so I'm not sure how to get it done.
I know I can use some ast-to-code module, like astor, but all are third party and I don't want to deal with bugs and unexpected issues.
Don't really know how to start, any suggestions?
I know I can use some ast-to-code module, like astor, but all are third party and I don't want to deal with bugs and unexpected issues.
From 3.9 onward there is ast.unparse, which practically does AST to source conversion after you transform it.
I’m refactoring a large procedural program (implemented as many files in one folder) and using packages to group the files into an object oriented structure. This application uses tKinter (probable red herring) and is being developed using PyDev on Eclipse Kepler (on & for Win7).
It does use some classes but converting the package structure (see below) into classes is NOT the preferred solution (unless it’s the only reasonable way to get what I want).
At the bottom of a 4 level package nest I’ve defined a constant (“conA”), a function (“funcB”), and a variable (“varC”). Going from the bottom up, the __init__.py file of every level (inside the (nested) folder that implements the package) contains:
from .levelbelowModuleName import conA
from .levelbelowModuleName import funcB
from .levelbelowModuleName import varC
so that the “recursive” imports make the “level4” entities visible by their “level4” names at all levels.
In the higher level packages “read only” references to all entities do not cause warning/error messages. No problem so far.
But when trying to update “varC “ inside a higher level package I get two warnings: 1) an “unused import: varC” warning at the top of the file, and, 2) inside the updating function (which has a “global varC” statement) an “unused variable: varC” warning at the “varC =” line. OK so far since the two “varC”s aren’t being associated (but see nasty side issue below).
I thought (inferred from Chapter 23/24 of Lutz’s “Learning Python” book ) that imported names - at all levels - would refer to the same object (address) - and therefore that updating a variable that resides in a “nephew” (brothers child) package would work. Elevating “varC” to the nearest common ancestor package (immediate parent of the problem package, grandparent of the “nephew”) seems to be a valid workaround (it eliminates the warnings) – but doing this destroys the purpose of the object-oriented package structure.
Converting to absolute imports did not help. Renaming/aliasing “varC” (by using “as” on the imports) did not help.
BTW, The update line used in the higher level module is “varC = X.getTable()”; this returns a matrix (list of lists) of tKinter IntVars from a custom class.
The nasty side issue: a “read only” reference to “varC” anywhere in the problem file, e.g. “print(varC)” whether at the top of the file or inside the function, will eliminate both warnings, hiding the problem.
Is there a way of having my cake and eating it too? Is there a “non-class” way of having “varC” reside at level4 and still be updateable by the higher level packages? Or is using a common ancestor the only simple-to-understand ”non-class” approach that will work?
P.S. Of the related questions that were suggested when entering this question, none seem to apply. A similar, but simpler (not nested) question is:
How to change a module variable from another module?
Added 2015-03-27:
Here are two Eclipse screenshots of the actual files. (For those not familiar with Eclipse, __init__.py appears as a tab that has the package name.) The first shot shows the recursive imports (from bottom to top). The second shows the function, the "unused import" (yellow triangle) warning for "SiteSpecificReports". The line highlighted in blue is where the "unused variable" warning was (it has mysteriously disappeared).
Ignore everything to do with LongitudeReports, it's essentially a clone of MessageCountReports. For clarity, I've hidden all code that is not relevant to the problem (e.g. tKinter calls). Ignore the red "X" on the file names; all are tKinter "init time" type mismatches that disappear when the code is run (as per comment above "SiteSpecificReports" in __init__.py for MessageCountReports).
In the module hierarchy the problem file is highlighted in grey. "A_Mainline.py" is the execute point, everything below that is the code being refactored (some has already been moved into the packages above that file). Finally, excluding CZQX, all subpackages beneath "SiteSpecific" are placeholders and only contain an empty __init__.py file.
Updated 2105-10-23
The point behind all of this was to keep file sizes to a reasonable level by splitting each module into several source files.
The accepted answer (and it's included link) provided the clue I needed. My problem was that when I refactored a file/module into several subfiles (each containing the variable definitions and the functions that modified them) I was thinking of each subfile as being a class-like "black box" object instead of being, somewhat more correctly, a simple "insert this file into the higher level file" command (like an editor "paste" command).
My thinking was that the recursive imports would, in effect, recursively promote the addresses of the lower level variables into the higher level "init.py" namespaces, thus making those variables visible to all other subfiles of the module (who would only reference those variables) - and allowing me to have my cake (localized definitions) and eat it too (variables available at the topmost level). This approach has worked for me in other compiled languages, notably ADA 83.
When creating subfiles it seems that for a variable to be visible to other subfiles you need to define it in the topmost file instead of the bottommost one - which defeats the "object-ish" approach I was trying to use. Kind of a bummer that this approach doesn't work since the location of the variables makes it awkward to reuse a subfile in other modules. Converting each file to a class should accomplish what I was after - but that seems really pointless when all you need is the "insert this code block here" effect.
Anyway, the important thing is that everything's working now.
You're missing the distinction between names and values in python. Names live in namespaces and point to values. When you say:
from .blah import Foo
you're creating a new name in your current namespace, that points to the same value that Foo points to in the blah namespace. If after that you say:
Foo = 1
That changes your local namespace's Foo to point to a 1, but it does nothing to blah.Foo -- you would have to explicitly say blah.Foo = 1 to change the name Foo that lives in blah.
This blog post is a good read to clarify.
In my current work environment, we produce a large number of Python packages for internal use (10s if not 100s). Each package has some dependencies, usually on a mixture of internal and external packages, and some of these dependencies are shared.
As we approach dependency hell, updating dependencies becomes a time consuming process. While we care about the functional changes a new version might introduce, of equal (if not more) importance are the API changes that break the code.
Although running unit/integration tests against newer versions of a dependency helps us to catch some issues, our coverage is not close enough to 100% to make this a robust strategy. Release notes and a change log help identify major changes at a high-level, but these rarely exist for internally developed tools or go into enough detail to understand the implications the new version has on the (public) API.
I am looking at otherways to automate this process.
I would like to be able to automatically compare two versions of a Python package and report the API differences between them. In particular this would include backwards incompatible changes such as removing functions/methods/classes/modules, adding positional arguments to a function/method/class and changing the number of items a function/method returns. As a developer, based on the report this generates I should have a greater understanding about the code level implications this version change will introduce, and so the time require to integrate it.
Elsewhere, we use the C++ abi-compliance-checker and are looking at the Java api-compliance-checker to help with this process. Is there a similar tool available for Python? I have found plenty of lint/analysis/refactor tools but nothing that provides this level of functionality. I understand that Python's dynamic typing will make a comprehensive report impossible.
If such a tool does not exist, are they any libraries that could help with implementing a solution? For example, my current approach would be to use an ast.NodeVisitor to traverse the package and build a tree where each node represents a module/class/method/function and then compare this tree to that of another version for the same package.
Edit: since posting the question I have found pysdiff which covers some of my requirements, but interested to see alternatives still.
Edit: also found Upstream-Tracker would is a good example of the sort of information I'd like to end up with.
What about using the AST module to parse the files?
import ast
with file("test.py") as f:
python_src = f.read()
node = ast.parse(python_src) # Note: doesn't compile the src
print ast.dump(node)
There's the walk method on the ast node (described http://docs.python.org/2/library/ast.html)
The astdump might work (available on pypi)
This out of date pretty printer
http://code.activestate.com/recipes/533146-ast-pretty-printer/
The documentation tool Sphinx also extracts the information you are looking for. Perhaps give that a look.
So walk the AST and build a tree with the information you want in it. Once you have a tree you can pickle it and diff later or convert the tree to a text representation in a
text file you can diff with difftools, or some external diff program.
The ast has parse() and compile() methods. Only thing is I'm not entirely sure how much information is available to you after parsing (as you don't want to compile()).
Perhaps you can start by using the inspect module
import inspect
import types
def genFunctions(module):
moduleDict = module.__dict__
for name in dir(module):
if name.startswith('_'):
continue
element = moduleDict[name]
if isinstance(element, types.FunctionType):
argSpec = inspect.getargspec(element)
argList = argSpec.args
print "{}.{}({})".format(module.__name__, name, ", ".join(argList))
That will give you a list of "public" (not starting with underscore) functions with their argument lists. You can add more stuff to print the kwargs, classes, etc.
Once you run that on all the packages/modules you care about, in both old and new versions, you'll have two lists like this:
myPackage.myModule.myFunction1(foo, bar)
myPackage.myModule.myFunction2(baz)
Then you can either just sort and diff them, or write some smarter tooling in Python to actually compare all the names, e.g. to permit additional optional arguments but reject new mandatory arguments.
Check out zope.interfaces (you can get it from PyPI). Then you can incorporate unit testing that modules support interfaces into your unit tests. May take a while to retro fit however - also it's not a silver bullet.
Recently, I have been working on a Python project with usual directory structure, and have received help from someone else who has given me a code snippet (a single function definition, about 30 lines long) which I would like to import into my code. What is the most proper directory/location in a Python project to store borrowed code of this size? Is it best to store the snippet into an entirely different module and import it from there?
I generally find it easiest to put such code in a separate file, because for clarity you don't want more than one different copyright/licensing term to apply within a single file. So in Python this does indeed mean a separate module. Then the file can contain whatever attribution and other legal boilerplate you need.
As long as your file headers don't accidentally claim copyright on something to which you do not own the copyright, I don't think it's actually a legal problem to mix externally-licensed or public domain code into files you mostly own. I may be wrong, though, which is why I normally avoid giving myself reason to think about it. A comment saying "this is external code from the following source with the following license:" may well be clearer than dividing code into different files that naturally wouldn't be. So I do occasionally do that.
I don't see any definite need for a separate directory (or package) per separate external source. If that's already part of your project structure (that is, it already uses external libraries by incorporating their source) then I suppose you might as well continue the trend.
I usually place scripts I copy off the internet in a folder/package called borrowed so I know all of the code here is stuff that I didn't write myself.
That is, if it's something more substantial than a one or two-liner demonstrating how something works.
I use Vim+Ctags to write Python, and my problem is that Vim often jumps to the import for a tag, rather than the definition. This is a common issue, and has already been addressed in a few posts here.
this post shows how to remove the imports from the tags file. This works quite well, except that sometimes it is useful to have tags form the imports (e.g. when you want to list all places where a class/function has been imported).
this post shows how to get to the definition without removing the imports from the tags file. This is basically what I've been doing so far (just remapped :tjump to a single keystroke). However, you still need to navigate the list of tags that comes up to find the definition entry.
It would be nice it if it were possible to just tell Vim to "got the the definition" with a single key chord (e.g. ). Exuberant Ctags annotates the tag entries with the type of entry (e.g. c for classes, i for imports). Does anyone know if there is a way to get Vim to utilize these annotations, so that I could say things like "go to the first tag that is not of type i"?
Unfortunately, there's no way for Vim itself to do that inference business and jump to an import or a definition depending on some context: when searching for a tag in your tags file, Vim stops at the first match whatever it is. A plugin may help but I'm not aware of such a thing.
Instead of <C-]> or :tag foo, you could use g] or :ts foo which shows you a list of matches (with kinds and a preview of the line of each match) instead of jumping to the first one. This way, you are able to tell Vim exactly where you want to go.