Python: how to ensure that import comes from defining module? - python

I think the following is bad style:
# In foo_pkg/bla_mod.py:
magic_value=42
# In foo_pkg/bar_mod.py:
from .bla_mod import magic_value
# In doit.py:
from foo_pkg.bar_mod import magic_value
Instead, I'd like that to always import an object from the module where it has been defined, i.e. in this case:
# In doit.py:
from foo_pkg.bla_mod import magic_value
Finding issues of this sort by hand is getting tedious very quickly (for each imported object, you have to open the module and check if it defines the object, or if it imports in from another module).
What is the best way to automate this check? To my surprise, neither pylint nor pyflakes seem to have an appropriate checker, but maybe there's another tool (or even some trick that can be used in Python itself)?
Problem statement in a nutshell: given a bunch of python source files, find every import of an object from a module that does not itself define the object.
I know there are libraries (including the standard library) where one module provides the main external API and imports the necessary symbols from other modules internally. However, that's not the case in the code base I'm working with, here these are artifacts of refactorings that I really want to eliminate.

Here's a draft script that solves the problem less than 100 lines: http://pastebin.com/CFsR6b3s

Related

Can I tell pylance in VS Code to assume import of certain symbols globally for all files?

I am working with code that Amazon's boto3 library. That library does not define any symbols normally and generates lot of the classes at runtime. As a result, it is not possible to use it for type hints.
There is a library called boto3-stubs, I found it in this answer: https://stackoverflow.com/a/54676985/607407 However this is a large codebase and there is certain inertia to adding new libraries, especially ones that do not actually do anything.
What I can do is use forward type annotations, like this:
def fn(versions: Iterator['ObjectVersion']):
pass
This will not affect the runtime and the rest of our team is ok with these names. But it serves little purpose unless pylance can recognize that I mean mypy_boto3_s3.service_resource.ObjectVersion.
Is there a setting that can tell Pylance: "For every file, automatically import these modules."?

Difference between importing python library within function versus importing globally?

Suppose I want to import a python library for use inside a function. Is it better to import the library within the function or import it globally?
Do this
def test_func:
import pandas as pd
# code implementation
or have the line below at the top of the python file to import globally?
import pandas as pd
What are the pros and cons of each approach? Which is the best practice in python?
I am using python v3.6
EDIT: Some clarifications to make.
Suppose I have 2 functions.
def func1:
import pandas as pd
# code implementation
def func2:
import pandas as pd
# code implementation
The python script runs both functions. Will the library be imported twice or is the python compiler smart enough to import it only once? This has performance implications.
It's a difference in name-visibility and execution time-point. The module-level import is imported when the file you are loading is imported or run itself. The function local one obviously only if the function is run. The imported names are either visible to all things in the file, or just within the function the import is executed in.
As there is a cost for hitting the import statement (albeit a small one, but still), the local one will always execute, not just once. It will not fully re-import the module though, python caches modules once they are imported the first time (see reload and sys.modules).
The best practice clearly is to use module level imports, and that's what you see in 99.999% of code. A huge reason is maintainability - if you want to understand what dependencies a module has, it's convenient to just look at the top, instead of having to comb through all code.
So when to use function local imports?
There are three scenarios:
you can't use the import earlier. This happens when e.g. a backend for a db or other system/functionality is chosen at runtime through configuration or system inspection.
you otherwise have circular imports. This is a rare case and also a code-smell, so if that is necessary, consider refactoring.
reducing startup-time by deferring module imports. This is very rarely useful though.
So for your case, the answer is a quick and simple "don't do it".
The module will be loaded when you import it, so if you need to import a rarely used module but cost a lot of time to initialize, you should import it when you need it.
Actually, if we just care about performance but not readability, it maybe always better to import module when we really need it.
But we need to keep our program maintainable. Importing all modules on the top is the most explicit way to tell others and author himself which modules are used.
To sum up, if you really have a very costly but rarely used module, you should import it locally. Otherwise you should import them on the top.

What is the benefit of putting an import statement inside of a class? [duplicate]

I created a module named util that provides classes and functions I often use in Python.
Some of them need imported features. What are the pros and the cons of importing needed things inside class/function definition? Is it better than import at the beginning of a module file? Is it a good idea?
It's the most common style to put every import at the top of the file. PEP 8 recommends it, which is a good reason to do it to start with. But that's not a whim, it has advantages (although not critical enough to make everything else a crime). It allows finding all imports at a glance, as opposed to looking through the whole file. It also ensures everything is imported before any other code (which may depend on some imports) is executed. NameErrors are usually easy to resolve, but they can be annoying.
There's no (significant) namespace pollution to be avoided by keeping the module in a smaller scope, since all you add is the actual module (no, import * doesn't count and probably shouldn't be used anyway). Inside functions, you'd import again on every call (not really harmful since everything is imported once, but uncalled for).
PEP8, the Python style guide, states that:
Imports are always put at the top of
the file, just after any module
comments and docstrings, and before module globals and constants.
Of course this is no hard and fast rule, and imports can go anywhere you want them to. But putting them at the top is the best way to go about it. You can of course import within functions or a class.
But note you cannot do this:
def foo():
from os import *
Because:
SyntaxWarning: import * only allowed at module level
Like flying sheep's answer, I agree that the others are right, but I put imports in other places like in __init__() routines and function calls when I am DEVELOPING code. After my class or function has been tested and proven to work with the import inside of it, I normally give it its own module with the import following PEP8 guidelines. I do this because sometimes I forget to delete imports after refactoring code or removing old code with bad ideas. By keeping the imports inside the class or function under development, I am specifying its dependencies should I want to copy it elsewhere or promote it to its own module...
Only move imports into a local scope, such as inside a function definition, if it’s necessary to solve a problem such as avoiding a circular import or are trying to reduce the initialization time of a module. This technique is especially helpful if many of the imports are unnecessary depending on how the program executes. You may also want to move imports into a function if the modules are only ever used in that function. Note that loading a module the first time may be expensive because of the one time initialization of the module, but loading a module multiple times is virtually free, costing only a couple of dictionary lookups. Even if the module name has gone out of scope, the module is probably available in sys.modules.
https://docs.python.org/3/faq/programming.html#what-are-the-best-practices-for-using-import-in-a-module
I believe that it's best practice (according to some PEP's) that you keep import statements at the beginning of a module. You can add import statements to an __init__.py file, which will import those module to all modules inside the package.
So...it's certainly something you can do the way you're doing it, but it's discouraged and actually unnecessary.
While the other answers are mostly right, there is a reason why python allows this.
It is not smart to import redundant stuff which isn’t needed. So, if you want to e.g. parse XML into an element tree, but don’t want to use the slow builtin XML parser if lxml is available, you would need to check this the moment you need to invoke the parser.
And instead of memorizing the availability of lxml at the beginning, I would prefer to try importing and using lxml, except it’s not there, in which case I’d fallback to the builtin xml module.

Does Python import order matter

I read here about sorting your import statements in Python, but what if the thing you are importing needs dependencies that have not been imported yet? Is this the difference between compiled languages and interpreted? I come from a JavaScript background and the order in which you load your scripts matter, whereas Python appears not to care. Thanks.
Import order does not matter. If a module relies on other modules, it needs to import them itself. Python treats each .py file as a self-contained unit as far as what's visible in that file.
(Technically, changing import order could change behavior, because modules can have initialization code that runs when they are first imported. If that initialization code has side effects it's possible for modules to have interactions with each other. However, this would be a design flaw in those modules. Import order is not supposed to matter, so initialization code should also be written to not depend on any particular ordering.)
Python Import order doesnot matter when you are importing standard python libraries/modules.
But, the order matters for your local application/library specific imports as you may stuck in circular dependency loop, so do look before importing.
No, it doesn't, because each python module should be self-contained and import everything it needs. This holds true for importing whole modules and only specific parts of it.
Order can matter for various nefarious reasons, including monkey patching.

Mocking Python iterables for use with Sphinx

I'm using Sphinx to document a project that depends on wxPython, using the autodocs extension so that it will automatically generate pages from our docstrings. The autodocs extension automatically operates on every module you import, which is fine for our packages but is a problem when we import a large external library like wxPython. Thus, instead of letting it generate everything from wxPython I'm using the unittest.mock library module (previously the external package Mock). The most basic setup works fine for most parts of wxPython, but I've run into a situation I can't see an easy way around (likely because of my relative unfamiliarity with mock until this week).
Currently, the end of my conf.py file has the following:
MOCK_MODULES = ['wx.lib.newevent'] # I've skipped irrelevant entries...
for module_name in MOCK_MODULES:
sys.modules[module_name] = mock.Mock()
For all the wxPython modules but wx.lib.newevent, this works perfectly. However, here I'm using the newevent.NewCommandEvent() function[1] to create an event for a particular scenario. In this case, I get a warning on the NewCommandEvent() call with the note TypeError: 'Mock' object is not iterable.
While I can see how one would use patching to handle this for building out unit tests (which I will be doing in the next month!), I'm having a hard time seeing how to integrate that at a simple level in my Sphinx configuration.
Edit: I've just tried using MagicMock() as well; this still produces an error at the same point, though it now produces ValueError: need more than 0 values to unpack. That seems like a step in the right direction, but I'm still not sure how to handle this short of explicitly setting it up for this one module. Maybe that's the best solution, though?
Footnotes
Yes, that's a function, naming convention making it look like a class notwithstanding; wxPython follows the C++ naming conventions which are used throughout the wxWidgets toolkit.
From the error, it looks like it is actually executing newevent.NewCommandEvent(), so I assume that somewhere in your code you have a top-level line something like this:
import wx.lib.newevent
...
event, binder = wx.lib.newevent.NewCommandEvent()
When autodoc imports the module, it tries to run this line of code, but since NewCommandEvent is actually a Mock object, Python can't bind its output to the (event, binder) tuple. There are two possible solutions. The first is to change your code to that this is not executed on import, maybe by wrapping it inside if __name__ == '__main__'. I would recommend this solution because creating objects like this on import can often have preblematic side effects.
The second solution is to tell the Mock object to return appropriate values thus:
wx.lib.newevent.NewCommandEvent = mock.Mock(return_value=(Mock(), Mock()))
However, if you are doing anything in your code with the returned values you might run into the same problem further down the line.

Categories