This is a question regarding good python coding and positioning.
I have a somewhat large code for which I use lots of external modules/packages/functions. Currently I load all of them at the very top of the code because that's how I've seen it done. This is bothersome for example when I need to comment out a block of the code for testing because then I need to go up, look for the modules that block was using and comment them out too. I know I don't have to do this last part, but I do it for consistency since I don't like to import things I won't be using.
If the imported modules where listed right above the block that makes use of them, this process would be easier and the code would be easier to follow, at least for me.
My question is whether it is recommended to import all modules at the beginning of the code or should I do it throughout the code as necessary?
Its officially recommended to import at the beginning, see PEP8:
Imports are always put at the top of the file, just after any module comments and docstrings, and before module globals and constants.
Imports should be grouped in the following order:
standard library imports
related third party imports
local application/library specific imports
You should put a blank line between each group of imports.
Put any relevant __all__ specification after the imports.
Well, I would say do it however is easiest for you. But I think following the PEP8 recommendation (already referenced in another answer) is generally the best method. It's easier to see everything your module references this way all in one place, and that's where new programmers will look to add their imports to your code later.
As an aside, this page gives one reason why you might violate this convention, https://wiki.python.org/moin/PythonSpeed/PerformanceTips#Import_Statement_Overhead:
Note that putting an import in a function can speed up the initial loading of the module, especially if the imported module might not be required. This is generally a case of a "lazy" optimization -- avoiding work (importing a module, which can be very expensive) until you are sure it is required.
If the import of a module is expensive (it's huge, or has side effects), then you might want to put it later. In short, if you want to follow this part of 'The Zen of Python':
Although practicality beats purity.
wrt to your import placement, then do so.
Related
I created a module named util that provides classes and functions I often use in Python.
Some of them need imported features. What are the pros and the cons of importing needed things inside class/function definition? Is it better than import at the beginning of a module file? Is it a good idea?
It's the most common style to put every import at the top of the file. PEP 8 recommends it, which is a good reason to do it to start with. But that's not a whim, it has advantages (although not critical enough to make everything else a crime). It allows finding all imports at a glance, as opposed to looking through the whole file. It also ensures everything is imported before any other code (which may depend on some imports) is executed. NameErrors are usually easy to resolve, but they can be annoying.
There's no (significant) namespace pollution to be avoided by keeping the module in a smaller scope, since all you add is the actual module (no, import * doesn't count and probably shouldn't be used anyway). Inside functions, you'd import again on every call (not really harmful since everything is imported once, but uncalled for).
PEP8, the Python style guide, states that:
Imports are always put at the top of
the file, just after any module
comments and docstrings, and before module globals and constants.
Of course this is no hard and fast rule, and imports can go anywhere you want them to. But putting them at the top is the best way to go about it. You can of course import within functions or a class.
But note you cannot do this:
def foo():
from os import *
Because:
SyntaxWarning: import * only allowed at module level
Like flying sheep's answer, I agree that the others are right, but I put imports in other places like in __init__() routines and function calls when I am DEVELOPING code. After my class or function has been tested and proven to work with the import inside of it, I normally give it its own module with the import following PEP8 guidelines. I do this because sometimes I forget to delete imports after refactoring code or removing old code with bad ideas. By keeping the imports inside the class or function under development, I am specifying its dependencies should I want to copy it elsewhere or promote it to its own module...
Only move imports into a local scope, such as inside a function definition, if it’s necessary to solve a problem such as avoiding a circular import or are trying to reduce the initialization time of a module. This technique is especially helpful if many of the imports are unnecessary depending on how the program executes. You may also want to move imports into a function if the modules are only ever used in that function. Note that loading a module the first time may be expensive because of the one time initialization of the module, but loading a module multiple times is virtually free, costing only a couple of dictionary lookups. Even if the module name has gone out of scope, the module is probably available in sys.modules.
https://docs.python.org/3/faq/programming.html#what-are-the-best-practices-for-using-import-in-a-module
I believe that it's best practice (according to some PEP's) that you keep import statements at the beginning of a module. You can add import statements to an __init__.py file, which will import those module to all modules inside the package.
So...it's certainly something you can do the way you're doing it, but it's discouraged and actually unnecessary.
While the other answers are mostly right, there is a reason why python allows this.
It is not smart to import redundant stuff which isn’t needed. So, if you want to e.g. parse XML into an element tree, but don’t want to use the slow builtin XML parser if lxml is available, you would need to check this the moment you need to invoke the parser.
And instead of memorizing the availability of lxml at the beginning, I would prefer to try importing and using lxml, except it’s not there, in which case I’d fallback to the builtin xml module.
From a performance point of view (time or memory) is it better to do:
import pandas as pd
or
from pandas import DataFrame, TimeSeries
Does the best thing to depend on how many classes I'm importing from the package?
Similarly, I've seen people do things like:
def foo(bar):
from numpy import array
Why would I ever want to do an import inside a function or method definition? Wouldn't this mean that import is being performed every time that the function is called? Or is this just to avoid namespace collisions?
This is micro-optimising, and you should not worry about this.
Modules are loaded once per Python process. All code that then imports only need to bind a name to the module or objects defined in the module. That binding is extremely cheap.
Moreover, the top-level code in your module only runs once too, so the binding takes place just once. An import in a function does the binding each time the function is run, but again, this is so cheap as to be negligible.
Importing in a function makes a difference for two reasons: it won't put that name in the global namespace for the module (so no namespace pollution), and because the name is now local, using that name is slightly faster than using a global.
If you want to improve performance, focus on code that is being repeated many, many times. Importing is not it.
Answering the more general question of when to import, imports are dependancies. It is code that may-or-may-not exist, that is required for the functioning of the program. It is therefore, a very good idea to import that code as soon as possible to prevent dumb errors from cropping up in the middle of execution.
This is particularly true as pypy becomes more popular, when the import might exist but isn't usable via pypy. Far better to fail early, than potentially hours into the execution of the code.
As for "import pandas as pd" vs "from pandas import DataFrame, TimeSeries", this question has multiple concerns (as all questions do), with some far more important than others. There's the question of namespace, there's the question of readability, and there's the question of performance. Performance, as Martjin states, should contribute to about 0.0001% of the decision. Readability should contribute about 90%. Namespace only 10%, as it can be mitigated so easily.
Personally, in my opinion, both import X as Y and form X import Y is bad practice, because explicit is better than implicit. You don't want to be on line 2000 trying to remember which package "calculate_mean" comes from because it isn't referenced anywhere else in the code. When i first started using numpy I was copy/pasting code from the internet, and couldn't figure out why i didn't/couldn't pip install np. This obviously isn't a problem if you have pre-existing knowledge that "np" is python for "numpy", but it's a stupid and pointless confusion for the 3 letters it saves. It came from numpy. Use numpy.
There is an advantage of importing a module inside of a function that hasn't been mentioned yet: doing so gives you some control over when the module is loaded. In fact, even though #J.J's answer recommends importing all modules as early as possible, this control allows you to postpone loading the module.
Why would you want to do that? Well, while it doesn't improve the actual performance of your program, doing so can improve the perceived performance, and by virtue of this, the user experience:
In part, users perceive whether your app is fast or slow based on how long it takes to start up.
MSDN: Best practices for your app's startup performance
Loading every module at the beginning of your main script can take some time. For example, one of my apps uses the Qt framework, Pandas, Numpy, and Matplotlib. If all these modules are imported right at the beginning of the app, the appearance of the user interface is delayed by several seconds. Users don't like to wait, and they are likely to perceive your app as generally slow because of this wait.
But if for example Matplotlib is imported only from within those functions that are called whenever the user issues a plot command, the startup time is notably reduced. The user doesn't perceive your app to be that sluggish anymore, which may result in a better user experience.
If I were to create a module that was called for example imp_mod.py and inside it contained all (subjectively used) relevant modules that I frequently used.
Would importing this module into my main program allow me access to the imports contained inside imp_mod.py?
If so, what disadvantages would this bring?
I guess a major advantage would be a reduction of time spent importing even though its only a couple of seconds saved...
Yes, it would allow you to access them. If you place these imports in imp_mod.py:
from os import listdir
from collections import defaultdict
from copy import deepcopy
Then, you could do this in another file, say, myfile.py:
import imp_mod
imp_mod.listdir
imp_mod.defaultdict
imp_mod.deepcopy
You're wrong about reduction of importing time, as what happens is the opposite. Python will need to import imp_mod and then import the other modules afterwards, while the first import would not be needed if you were importing these modules in myfile.py itself. If you do the same imports in another file, they will already be in cache, so virtually no time is spent in the next import.
The real disadvantage here is less readability. Whoever looks at imp_mod.listdir, for example, will ask himself what the heck is this method and why it has the same name as that os module's method. When he had to open imp_mod.py just to find out that it's the same method, well, he probably wouldn't be happy. I wouldn't.
As lucasnadalutti mentioned, you can access them by importing your module.
In terms of advantages, it can make your main program care less about where the imports are coming from if the imp_mod handles all imports, however, as your program gets more complex and starts to include more namespaces, this approach can get more messy. You can start to handle a bit of this by using __init__.py within directories to handle imports to do a similar thing, but as things get more complex, personally, I feel it add a little more complexity. I'd rather just know where a module came from to look it up.
I read here about sorting your import statements in Python, but what if the thing you are importing needs dependencies that have not been imported yet? Is this the difference between compiled languages and interpreted? I come from a JavaScript background and the order in which you load your scripts matter, whereas Python appears not to care. Thanks.
Import order does not matter. If a module relies on other modules, it needs to import them itself. Python treats each .py file as a self-contained unit as far as what's visible in that file.
(Technically, changing import order could change behavior, because modules can have initialization code that runs when they are first imported. If that initialization code has side effects it's possible for modules to have interactions with each other. However, this would be a design flaw in those modules. Import order is not supposed to matter, so initialization code should also be written to not depend on any particular ordering.)
Python Import order doesnot matter when you are importing standard python libraries/modules.
But, the order matters for your local application/library specific imports as you may stuck in circular dependency loop, so do look before importing.
No, it doesn't, because each python module should be self-contained and import everything it needs. This holds true for importing whole modules and only specific parts of it.
Order can matter for various nefarious reasons, including monkey patching.
This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
Perl's AUTOLOAD in Python (getattr on a module)
I'm coming from a PHP background and attempting to learn Python, and I want to be sure to do things the "Python way" instead of how i've developed before.
My question comes from the fact in PHP5 you can set up your code so if you attempt to call a class that doesn't exist in the namespace, a function will run first that will load the class in and allow you to continue on as if it were already loaded. the advantages to this is classes weren't loaded unless they were called, and you didn't have to worry about loading classes before using them.
In python, there's alot of emphasis on the import statement, is it bad practice to attempt an auto importing trick with python, to alleviate the need for an import statement? I've found this module that offers auto importing, however I dont know if that's the best way of doing it, or if auto importing of modules is something that is recommended, thoughts?
Imports serve at least two other important purposes besides making the modules or contents of the modules available:
They serve as a sort of declaration of intent -- "this module uses services from this other module" or "this module uses services belonging to a certain class" -- e.g. if you are doing a security review for socket-handling code, you can begin by only looking at modules that import socket (or other networking-related modules)
Imports serve as a proxy for the complexity of a module. If you find yourself with dozens of lines of imports, it may be time to reconsider your separation of concerns within the module, or within your application as a whole. This is also a good reason to avoid "from foo import *"-type imports.
In Python, people usually avoid auto imports, just because it is not worth the effort. You may slightly remove startup costs, but otherwise, there is no (or should be no) significant effect. If you have modules that are expensive to import and do a lot of stuff that doesn't need to be done, rather rewrite the module than delay importing it.
That said, there is nothing inherently wrong with auto imports. Because of the proxy nature, there may be some pitfalls (e.g. when looking at a thing that has not been imported yet). Several auto importing libraries are floating around.
If you are learning Python and want to do things the Python way, then just import the modules. It's very unusual to find autoimports in Python code.
You could auto-import the modules, but the most I have ever needed to import was about 10, and that is after I tacked features on top of the original program. You won't be importing a lot, and the names are very easy to remember.