so this is a collection of questions that are more to clarify things and help better understand rather than an issue I am having.
I apologise now if I got things wrong or if these questions have been answered before. I wasn't able to find them.
First clarification I want to ask is:
Let us assume:
import scipy
First, I have noticed that you cannot in general access a module in a package by doing import package and then trying to access package.module.
For example scipy.io
You often have to do import package.module or even import astropy.io.fits, or you can do from package import module.
My question is why is this the case, and why is it so random -dependent on the package? I can't seem to identify any stable pattern.
Is it due to the fact that some of these libraries (packages) are very big and in order to not have memory problems it only imports the core attributes/modules?
The second question:
It relates to actually checking the size of these packages. Is there any way to see how big they are when imported? Any way of knowing what will work and what won't other than trying it? I guess I could check with sys.modules and try to obtain it from there?
The third and final question:
In the scenario that I am not running my code on a Raspberry Pi and I don't necessarily have to worry about the memory issue (if that is the reason why they don't allow direct access), is there any way of actually importing package, such that it also loads all the sub packages?
I am just being lazy and wondering if it is possible. I am aware that it isn't good practice, but curiosity killed the cat.
Just to update and make it accessible to people to see related questions I have looked at:
This answer gives good advice on good general practice:
What are good rules of thumb for Python imports?
Why can't I use the scipy.io? just like the documentation explains why the subpackage isn't necessarily imported
Then there is obviously the documentation:
https://docs.python.org/3/reference/import.html#packages
Section 5.2.1 is the reason why import scipy doesn't also import scipy.io, but I was wondering why would developers not make it an automated process.
This question is actually similar to part of my question but doesn`t seem to have a clear answer Python complex subpackage importing
Status of Questions:
Question 1: Good reason in answers
Question 2: Pending
Question 3: Pending
Answer Q1
When you import a package, especially large ones like SciPy, it uses the init.py module intialisation module which prevents all subpackages/modules from being imported automatically to save space. I won't go into this further as this is already mentioned in this question, documented here, and talked about in other answers.
Additionally, if you have questions about scripts vs. modules, this post is incredibly descriptive.
Answer Q2
To find the size of a package I would point you towards this post about finding package directories, and then this post about reporting the size of a particular directory. You could create some combined code to do both for you.
Answer Q3
Update: Unsure on how to do this as the normal from package import * works as explained in the documentation (similar to Q1):
if a package’s __init__.py code defines a list named __all__, it is taken to be the list of module names that should be imported when from package import * is encountered
A package is represented by the file __init__.py. Therefore, the packge scipy is represented by scipy/__init__.py. Inside this file you see a lot of imports like this:
from scipy.version import version as __version__
This is the reason why scipy.__version__ works, even though __version__ actually lives in scipy.version. Not all packages do this. There is no rule when such kind of behavior can be expected. It is totally up to the package author(s).
The key difference between these import calls is the namespace the module is imported into. Given the following example:
import mypackage
import mypackage.myclass
from mypackage import myclass
The first example imports everything exposed by __init__.py into the package's namespace. I.E. its elements can be accessed as mypackage.myclass(). The second example imports only mypackage.myclass and still imports it into that package's namespace, so it is still accessed as mypackage.myclass(). The third example imports mypackage.myclass into the current namespace, so it is accessed explicitly as myclass(), as if you had defined it yourself in the same script. This may hide things that you have named elsewhere.
One other important use case looks like this:
import mypackage as mp
This lets you set the namespace that you want that package to be imported into, perhaps making it a shorthand or something more convenient.
In the case of your question about why scipy doesn't import everything when you call import scipy, what it comes back to is that that import call only imports whatever the developers tell it to in the __init__.py. For scipy specifically, if you do:
import scipy
dir(scipy)
You will see that it imports a bunch of classes and functions that are used throughout the package. I suspect that they intentionally don't import the submodules so as not to litter your runtime space with things that you aren't using. Perhaps there is a way to import everything automatically, but you probably shouldn't.
Related
Are there any rules or guidelines concerning when to use relative imports in Python? I see them in use all the time, such as in the Flask web framework. When searching for this topic, I only see articles on how to use relative imports, but not why.
So is there some special benefit to using:
from . import x
rather than:
from package import x
Moreover, I noticed that a related SO post mentions that relative imports are discouraged. Yet people still continue to use them.
Check out PEP 328's section on relative imports
The rationale seems to be as written:
Several use cases were presented, the most important of which is being able to rearrange the structure of large packages without having to edit sub-packages. In addition, a module inside a package can't easily import itself without relative imports.
Suppose I have a package that contains modules:
SWS/
__init.py__
foo.py
bar.py
time.py
and the modules need to refer to functions contained in one another. It seems like I run into problems with my time.py module since there is a standard module that goes by the same name.
For instance, in the case that my foo.py module requires both my SWS.time and the standard python time modules, I run into trouble since the interpreter will look inside the package and find my time.py modules before it comes across the standard time module.
Is there any way around this? Is this a no-no situation and should modules names not be reused?
Any solutions and opinions on package philosophy would be useful here.
Reusing names of standard functions/classes/modules/packages is never a good idea. Try to avoid it as much as possible. However there are clean workarounds to your situation.
The behaviour you see, importing your SWS.time instead of the stdlib time, is due to the semantics of import in ancient python versions (2.x). To fix it add:
from __future__ import absolute_import
at the very top of the file. This will change the semantics of import to that of python3.x, which are much more sensible. In that case the statement:
import time
Will only refer to a top-level module. So the interpreter will not consider your SWS.time module when executing that import inside the package, but it will only use the standard library one.
If a module inside your package needs to import SWS.time you have the choice of:
Using an explicit relative import:
from . import time
Using an absolute import:
import SWS.time as time
So, your foo.py would be something like:
from __future__ import absolute_import
import time
from . import time as SWS_time
It depends on what version of Python you're using. If your targeted Python version is 2.4 or older (in 2015, I sure hope not), then yes it would be bad practice as there is no way (without hacks) to differentiate the two modules.
However, in Python 2.5+, I think that reusing standard lib module names within a package namespace is perfectly fine; in fact, that is the spirit of PEP328.
As Python's library expands, more and more existing package internal modules suddenly shadow standard library modules by accident. It's a particularly difficult problem inside packages because there's no way to specify which module is meant. To resolve the ambiguity, it is proposed that foo will always be a module or package reachable from sys.path . This is called an absolute import.
The python-dev community chose absolute imports as the default because they're the more common use case and because absolute imports can provide all the functionality of relative (intra-package) imports -- albeit at the cost of difficulty when renaming package pieces higher up in the hierarchy or when moving one package inside another.
Because this represents a change in semantics, absolute imports will be optional in Python 2.5 and 2.6 through the use of from __future__ import absolute_import
SWS.time is clearly not the same thing as time and as a reader of the code, I would expect SWS.time to not only use time, but to extend it in some way.
So, if SWS.foo needs to import SWS.time, then it should use the absolute path:
# in SWS.foo
# I would suggest renaming *within*
# modules that use SWS.time so that
# readers of your code aren't confused
# with which time module you're using
from SWS import time as sws_time
Or, it should use an explicit relative import as in Bakuriu's answer:
# in SWS.foo
from . import time as sws_time
In the case that you need to import the standard lib time module within the SWS.time module, you will first need to import the future feature (only for Python 2.5+; Python 3+ does this by default):
# inside of SWS.time
from __future__ import absolute_import
import time
time.sleep(28800) # time for bed
Note: from __future__ import absolute_imports will only affect import statements within the module that the future feature is imported and will not affect any other module (as that would be detrimental if another module depends on relative imports).
As others have said, this is generally a bad idea.
That being said, if you're looking for potential workarounds, or a better understanding of the problem, I suggest you read the following SO questions:
Importing from builtin library when module with same name exists
How to access a standard-library module in Python when there is a local module with the same name?
Yeah, really no good way around it. Try not to name your modules like standard packages. If you really want to call your module time, i'd recommend using _time.py instead. Even if there was a way to do it, it would make your code hard to read and confusing when it came to the 2 time modules.
I write a package P. At the root of P there is a module m0.
Somewhere inside P there are modules m1,m2,... that need to import from m0
Of course I can write in each of these modules:
from P.m0 import ...
However if I change the name of P I have to revisit all the places and rewrite such statements.
I could also use relative imports but if I move a module at a different level in the package hierarchy I have to go fix the number of dots.
There are some other reasons too, but bottom line, I really want to say import from the module m0 that is located at the root of my package, what is the best way to express this?
That's not possible.
However, if you perform major refactoring where you move around modules between subpackages having to update some relative imports is not a huge problem.
Same applies for renaming the top-level package name if you do not use relative imports - that could even be done really fast with search-and-replace over all your files.
If you are willing to modify your question a tiny bit, you can get away with this.
IF the only entrypoints to your package are controlled; e.g. you only test your code by doing something like invoking testsuite package/.../module.py which will
THEN you can make sure that the first thing you do is import firstthing, and in package/firstthing.py you have:
import sys
import os.path
packageDir = os.path.split(__name__)[0]
sys.path[:] = sys.path+[packageDir] # or maybe you want it first...
The main caveat being that you will not be able to run python files without going through your entrypoints. I always want to do this for every project I write in python (to make relative imports work nicely), but I personally find this so inconvenient that I just give up.
There is also a second alternative. It is not that unreasonable to specify that your package requires another package in the python path. This package could be a utility package which performs a major hack. For example if the name of the package was "x", you could do import x which would use the inspect module to perform reflection on the interpreter stack, letting you figure out which module you were importing it from. Then you could do a sort of "backwards os.walk" by going up parent directories until you found the root of your package (by checking for some special indicator file, or manifest, or something). Then the code would programatically perform the above modification of the python path via sys.path. It's the same as the above, but you have the liberty to do things like run any python file without having to go through an awful entrypoint.
If you have extreme control over the shell environment, you can also just augment the $PYTHONPATH to include your package directory, but this is extremely fragile in many ways, and rather inelegant.
I am working on a project wherein I need to use a third party module in different project files(.py files). The situation is like this.
I have a file "abc.py" which imports third party module "common.py". There are couple of other files which also import "common.py". All these files are also imported in main project file "main.py".
It seems redundant to import same module in your project multiple times in different files since "main.py" is also importing all the project files.
I am also not sure how the size of the project gets affected by multiple import statements.
Can someone pls help me in making things bit simpler.
Importing only ever loads a module once. Any imports after that simply add it to the current namespace.
Just import things in the files you need them to be available and let Python do the heavy-lifting of figuring out loading the modules.
Yes, you are right, this behavior really exists in Python. Namely, if user code tries to import the same module in different ways, for example - import a and import A.a (where a.py file is located into A package and the first import is done from within the A package while the other import comes as from outside).
This can easily happen in real life, especially for multi-level packaged Python projects.
I have experienced a side-effect of such behavior, namely command isinstance does not work when an object is checked against a class that is defined in module that was imported in such way.
The solution I can think about is to redefine the __builtin__. __ import__ function to perform its work more intelligently.
I'm trying to import a few libraries into my program (which is a google AppEngine application).
Basically, I'm supposed to put all libraries in the root folder, but I've just created another folder called lib and placed them within that folder. (I've created the __init__.py)
Imports regularly work fine by using the import lib.module or from lib import module, but what happens is that when I try to import a complete package, for instance a folder named pack1 with various modules in it, by calling from lib.pack1 import *, I get this error in one of the modules who has accessed another module statically, i.e. from pack1.mod2 import sth.
What is the easy and clean way to overcome this? Without modifying the libraries themselves.
Edit: Using Python 2.7.
Edit: Error: when using import lib.pack1, I get ImportError: No module named pack1.mod1.
I think that instead of from pack1.mod2 you actually want to say from lib.pack1.mod2.
Edit: and, specifying what version of Python this is would help, since importation semantics have improved gradually over the years!
Edit: Aha! Thank you for your comment; I now understand. You are trying to rename libraries without going inside of them and fixing the fact that their name is now different. The problem is that what you are doing is, unfortunately, impossible. If all libraries used relative imports inside, then you might have some chance of doing it; but, alas, relative imports are both (a) recent and (b) not widely used.
So, if you want to use library p, then you are going to have to put it in your root directory, not inside of lib/p because that creates a library with a different name: lib.p, which is going to badly surprise the library and break it.
But I have two more thoughts.
First, if you are trying to do this to organize your files, and not because you need the import names to be different, then (a) create lib like you are doing but (b) do not put an __init__.py inside! Instead, add the lib directory to your PYTHONPATH or, inside of your program, to sys.path. (Does the GAE let you do something like this? Does it have a PYTHONPATH?)
Second, I am lying when I say this is not possible. Strictly speaking, you could probably do this by adding an entry to sys.metapath that intercepts all module lookups and tries grabbing them from inside of lib if they exist there. But — yuck.