What's the purpose of the file "pylab.py" - python

I looked at the file "pylab.py" at matplotlab's directory and found that it contains a great bunch of imports, and then defines a single variable "bytes" at the last line. Here is the last several lines of this file:
from numpy.fft import *
from numpy.raenter code herendom import *
from numpy.linalg import *
import numpy as np
import numpy.ma as ma
# don't let numpy's datetime hide stdlib
import datetime
# This is needed, or bytes will be numpy.random.bytes from
# "from numpy.random import *" above
bytes = six.moves.builtins.bytes
I wonder what's the purpose of such a file when it only defines a seemingly useless variable. As a result, what's the purpose of writing code like from matplotlib import pylab?

The matplotlib docs say:
pylab is a convenience module that bulk imports matplotlib.pyplot (for plotting) and numpy (for mathematics and working with arrays) in a single name space. Although many examples use pylab, it is no longer recommended.
So for example, you can do
>>> from pylab import *
And you have imported all the names imported by pylab into your local namespace. This is convenient when using the interactive shell.
Additionally, pylab imports datetime and bytes. This is because the from numpy.foo import * statements import numpy objects named bytes and datetime which are not the same as the standard python objects with these names, so they need to be overridden with the standard versions.
The practice of importing names into a module just so other modules can import them from there instead of the original module is not unusual. For example, given this module:
$ cat foo/__init__.py
from bar import *
from baz.quux import *
from spam import eggs
Other modules can do from foo import eggs rather than from foo.spam import eggs. Apart from the convenience of less typing, this approach hides the internal structure of the foo package from its clients. As long as they import from the top level module they need not be concerned that the internal structure of the package may change over time. This is a form of the facade design pattern.

Related

Python - Importing packages by running a script

I have a script which is importing lots of packages, including import numpy as np.
I have lots of scripts which need to import all of these packages (including some of my own). To make my life easier, I have a file called mysetup.py in my path to import all the packages. It includes the statement in a function called "import numpy as np".
I run "main.py". It runs the following
from mysetup import *
import_my_stuff()
np.pi()
"mysetup.py"
def import_my_stuff():
import numpy as np
return
However, I am unable to use numpy in "main.py" - this code will fail. Any suggestions as to why?
The problem you are facing is a consequence of a very important features of Python: namespaces.
https://docs.python.org/3/tutorial/classes.html#python-scopes-and-namespaces
https://realpython.com/python-namespaces-scope/
Basically, in your case, when you do that (numpy) import inside the (import_my_stuff) function, you are defining the code object numpy/np inside the function namespace. (scope, if you prefer).
To solve your issue (the way you are doing; not the only way), you should simply import everything at the module top level (without a function encapsulating the imports):
mysetup.py:
import numpy as np
# other modules...
main.py:
from mysetup import *
np.pi()
Imports in functions are not the best idea.
But you can just define whatever imports you need in top level code of mysetup.py
import numpy as np
and then it will be available when you import * from mysetup
from mysetup import *
print(np.pi)

Why does python import module imports when importing *

Let's say I have a file where I'm importing some packages:
# myfile.py
import os
import re
import pathlib
def func(x, y):
print(x, y)
If I go into another file and enter
from myfile import *
Not only does it import func, but it also imports os, re, and pathlib,
but I DO NOT want those modules to be imported when I do import *.
Why is it importing the other packages I'm importing and how do you avoid this?
The reason
Because import imports every name in the namespace. If something has a name inside the module, then it's valid to be exported.
How to avoid
First of all, you should almost never be using import *. It's almost always clearer code to either import the specific methods/variables you're trying to use (from module import func), or to import the whole module and access methods/variables via dot notation (import module; ...; module.func()).
That said, if you must use import * from module, there are a few ways to prevent certain names from being exported from module:
Names starting with _ will not be imported by import * from .... They can still be imported directly (i.e. from module import _name), but not automatically. This means you can rename your imports so that they don't get exported, e.g. import os as _os. However, this also means that your entire code in that module has to refer to the _os instead of os, so you may have to modify lots of code.
If a module contains the name __all__: List[str], then import * will export only the names contained in that list. In your example, add the line __all__ = ['func'] to your myfile.py, and then import * will only import func. See also this answer.
from myfile import func
Here is the fix :)
When you import *, you import everything from. Which includes what yu imported in the file your source.
It has actually been discussed on Medium, but for simplification, I will answer it myself.
from <module/package> import * is a way to import all the names we can get in that specific module/package. Usually, everyone doesn't actually use import * for this reason, and rather sticked with import <module>.
Python's import essentially just runs the file you point it to import (it's not quite that but close enough). So if you import a module it will also import all the things the module imports. If you want to import only specific functions within the module, try:
from myfile import func
...which would import only myfile.func() instead of the other things as well.

Why do I have to import this from numpy if I am just referencing it from the numpy module

Aloha!
I have two blocks of code, one that will work and one that will not. The only difference is a commented line of code for a numpy module I don't use. Why am I required to import that model when I never reference "npm"?
This command works:
import numpy as np
import numpy.matlib as npm
V = np.array([[1,2,3],[4,5,6],[7,8,9]])
P1 = np.matlib.identity(V.shape[1], dtype=int)
P1
This command doesn't work:
import numpy as np
#import numpy.matlib as npm
V = np.array([[1,2,3],[4,5,6],[7,8,9]])
P1 = np.matlib.identity(V.shape[1], dtype=int)
P1
The above gets this error:
AttributeError: 'module' object has no attribute 'matlib'
Thanks in advance!
Short Answer
This is because numpy.matlib is an optional sub-package of numpy that must be imported separately.
The reason for this feature may be:
In particular for numpy, the numpy.matlib sub-module redefines numpy's functions to return matrices instead of ndarrays, an optional feature that many may not want
More generally, to load the parent module without loading a potentially slow-to-load module which many users may not often need
Possibly, namespace separation
When you import just numpy without the sub-package matlib, then Python will be looking for .matlib as an attribute of the numpy package. This attribute has not been assigned to numpy without importing numpy.matlib (see discussion below)
Sub-Modules and Binding
If you're wondering why np.matlib.identity works without having to use the keyword npm, that's because when you import the sub-module matlib, the parent module numpy (named np in your case) will be given an attribute matlib which is bound to the sub-module. This only works if you first define numpy.
From the reference:
When a submodule is loaded using any mechanism (e.g. importlib APIs, the import or import-from statements, or built-in import()) a binding is placed in the parent module’s namespace to the submodule object.
Importing and __init__.py
The choice of what to import is determined in the modules' respective __init__.py files in the module directory. You can use the dir() function to see what names the respective modules define.
>> import numpy
>> 'matlib' in dir(numpy)
# False
>> import numpy.matlib
>> 'matlib' in dir(numpy)
# True
Alternatively, if you look directly at the __init__.py file for numpy you'll see there's no import for matlib.
Namespace across Sub-Modules
If you're wondering how the namespace is copied over smoothly;
The matlib source code runs this command to copy over the numpy namespace:
import numpy as np # (1)
...
# need * as we're copying the numpy namespace
from numpy import * # (2)
...
__all__ = np.__all__[:] # copy numpy namespace # (3)
Line (2), from numpy import * is particularly important. Because of this, you'll notice that if you just import numpy.matlib you can still use all of numpy modules without having to import numpy!
Without line (2), the namespace copy in line (3) would only be attached to the sub-module. Interestingly, you can still do a funny command like this because of line (3).
import numpy.matlib
numpy.matlib.np.matlib.np.array([1,1])
This is because the np.__all__ is attached to the np of numpy.matlib (which was imported via line (1)).
You never use npm but you do use np.matlib, so you could change your 2nd import line to just:
import numpy.matlib
Or you could keep your 2nd import line as is but instead use:
P1 = npm.identity(V.shape[1], dtype=int)
Is there are reason you don't use np.identity?
P1 = np.identity(V.shape[1], dtype=int)
This module contains all functions in the numpy namespace, with the following replacement functions that return matrices instead of ndarrays.
Unless you are wedded to 2d np.matrix subclass, you are better off sticking with the regular ndarray versions.
(Others have pointed out that the import why is based on the __init__ specs for numpy. numpy imports most, but not all of its submodules. The ones it does not automatically import are used less often. It's a polite way of saying, You don't really need this module)

How to share imports between modules?

My package looks like this:
These helpers, since they are all dealing with scipy, all have common imports:
from matplotlib import pyplot as plt
import numpy as np
I'm wondering if it is possible to extract them out, and put it somewhere else, so I can reduce the duplicate code within each module?
You can create a file called my_imports.py which does all your imports and makes them available as * via the __all__ variable (note that the module names are declared as strings):
File my_imports.py:
import os, shutil
__all__ = ['os', 'shutil']
File your_other_file.py:
from my_imports import *
print(os.curdir)
Although you might want to be explicit in your other files:
File your_other_file.py:
from my_imports import os # or whichever you actually need.
print(os.curdir)
Still, this saves you having to specify the various sources each time — and can be done with a one-liner.
Alright, here is my tweak,
Create a gemfile under the package dir, like this
import numpy as np
from matplotlib import pyplot as plt
import matplotlib as mpl
Then, for other files, like app_helper.py
from .gemfile import *
This comes from here Can I use __init__.py to define global variables?

Conventions for 'import ... as'

Typically, one uses import numpy as np to import the module numpy.
Are there general conventions for naming?
What about other modules, in particular from scientific computing like scipy, sympy and pylab or submodules like scipy.sparse.
SciPy recommends import scipy as sp in its documentation, though personally I find that rather useless since it only gives you access to re-exported NumPy functionality, not anything that SciPy adds to that. I find myself doing import scipy.sparse as sp much more often, but then I use that module heavily. Also
import matplotlib as mpl
import matplotlib.pyplot as plt
import networkx as nx
You might encounter more of these as you start using more libraries. There's no registry or anything for these shorthands and you're free to invent new ones as you see fit. There's also no general convention except that import lln as library_with_a_long_name obviously won't occur very often.
Aside from these shorthands, there's a habit among Python 2.x programmers to do things like
# Try to import the C implementation of StringIO; if that doesn't work
# (e.g. in IronPython or Jython), import the pure Python version.
# Make sure the imported module is called StringIO locally.
try:
import cStringIO as StringIO
except ImportError:
import StringIO
Python 3.x is putting an end to this, though, because it no longer offers partial C implementations of StringIO, pickle, etc.

Categories