Python - Importing packages by running a script - python

I have a script which is importing lots of packages, including import numpy as np.
I have lots of scripts which need to import all of these packages (including some of my own). To make my life easier, I have a file called mysetup.py in my path to import all the packages. It includes the statement in a function called "import numpy as np".
I run "main.py". It runs the following
from mysetup import *
import_my_stuff()
np.pi()
"mysetup.py"
def import_my_stuff():
import numpy as np
return
However, I am unable to use numpy in "main.py" - this code will fail. Any suggestions as to why?

The problem you are facing is a consequence of a very important features of Python: namespaces.
https://docs.python.org/3/tutorial/classes.html#python-scopes-and-namespaces
https://realpython.com/python-namespaces-scope/
Basically, in your case, when you do that (numpy) import inside the (import_my_stuff) function, you are defining the code object numpy/np inside the function namespace. (scope, if you prefer).
To solve your issue (the way you are doing; not the only way), you should simply import everything at the module top level (without a function encapsulating the imports):
mysetup.py:
import numpy as np
# other modules...
main.py:
from mysetup import *
np.pi()

Imports in functions are not the best idea.
But you can just define whatever imports you need in top level code of mysetup.py
import numpy as np
and then it will be available when you import * from mysetup
from mysetup import *
print(np.pi)

Related

Why does python import module imports when importing *

Let's say I have a file where I'm importing some packages:
# myfile.py
import os
import re
import pathlib
def func(x, y):
print(x, y)
If I go into another file and enter
from myfile import *
Not only does it import func, but it also imports os, re, and pathlib,
but I DO NOT want those modules to be imported when I do import *.
Why is it importing the other packages I'm importing and how do you avoid this?
The reason
Because import imports every name in the namespace. If something has a name inside the module, then it's valid to be exported.
How to avoid
First of all, you should almost never be using import *. It's almost always clearer code to either import the specific methods/variables you're trying to use (from module import func), or to import the whole module and access methods/variables via dot notation (import module; ...; module.func()).
That said, if you must use import * from module, there are a few ways to prevent certain names from being exported from module:
Names starting with _ will not be imported by import * from .... They can still be imported directly (i.e. from module import _name), but not automatically. This means you can rename your imports so that they don't get exported, e.g. import os as _os. However, this also means that your entire code in that module has to refer to the _os instead of os, so you may have to modify lots of code.
If a module contains the name __all__: List[str], then import * will export only the names contained in that list. In your example, add the line __all__ = ['func'] to your myfile.py, and then import * will only import func. See also this answer.
from myfile import func
Here is the fix :)
When you import *, you import everything from. Which includes what yu imported in the file your source.
It has actually been discussed on Medium, but for simplification, I will answer it myself.
from <module/package> import * is a way to import all the names we can get in that specific module/package. Usually, everyone doesn't actually use import * for this reason, and rather sticked with import <module>.
Python's import essentially just runs the file you point it to import (it's not quite that but close enough). So if you import a module it will also import all the things the module imports. If you want to import only specific functions within the module, try:
from myfile import func
...which would import only myfile.func() instead of the other things as well.

What's the purpose of the file "pylab.py"

I looked at the file "pylab.py" at matplotlab's directory and found that it contains a great bunch of imports, and then defines a single variable "bytes" at the last line. Here is the last several lines of this file:
from numpy.fft import *
from numpy.raenter code herendom import *
from numpy.linalg import *
import numpy as np
import numpy.ma as ma
# don't let numpy's datetime hide stdlib
import datetime
# This is needed, or bytes will be numpy.random.bytes from
# "from numpy.random import *" above
bytes = six.moves.builtins.bytes
I wonder what's the purpose of such a file when it only defines a seemingly useless variable. As a result, what's the purpose of writing code like from matplotlib import pylab?
The matplotlib docs say:
pylab is a convenience module that bulk imports matplotlib.pyplot (for plotting) and numpy (for mathematics and working with arrays) in a single name space. Although many examples use pylab, it is no longer recommended.
So for example, you can do
>>> from pylab import *
And you have imported all the names imported by pylab into your local namespace. This is convenient when using the interactive shell.
Additionally, pylab imports datetime and bytes. This is because the from numpy.foo import * statements import numpy objects named bytes and datetime which are not the same as the standard python objects with these names, so they need to be overridden with the standard versions.
The practice of importing names into a module just so other modules can import them from there instead of the original module is not unusual. For example, given this module:
$ cat foo/__init__.py
from bar import *
from baz.quux import *
from spam import eggs
Other modules can do from foo import eggs rather than from foo.spam import eggs. Apart from the convenience of less typing, this approach hides the internal structure of the foo package from its clients. As long as they import from the top level module they need not be concerned that the internal structure of the package may change over time. This is a form of the facade design pattern.

Why do I have to import this from numpy if I am just referencing it from the numpy module

Aloha!
I have two blocks of code, one that will work and one that will not. The only difference is a commented line of code for a numpy module I don't use. Why am I required to import that model when I never reference "npm"?
This command works:
import numpy as np
import numpy.matlib as npm
V = np.array([[1,2,3],[4,5,6],[7,8,9]])
P1 = np.matlib.identity(V.shape[1], dtype=int)
P1
This command doesn't work:
import numpy as np
#import numpy.matlib as npm
V = np.array([[1,2,3],[4,5,6],[7,8,9]])
P1 = np.matlib.identity(V.shape[1], dtype=int)
P1
The above gets this error:
AttributeError: 'module' object has no attribute 'matlib'
Thanks in advance!
Short Answer
This is because numpy.matlib is an optional sub-package of numpy that must be imported separately.
The reason for this feature may be:
In particular for numpy, the numpy.matlib sub-module redefines numpy's functions to return matrices instead of ndarrays, an optional feature that many may not want
More generally, to load the parent module without loading a potentially slow-to-load module which many users may not often need
Possibly, namespace separation
When you import just numpy without the sub-package matlib, then Python will be looking for .matlib as an attribute of the numpy package. This attribute has not been assigned to numpy without importing numpy.matlib (see discussion below)
Sub-Modules and Binding
If you're wondering why np.matlib.identity works without having to use the keyword npm, that's because when you import the sub-module matlib, the parent module numpy (named np in your case) will be given an attribute matlib which is bound to the sub-module. This only works if you first define numpy.
From the reference:
When a submodule is loaded using any mechanism (e.g. importlib APIs, the import or import-from statements, or built-in import()) a binding is placed in the parent module’s namespace to the submodule object.
Importing and __init__.py
The choice of what to import is determined in the modules' respective __init__.py files in the module directory. You can use the dir() function to see what names the respective modules define.
>> import numpy
>> 'matlib' in dir(numpy)
# False
>> import numpy.matlib
>> 'matlib' in dir(numpy)
# True
Alternatively, if you look directly at the __init__.py file for numpy you'll see there's no import for matlib.
Namespace across Sub-Modules
If you're wondering how the namespace is copied over smoothly;
The matlib source code runs this command to copy over the numpy namespace:
import numpy as np # (1)
...
# need * as we're copying the numpy namespace
from numpy import * # (2)
...
__all__ = np.__all__[:] # copy numpy namespace # (3)
Line (2), from numpy import * is particularly important. Because of this, you'll notice that if you just import numpy.matlib you can still use all of numpy modules without having to import numpy!
Without line (2), the namespace copy in line (3) would only be attached to the sub-module. Interestingly, you can still do a funny command like this because of line (3).
import numpy.matlib
numpy.matlib.np.matlib.np.array([1,1])
This is because the np.__all__ is attached to the np of numpy.matlib (which was imported via line (1)).
You never use npm but you do use np.matlib, so you could change your 2nd import line to just:
import numpy.matlib
Or you could keep your 2nd import line as is but instead use:
P1 = npm.identity(V.shape[1], dtype=int)
Is there are reason you don't use np.identity?
P1 = np.identity(V.shape[1], dtype=int)
This module contains all functions in the numpy namespace, with the following replacement functions that return matrices instead of ndarrays.
Unless you are wedded to 2d np.matrix subclass, you are better off sticking with the regular ndarray versions.
(Others have pointed out that the import why is based on the __init__ specs for numpy. numpy imports most, but not all of its submodules. The ones it does not automatically import are used less often. It's a polite way of saying, You don't really need this module)

Way to run all packages I want in python script

There are many times that I want to use same packages in my scripts, I mostly copy paste packages I want from my last script. I want to stop this work and run all of theme with one simple function, Today i try this:
def econometrics():
print("Econometrics is starting")
import pandas as pd
import numpy as np
import statsmodels.formula.api as smf
import statsmodels.api as sm
import matplotlib.pyplot as plt
print("Econometrics is started")
econometrics()
the function runs without error but when I call some method from packages, I get errors like this:
name 'plt' is not defined
What is wrong with that code? is there anyway to define function to do that?
What is wrong with that code?
Simple answer: Variable scope. plt (and the others) are only accessible from within the econometrics method.
Try making one file, named importer.py, for example
import pandas as pd
import numpy as np
import statsmodels.formula.api as smf
import statsmodels.api as sm
import matplotlib.pyplot as plt
Then in your other code (that is in the same directory),
from importer import *
Using an __init__.py is probably the recommended way to approach that, though, but it wasn't clear if you have a module/package layout, or not.
If you do, then use
Relative import (same directory): from . import *
Absolute import (use module name): from some_module import *
Your intent is wrong in python's grammar. Because within your code, the variables range are scoped within the function. So, when you do your imports, you're creating a bunch of variables within the econometrics function range, and thus your variables are only in reach within that function.
So, let's take a simpler example:
>>> def foobar():
... a = 1
... b = 2
...
>>> foobar()
>>> a
NameError: name 'a' is not defined
here a and b only exist within foobar's function scope, so it's out of scope at the main scope.
To do what you want, the way you want it, you should declare your variable as belonging to the global scope:
def econometrics():
global pd, np, smf, sm, plt
print("Econometrics is starting")
import pandas as pd
import numpy as np
import statsmodels.formula.api as smf
import statsmodels.api as sm
import matplotlib.pyplot as plt
print("Econometrics is started")
econometrics()
So to get back to the foobar example:
>>> def foobar():
... global a, b
... a = 1
... b = 2
...
>>> foobar()
>>> a
1
>>> b
2
Though, I do not really like that way of doing things, as it's doing things implicitely. Considering you have a python module with just the econometrics function defined, people reading the following code:
from econometrics import econometrics
econometrics()
plt.something()
wouldn't necessary understand that plt has been made available through the econometrics function call. Adding a comment would help, but still is an unnecessary extra step.
Generally speaking, doing globals within any language is wrong, and there's most of the time always a better way to do it. Within the "Zen of python", it is stated that "Explicit is better than implicit", so I believe a more elegant way would be to create a module that does the import, and then you'd import what you need from the module:
econometrics.py:
import pandas as pd
import numpy as np
import statsmodels.formula.api as smf
import statsmodels.api as sm
import matplotlib.pyplot as plt
and in your code you'd then import only what you need:
from econometrics import pd, plt
plt.something()
which would be much more elegant and explicit! Then, you'd just have to drop that file in any projects you need your mathematics modules to have all your beloved modules that need - and only them - available in your code!
Then as a step further, you could define your own python module, with a full blown setup.py, and with your econometrics.py file being a __init__.py in the econometrics package directory, to then have it installed as a python package through:
python setup.py install
at the root of your sources. So then any code you work out can be using econometrics as a python package. You might even consider making it a package on pypi!
HTH
You imported the packages into the scope of the function. If you want to use them in the global scope, you have to tell python
def importfunc():
global np
import numpy as np
importfunc()
print np.version.version
On a side note: Are you using some kind of toolchain? I'd think it would be better to use an IDE or to write a script which sets up new projects for you.
The various imports are performed when you call the function, but the names pd, np, etc are local to the function, so they can't be referenced outside the function.
I suppose you could return those names, but importing modules in a function like that makes your code a little harder for readers to follow, IMHO.

Issues with Importing Modules into Python

new to Python programming and have encountered an issue importing modules.
I have a main application (compare.py) with imports as follows :
# import the necessary packages
from skimage.measure import structural_similarity as ssim
import matplotlib.pyplot as plt
import numpy as np
import os
import skimage
from skimage import io
from skimage import color
from epilib import mse
from epilib import compare_images
and I have defined two functions in epilib, one called mse() and one called compare_images().
The code in mse() requires numpy. When I execute 'python compare.py', I get the following error message :
File "C:\Users\Dan\epilib.py", line 7, in mse err = np.sum((imageA.astype("float") - imageB.astype("float")) ** 2)
NameError: name 'np' is not defined
I assumed that because 'import numpy as np' was executed prior to import epilib, that the numpy library would be available to epilib? When I added 'import numpy as np' to the top of epilib, the issue resolved.
I don't see it as very efficient to have to move all the import statements to epilib. I was hoping to have epilib as just a library of functions and I could import into various python programs as required.
Is there a way to accomplish this?
That is not how python works, if you want to use numpy library in a module (in this case in eplib module), you need to import it in that module as well, eplib would get not the numpy module imported in your compare.py .
You should import numpy in eplib.py as -
import numpy as np
I do not think there would be any issue in efficiency, since once python imports a module for the first time, it caches the module in sys.modules , so whenever you re-import it (even if its in a different module) as long as its the same python process , Python would not re-import it, instead it would return the module object from sys.modules .

Categories