IPython import local file - python

I've come across a weird disparity between IPython and the default python interpreter. I have a python file that shadows a built-in module's name: logging.py. Say it has a simple method foo().
If I start up the default python interpreter and call import logging it imports the local file and I can access logging.foo().
If I start up IPython and call import logging it imports the python built-in module. If I change the name to a non-shadow (e.g. import my_logging) then the import will work as expected.
Which is the expected behaviour? The current directory is at the start of sys.path for both interpreters but they differ in which imports have priority.

import sys
print(sys.modules)
IPython starts with most of the standard library imported, including logging. Those modules seem to be imported by full path.
Speculation: IPython already imported those libraries by full path for internal use, now when you import logging again it checks that the module is already imported, regardless of the path, and does nothing.

Related

How to understand Python's module lookup

I created two new files, random.py and main.py, in the directory. The code is as follows:
# random.py
if __name__ == "__main__":
print("random")
# main.py
import random
if __name__ == "__main__":
print(random.choice([1, 2, 3]))
When I run the main.py file, the program reports an error.
Traceback (most recent call last):
File "main.py", line 8, in <module>
print(random.choice([1, 2, 3]))
AttributeError: module 'random' has no attribute 'choice'
Main.py imports my own defined random module.
However, if I create a new sys.py file and a main.py file in the same directory, the code is as follows:
# sys.py
if __name__ == "__main__":
print("sys")
# main.py
import sys
if __name__ == "__main__":
print(sys.path)
When I run the main.py file, successfully.
main.py imports the built-in modules sys.
Why is there such a clear difference?
The directory relationship of the script file is as follows:
C:.
main.py
random.py
sys.py
Thank you very much for your answer.
Forgive my poor english.
sys is a built-in module, meaning it's compiled directly into the Python executable itself. Built-in modules outprioritize external files when Python is looking for modules. The standard random module isn't built-in, so it doesn't get that treatment.
Quoting the docs:
When the named module is not found in sys.modules, Python next searches sys.meta_path, which contains a list of meta path finder objects. These finders are queried in order to see if they know how to handle the named module...
Python’s default sys.meta_path has three meta path finders, one that knows how to import built-in modules, one that knows how to import frozen modules, and one that knows how to import modules from an import path (i.e. the path based finder).
Since the finder for built-in modules comes before the finder that searches the import path, built-in modules will be found before anything on the import path.
You can see a tuple of the names of all modules your Python has built-in in sys.builtin_module_names.
That said, while any built-in module would outprioritize a module loaded from a file, sys has its own special handling. sys is one of the foundational building blocks of Python, and much of the sys module's setup needs to happen before the import system is functional enough for the normal import process to work. sys gets explicitly created during interpreter setup in a way that bypasses the normal import system, and then future imports for sys find it in sys.modules without hitting any meta path finders.
How and where sys is created is an implementation detail that varies from Python version to Python version (and is wildly different in different Python implementations), but in the CPython 3.7.4 code, you can see it beginning on line 755 in Python/pylifecycle.c.
tl;dr Caching
sys is somewhat of a special case among other python modules because it gets loaded at program start, unconditionally (presumably because a lot of the constants, functions, and data within - such as the streams stdout and stderr - are used by the python interpreter). As #user2357112 noted in the other answer, this is partly because it's built-in to the python executable, but also because it's necessary for running a substantial amount of python's core functionality (see below how it needs to be loaded for imports to work). random is part of the standard library, but it doesn't get loaded automatically when you execute, which is the primary relevant difference between it and sys, for our purposes
Looking at python's documentation on the subject clarifies how python resolves imports:
The first place checked during import search is sys.modules. This mapping serves as a cache of all modules that have been previously imported, including the intermediate paths.
...
During import, the module name is looked up in sys.modules and if present, the associated value is the module satisfying the import, and the process completes. However, if the value is None, then a ModuleNotFoundError is raised. If the module name is missing, Python will continue searching for the module.
As for where it looks for the module, you can see in your observed behavior that it looks in the local directory first. That is, it searches the local directory first and then the "usual places" afterwards.
The reason for the discrepancy between how sys is handled and how random is handled is caching - sys is cached (so python doesn't even check the path to import), whereas random is not cached (so python does check the path to import it, and imports locally).
There are a few ways you can change this behavior.
First, if you must have a local module called sys, you can use importlib to import it in relative or absolute terms, without running into the ambiguity with the sys that's already cached. I have no idea how this would affect other modules that independently try to import sys, and you really shouldn't be naming your files the same as standard library modules anyway.
Alternatively, if you want the code to check python's built-in modules before checking the local directory, then you should be able to do that by modifying sys.path, which shows the order in which paths are searched for input (the same as the $PATH environment variable, or any other similar language-specific one). The first element of sys.path is usually going to be an empty string '', that would result in searching the current working directory. So you can simply move that to the back of sys.path, to have it searched last instead of first:
sys.path.append(sys.path.pop(0))

Does importing a Python file also import the imported files into shell?

I am running Python 3.6.2 and trying to import other files into my shell prompt as needed. I have the following code inside my_file.py.
import numpy as np
def my_file(x):
s = 1/(1+np.exp(-x))
return s
From my 3.6.2 shell prompt I call
from my_file import my_file
But in my shell prompt if I want to use the library numpy I still have to import numpy into the shell prompt even though I have imported a file that imports numpy. Is this functionality by design? Or is there a way to import numpy once?
import has three completely separate effects:
If the module has not yet been imported in the current process (by any script or module), execute its code (usually from disk) and store a module object with the resulting classes, functions, and variables.
If the module is in a package, (import the package first, and) store the new module as an attribute on the containing package (so that references like scipy.special work).
Assign the module ultimately imported to a variable in the invoking scope. (import foo.bar assigns foo; import baz.quux as frob assigns baz.quux to the name frob.)
The first two effects are shared among all clients, while the last is completely local. This is by design, as it avoids accidentally using a dependency of an imported module without making sure it’s available (which would break later if the other modules changed what they imported). It also lets different clients use different shorthands.
As hpaul noted, you can use another module’s imports with a qualified name, but this is abusing the module’s interface just like any other use of a private name unless (like six.moves, for example, or os.path which is actually not a module at all) the module intends to publish names for other modules.

What is usercustomize

I have seen many mentions of usercustomize throughout the docs. What is it exactly?
I am on Ubuntu 12.0, Python 3.3, using the IDLE interpreter.
Adding a 'usercustomize.py' file to /usr/lib/python3.3 with the following code in it:
import math
I started the IDLE interpreter. Without importing math, I typed math.sqrt(
Typing Ctrl + \ to start the auto complete suggestion, I get a prompt like sqrt(x). This suggests that math in fact has been imported. But actually calling the function raises NameError.
What exactly is going on here?
See the site module for full documenation on what usercustomize is meant to do.
Note that usercustomize is only imported if site.ENABLE_USER_SITE is enabled:
After this, an attempt is made to import a module named usercustomize, which can perform arbitrary user-specific customizations, if ENABLE_USER_SITE is true. This file is intended to be created in the user site-packages directory (see below), which is part of sys.path unless disabled by -s. An ImportError will be silently ignored.
Importing math into usercustomize will not make it available in IDLE; you are not making it a built-in that way. You could add it to the builtins module, but I'd advice against that.
usercustomize is not meant to set up a default IDLE environment, it is meant to add extra entries to the sys.path module search path and other general Python runtime environment changes.

python import statements

In some code I'm looking at there is the following statement:
from math import exp, sqrt, ceil
however there is no folder called math in the project folder and no modules named exp, sqrt, and ceil. My question is basically where are these modules being imported from and how do I see them and other files like them? Thanks in advance.
You have some confused terminology. In this case, math is the module, and exp, sqrt, ceil are functions it defines. Usually it is from <module> import <function/class>. math is a base module which is included with every Python installation. Python has set of specific places it will look for the module. In this case, math will be a dynamically loaded module written in C.
You can find out where it comes from by doing:
import math
math.__file__
Note that this will give an error for anything that is inbuilt into the interpreter, however.
You are importing a standard Python module. See here for the documentation for the math module, and here for the full standard library.
The location of the module on your filesystem varies across environments. Don't bother trying to locate stuff there. Just bookmark the documentation.
You are seeing the Python standard libraries. They are resolved by searching the PYTHONPATH for matching modules. In addition to the PYTHONPATH, you can import from any subfolders of your python script that contain a file called __init__.py
The math module is part of Python's standard library and is always available in any Python installation. However, since the functions are not builtins, they do need to be imported.
When a module named spam is imported, the interpreter first searches for a built-in module with that name. If not found, it then searches for a file named spam.py in a list of directories given by the variable sys.path. sys.path is initialized from these locations:
- The directory containing the input script (or the current directory).
- PYTHONPATH (a list of directory names, with the same syntax as the shell variable PATH).
- the installation-dependent default.
After initialization, Python programs can modify sys.path. The directory containing the script being run is placed at the beginning of the search path, ahead of the standard library path. This means that scripts in that directory will be loaded instead of modules of the same name in the library directory. This is an error unless the replacement is intended. See section Standard Modules for more information.
From a shell you can type the following to get the default sys.path
>>> import sys
>>> print sys.path
['', '/usr/lib/python2.6', '/usr/lib/python2.6/plat-linux2', '/usr/lib/python2.6/lib-tk', '/usr/lib/python2.6/lib-old', '/usr/lib/python2.6/lib-dynload', '/usr/lib/python2.6/dist-packages', '/usr/lib/python2.6/dist-packages/PIL', '/usr/lib/python2.6/dist-packages/gst-0.10', '/usr/lib/pymodules/python2.6', '/usr/lib/python2.6/dist-packages/gtk-2.0', '/usr/lib/pymodules/python2.6/gtk-2.0', '/usr/local/lib/python2.6/dist-packages']

Import package from file with same name as package [duplicate]

I have a module that conflicts with a built-in module. For example, a myapp.email module defined in myapp/email.py.
I can reference myapp.email anywhere in my code without issue. However, I need to reference the built-in email module from my email module.
# myapp/email.py
from email import message_from_string
It only finds itself, and therefore raises an ImportError, since myapp.email doesn't have a message_from_string method. import email causes the same issue when I try email.message_from_string.
Is there any native support to do this in Python, or am I stuck with renaming my "email" module to something more specific?
You will want to read about Absolute and Relative Imports which addresses this very problem. Use:
from __future__ import absolute_import
Using that, any unadorned package name will always refer to the top level package. You will then need to use relative imports (from .email import ...) to access your own package.
NOTE: The above from ... line needs to be put into any 2.x Python .py files above the import ... lines you're using. In Python 3.x this is the default behavior and so is no longer needed.

Categories