Python pathing to find modules/classes to import - python

I am new to Python and I am looking to find out how Python discovers the path to modules it imports from. This would be equivalent to CLASSPATH in Java and PERL5LIB in Perl.
E.g. the import block in a script I am looking at looks something like this:
import os
import resource
from localnamespace.localmodule import some_class
I understand that os and resource are native to Python (are part of the core language API) but still the interpreter must have some pointer where to find them. As for localnamespace.localmodule, how do we tell the interpreter where to find this module because it the directory in which this script is does not have a subdirectory named localnamespace.

TLDR
In summary, the search process goes something like:
1) Previously imported in sys.modules?
2) If no, can I find it in the script / interpreter directory?
3) If no, can I find it in any of the directories in the PYTHONPATH environment variable?
4) If no, ImportError
The Longer Answer
Referring to the documentation an import statement first looks at sys.modules which is a dictionary of currently or recently loaded modules.
It it can't find the module there it searches through sys.meta_path - the actual paths here vary between implementations. In general, import paths will be defined in sys.path, which is a list of directories including those in the environment variable PYTHONPATH.
The documentation for sys.path describes itself as:
A list of strings that specifies the search path for modules. Initialized from the environment variable PYTHONPATH, plus an installation-dependent default.
As initialized upon program startup, the first item of this list, path[0], is the directory containing the script that was used to invoke the Python interpreter. If the script directory is not available (e.g. if the interpreter is invoked interactively or if the script is read from standard input), path[0] is the empty string, which directs Python to search modules in the current directory first. Notice that the script directory is inserted before the entries inserted as a result of PYTHONPATH.

Related

PyCharm ModuleNotFoundError? [duplicate]

I am having a problem running my script in a cmd prompt despite it working in PyCharm. I have a folder structure as such:
MyCode # PyCharm project folder
/UsefulFunctions
/Messaging
/Texter.py
/DiscordBot
/DiscordBot.py
Within DiscordBot.py I have an import
from UsefulFunctions.Messaging import Texter
This works when I run it from PyCharm without a problem. However when I try to run from a command prompt located at the DiscordBot level it errors with:
ImportError: No module named 'UsefulFunctions'
So naturally I thought it meant that the UsefulFunctions folder was not on my path. Therefore, I went into my environment variables and added it to my PATH variable (as well as the MyCode folder for good measure). Still it encountered this error. I browsed some posts on here regarding imports (mainly Importing files from different folder) and they recommend doing something like:
import sys
sys.path.insert(0, '/path/to/application/app/folder')
import file
Or adding __init__.py files to each folder in order to get them to register as packages. I went ahead and added __init__ files to each folder and subfolder I was trying to import from, but still could not run from the command prompt...I ommitted the sys.path.insert() solution because I see no benefit from this after already explicitly adding it to my PATH variable. Another solution was to add "." before the import because supposedly otherwise it is only searching python's PATH. I attempted this as:
from .UsefulFunctions.Messaging import Texter
ImportError: attempted relative import with no known parent package
And this error shows on PyCharm now as well... I don't get why my initial script would work without a hitch on PyCharm, but the same program cannot seem to find my import when run from a prompt. Can somebody please explain the difference between PyCharm running the program and my prompt? Why will this not work despite having __init__.py files and having added MyCode and UsefulFunctions to my PATH variable on Windows?
From [Python.Docs]: Command line and environment - PYTHONPATH:
Augment the default search path for module files. The format is the same as the shell’s PATH: one or more directory pathnames separated by os.pathsep (e.g. colons on Unix or semicolons on Windows). Non-existent directories are silently ignored.
You can also find more details on [SO]: Strange error while using Pycharm to debug PyQt gui (#CristiFati's answer).
So, in order for Python to be able to load a module (package) without specifying its path, the path must be present in %PYTHONPATH% environment variable.
You mentioned %PATH% several times in the question but it's %PYTHONPATH% (MyCode must be added to it).
PyCharm does that because of (any of) the 2 checkboxes in the image below (dialog can be triggered from the menu: Run -> Edit Configurations...):
If you want to get things working from CmdLine, yo have to do the same thing there as well:
[cfati#CFATI-5510-0:e:\Work\Dev\StackOverflow\q054955891\DiscordBot]> sopr.bat
### Set shorter prompt to better fit when pasted in StackOverflow (or other) pages ###
[prompt]> set py
Environment variable py not defined
[prompt]> "e:\Work\Dev\VEnvs\py_064_03.06.08_test0\Scripts\python.exe" DiscordBot.py
Traceback (most recent call last):
File "DiscordBot.py", line 1, in <module>
from UsefulFunctions.Messaging import Texter
ModuleNotFoundError: No module named 'UsefulFunctions'
[prompt]> set PYTHONPATH=e:\Work\Dev\StackOverflow\q054955891
[prompt]> set py
PYTHONPATH=e:\Work\Dev\StackOverflow\q054955891
[prompt]> "e:\Work\Dev\VEnvs\py_064_03.06.08_test0\Scripts\python.exe" DiscordBot.py
e:\Work\Dev\StackOverflow\q054955891\UsefulFunctions\Messaging\Texter.py imported
Conversely, in PyCharm (with the content roots related checkbox from above, checked), more content roots can be added like in the image below (menu: File -> Settings..., select Project Structure then Add Content Root):
This is useful when some required modules are located deeper in the project tree (and some dirs aren't valid Python package names).
So, when dealing with this type of situation, checking [Python.Docs]: sys.path, [Python.Docs]: os.getcwd() and module path, can save lots of wasted time and headaches:
import os
import sys
print(sys.path)
print(os.getcwd())
import some_module
print(some_module)
As a side note, I personally hate names starting with My (e.g. MyCode). Such a name tells me that the purpose of whatever entity "wears" it, was not clear to the person who wrote the code. Try finding a more useful name (e.g. TestBotProject, or smth similar) :).
[SO]: PyCharm doesn't recognize installed module (#CristiFati's answer) might also contain some useful info.
Python uses the system variable PYTHONPATH, among other things, to decide what to import.
From the docs:
When a module named spam is imported, the interpreter first searches
for a built-in module with that name. If not found, it then searches
for a file named spam.py in a list of directories given by the
variable sys.path. sys.path is initialized from these locations:
The directory containing the input script (or the current directory
when no file is specified).
PYTHONPATH (a list of directory names,
with the same syntax as the shell variable PATH).
The installation-dependent default.
The reason PyCharm magically imports the module when you run the script is because of the Project Structure -> Content Root value. It points to your project directory, by default.
Check your Interpreter. It is different than your command prompt Interpreter, located in Appdata, whereas the interpreter for PyCharm is in the Workspace folder.
Set your Python path in System variables,So that you can run python -help from any where in directory
then
navigate to project folder
c:\nnnn..\mmm..\MyCode
run python c:\nnnn..\mmm..\MyCode\DiscordBot
\DiscordBot.py
or
C:\Python27\python.exe "C:\Users\Username\MyCode\DiscordBot
\DiscordBot.py" or
C:\Python27\python.exe C:\Users\Username\MyCode\DiscordBot
\DiscordBot.py
Try quotes if path has space

How to get path from Python module name when module does not exist?

Getting the file system path of a python module by module name is easy if the module exists. For example like this:
importlib.machinery.PathFinder.find_module('my.module').path
But when the module does not exist, how can I still get the theoretical path that would be used to check whether the module exists?
Background: I want to watch the file system to be notified when a module (specified by module name like in an import statement) comes into existence.
Python uses it's own path variable to know where to look for modules. The path variable is a list containing all the directories where python looks for modules, going from start to finish. To get a look at it, you can use the sys module:
import sys
print(sys.path)
The first item in the list is the directory of the script, unless you're in an interactive session, in which case an empty string will be present, representing the current working directory.

How to prevent Python from search the current working directory for modules?

I have a Python script which imports the datetime module. It works well until someday I can run it in a directory which has a Python script named datetime.py. Of course, there are a few ways to resolve the issue. First, I run the script in a directory that does not contain the script datetime.py. Second, I can rename the Python script datetime.py. However, neither of the 2 approaches are perfect ways. Suppose one ship a Python script, he never knows where users will run the Python script. Another possible fix is to prevent Python from search the current working directory for modules. I tried to remove the empty path ('') from sys.path but it works in an interactive Python shell but not in a Python script. The invoked Python script still searches the current path for modules. I wonder whether there is any way to disable Python from searching the current path for modules?
if __name__ == '__main__':
if '' in sys.path:
sys.path.remove('')
...
Notice that it deosn't work even if I put the following code to the beginning of the script.
import sys
if '' in sys.path:
sys.path.remove('')
Below are some related questions on StackOverflow.
Removing path from Python search module path
pandas ImportError C extension when io.py in same directory
initialization of multiarray raised unreported exception python
Are you sure that Python is searching for that module in the current directory, and not on the script directory? I don't think Python adds the current directory to the sys.path, except in one case. Doing so could even be a security risk (akin to having . on the UNIX PATH).
According to the documentation:
As initialized upon program startup, the first item of this list, path[0], is the directory containing the script that was used to invoke the Python interpreter. If the script directory is not available (e.g. if the interpreter is invoked interactively or if the script is read from standard input), path[0] is the empty string, which directs Python to search modules in the current directory first
So, '' as a representation of the current directory happens only if run from interpreter (that's why your interactive shell test worked) or if the script is read from the standard input (something like cat modquest/qwerty.py | python). Neither is a rather 'normal' way of running Python scripts, generally.
I'm guessing that your datetime.py stands side by side with your actual script (on the script directory), and that it just happens that you're running the script from that directory (that is script directory == current directory).
If that's the actual scenario, and your script is standalone (meaning just one file, no local imports), you could do this:
sys.path.remove(os.path.abspath(os.path.dirname(sys.argv[0])))
But keep in mind that this will bite you in the future, once the script gets bigger and you split it into multiple files, only to spend several hours trying to figure out why it is not importing the local files...
Another option is to use -I, but that may be overkill:
-I
Run Python in isolated mode. This also implies -E and -s. In isolated mode sys.path contains neither the script’s directory nor the user’s site-packages directory. All PYTHON* environment variables are ignored, too. Further restrictions may be imposed to prevent the user from injecting malicious code.

Python's sys.path contains an entry pointing to a non-existant file

I used
>>> import sys
>>> print(sys.path)
and I get this:
['', 'C:\\Users\\HowLo\\AppData\\Local\\Programs\\Python\\Python3‌​6-32\\python36.zip',
'C:\\Users\\HowLo\\AppData\\Local\\Programs\\Python\\Python3‌​6-32\\DLLs',
'C:\\Users\\HowLo\\AppData\\Local\\Programs\\Python\\Python3‌​6-32\\lib',
'C:\\Users\\HowLo\\AppData\\Local\\Programs\\Python\\Python3‌​6-32',
'C:\\Users\\HowLo\\AppData\\Local\\Programs\\Python\\Python3‌​6-32\\lib\\site-pack‌​ages']
I am confused why this is in there:
r'C:\Users\HowLo\AppData\Local\Programs\Python\Python36-32\python36.zip'
when I try to pull it up in the File Explorer, nothing is there.
sys.path stores a list of strings, each one (as you can tell) is a path to a location on your computer.
Python looks in these places to find modules your program can use (when you do import sys python is getting the sys module from one of the locations in sys.path)
Paths to .zip files are just as valid as paths to folders, python will try to unzip any archived files.
Now that you know what sys.path is, we can look at your "problem."
You've said that C:\Users\HowLo\AppData\Local\Programs\Python\Python36-32\python36.zip doesn't exist.
All this means is that python doesn't load any modules from there.
It really has no (meaningful) implications whatsoever.
The sys path can be appended with whatever path you choose. It is after all simply a list.
It is used to search for modules which are not in the current folder. More on that in the docs.
A list of strings that specifies the search path for modules. Initialized from the environment variable PYTHONPATH, plus an installation-dependent default.
As initialized upon program startup, the first item of this list, path[0], is the directory containing the script that was used to invoke the Python interpreter. If the script directory is not available (e.g. if the interpreter is invoked interactively or if the script is read from standard input), path[0] is the empty string, which directs Python to search modules in the current directory first. Notice that the script directory is inserted before the entries inserted as a result of PYTHONPATH.
A program is free to modify this list for its own purposes. Only strings and bytes should be added to sys.path; all other data types are ignored during import.

How does python find a module file if the import statement only contains the filename?

Everywhere I see Python code importing modules using import sys or import mymodule
How does the interpreter find the correct file if no directory or path is provided?
http://docs.python.org/3/tutorial/modules.html#the-module-search-path
6.1.2. The Module Search Path
When a module named spam is imported, the interpreter first searches for a built-in module with that name. If not found, it then searches for a file named spam.py in a list of directories given by the variable sys.path. sys.path is initialized from these locations:
The directory containing the input script (or the current directory when no file is specified).
PYTHONPATH (a list of directory names, with the same syntax as the shell variable PATH).
The installation-dependent default.
Note: On file systems which support symlinks, the directory containing the input script is calculated after the symlink is followed. In other words the directory containing the symlink is not added to the module search path.
After initialization, Python programs can modify sys.path. The directory containing the script being run is placed at the beginning of the search path, ahead of the standard library path. This means that scripts in that directory will be loaded instead of modules of the same name in the library directory. This is an error unless the replacement is intended. See section Standard Modules for more information.
For information on the "installation-specific default", see documentation on the site module.
Also, you can see what the current path is by using the sys module
import sys
print(sys.path)
It uses the PYTHONPATH, set as an environment variable, to find packages (folders containing __init__.py files) and modules (or, if already loaded once, retrieves the module object from sys.modules).
Python has a path variable just like the one you have inside your terminal. Python looks for modules in folders inside that path, or in the folder where your program is located.

Categories