How can I make Python find this module? - python

I wrote some Python programs about 7 or 8 years ago and I haven't touched them since, although they are still in heavy use by my company.
We have moved to a new server, and I'm trying to set that environment up. I have created all of the code, but apparently I have some sort of path problem that I don't remember how to solve. I haven't written any Python at all in at least 6 years.
At the top of this program I'm trying to execute, I have this:
from mycompany.initializer import Configurator
I'm running this program from a directory called:
/usr/code/myapp/migrate
The "mycompany" python module is located in:
/usr/code/mycompany
What path or other variable do I need to export in my user account so that above program can find the appropriate Python module?

You can manually add the path (like Kasapo shows). Though I think it would be pythonically preferable if you managed to properly install the module so that it becomes part of the python search path. One way to do that is with PYTHONPATH (ie: exporting PYTHONPATH to include '/usr/code' upon shell start up).
http://docs.python.org/using/cmdline.html

You should be able to add the directory to the system path:
import sys
sys.path.append('/usr/code/')
Also, not sure if this changed in python at any point, but you should have __init__.py in the 'mycompany' directory (and any module components/submodules in subdirectories...) to properly make it an importable module.

Related

Python script works in PyCharm but not in terminal

I'm currently trying to import one of my modules from a different folder.
Like this...
from Assets.resources.libs.pout import Printer, ForeColor, BackColor
This import method works completely fine in PyCharm, however, when i try to launch the file in cmd or IDLE, i get this error.
ModuleNotFoundError: No module named 'Assets'
This is my file structure from main.py to pout.py:
- Assets
- main.py
- resources
- libs
- pout.py
Any clue about how i could fix this ?
Any help is appreciated !
Edit: The original answer was based on the assumption that the script you're running is within the folder structure given, which a re-read tells me may not be true. The general solution is to do
sys.path.append('path_to_Assets')
but read below for more detail.
Original answer
The paths that modules can be loaded from will differ between the two methods of running the script.
If you add
import sys
print(sys.path)
to the top of your script before the imports you should be able to see the difference.
When you run it yourself the first entry will be the location of the script itself, followed by various system/environment paths. When you run it in PyCharm you will see the same first entry, followed by an entry for the top level of the project. This is how it finds the modules when run from PyCharm. This behaviour is controlled by the "Add content roots to PYTHONPATH" option in the run configuration.
Adding this path programmatically in a way that will work in all situations isn't trivial because, unlike PyCharm, your script doesn't have a concept of where the top level should be. BUT if you know you'll be running the script from the top level, i.e. your working directory will be the folder containing Assets and you're running something like python Assets/main.py then you can do
sys.path.append(os.path.abspath('.'))
and that will add the correct folder to the path.
Appending sys path didn't work for me on windows, hence here is the solution that worked for me:
Add an empty __init__.py file to each directory
i.e. in Assets, resources, libs.
Then try importing with only the base package names.
Worked for me!

How to prevent Python from search the current working directory for modules?

I have a Python script which imports the datetime module. It works well until someday I can run it in a directory which has a Python script named datetime.py. Of course, there are a few ways to resolve the issue. First, I run the script in a directory that does not contain the script datetime.py. Second, I can rename the Python script datetime.py. However, neither of the 2 approaches are perfect ways. Suppose one ship a Python script, he never knows where users will run the Python script. Another possible fix is to prevent Python from search the current working directory for modules. I tried to remove the empty path ('') from sys.path but it works in an interactive Python shell but not in a Python script. The invoked Python script still searches the current path for modules. I wonder whether there is any way to disable Python from searching the current path for modules?
if __name__ == '__main__':
if '' in sys.path:
sys.path.remove('')
...
Notice that it deosn't work even if I put the following code to the beginning of the script.
import sys
if '' in sys.path:
sys.path.remove('')
Below are some related questions on StackOverflow.
Removing path from Python search module path
pandas ImportError C extension when io.py in same directory
initialization of multiarray raised unreported exception python
Are you sure that Python is searching for that module in the current directory, and not on the script directory? I don't think Python adds the current directory to the sys.path, except in one case. Doing so could even be a security risk (akin to having . on the UNIX PATH).
According to the documentation:
As initialized upon program startup, the first item of this list, path[0], is the directory containing the script that was used to invoke the Python interpreter. If the script directory is not available (e.g. if the interpreter is invoked interactively or if the script is read from standard input), path[0] is the empty string, which directs Python to search modules in the current directory first
So, '' as a representation of the current directory happens only if run from interpreter (that's why your interactive shell test worked) or if the script is read from the standard input (something like cat modquest/qwerty.py | python). Neither is a rather 'normal' way of running Python scripts, generally.
I'm guessing that your datetime.py stands side by side with your actual script (on the script directory), and that it just happens that you're running the script from that directory (that is script directory == current directory).
If that's the actual scenario, and your script is standalone (meaning just one file, no local imports), you could do this:
sys.path.remove(os.path.abspath(os.path.dirname(sys.argv[0])))
But keep in mind that this will bite you in the future, once the script gets bigger and you split it into multiple files, only to spend several hours trying to figure out why it is not importing the local files...
Another option is to use -I, but that may be overkill:
-I
Run Python in isolated mode. This also implies -E and -s. In isolated mode sys.path contains neither the script’s directory nor the user’s site-packages directory. All PYTHON* environment variables are ignored, too. Further restrictions may be imposed to prevent the user from injecting malicious code.

Get 'libs' directory from within Python regardless of Python installation structure

While patching a program I wrote I noticed that I had no safe way to actually get the directory of 'libs', or even 'include' from within the Python installation itself. Every single time I tried to find any python used directory, I just assumed they would always be in the same place, and thus hard-coded them in. Now you might wonder why I am having trouble with this, I mean it's just a simple join call after a sys check, right?
import os, sys
python_base_directory = sys.exec_prefix
libs_directory = os.path.join(python_base_directory, 'libs')
This code works most of the time, but will fail when you are using a virtual environment, or in my case, using travis_ci.
I want to be able to get the path to the 'libs' directory without hard coding anything.

Python: runtime shebang problems

Here is the problem I am trying to solve. I don't have a specific question in the title because I don't even know what I need.
We have an ancient Hadoop computing cluster with a very old version of Python installed. What we have done is installed a new version (2.7.9) to a local directory (that we have perms on) visible to the entire cluster, and have a virtualenv with the packages we need. Let's call this path /n/2.7.9/venv/
We are using Hadoopy to distribute Python jobs on the cluster. Hadoopy distributes the python code (the mappers and reducers) to the cluster, which are assumed to be executable and come with a shebang, but it doesn't do anything like activate a virtualenv.
If I hardcode the shebang in the .py files to /n/2.7.9/venv/, everything works. But I want to put the .py files in a library; these files should have some generic shebang like #!/usr/bin/env python. But I tried this and it does not work, because at runtime the virtualenv is not "activated" by the script and therefore it bombs with import errors.
So if anyone has any ideas on how to solve this problem I would be grateful. Essentially I want #!/usr/bin/env python to resolve to /n/2.7.9/venv/ without /n/2.7.9/venv/ being active, or some other solution where I cannot hardcode the shebang.
Currently I am solving this problem by having a run function in the library, and putting a wrapper around this function in the main code (that calls the library) with the hardcoded shebang in it. This is less offensive because the hardcoded shebang makes sense in the main code, but it is still messy because I have to have an executable wrapper file around every function I want to run from the library.
I would change the environment variable PYTHONPATH and also the environment variable PATH. Point PYTHONPATH to your virtual environment and PATH to the directory that contains your new python executable, and make sure the path to your python executable comes first.
I accepted John Schmitt's answer because it led me to the solution. However, I am posting what I actually did, because it might be useful for other Hadoopy users.
What I actually did was :
args['cmdenvs'] = ['export VIRTUAL_ENV=/n/2.7.9/ourvenv','export PYTHONPATH=/n/2.7.9/ourvenv', 'export PATH=/n/2.7.9/ourvenv/bin:$PATH']
and passed args into Hadoopy's launch function. In the executable .py files, I put the generic #!/usr/bin/env python shebang.

Add a directory to Python sys.path so that it's included each time I use Python

Currently, when trying to reference some library code, I'm doing this at the top of my python file:
import sys
sys.path.append('''C:\code\my-library''')
from my-library import my-library
Then, my-library will be part of sys.path for as long as the session is active. If I start a new file, I have to remember to include sys.path.append again.
I feel like there must be a much better way of doing this. How can I make my-library available to every python script on my windows machine without having to use sys.path.append each time?
Simply add this path to your PYTHONPATH environment variable. To do this, go to Control Panel / System / Advanced / Environment variable, and in the "User variables" sections, check if you already have PYTHONPATH. If yes, select it and click "Edit", if not, click "New" to add it.
Paths in PYTHONPATH should be separated with ";".
You should use
os.path.join
to make your code more reliable.
You have already used __my-library__ in the path. So don't use it the second time in import.
If you have a directory structure like this
C:\code\my-library\lib.py and a function in there, e.g.:
def main():
print("Hello, world")
then your resulting code should be
import sys
sys.path.append(os.path.join('C:/', 'code', 'my-library'))
from lib import main
If this is a library that you use throughout your code, you should install it as such. Package it up properly, and either install it in your site-packages directory - or, if it's specific to certain projects, use virtualenv and install it just within the relevant virtualenvs.
To do such a thing, you'll have to use a sitecustomize.py (or usercustomize.py) file where you'll do your sys.path modifications (source python docs).
Create the sitecustomize.py file into the \Lib\site-packages directory of your python installation, and it will be imported each time a python interpreter is launched.
If you are doing this interactively, the best thing to do would be to install ipython and configure your startup settings to include that code. If you intend to have it be part of a script you run from the interpreter, the same thing applies, since it will have access to your namespace.
On the other hand, a stand alone script should not include that automatically. In the future, you or some other maintainer will come along, and all the code should be obvious, and not dependent upon a specific machine setup. The best thing to do would be to set up a skeleton file for new projects that includes all of the basic functionality you need. That, along with oft-used snippets will handle the problem.
All of your code to run the script, will be in the script, and you won't have to think about adding that code every time.
Using jupyter with multiple environments, adding the path to .bashrc didn't work. I had to edit the kernel.json file for that particular kernel and append it to the PYTHONPATH in env section.
This only worked in that kernel but maybe this can help someone else.

Categories