Where to trigger ast modifications in Python? - python

I'm doing some AST modifications in Python. This may be done for optimization purpose or other.
Where is the right place to put it, so when you do something like:
python myfile.py
or
python runserver.py myapp
the modifications took place for every .py executed file?

Put your modifications in a site.py or sitecustomize.py (Python < 2.6) file (in the lib/site-packages directory ; Python will try to import and run it # interpreter startup).

Related

How to prevent Python from search the current working directory for modules?

I have a Python script which imports the datetime module. It works well until someday I can run it in a directory which has a Python script named datetime.py. Of course, there are a few ways to resolve the issue. First, I run the script in a directory that does not contain the script datetime.py. Second, I can rename the Python script datetime.py. However, neither of the 2 approaches are perfect ways. Suppose one ship a Python script, he never knows where users will run the Python script. Another possible fix is to prevent Python from search the current working directory for modules. I tried to remove the empty path ('') from sys.path but it works in an interactive Python shell but not in a Python script. The invoked Python script still searches the current path for modules. I wonder whether there is any way to disable Python from searching the current path for modules?
if __name__ == '__main__':
if '' in sys.path:
sys.path.remove('')
...
Notice that it deosn't work even if I put the following code to the beginning of the script.
import sys
if '' in sys.path:
sys.path.remove('')
Below are some related questions on StackOverflow.
Removing path from Python search module path
pandas ImportError C extension when io.py in same directory
initialization of multiarray raised unreported exception python
Are you sure that Python is searching for that module in the current directory, and not on the script directory? I don't think Python adds the current directory to the sys.path, except in one case. Doing so could even be a security risk (akin to having . on the UNIX PATH).
According to the documentation:
As initialized upon program startup, the first item of this list, path[0], is the directory containing the script that was used to invoke the Python interpreter. If the script directory is not available (e.g. if the interpreter is invoked interactively or if the script is read from standard input), path[0] is the empty string, which directs Python to search modules in the current directory first
So, '' as a representation of the current directory happens only if run from interpreter (that's why your interactive shell test worked) or if the script is read from the standard input (something like cat modquest/qwerty.py | python). Neither is a rather 'normal' way of running Python scripts, generally.
I'm guessing that your datetime.py stands side by side with your actual script (on the script directory), and that it just happens that you're running the script from that directory (that is script directory == current directory).
If that's the actual scenario, and your script is standalone (meaning just one file, no local imports), you could do this:
sys.path.remove(os.path.abspath(os.path.dirname(sys.argv[0])))
But keep in mind that this will bite you in the future, once the script gets bigger and you split it into multiple files, only to spend several hours trying to figure out why it is not importing the local files...
Another option is to use -I, but that may be overkill:
-I
Run Python in isolated mode. This also implies -E and -s. In isolated mode sys.path contains neither the script’s directory nor the user’s site-packages directory. All PYTHON* environment variables are ignored, too. Further restrictions may be imposed to prevent the user from injecting malicious code.

ModuleNotFoundError when running script from Terminal

I have the following folder structure:
app
__init__.py
utils
__init__.py
transform.py
products
__init__.py
fish.py
In fish.py I'm importing transform as following: import utils.transform.
When I'm running fish.py from Pycharm, it works perfectly fine. However when I am running fish.py from the Terminal, I am getting error ModuleNotFoundError: No module named 'utils'.
Command I use in Terminal: from app folder python products/fish.py.
I've already looked into the solutions suggested here: Importing files from different folder, adding a path to the application folder into the sys.path helps. However I am wondering if there is any other way of making it work without adding two lines of code into the fish.py. It's because I have many scripts in the /products directory, and do not want to add 2 lines of code into each of them.
I looked into some open source projects, and I saw many examples of importing modules from a parallel folder without adding anything into sys.path, e.g. here:
https://github.com/jakubroztocil/httpie/blob/master/httpie/plugins/builtin.py#L5
How to make it work for my project in the same way?
You probably want to run python -m products.fish. The difference between that and python products/fish.py is that the former is roughly equivalent to doing import products.fish in the shell (but with __name__ set to __main__), while the latter does not have awareness of its place in a package hierarchy.
This expands on #Mad Physicist's answer.
First, assuming app is itself a package (since you added __init__.py to it) and utils and products are its subpackages, you should change the import to import app.utils.transform, and run Python from the root directory (the parent of app). The rest of this answer assumes you've done this. (If it wasn't your intention making app the root package, tell me in a comment.)
The problem is that you're running app.products.fish as if it were a script, i.e. by giving the full path of the file to the python command:
python app/products/fish.py
This makes Python think this fish.py file is a standalone script that isn't part of any package. As defined in the docs (see here, under <script>), this means that Python will search for modules in the same directory as the script, i.e. app/products/:
If the script name refers directly to a Python file, the directory
containing that file is added to the start of sys.path, and the file
is executed as the __main__ module.
But of course, the app folder is not in app/products/, so it will throw an error if you try to import app or any subpackage (e.g. app.utils).
The correct way to start a script that is part of a package is to use the -m (module) switch (reference), which takes a module path as an argument and executes that module as a script (but keeping the current working directory as a module search path):
If this option is given, [...] the current directory
will be added to the start of sys.path.
So you should use the following to start your program:
python -m app.products.fish
Now when app.products.fish tries to import the app.utils.transform module, it will search for app in your current working directory (which contains the app/... tree) and succeed.
As a personal recommendation: don't put runnable scripts inside packages. Use packages only to store all the logic and functionality (functions, classes, constants, etc.) and write a separate script to run your application as you wish, putting it outside the package. This will save you from this kind of problems (including the double import trap), and has also the advantage that you can write several run configurations for the same package by just making a separate startup script for each.

Running PyCharm project from command line

I am trying to deploy my project to a server and run it there.
When I try to start a script from command line it shows errors
when importing scripts that are in parrent directories.
I made the project (python 2.7.10) using PyCharm and it is spread out into multiple directories.
The folders look something like this:
project/dir/subdir/main_dir/script1.py
from dir.subdir.other_dir.script2 import * //gives error here
project/dir/subdir/other_dir/script2.py
def my_function():
//do something
I run the script by going to the main_dir and running: python script1.py
If you are running your script from the main_dir, that means when running your Python command, your relative reference is main_dir. So your imports are with respect to main_dir being your root.
This means if we take your script1 for example, your import should look like this:
from other_dir.script2 import *
Chances are your PyCharm project root is actually set to run from
project/
Which is why your references work within PyCharm.
What I suggest you do is, if your server is supposed to run within main_dir then you should re-configure PyCharm so that its execution root is the same in order to remove this confusion.
An alternative solution to this problem in my case was to add a main.py script in the root of the python project which triggers the program.
project/__main__.py:
from dir.subdir.other_dir.script2 import * //doesn't give errors
This means that when calling the program from the terminal the workspace would be correct and every inclusion of script would have the folders mapped correctly (from the root).
project/dir/subdir/main_dir/script1.py:
from dir.subdir.other_dir.script2 import * //also doesn't give errors
Another solution where you can skip the parent directories while importing (and don't have have to change anything in your script going from a Pycharm execution to a manual execution):
from script2 import *
works when you set the PYTHONPATH variable before running your script, e.g. like this in Windows:
set PYTHONPATH=../other_dir && python script1.py
for Linux (bash) it is:
PYTHONPATH=../other_dir python script1.py
I believe this is also what PyCharm does upon execution: adding the according folders to the PYTHONPATH.

Eric5: Update module imports

I'm using eric5 for my python projects and I really like it.
However I don't understand how to force Eric5 to re-import modules before running a script.
Example of my workflow.
My modules
== mainfile.py ==
import mymodule
# some more code
=================
1.) First run of mainfile.py with "Start->Run Script".
2.) Then I modify mymodule.py and re-run with "Start->Run Script"
3.) I find out that the modification in mymodule.py has not included in the mainfile.py. There was no re-import of the updated file done.
4.) Currently I'm helping myself by closing and opening the entire Eric5 project. This works but is not very elegant. I assume that there is a more convenient way?

Handling complicated directory structure with python imports

I've worked on several medium-sized python applications to date, and every time it seems like I cobble together a terrible system of imports from tangential Stack Overflow answers and half-understood blog posts. It's ugly and hard to maintain and ultimately very unsatisfying. With this question I attempt to put all that behind me.
Say I have a python application split into the following files:
app.py
constants.py
ui/window.py
web/connection.py
With the following include requirements:
app.py needs to include window.py and connection.py
window.py needs to include constants.py and connection.py
connection.py needs to include constants.py
app.py is the starting point for the application, but window.py and connection.py are also invokable from the command line to test basic functionality (ideally from within their respective folders).
What combination of __init__.py files, carefully crafted import statements and wacky python path magic will allow me to achieve this structure?
Thanks very much,
--Dan
It really helps if, instead of thinking in terms of "file structure" first and then trying to figure out the packages, you design things in terms of packages, and then lay out your file structure to implement those packages.
But if you want to know how to hack up what you already have: If you put this at the top level (that is, in one of the paths on your sys.path), and create files names ui/__init__.py and web/__init__.py, then:
app.py can be run as a script.
app.py can be run with -m app.
app.py can be imported with import app.
window.py cannot be run directly.
window.py can be run with -m ui.window.
window.py can be imported with import ui.window.
connection.py cannot be run directly.
connection.py can be run with -m web.connection.
connection.py can be imported with import web.connection.
No wacky path magic is needed; you just need the top level (with app.py, constants.py, ui, and web) to be on your sys.path—which it automatically is when you run with that directory as your working directory, or install everything directly into site-packages, or install it as an egg, etc.
That's as close as you're going to get to what you want. You do ever want to run code with a package directory as your current working directory or otherwise on sys.path, so don't even try. If you think you need that, what you probably want is to separate the runnable code out into a script that you can put at the top level, or somewhere entirely separate. (For example, look at pip or ipython, which installs scripts into somewhere on your system $PATH that do nothing but import some module and run a function.)
The only other thing you might want to consider is putting all of this into a package, say, myapp. You do that by adding a top-level __init__.py, and then running from the parent directory, and adding myapp. to the start of all your importand -m statements. That means you can no longer run app.py as a script either, so again you will need to split the script code out into a separate file from the module that does all the work.
You can use that structure with just a small modification: add empty __init__.py files to the ui/ and web/ directory. Then, where you would have done import window, do either import ui.window, or from ui import window. Similarly, change import connection to import web.connection or from web import connection.
Rationale: Python doesn't work so much with directories as it does with packages, which are directories with an __init__.py in them. By changing ui and web to be packages, you don't have to do any particular Python path magic to work with them, and you get the benefit of adding some structure to your modules and imports. That will become particularly important if you start having modules with the same name in different directories (e.g. a util.py in both the ui and web directories; not necessarily the cleanest design but you get the idea).
If you invoke window.py or connection.py directly to test them, you need to add the top-level directory to your PYTHONPATH for things to still work – but there is a subtle additional wrinkle. When you run this from the top-level directory:
PYTHONPATH=$PWD python web/connection.py
you now have both the top-level directory on your module path AND the web/ directory. This can cause certain relative imports to do unexpected things.
Another way is to use Python's -m option from the top-level directory:
python -m web.foo
I know many folks like to nail their tests right into the modules like this, but I should also note that there are other ways to structure your tests, particularly with an updated unittest library and tools like nosetests, that will make it a little bit easier to run your tests as your project gets larger. See the skeleton here for a reasonable example:
http://learnpythonthehardway.org/book/ex46.html

Categories