I have a module that I want to keep up to date, and I'm wondering if this is a bad idea:
Have a module (mod1.py) in the
site-packages directory that copies a
different module from some other
location into the site-packages
directory, and then imports * from
that module.
import shutil
from distutils.sysconfig import get_python_lib
p_source = r'\\SourceSafeServer\mod1_current.py'
p_local = get_python_lib() + r'\mod1_current.py'
shutil.copyfile(p_source, p_local)
from mod1_current import *
Now I can do this in any module, and it will always be the latest version:
from mod1 import function1
This works.... but is there a better way of doing this?
Update
Here is the current process... there is a project under source-control that has a single module: mod1.py There is also a setup.py Running setup.py copies mod1.py to the site-packages directory.
Developers that use the module must run setup.py to update the module. Sometimes, they don't and not having the latest version causes problems.
I want to be able to just check-in the a new version, and any code that imports that module will automatically grab the latest version every time, without anyone having to run setup.py
Do you really want to do this? This means you could very easily roll code to a production app simply by committing to source control. I would consider this a nasty side-effect for someone who isn't aware of your setup.
That being said this seems like a pretty good solution - you may want to add some exception-handling around the network file calls as those are prone to failure.
In some cases, we put .pth files in the Python site-packages directory. The .pth files name our various SVN checkout directories.
No install. No copy.
.pth files are described here.
The original strategy of having other developers copy mod1.py into their site-packages in order to use the module sounds like it's the real problem. Why aren't they just using the same source control are you are?
This auto-copying will make it hard to do rollbacks, especially if other developers copy your strategy. Imagine this same system used for dozens and dozens of files. And then imagine you actually do want to use a version of mod1.py that is not the latest for something.
Related
It's been a while that I am struggling with imports in packages. When I develop a package, I read everywhere that it is preferable to use absolute imports in submodules of that package. I understand that and I like it more as well. But then I don't like and I also read that you shouldn't use sys.path.append('/path/to/package') to use your package in development...
So my question is, how do you develop such a package from zero, using directly absolute imports? At the moment I develop the package using relative imports, since then I am able to test the code I am writing before packaging and installing, then I change the imports once I have a release and build the package.
What is the correct way of doing such thing? In Pycharm for example you would mark the folder as 'source roor' and be able to work as if the package folder was in the path. Still I read that this is not the proper way... what am I missing? How do you develop a package while testing its code?
Your mileage may vary but this is what I usually do:
Within a package (foo), absolute (import foo.bar) or relative (import .bar) doesn't matter to me as long as it works. Sometimes, I prefer relative especially when the project is large and one day I might decide to move a number of source files into a subdirectory.
How do I test? My $PYTHONPATH usually has . in it, and my directory hierarchy is like this:
/path/to/foo_project
/setup.py
/foo
/__init__.py
/bar.py
/test
/test1.py
/test2.py
then the script in foo_project/test/test1.py will be like what you normally use the package, using import foo.bar. And when I test my code, I will be in the directory foo_project and run python test/test1.py. Since I have . in my $PYTHONPATH, it will find the directory foo and use it as a package.
To allow myself to have a clear filestructure in my project i am using the following code snippet to dynamically add the project main folder to the PYTHONPATH and therefore assure that I can import files even from above a files location.
import sys
import os
sys.path.insert(0, os.path.join(os.path.dirname(os.path.realpath(__file__)), "."))
Since I did this, when I start my main file, changes to the modules aren't recognized anymore until i manually delete any .pyc files. Thus I assume this for some reason prevented python from checking if the pyc files are up to date. Can I overcome this issue in any way?
Adding the path of an already imported module can get you into trouble if module names are no longer unique. Consider that you do import foo, which adds its parent package bar to sys.path - it's now possible to also do import bar.foo. Python will consider both to be different modules, which can mess up anything relying on module identity.
You should really consider why you need to do this hack in the first place. If you have an executable placed inside your package, you should not do
cd bardir/bar
python foo
but instead call it as part of the package via
cd bardir
python -m bar.foo
You could try to make python not write those *.pyc files.
How to avoid .pyc files?
For large projects this would matter slightly from a performance perspective. It's possible that you don't care about that, and then you can just not create the pyc files.
I'm working toward adopting Python as part of my team's development tool suite. With the other languages/tools we use, we develop many reusable functions and classes that are specific to the work we do. This standardizes the way we do things and saves a lot of wheel re-inventing.
I can't seem to find any examples of how this is usually handled with Python. Right now I have a development folder on a local drive, with multiple project folders below that, and an additional "common" folder containing packages and modules with re-usable classes and functions. These "common" modules are imported by modules within multiple projects.
Development/
Common/
Package_a/
Package_b/
Project1/
Package1_1/
Package1_2/
Project2/
Package2_1/
Package2_2/
In trying to learn how to distribute a Python application, it seems that there is an assumption that all referenced packages are below the top-level project folder, not collateral to it. The thought also occurred to me that perhaps the correct approach is to develop common/framework modules in a separate project, and once tested, deploy those to each developer's environment by installing to the site-packages folder. However, that also raises questions re distribution.
Can anyone shed light on this, or point me to a resource that discusses this issue?
If you have common code that you want to share across multiple projects, it may be worth thinking about storing this code in a physically separate project, which is then imported as a dependency into your other projects. This is easily achieved if you host your common code project in github or bitbucket, where you can use pip to install it in any other project. This approach not only helps you to easily share common code across multiple projects, but it also helps protect you from inadvertently creating bad dependencies (i.e. those directed from your common code to your non common code).
The link below provides a good introduction to using pip and virtualenv to manage dependencies, definitely worth a read if you and your team are fairly new to working with python as this is a very common toolchain used for just this kind of problem:
http://dabapps.com/blog/introduction-to-pip-and-virtualenv-python/
And the link below shows you how to pull in dependencies from github using pip:
How to use Python Pip install software, to pull packages from Github?
The must-read-first on this kind of stuff is here:
What is the best project structure for a Python application?
in case you haven't seen it (and follow the link in the second answer).
The key is that each major package be importable as if "." was the top level directory, which means that it will also work correctly when installed in a site-packages. What this implies is that major packages should all be flat within the top directory, as in:
myproject-0.1/
myproject/
framework/
packageA/
sub_package_in_A/
module.py
packageB/
...
Then both you (within your other packages) and your users can import as:
import myproject
import packageA.sub_package_in_A.module
etc
Which means you should think hard about #MattAnderson's comment, but if you want it to appear as a separately-distributable package, it needs to be in the top directory.
Note this doesn't stop you (or your users) from doing an:
import packageA.sub_package_in_A as sub_package_in_A
but it does stop you from allowing:
import sub_package_in_A
directly.
...it seems that there is an assumption that all referenced packages
are below the top-level project folder, not collateral to it.
That's mainly because the current working directory is the first entry in sys.path by default, which makes it very convenient to import modules and packages below that directory.
If you remove it, you can't even import stuff from the current working directory...
$ touch foo.py
$ python
>>> import sys
>>> del sys.path[0]
>>> import foo
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ImportError: No module named foo
The thought also occurred to me that perhaps the correct approach is
to develop common/framework modules in a separate project, and once
tested, deploy those to each developer's environment by installing to
the site-packages folder.
It's not really a major issue for development. If you're using version control, and all developers check out the source tree in the same structure, you can easily employ relative path hacks to ensure the code works correctly without having to mess around with environment variables or symbolic links.
However, that also raises questions re distribution.
This is where things can get a bit more complicated, but only if you're planning to release libraries independently of the projects which use them, and/or having multiple project installers share the same libraries. It that's the case, take a look at distutils.
If not, you can simply employ the same relative path hacks used in development to ensure you project works "out of the box".
I think that this is the best reference for creating a distributable python package:
link removed as it leads to a hacked site.
also, don't feel that you need to nest everything under a single directory. You can do things like
platform/
core/
coremodule
api/
apimodule
and then do things like from platform.core import coremodule, etc.
I need to ship a collection of Python programs that use multiple packages stored in a local Library directory: the goal is to avoid having users install packages before using my programs (the packages are shipped in the Library directory). What is the best way of importing the packages contained in Library?
I tried three methods, but none of them appears perfect: is there a simpler and robust method? or is one of these methods the best one can do?
In the first method, the Library folder is simply added to the library path:
import sys
import os
sys.path.insert(0, os.path.join(os.path.dirname(__file__), 'Library'))
import package_from_Library
The Library folder is put at the beginning so that the packages shipped with my programs have priority over the same modules installed by the user (this way I am sure that they have the correct version to work with my programs). This method also works when the Library folder is not in the current directory, which is good. However, this approach has drawbacks. Each and every one of my programs adds a copy of the same path to sys.path, which is a waste. In addition, all programs must contain the same three path-modifying lines, which goes against the Don't Repeat Yourself principle.
An improvement over the above problems consists in trying to add the Library path only once, by doing it in an imported module:
# In module add_Library_path:
sys.path.insert(0, os.path.join(os.path.dirname(__file__), 'Library'))
and then to use, in each of my programs:
import add_Library_path
import package_from_Library
This way, thanks to the caching mechanism of CPython, the module add_Library_path is only run once, and the Library path is added only once to sys.path. However, a drawback of this approach is that import add_Library_path has an invisible side effect, and that the order of the imports matters: this makes the code less legible, and more fragile. Also, this forces my distribution of programs to inlude an add_Library_path.py program that users will not use.
Python modules from Library can also be imported by making it a package (empty __init__.py file stored inside), which allows one to do:
from Library import module_from_Library
However, this breaks for packages in Library, as they might do something like from xlutils.filter import …, which breaks because xlutils is not found in sys.path. So, this method works, but only when including modules in Library, not packages.
All these methods have some drawback.
Is there a better way of shipping programs with a collection of packages (that they use) stored in a local Library directory? or is one of the methods above (method 1?) the best one can do?
PS: In my case, all the packages from Library are pure Python packages, but a more general solution that works for any operating system is best.
PPS: The goal is that the user be able to use my programs without having to install anything (beyond copying the directory I ship them regularly), like in the examples above.
PPPS: More precisely, the goal is to have the flexibility of easily updating both my collection of programs and their associated third-party packages from Library by having my users do a simple copy of a directory containing my programs and the Library folder of "hidden" third-party packages. (I do frequent updates, so I prefer not forcing the users to update their Python distribution too.)
Messing around with sys.path() leads to pain... The modern package template and Distribute contain a vast array of information and were in part set up to solve your problem.
What I would do is to set up setup.py to install all your packages to a specific site-packages location or if you could do it to the system's site-packages. In the former case, the local site-packages would then be added to the PYTHONPATH of the system/user. In the latter case, nothing needs to changes
You could use the batch file to set the python path as well. Or change the python executable to point to a shell script that contains a modified PYTHONPATH and then executes the python interpreter. The latter of course, means that you have to have access to the user's machine, which you do not. However, if your users only run scripts and do not import your own libraries, you could use your own wrapper for scripts:
#!/path/to/my/python
And the /path/to/my/python script would be something like:
#!/bin/sh
PYTHONPATH=/whatever/lib/path:$PYTHONPATH /usr/bin/python $*
I think you should have a look at path import hooks which allow to modify the behaviour of python when searching for modules.
For example you could try to do something like kde's scriptengine does for python plugins[1].
It adds a special token to sys.path(like "<plasmaXXXXXX>" with XXXXXX being a random number just to avoid name collisions) and then when python try to import modules and can't find them in the other paths, it will call your importer which can deal with it.
A simpler alternative is to have a main script used as launcher which simply adds the path to sys.path and execute the target file(so that you can safely avoid putting the sys.path.append(...) line on every file).
Yet an other alternative, that works on python2.6+, would be to install the library under the per-user site-packages directory.
[1] You can find the source code under /usr/share/kde4/apps/plasma_scriptengine_python in a linux installation with kde.
I need to put a python script somewhere on my computer so that in another file I can use it. How do I do this and where do I put it? And where in the python documentation do I learn how to do this? I'm a beginner + don't use python much.
library file: MyLib.py put in a well-known place
def myfunc():
....
other file SourceFile.py located elsewhere, doesn't need to know where MyLib.py is:
something = MyLib.myfunc()
Option 1:
Put your file at:
<Wherever your Python is>/Lib/site-packages/myfile.py
Add this to your code:
import myfile
Pros: Easy
Cons: Clutters site-packages
Option 2:
Put your file at:
/Lib/site-packages/mypackage/myfile.py
Create an empty text file called:
<Wherever your Python is>/Lib/site-packages/mypackage/__init__.py
Add this to your code:
from mypackage import myfile
Pros: Reduces clutter in site-packages by keeping your stuff consolidated in a single directory
Cons: Slightly more work; still some clutter in site-packages. This isn't bad for stable stuff, but may be regarded as inappropriate for development work, and may be impossible if Python is installed on a shared drive
Option 3
Put your file in any directory you like
Add that directory to the PYTHONPATH environment variable
Proceed as with Option 1 or Option 2, except substitute the directory you just created for <Wherever your Python is>/Lib/site-packages/
Pros: Keeps development code out of the site-packages directory
Cons: slightly more setup
This is the approach I usually use for development work
In general, the Modules section of the Python tutorial is a good introduction for beginners on this topic. It explains how to write your own modules and where to put them, but I'll summarize the answer to your question below:
Your Python installation has a site-packages directory; any python file you put in that directory will be available to any script you write. For example, if you put the file MyLib.py in the site-packages directory, then in your script you can say
import MyLib
something = MyLib.myfunc()
If you're not sure where Python is installed, the Stack Overflow question How do I find the location of my Python site-packages directory will be helpful to you.
Alternatively, you can modify sys.path, which is a list of directories where Python looks for libraries when you use the import statement. Your site-packages directory is already in this list, but you can add (or remove) entries yourself. For example, if you wanted to put your MyLib.py file in /usr/local/pythonModules, you could say
import sys
sys.path.append("/usr/local/pythonModules")
import MyLib
something = MyLib.myfunc()
Finally, you could use the PYTHONPATH environment variable to indicate the directory where your MyLib.py is located.
However, I recommend simply placing your MyLib.py file in the site-packages directory, as described above.
No one has mentioned using .pth files in site-packages to abstract away the location.
You will have to place your MyLib.py somewhere in your load path (this the paths in your sys.path variable) and then you'll be able to import it fine. Your code would look like
import MyLib
MyLib.myfunc()
Generally speaking, you should distribute your packages using distutils so that they can be easily installed in the proper locations. It would help you as well.
Also, you might not want to install packages in your global Python install. It's customary (and recommended) to use virtualenv which you can use to create small isolated Python environments that can hold local packages.
It's best your give the whole thing a shot and then ask further questions if you have them.
The private version, from my .profile
export PYTHONPATH=${PYTHONPATH}:$HOME/lib/python
which has a subdirectory "msw" so import msw.primes is self documenting or add to a local directory that is already in sys.path
The Python tutorial section 6 talks about modules, and 6.1.2 talks about the PYTHONPATH, which determines where Python will look for modules you try to import. The tutorial: http://docs.python.org/tutorial/modules.html