I'm just beginning Python, and I'd like to use an external RSS class. Where do I put that class and how do I import it? I'd like to eventually be able to share python programs.
About the import statement:
(a good writeup is at http://effbot.org/zone/import-confusion.htm and the python tutorial goes into detail at http://docs.python.org/tutorial/modules.html )
There are two normal ways to import code into a python program.
Modules
Packages
A module is simply a file that ends in .py. In order for python, it must exist on the search path (as defined in sys.path). The search path usually consists of the same directory of the .py that is being run, as well as the python system directories.
Given the following directory structure:
myprogram/main.py
myprogram/rss.py
From main.py, you can "import" the rss classes by running:
import rss
rss.rss_class()
#alternativly you can use:
from rss import rss_class
rss_class()
Packages provide a more structured way to contain larger python programs. They are simply a directory which contains an __init__.py as well as other python files.
As long as the package directory is on sys.path, then it can be used exactly the same as above.
To find your current path, run this:
import sys
print(sys.path)
I don't really like answering so late, but I'm not entirely satisfied with the existing answers.
I'm just beginning Python, and I'd like to use an external RSS class. Where do I put that class and how do I import it?
You put it in a python file, and give the python file an extension of .py . Then you can import a module representing that file, and access the class. Supposing you want to import it, you must put the python file somewhere in your import search path-- you can see this at run-time with sys.path, and possibly the most significant thing to know is that the site-packages (install-specific) and current directory ('') are generally in the import search path. When you have a single homogeneous project, you generally put it in the same directory as your other modules and let them import each other from the same directory.
I'd like to eventually be able to share python programs.
After you have it set up as a standalone file, you can get it set up for distribution using distutils. That way you don't have to worry about where, exactly, it should be installed-- distutils will worry for you. There are many other additional means of distribution as well, many OS-specific-- distutils works for modules, but if you want to distribute a proper program that users are meant to run, other options exist, such as using py2exe for Windows.
As for the modules/packages distinction, well, here it goes. If you've got a whole bunch of classes that you want divided up so that you don't have one big mess of a python file, you can separate it into multiple python files in a directory, and give the directory an __init__.py . The important thing to note is that from Python, there's no difference between a package and any other module. A package is a module, it's just a different way of representing one on the filesystem. Similarly, a module is not just a .py file-- if that were the case, sys would not be a module, since it has no .py file. It's built-in to the interpreter. There are infinitely many ways to represent modules on the filesystem, since you can add import hooks that can create ways other than directories and .py files to represent modules. One could, hypothetically, create an import hook that used spidermonkey to load Javascript files as Python modules.
from [module] import [classname]
Where the module is somewhere on your python path.
About modules and packages:
a module is a file ending with .py. You can put your class in such a file. As said by Andy, it needs to be in your python path (PYTHONPATH). Usually you will put the additional module in the same directory as your script is though which can be directly imported.
a package is a directory containing an __init__.py (can be empty) and contains module files. You can then import a la from <package>.<module> import <class>. Again this needs to be on your python path.
You can find more in the documenation.
If you want to store your RSS file in a different place use sys.append("") and pout the module in that directory and use
import or from import *
The first file, where you have created the class, is "first.py"
first.py:
class Example:
...
You create the second file, where you want to use the class contained in the "first.py", which is "second.py"
myprogram/first.py
myprogram/second.py
Then in the second file, to call the class contained in the first file, you simply type:
second.py:
from first import Example
...
Related
I have a Python module called util. I would like to import a script in this package _util.py from another script in scripts folder.
Even if the util package has an empty __init__.py file it does not appear as a Python package but a normal directory, without the small dot on folder image.
How can I import this module?
A preliminary answer to your question is that modules or methods beginning with underscores are meant to be used internally.
_my_method() should only be referenced from within the module holding it
_my_module() should only be referenced from within the package holding it
That being said, this convention is meant to be a hint to other developers, not a strict prohibition. Perhaps the first step you can take to solve the import issue is to rename _util.py to util.py and proceed from there.
To allow myself to have a clear filestructure in my project i am using the following code snippet to dynamically add the project main folder to the PYTHONPATH and therefore assure that I can import files even from above a files location.
import sys
import os
sys.path.insert(0, os.path.join(os.path.dirname(os.path.realpath(__file__)), "."))
Since I did this, when I start my main file, changes to the modules aren't recognized anymore until i manually delete any .pyc files. Thus I assume this for some reason prevented python from checking if the pyc files are up to date. Can I overcome this issue in any way?
Adding the path of an already imported module can get you into trouble if module names are no longer unique. Consider that you do import foo, which adds its parent package bar to sys.path - it's now possible to also do import bar.foo. Python will consider both to be different modules, which can mess up anything relying on module identity.
You should really consider why you need to do this hack in the first place. If you have an executable placed inside your package, you should not do
cd bardir/bar
python foo
but instead call it as part of the package via
cd bardir
python -m bar.foo
You could try to make python not write those *.pyc files.
How to avoid .pyc files?
For large projects this would matter slightly from a performance perspective. It's possible that you don't care about that, and then you can just not create the pyc files.
I've been looking around at some open source projects on Python, and I'm seeing a lot of files and patterns that I'm not familiar with.
First of all, a lot of projects just have a file called setup.py, which usually contains one function:
setup(blah, blah, blah)
Second, a lot contain a file that is simply called __init__.py and contains next to no information.
Third, some .py files contain a statement similar to this:
if __name__ == "__main__"
Finally, I'm wondering if there are any "best practices" for dividing Python files up in a git repository. With Java, the idea of file division comes pretty naturally because of the class structure. With Python, many scripts have no classes at all, and sometimes a program will have OOP aspects, but a class by class division does not make that much sense. Is it just "whatever makes the code the most readable," or are there some guidelines somewhere about this?
The setup.py is part of Python’s module distribution using the distrubution utilities. It allows for easy installation of the Python module and is useful when, well, you want to distribute your project as a whole Python module.
The __init__.py is used for Python’s package system. An empty file is usually enough to make Python recognize the directory it is in as a package, but you can also define different things in it.
Finally, the __name__ == '__main__' check is to ensure that the current script is run directly (e.g. from the command line) and it is not just imported into some other script. During a Python script execution only a single module’s __name__ property will be equal to __main__. See also my answer here or the more general question on that topic.
The setup.py is part of distutils setup process. You'll want to have one of those if you're distributing a module instead of just a basic script (which even then it's a good idea to have one so you can easily expand into a module later).
The __init__.py part of the python module import process:
Files named init.py are used to mark directories on disk as a
Python package directories. If you have the files
mydir/spam/init.py mydir/spam/module.py and mydir is on your path,
you can import the code in module.py as:
import spam.module or
from spam import module If you remove the init.py file, Python
will no longer look for submodules inside that directory, so attempts
to import the module will fail.
if __name == "__main__" is a way to indicate code that would be executed if the file was run directly instead of imported.
To answer on how to layout your code, the distfiles documentation has a good guide on this.
In addition to #poke's answer, see this related question on what the directory structure of a python project should be. Here is another useful tutorial on how to make your project easily runnable.
I am working on a project wherein I need to use a third party module in different project files(.py files). The situation is like this.
I have a file "abc.py" which imports third party module "common.py". There are couple of other files which also import "common.py". All these files are also imported in main project file "main.py".
It seems redundant to import same module in your project multiple times in different files since "main.py" is also importing all the project files.
I am also not sure how the size of the project gets affected by multiple import statements.
Can someone pls help me in making things bit simpler.
Importing only ever loads a module once. Any imports after that simply add it to the current namespace.
Just import things in the files you need them to be available and let Python do the heavy-lifting of figuring out loading the modules.
Yes, you are right, this behavior really exists in Python. Namely, if user code tries to import the same module in different ways, for example - import a and import A.a (where a.py file is located into A package and the first import is done from within the A package while the other import comes as from outside).
This can easily happen in real life, especially for multi-level packaged Python projects.
I have experienced a side-effect of such behavior, namely command isinstance does not work when an object is checked against a class that is defined in module that was imported in such way.
The solution I can think about is to redefine the __builtin__. __ import__ function to perform its work more intelligently.
I'm trying to import a few libraries into my program (which is a google AppEngine application).
Basically, I'm supposed to put all libraries in the root folder, but I've just created another folder called lib and placed them within that folder. (I've created the __init__.py)
Imports regularly work fine by using the import lib.module or from lib import module, but what happens is that when I try to import a complete package, for instance a folder named pack1 with various modules in it, by calling from lib.pack1 import *, I get this error in one of the modules who has accessed another module statically, i.e. from pack1.mod2 import sth.
What is the easy and clean way to overcome this? Without modifying the libraries themselves.
Edit: Using Python 2.7.
Edit: Error: when using import lib.pack1, I get ImportError: No module named pack1.mod1.
I think that instead of from pack1.mod2 you actually want to say from lib.pack1.mod2.
Edit: and, specifying what version of Python this is would help, since importation semantics have improved gradually over the years!
Edit: Aha! Thank you for your comment; I now understand. You are trying to rename libraries without going inside of them and fixing the fact that their name is now different. The problem is that what you are doing is, unfortunately, impossible. If all libraries used relative imports inside, then you might have some chance of doing it; but, alas, relative imports are both (a) recent and (b) not widely used.
So, if you want to use library p, then you are going to have to put it in your root directory, not inside of lib/p because that creates a library with a different name: lib.p, which is going to badly surprise the library and break it.
But I have two more thoughts.
First, if you are trying to do this to organize your files, and not because you need the import names to be different, then (a) create lib like you are doing but (b) do not put an __init__.py inside! Instead, add the lib directory to your PYTHONPATH or, inside of your program, to sys.path. (Does the GAE let you do something like this? Does it have a PYTHONPATH?)
Second, I am lying when I say this is not possible. Strictly speaking, you could probably do this by adding an entry to sys.metapath that intercepts all module lookups and tries grabbing them from inside of lib if they exist there. But — yuck.