Python modules confusion - python

I feel a bit confused with how python modules work when I started looking at PyMySQL repository, see here: https://github.com/PyMySQL/PyMySQL?files=1
1) Why is there no pymysql.py file because it is imported like: import pymysql? Isnt it required to have such a file?
2) I cannot find the method connect, used like: pymysql.connect(...), anywhere. Is it possible to rename exported methods somehow?

There's a directory pymysql there. A directory can also be imported as a module*, with the advantage that it can contain submodules. Classically, there's a __init__.py file in the directory that controls what's in the top-level pymysql.* namespace.
So, the connect method you're missing will either be defined directly in pymysql/__init__.py, or defined in one of its siblings in that directory, and then imported from there by pymysql/__init__.py.
*Strictly speaking, a directory that you import like a module is really called a "package". I like to avoid that term—it's potentially confusing because the term is overloaded: what you install with pip is also called a "package" in sense 2, and that might actually contain multiple "packages" in sense 1.
See What is __init__.py for? and the official docs

Related

Python package cannot be imported

I have a Python module called util. I would like to import a script in this package _util.py from another script in scripts folder.
Even if the util package has an empty __init__.py file it does not appear as a Python package but a normal directory, without the small dot on folder image.
How can I import this module?
A preliminary answer to your question is that modules or methods beginning with underscores are meant to be used internally.
_my_method() should only be referenced from within the module holding it
_my_module() should only be referenced from within the package holding it
That being said, this convention is meant to be a hint to other developers, not a strict prohibition. Perhaps the first step you can take to solve the import issue is to rename _util.py to util.py and proceed from there.

What does `__import__('pkg_resources').declare_namespace(__name__)` do?

In some __init__.py files of modules I saw such single line:
__import__('pkg_resources').declare_namespace(__name__)
What does it do and why people use it? Suppose it's related to dynamic importing and creating namespace at runtime.
It boils down to two things:
__import__ is a Python function that will import a package using a string as the name of the package. It returns a new object that represents the imported package. So foo = __import__('bar') will import a package named bar and store a reference to its objects in a local object variable foo.
From setup utils pkg_resources' documentation, declare_namespace() "Declare[s] that the dotted package name name is a "namespace package" whose contained packages and modules may be spread across multiple distributions."
So __import__('pkg_resources').declare_namespace(__name__) will import the 'pkg_resources' package into a temporary and call the declare_namespace function stored in that temporary (the __import__ function is likely used rather than the import statement so that there is no extra symbol left over named pkg_resources). If this code were in my_namespace/__init__.py, then __name__ is my_namespace and this module will be included in the my_namespace namespace package.
See the setup tools documentation for more details
See this question for discussion on the older mechanism for achieving the same effect.
See PEP 420 for the standardized mechanism that provides similar functionality beginning with Python 3.3.
This is a way to declare the so called "namespace packages" in Python.
What are these and what is the problem:
Imagine you distribute a software product which has a lot of functionality, and not all people want all of it, so you split it into pieces and ship as optional plugins.
You want people to be able to do
import your_project.plugins.plugin1
import your_project.plugins.plugin2
...
Which is fine if your directory structure is exactly as above, namely
your_project/
__init__.py
plugins/
__init__.py
plugin1.py
plugin2.py
But what if you ship those two plugins as separate python packages so they are located in two different directories? Then you might want to put __import__('pkg_resources').declare_namespace(__name__) in each package's __init__.py so that Python knows those packages are part of a bigger "namespace package", in our case it's your_project.plugins.
Please refer to the documentation for more info.

Importing nested modules in Python

I'm trying to import a few libraries into my program (which is a google AppEngine application).
Basically, I'm supposed to put all libraries in the root folder, but I've just created another folder called lib and placed them within that folder. (I've created the __init__.py)
Imports regularly work fine by using the import lib.module or from lib import module, but what happens is that when I try to import a complete package, for instance a folder named pack1 with various modules in it, by calling from lib.pack1 import *, I get this error in one of the modules who has accessed another module statically, i.e. from pack1.mod2 import sth.
What is the easy and clean way to overcome this? Without modifying the libraries themselves.
Edit: Using Python 2.7.
Edit: Error: when using import lib.pack1, I get ImportError: No module named pack1.mod1.
I think that instead of from pack1.mod2 you actually want to say from lib.pack1.mod2.
Edit: and, specifying what version of Python this is would help, since importation semantics have improved gradually over the years!
Edit: Aha! Thank you for your comment; I now understand. You are trying to rename libraries without going inside of them and fixing the fact that their name is now different. The problem is that what you are doing is, unfortunately, impossible. If all libraries used relative imports inside, then you might have some chance of doing it; but, alas, relative imports are both (a) recent and (b) not widely used.
So, if you want to use library p, then you are going to have to put it in your root directory, not inside of lib/p because that creates a library with a different name: lib.p, which is going to badly surprise the library and break it.
But I have two more thoughts.
First, if you are trying to do this to organize your files, and not because you need the import names to be different, then (a) create lib like you are doing but (b) do not put an __init__.py inside! Instead, add the lib directory to your PYTHONPATH or, inside of your program, to sys.path. (Does the GAE let you do something like this? Does it have a PYTHONPATH?)
Second, I am lying when I say this is not possible. Strictly speaking, you could probably do this by adding an entry to sys.metapath that intercepts all module lookups and tries grabbing them from inside of lib if they exist there. But — yuck.

python module layout

I'm just starting to get to the point in my python projects that I need to start using multiple packages and I'm a little confused on exactly how everything is supposed to work together. What exactly should go into the __init__.py of the package? Some projects I see just have blank inits and all of their code are in modules in that package. Other projects implement what seems to be the majority of the package's classes and functions inside the init.
Is there a document or style guide or something that describes what the python authors had in mind for the use of packages and the __init__ file and such?
Edit:
I know the point of having the __init__.py file in the simplest sense that it makes a folder a package. But why would I put a function there instead of a module in that same folder(package)?
__init__.py can be empty, but what it really does is make sure Python treats your directories correctly, provide any initialization you might need for when your package is imported (configuring the environment or something along those lines), or defining __all__ so that Python knows what to do when someone uses from package import *.
Most everything you need to know is described in the docs on Packages. Dive Into Python also has a piece on packaging.
You already know, I guess that __init__.py files are required to make Python treat the directories as containing packages.
In the above model __init__.py can remain empty.
You can can also execute initialization code for the package.
You can also set the __all__ variable.
[Edit: learnings]
When you do "from package import item", or "from package import *", then the variable __all__ can be used to import selected packages.
See : http://docs.python.org/tutorial/modules.html

Deploying a python application with shared package

I'm thinking how to arrange a deployed python application which will have a
Executable script located in /usr/bin/ which will provide a CLI to functionality implemented in
A library installed to wherever the current site-packages directory is.
Now, currently, I have the following directory structure in my sources:
foo.py
foo/
__init__.py
...
which I guess is not the best way to do things. During development, everything works as expected, however when deployed, the "from foo import FooObject" code in foo.py seemingly attempts to import foo.py itself, which is not the behaviour I'm looking for.
So the question is what is the standard practice of orchestrating situations like this? One of the things I could think of is, when installing, rename foo.py to just foo, which stops it from importing itself, but that seems rather awkward...
Another part of the problem, I suppose, is that it's a naming challenge. Perhaps call the executable script foo-bin.py?
This article is pretty good, and shows you a good way to do it. The second item from the Do list answers your question.
shameless copy paste:
Filesystem structure of a Python project
by Jp Calderone
Do:
name the directory something related to your project. For example, if your
project is named "Twisted", name the
top-level directory for its source
files Twisted. When you do releases,
you should include a version number
suffix: Twisted-2.5.
create a directory Twisted/bin and put your executables there, if you
have any. Don't give them a .py
extension, even if they are Python
source files. Don't put any code in
them except an import of and call to a
main function defined somewhere else
in your projects.
If your project is expressable as a single Python source file, then put it
into the directory and name it
something related to your project. For
example, Twisted/twisted.py. If you
need multiple source files, create a
package instead (Twisted/twisted/,
with an empty
Twisted/twisted/__init__.py) and place
your source files in it. For example,
Twisted/twisted/internet.py.
put your unit tests in a sub-package of your package (note - this means
that the single Python source file
option above was a trick - you always
need at least one other file for your
unit tests). For example,
Twisted/twisted/test/. Of course, make
it a package with
Twisted/twisted/test/__init__.py.
Place tests in files like
Twisted/twisted/test/test_internet.py.
add Twisted/README and Twisted/setup.py to explain and
install your software, respectively,
if you're feeling nice.
Don't:
put your source in a directory called src or lib. This makes it hard
to run without installing.
put your tests outside of your Python package. This makes it hard to
run the tests against an installed
version.
create a package that only has a __init__.py and then put all your code into __init__.py. Just make a module
instead of a package, it's simpler.
try to come up with magical hacks to make Python able to import your module
or package without having the user add
the directory containing it to their
import path (either via PYTHONPATH or
some other mechanism). You will not
correctly handle all cases and users
will get angry at you when your
software doesn't work in their
environment.
Distutils supports installing modules, packages, and scripts. If you create a distutils setup.py which refers to foo as a package and foo.py as a script, then foo.py should get installed to /usr/local/bin or whatever the appropriate script install path is on the target OS, and the foo package should get installed to the site_packages directory.
You should call the executable just foo, not foo.py, then attempts to import foo will not use it.
As for naming it properly: this is difficult to answer in the abstract; we would need to know what specifically it does. For example, if it configures and controls, calling it -config or ctl might be appropriate. If it is a shell API for the library, it should have the same name as the library.
Your CLI module is one thing, the package that supports it is another thing. Don't confuse the names withe module foo (in a file foo.py) and the package foo (in a directory foo with a file __init__.py).
You have two things named foo: a module and a package. What else do you want to name foo? A class? A function? A variable?
Pick a distinctive name for the foo module or the foo package. foolib, for example, is a popular package name.

Categories