I am trying to understand the lower level implementations of python 3. There is one module named _posixsubprocess used by the subprocess module. I tried to find the location of this module in my system and found that it's a stub file.
Could someone guide me as I have no idea about what are the stub files and how are they implemented at the lower level?
_posixsubprocess
The file you are referencing is a Python module written in C. It's not a "stub" file. The real implementation can be found in the stdlib at Modules/_posixsubprocess.c. You can see how writing a C/C++ extension is written by having a look at Building C and C++ Extensions. This should help you understanding the code in _posixsubprocess.c.
In order to add type-hints to that file (which is an "Extension Module" as it is written in C), the type hints are added to a "stub" file with the extension .pyi.
That file can be found in the typeshed which is a collection of stub files. The typeshed also contains stubs for third-party modules which is a historical remnant. That is no longer needed since PEP-561 has been adopted.
Concerning stub/pyi files
Stub files contain type-hinting information of normal Python modules. The full official documentation can be found in the section about stub-files in PEP-484.
For example, if you have a Python module mymodule.py like this:
def myfunction(name):
return "Hello " + name
Then you can add type-hints via a stub-file mymodule.pyi. Note that here the ellipsis (...) is part of the syntax, so the code-block below really shows the complete file contents:
def myfunction(name: str) -> str: ...
They look very similar to C header files in that they contain only the function signatures, but their use is purely optional.
You can also add type hints directly in the .py module like the following:
def myfunction(name: str) -> str:
return "Hello " + name
But there are some cases where you want to keep them separate in stubs:
You want to keep your code Python 2 compatible and don't like the # type: ... comment syntax
You use function annotations for something else but still want to use type-hints
You are adding type-hints into an existing code-base and want to keep code-churn in existing files minimal
Related
I am trying to understand the lower level implementations of python 3. There is one module named _posixsubprocess used by the subprocess module. I tried to find the location of this module in my system and found that it's a stub file.
Could someone guide me as I have no idea about what are the stub files and how are they implemented at the lower level?
_posixsubprocess
The file you are referencing is a Python module written in C. It's not a "stub" file. The real implementation can be found in the stdlib at Modules/_posixsubprocess.c. You can see how writing a C/C++ extension is written by having a look at Building C and C++ Extensions. This should help you understanding the code in _posixsubprocess.c.
In order to add type-hints to that file (which is an "Extension Module" as it is written in C), the type hints are added to a "stub" file with the extension .pyi.
That file can be found in the typeshed which is a collection of stub files. The typeshed also contains stubs for third-party modules which is a historical remnant. That is no longer needed since PEP-561 has been adopted.
Concerning stub/pyi files
Stub files contain type-hinting information of normal Python modules. The full official documentation can be found in the section about stub-files in PEP-484.
For example, if you have a Python module mymodule.py like this:
def myfunction(name):
return "Hello " + name
Then you can add type-hints via a stub-file mymodule.pyi. Note that here the ellipsis (...) is part of the syntax, so the code-block below really shows the complete file contents:
def myfunction(name: str) -> str: ...
They look very similar to C header files in that they contain only the function signatures, but their use is purely optional.
You can also add type hints directly in the .py module like the following:
def myfunction(name: str) -> str:
return "Hello " + name
But there are some cases where you want to keep them separate in stubs:
You want to keep your code Python 2 compatible and don't like the # type: ... comment syntax
You use function annotations for something else but still want to use type-hints
You are adding type-hints into an existing code-base and want to keep code-churn in existing files minimal
When calling the Python help() function on a package I get the following things:
The content of the docstring specified in __init__.py
A list of the package contents containing all modules
The version of the package
The path of __init__.py
As a provider of a customer-oriented software I would like to restrict this output to the relevant information, i.e. not showing those modules that are not meant to be used by the customer.
So, if my package P contains modules A, B and _c, where A and B are meant to be used as a public interface and _c provides just some utility functionality for A and B then I would like to limit the output of help(P) to:
Help on package P:
Some descriptive text.
PACKAGE CONTENTS
A
B
VERSION
1.0
FILE
/path/to/P/__init__.py
When trying to achieve something similar for modules, I can define my own __dir__() function which interestingly is respected by help(module). But trying to apply the same approach to a package (meaning: defining __dir__() in __init__.py) doesn't achieve the result.
The output of help() comes from the pydoc module.
Looking at the relevant source, the PACKAGE CONTENTS section is generated by pkgutil.iter_modules(), and there's no hook I can see that could hide some modules from there (aside from monkey-patching the whole function to hide the modules you want when the stack indicates it's being called from pydoc, but I would strongly advise against it unless this is an app you're in full control of).
You might use __all__ magic variable to provide list of things which should be described after using help, let mymodule.py be file with content
__all__ = ['func1','func2']
def func1():
'''first function for user'''
return 1
def func2():
'''second function for user'''
return 2
def func3():
'''function for internal usage'''
return 3
then
import mymodule
help(mymodule)
gives output
Help on module mymodule:
NAME
mymodule
FUNCTIONS
func1()
first function for user
func2()
second function for user
DATA
__all__ = ['func1', 'func2']
FILE
<path to mymodule.py here>
Let's say the name of the module is available in form of a string rather than module object. How do I locate its source code location and load the abstract syntax tree (if the source code is present)?
I'd take the problem in three steps:
Import the module by name. This should be relatively easy using importlib.import_module, though you could bodge up your own version with the builtin __import__ if you needed to.
Get the source code for the module. Using inspect.getsource is probably the easiest way (but you could also just try open(the_module.__file__).read() and it is likely to work).
Parse the source into an AST. This should be easy with ast.parse. Even for this step, the library isn't essential, as you can use the builtin compile instead, as long as you pass the appropriate flag (ast.PyCF_ONLY_AST appears to be 1024 on my system, so compile(source, filename, 'exec', 1024) should work).
I have defined several classes in a single python file. My wish is to create a library with these. I would ideally like to import the library in such a way that I can use the classes without a prefix (like mylibrary.myclass() as opposed to just myclass() ), if that's what you can call them, I am not entirely sure as I am a beginner.
What is the proper way to achieve this, or the otherwise best result? Define all classes in __init __? Define them all in a single file as I currently have like AllMyClasses.py? Or should I have a separate file for every class in the library directory like FirstClass.py, SecondClass.py etc.
I realize this is a question that should be easy enough to google, but since I am still quite new to python and programming in general I haven't quite figured out what the correct keywords are for a problem in this context(such as my uncertainty about "prefix")
More information can be found in the tutorial on modules (single files) or packages (when in a directory with an __init__.py file) on the python site.
The suggested way (according to the style guide) is to spell out each class import specifically.
from my_module import MyClass1, MyClass2
object1 = MyClass1()
object2 = MyClass2()
While you can also shorten the module name:
import my_module as mo
object = mo.MyClass1()
Using from my_module import * is recommended to be avoided as it can be confusing (even if it is the recommended way for some things, like tkinter)
If it's for your personal use, you can just put all your classes Class1, Class2, ... in a myFile.py and to use them call import myFile (without the .py extension)
import myFile
myVar1 = myFile.Class1()
myVar2 = myFile.Class2()
from within another script. If you want to be able to use the classes without the file name prefix, import the file like this:
from myFile import *
Note that the file you want to import should be in a directory where Python can find it (the same where the script is running or a directory in PYTHONPATH).
The _init_ is needed if you want to create a Python module for distribution. Here are the instructions: Distributing Python Modules
EDIT after checking the Python's style guide PEP 8 on imports:
Wildcard imports (from import) should be avoided, as they make it unclear which names are present in the namespace, confusing both readers and many automated tools
So in this example you should have used
from myFile import Class1, Class2
In a Python package directory of my own creation, I have an __init__.py file that says:
from _foo import *
In the same directory there is a _foomodule.so which is loaded by the above. The shared library is implemented in C++ (using Boost Python). This lets me say:
import foo
print foo.MyCppClass
This works, but with a quirk: the class is known to Python by the full package path, which makes it print this:
foo._foo.MyCppClass
So while MyCppClass exists as an alias in foo, foo.MyCppClass is not its canonical name. In addition to being a bit ugly, this also makes help() a bit lame: help(foo) will say that foo contains a module _foo, and only if you say help(foo._foo) do you get the documentation for MyCppClass.
Is there something I can do differently in __init__.py or otherwise to make it so Python sees foo.MyCppClass as the canonical name?
I'm using Python 2.7; it would be great if the solution worked on 2.6 as well.
I had the same problem. You can change the module name in your Boost.Python definition:
BOOST_PYTHON_MODULE(_foo)
{
scope().attr("__name__") = "foo";
...
}
The help issue is a separate problem. I think you need to add each item to __all__ to get it exported to help.
When I do both of these, the name of foo.MyCppClass is just that -- foo.MyCppClass -- and help(foo) gives documentation for MyCppClass.
You can solve the help() problem by adding the line
__all__ = ['MyCppClass']
to your __init__.py file.