Python internal implementation [duplicate] - python

I wanted to try and look up the source of some of the modules in the Python standard library, but wasn't able to find them. I tried looking in the modules directory after downloading the python tarball, but it has mainly .c files. I also tried looking at the directory where the python that already comes with the OS (mac osx) has it's modules, and there it seems to have mainly .pyc and .pyo files. Would really appreciate it if someone can help me out.
(I tried what was suggested in the question How do I find the location of Python module sources? with no luck)

In cpython, many modules are implemented in C, and not in Python. You can find those in Modules/, whereas the pure Python ones reside in Lib/.
In some cases (for example the json module), the Python source code provides the module on its own and only uses the C module if it's available (to improve performance). For the remaining modules, you can have a look at PyPy's implementations.

The canonical repository for CPython is this Mercurial repository. There is also a git mirror on GitHub.

That would depend on what you define as Standard Library.
The Python Documentations says:
...this library reference manual describes the standard library that is
distributed with Python. It also describes some of the optional
components that are commonly included in Python distributions.
Python’s standard library is very extensive, offering a wide range of
facilities as indicated by the long table of contents listed below.
The library contains built-in modules (written in C) that provide
access to system functionality such as file I/O that would otherwise
be inaccessible to Python programmers, as well as modules written in
Python that provide standardized solutions for many problems that
occur in everyday programming. Some of these modules are explicitly
designed to encourage and enhance the portability of Python programs
by abstracting away platform-specifics into platform-neutral APIs.
If you take an extensive criteria, the Python Documentation explicitly answers what you're asking for, and I quote:
Exploring CPython’s Internals.
CPython Source Code Layout.
This guide gives an overview of CPython’s code structure. It serves as a summary of file locations for modules and builtins.
For Python modules, the typical layout is:
Lib/<module>.py
Modules/_<module>.c
Lib/test/test_<module>.py
Doc/library/<module>.rst
For extension-only modules, the typical layout is:
Modules/<module>module.c
Lib/test/test_<module>.py
Doc/library/<module>.rst
For builtin types, the typical layout is:
Objects/<builtin>object.c
Lib/test/test_<builtin>.py
Doc/library/stdtypes.rst
For builtin functions, the typical layout is:
Python/bltinmodule.c
Lib/test/test_builtin.py
Doc/library/functions.rst
Some exceptions:
builtin type int is at Objects/longobject.c
builtin type str is at Objects/unicodeobject.c
builtin module sys is at Python/sysmodule.c
builtin module marshal is at Python/marshal.c
Windows-only module winreg is at PC/winreg.c

You can get the source code of pure python modules that are part of the
standard library from the location where Python is installed.
For example at : C:\Python27\Lib (on windows) if you have
used Windows Installer for Python Installation.

Look it up under the Lib sub-directory of the Python installation directory.

The source code for many standard library packages is linked at the top of the package's documentation page in the library documentation, for example, the docs for the random module.
The original commit message for adding these links states
Provide links to Python source where the code is short, readable and
informative adjunct to the docs.

Related

How to read source file for the math library? [duplicate]

I wanted to try and look up the source of some of the modules in the Python standard library, but wasn't able to find them. I tried looking in the modules directory after downloading the python tarball, but it has mainly .c files. I also tried looking at the directory where the python that already comes with the OS (mac osx) has it's modules, and there it seems to have mainly .pyc and .pyo files. Would really appreciate it if someone can help me out.
(I tried what was suggested in the question How do I find the location of Python module sources? with no luck)
In cpython, many modules are implemented in C, and not in Python. You can find those in Modules/, whereas the pure Python ones reside in Lib/.
In some cases (for example the json module), the Python source code provides the module on its own and only uses the C module if it's available (to improve performance). For the remaining modules, you can have a look at PyPy's implementations.
The canonical repository for CPython is this Mercurial repository. There is also a git mirror on GitHub.
That would depend on what you define as Standard Library.
The Python Documentations says:
...this library reference manual describes the standard library that is
distributed with Python. It also describes some of the optional
components that are commonly included in Python distributions.
Python’s standard library is very extensive, offering a wide range of
facilities as indicated by the long table of contents listed below.
The library contains built-in modules (written in C) that provide
access to system functionality such as file I/O that would otherwise
be inaccessible to Python programmers, as well as modules written in
Python that provide standardized solutions for many problems that
occur in everyday programming. Some of these modules are explicitly
designed to encourage and enhance the portability of Python programs
by abstracting away platform-specifics into platform-neutral APIs.
If you take an extensive criteria, the Python Documentation explicitly answers what you're asking for, and I quote:
Exploring CPython’s Internals.
CPython Source Code Layout.
This guide gives an overview of CPython’s code structure. It serves as a summary of file locations for modules and builtins.
For Python modules, the typical layout is:
Lib/<module>.py
Modules/_<module>.c
Lib/test/test_<module>.py
Doc/library/<module>.rst
For extension-only modules, the typical layout is:
Modules/<module>module.c
Lib/test/test_<module>.py
Doc/library/<module>.rst
For builtin types, the typical layout is:
Objects/<builtin>object.c
Lib/test/test_<builtin>.py
Doc/library/stdtypes.rst
For builtin functions, the typical layout is:
Python/bltinmodule.c
Lib/test/test_builtin.py
Doc/library/functions.rst
Some exceptions:
builtin type int is at Objects/longobject.c
builtin type str is at Objects/unicodeobject.c
builtin module sys is at Python/sysmodule.c
builtin module marshal is at Python/marshal.c
Windows-only module winreg is at PC/winreg.c
You can get the source code of pure python modules that are part of the
standard library from the location where Python is installed.
For example at : C:\Python27\Lib (on windows) if you have
used Windows Installer for Python Installation.
Look it up under the Lib sub-directory of the Python installation directory.
The source code for many standard library packages is linked at the top of the package's documentation page in the library documentation, for example, the docs for the random module.
The original commit message for adding these links states
Provide links to Python source where the code is short, readable and
informative adjunct to the docs.

What is "Pure Python?"

Many questions on Stack Overflow refer to "Pure Python" (some random examples from the "similar questions" list: 1, 2, 3, 4, 5, 6, 7, 8,
9,
10,
11).
I also encounter the concept elsewhere on the web, e.g. in the package documentation for imageio and in tutorials such as "An introduction to Pure Python".
This has led me to believe there must be some universally accepted standard definition of what "Pure Python" is.
However, despite googling to the limits of my ability, I have not yet been able to locate this definition.
Is there a universally accepted definition of "Pure Python," or is this just some elusive concept that means different things to different people?
To be clear, I am asking: Does such a definition exist, yes or no, and if so, what is the acclaimed source? Although I truly appreciate all comments and answers, I am not looking for personal interpretations.
In that imageio package, they mean it's all implemented in Python, and not (as is sometimes done) with parts written in C or other languages. As a result it's guaranteed to work on any system that Python works on.
In that tutorial, it means the Python you get when you download and install Python -- the language and the standard libraries, not any external modules. The chapter after that adds some external libraries, like numpy and scipy, that are used a lot but aren't part of the standard library.
So they mean different things there already.
A "pure-Python" package is a package that only contains Python code, and doesn't include, say, C extensions or code in other languages. You only need a Python interpreter and the Python Standard Library to run a pure-Python package, and it doesn't matter what your OS or platform is.
Pure-Python packages can import or depend on non-pure-Python packages:
Package X contains only Python code and is a pure-Python package.
Package Y contains Python and C code and isn't a pure-Python package.
Package Z imports Package Y, but Package Z is still a pure-Python package.
A good rule of thumb: If you can make a source distribution ("sdist") of your package and it doesn't include any non-Python code, it is a pure-Python package.
Pure-Python packages aren't restricted to just the Python Standard Library; packages can import modules from outside the Python Standard Library and still be considered pure-Python.
Additionally, a standalone module is a single .py file that only imports modules from the Python Standard Library. A standalone module is necessarily a pure-Python module.
Note that in Python, package technically refers to a folder with an __init__.py file in it. The things you download and install from PyPI with pip are distributions (such as "source distribution" or "sdist"), though the term "package" is also used as a synonym with "distribution", since that term could be confused with the "Linux distro" usage of the word.
Is there an official definition for "pure-Python"? As of this writing, no, though the Python Packaging User Guide makes heavy use of the term in https://packaging.python.org/overview/
Unfortunately, it seems there is no standardized, formalized definition.
As a programmer in Python for almost 2 decades, my definition of pure Python is: a Python package that implements the core logic only in Pythonic statements that only require pure python or native packages. This is a recursive statement, so at the far end of your packages dependency tree, you end up with python packages that only require native python libraries/functions. With this approach, the whole chain of code logic that allows for the main objective of the tool to be accomplished can be read and modified only using Python, and no other programming language nor tool besides the CPython interpreter.
This can't be overstated: "pure Python" is more defined as an objective -- of being entirely readable and modifiable in the Python language --, rather than a state. It's very similar to what the sibling language Julia is trying to do but by generalizing the procedure down to the interpreter, which is written in the Julia language itself. You can say that Julia is "pure" by design, whereas CPython is not (because the CPython interpreter is compiled in C++), but you can still write "pure Python" packages, just like you can write "pure PHP" or "pure Ruby" packages that do not require the use of any package written in another language at any point of the program's logic.

Modular Compiler in Python

I am writing a compiler in Python, using the PLY (Python Lex-Yacc) library to 'compile' the compiler. The compiler has to go through a lot of rules (the
number of just the core rules is eventually going to be a little less than a hundred, and they can be extended). So to keep the different types of rules separate, I made many Python modules in a single modules directory.
To include all the rules, I don't have to include the modules in this directory, but I have to include the rules (implemented as Python functions) into the current namespace. Once they simply exist there, the compiler's input will be properly tokenized, parsed, etc.
Here's what I've read about and tried:
using __import__, getattr, and sys.modules (very raw and in general not preferred)
the importlib library (how do I get everything inside the module?)
a lot of fiddling with __init__.py and just trying to from modules import * which will import everything in the modules as well
But none of these seem entirely satisfactory to me. I can't do precisely what I want to do with any of them. So my question is: how can I import some of the attributes of a Python module in a subdirectory into the running namespace of a top-level module?
Thanks for your attention!
You want to use an existing plugin library like stevedore. It will give you the tools to enumerate files that can be imported, and tools to import those modules.

Implementation of built-in python modules

I have just started learning python. I want to understand how some functions work and how are modules organized. How can I read the implementations of built-in modules?
Where the source code of the Python standard library is located will depend on your operating system and on how you installed Python. However, the following locations are common:
Windows - C:\Python27\Lib
Linux - /usr/lib/python2.7/
Note that some builtins such as the math module are missing -- that's because those builtins are written in C and are baked directly into the interpreter for purposes of speed.
You should also consider taking a look at the source code for some popular 3rd party libraries. They'll vary in quality, but might be worth examining. Here's a list to help you get started.
There are many implementations of Python, such as CPython, IronPython, PyPy, Jython. The most commonly used Python is CPython. Its source code can by found at hg.python.org.
Your installation also contains source code. For example, to find the source code associate with the collections module, type the following in an interactive session:
>>> import collections
>>> collections
<module 'collections' from '/usr/lib/python2.7/collections.pyc'>
Thus you would look in '/usr/lib/python2.7/collections.py' for the source code associated with the collections module. (Note that you should remove the c in pyc from the path. The .py file is Python source code, the .pyc is byte code.)
A clean way to read this code is in the Python Mercurial repository, or in the Git mirror. (I personally find the Git mirror easier to use, but both are equally good for learning the code.)
The Git repository is at https://github.com/python/cpython/tree/2.7
The Mercurial repository is at http://hg.python.org/cpython (click "branches", then click "2.7", then click "browse")
In both of these repositories, the Lib folder is the Python standard library.

Where do I find the python standard library code?

I wanted to try and look up the source of some of the modules in the Python standard library, but wasn't able to find them. I tried looking in the modules directory after downloading the python tarball, but it has mainly .c files. I also tried looking at the directory where the python that already comes with the OS (mac osx) has it's modules, and there it seems to have mainly .pyc and .pyo files. Would really appreciate it if someone can help me out.
(I tried what was suggested in the question How do I find the location of Python module sources? with no luck)
In cpython, many modules are implemented in C, and not in Python. You can find those in Modules/, whereas the pure Python ones reside in Lib/.
In some cases (for example the json module), the Python source code provides the module on its own and only uses the C module if it's available (to improve performance). For the remaining modules, you can have a look at PyPy's implementations.
The canonical repository for CPython is this Mercurial repository. There is also a git mirror on GitHub.
That would depend on what you define as Standard Library.
The Python Documentations says:
...this library reference manual describes the standard library that is
distributed with Python. It also describes some of the optional
components that are commonly included in Python distributions.
Python’s standard library is very extensive, offering a wide range of
facilities as indicated by the long table of contents listed below.
The library contains built-in modules (written in C) that provide
access to system functionality such as file I/O that would otherwise
be inaccessible to Python programmers, as well as modules written in
Python that provide standardized solutions for many problems that
occur in everyday programming. Some of these modules are explicitly
designed to encourage and enhance the portability of Python programs
by abstracting away platform-specifics into platform-neutral APIs.
If you take an extensive criteria, the Python Documentation explicitly answers what you're asking for, and I quote:
Exploring CPython’s Internals.
CPython Source Code Layout.
This guide gives an overview of CPython’s code structure. It serves as a summary of file locations for modules and builtins.
For Python modules, the typical layout is:
Lib/<module>.py
Modules/_<module>.c
Lib/test/test_<module>.py
Doc/library/<module>.rst
For extension-only modules, the typical layout is:
Modules/<module>module.c
Lib/test/test_<module>.py
Doc/library/<module>.rst
For builtin types, the typical layout is:
Objects/<builtin>object.c
Lib/test/test_<builtin>.py
Doc/library/stdtypes.rst
For builtin functions, the typical layout is:
Python/bltinmodule.c
Lib/test/test_builtin.py
Doc/library/functions.rst
Some exceptions:
builtin type int is at Objects/longobject.c
builtin type str is at Objects/unicodeobject.c
builtin module sys is at Python/sysmodule.c
builtin module marshal is at Python/marshal.c
Windows-only module winreg is at PC/winreg.c
You can get the source code of pure python modules that are part of the
standard library from the location where Python is installed.
For example at : C:\Python27\Lib (on windows) if you have
used Windows Installer for Python Installation.
Look it up under the Lib sub-directory of the Python installation directory.
The source code for many standard library packages is linked at the top of the package's documentation page in the library documentation, for example, the docs for the random module.
The original commit message for adding these links states
Provide links to Python source where the code is short, readable and
informative adjunct to the docs.

Categories