I have just started learning python. I want to understand how some functions work and how are modules organized. How can I read the implementations of built-in modules?
Where the source code of the Python standard library is located will depend on your operating system and on how you installed Python. However, the following locations are common:
Windows - C:\Python27\Lib
Linux - /usr/lib/python2.7/
Note that some builtins such as the math module are missing -- that's because those builtins are written in C and are baked directly into the interpreter for purposes of speed.
You should also consider taking a look at the source code for some popular 3rd party libraries. They'll vary in quality, but might be worth examining. Here's a list to help you get started.
There are many implementations of Python, such as CPython, IronPython, PyPy, Jython. The most commonly used Python is CPython. Its source code can by found at hg.python.org.
Your installation also contains source code. For example, to find the source code associate with the collections module, type the following in an interactive session:
>>> import collections
>>> collections
<module 'collections' from '/usr/lib/python2.7/collections.pyc'>
Thus you would look in '/usr/lib/python2.7/collections.py' for the source code associated with the collections module. (Note that you should remove the c in pyc from the path. The .py file is Python source code, the .pyc is byte code.)
A clean way to read this code is in the Python Mercurial repository, or in the Git mirror. (I personally find the Git mirror easier to use, but both are equally good for learning the code.)
The Git repository is at https://github.com/python/cpython/tree/2.7
The Mercurial repository is at http://hg.python.org/cpython (click "branches", then click "2.7", then click "browse")
In both of these repositories, the Lib folder is the Python standard library.
Related
I wanted to try and look up the source of some of the modules in the Python standard library, but wasn't able to find them. I tried looking in the modules directory after downloading the python tarball, but it has mainly .c files. I also tried looking at the directory where the python that already comes with the OS (mac osx) has it's modules, and there it seems to have mainly .pyc and .pyo files. Would really appreciate it if someone can help me out.
(I tried what was suggested in the question How do I find the location of Python module sources? with no luck)
In cpython, many modules are implemented in C, and not in Python. You can find those in Modules/, whereas the pure Python ones reside in Lib/.
In some cases (for example the json module), the Python source code provides the module on its own and only uses the C module if it's available (to improve performance). For the remaining modules, you can have a look at PyPy's implementations.
The canonical repository for CPython is this Mercurial repository. There is also a git mirror on GitHub.
That would depend on what you define as Standard Library.
The Python Documentations says:
...this library reference manual describes the standard library that is
distributed with Python. It also describes some of the optional
components that are commonly included in Python distributions.
Python’s standard library is very extensive, offering a wide range of
facilities as indicated by the long table of contents listed below.
The library contains built-in modules (written in C) that provide
access to system functionality such as file I/O that would otherwise
be inaccessible to Python programmers, as well as modules written in
Python that provide standardized solutions for many problems that
occur in everyday programming. Some of these modules are explicitly
designed to encourage and enhance the portability of Python programs
by abstracting away platform-specifics into platform-neutral APIs.
If you take an extensive criteria, the Python Documentation explicitly answers what you're asking for, and I quote:
Exploring CPython’s Internals.
CPython Source Code Layout.
This guide gives an overview of CPython’s code structure. It serves as a summary of file locations for modules and builtins.
For Python modules, the typical layout is:
Lib/<module>.py
Modules/_<module>.c
Lib/test/test_<module>.py
Doc/library/<module>.rst
For extension-only modules, the typical layout is:
Modules/<module>module.c
Lib/test/test_<module>.py
Doc/library/<module>.rst
For builtin types, the typical layout is:
Objects/<builtin>object.c
Lib/test/test_<builtin>.py
Doc/library/stdtypes.rst
For builtin functions, the typical layout is:
Python/bltinmodule.c
Lib/test/test_builtin.py
Doc/library/functions.rst
Some exceptions:
builtin type int is at Objects/longobject.c
builtin type str is at Objects/unicodeobject.c
builtin module sys is at Python/sysmodule.c
builtin module marshal is at Python/marshal.c
Windows-only module winreg is at PC/winreg.c
You can get the source code of pure python modules that are part of the
standard library from the location where Python is installed.
For example at : C:\Python27\Lib (on windows) if you have
used Windows Installer for Python Installation.
Look it up under the Lib sub-directory of the Python installation directory.
The source code for many standard library packages is linked at the top of the package's documentation page in the library documentation, for example, the docs for the random module.
The original commit message for adding these links states
Provide links to Python source where the code is short, readable and
informative adjunct to the docs.
I wanted to try and look up the source of some of the modules in the Python standard library, but wasn't able to find them. I tried looking in the modules directory after downloading the python tarball, but it has mainly .c files. I also tried looking at the directory where the python that already comes with the OS (mac osx) has it's modules, and there it seems to have mainly .pyc and .pyo files. Would really appreciate it if someone can help me out.
(I tried what was suggested in the question How do I find the location of Python module sources? with no luck)
In cpython, many modules are implemented in C, and not in Python. You can find those in Modules/, whereas the pure Python ones reside in Lib/.
In some cases (for example the json module), the Python source code provides the module on its own and only uses the C module if it's available (to improve performance). For the remaining modules, you can have a look at PyPy's implementations.
The canonical repository for CPython is this Mercurial repository. There is also a git mirror on GitHub.
That would depend on what you define as Standard Library.
The Python Documentations says:
...this library reference manual describes the standard library that is
distributed with Python. It also describes some of the optional
components that are commonly included in Python distributions.
Python’s standard library is very extensive, offering a wide range of
facilities as indicated by the long table of contents listed below.
The library contains built-in modules (written in C) that provide
access to system functionality such as file I/O that would otherwise
be inaccessible to Python programmers, as well as modules written in
Python that provide standardized solutions for many problems that
occur in everyday programming. Some of these modules are explicitly
designed to encourage and enhance the portability of Python programs
by abstracting away platform-specifics into platform-neutral APIs.
If you take an extensive criteria, the Python Documentation explicitly answers what you're asking for, and I quote:
Exploring CPython’s Internals.
CPython Source Code Layout.
This guide gives an overview of CPython’s code structure. It serves as a summary of file locations for modules and builtins.
For Python modules, the typical layout is:
Lib/<module>.py
Modules/_<module>.c
Lib/test/test_<module>.py
Doc/library/<module>.rst
For extension-only modules, the typical layout is:
Modules/<module>module.c
Lib/test/test_<module>.py
Doc/library/<module>.rst
For builtin types, the typical layout is:
Objects/<builtin>object.c
Lib/test/test_<builtin>.py
Doc/library/stdtypes.rst
For builtin functions, the typical layout is:
Python/bltinmodule.c
Lib/test/test_builtin.py
Doc/library/functions.rst
Some exceptions:
builtin type int is at Objects/longobject.c
builtin type str is at Objects/unicodeobject.c
builtin module sys is at Python/sysmodule.c
builtin module marshal is at Python/marshal.c
Windows-only module winreg is at PC/winreg.c
You can get the source code of pure python modules that are part of the
standard library from the location where Python is installed.
For example at : C:\Python27\Lib (on windows) if you have
used Windows Installer for Python Installation.
Look it up under the Lib sub-directory of the Python installation directory.
The source code for many standard library packages is linked at the top of the package's documentation page in the library documentation, for example, the docs for the random module.
The original commit message for adding these links states
Provide links to Python source where the code is short, readable and
informative adjunct to the docs.
Many questions on Stack Overflow refer to "Pure Python" (some random examples from the "similar questions" list: 1, 2, 3, 4, 5, 6, 7, 8,
9,
10,
11).
I also encounter the concept elsewhere on the web, e.g. in the package documentation for imageio and in tutorials such as "An introduction to Pure Python".
This has led me to believe there must be some universally accepted standard definition of what "Pure Python" is.
However, despite googling to the limits of my ability, I have not yet been able to locate this definition.
Is there a universally accepted definition of "Pure Python," or is this just some elusive concept that means different things to different people?
To be clear, I am asking: Does such a definition exist, yes or no, and if so, what is the acclaimed source? Although I truly appreciate all comments and answers, I am not looking for personal interpretations.
In that imageio package, they mean it's all implemented in Python, and not (as is sometimes done) with parts written in C or other languages. As a result it's guaranteed to work on any system that Python works on.
In that tutorial, it means the Python you get when you download and install Python -- the language and the standard libraries, not any external modules. The chapter after that adds some external libraries, like numpy and scipy, that are used a lot but aren't part of the standard library.
So they mean different things there already.
A "pure-Python" package is a package that only contains Python code, and doesn't include, say, C extensions or code in other languages. You only need a Python interpreter and the Python Standard Library to run a pure-Python package, and it doesn't matter what your OS or platform is.
Pure-Python packages can import or depend on non-pure-Python packages:
Package X contains only Python code and is a pure-Python package.
Package Y contains Python and C code and isn't a pure-Python package.
Package Z imports Package Y, but Package Z is still a pure-Python package.
A good rule of thumb: If you can make a source distribution ("sdist") of your package and it doesn't include any non-Python code, it is a pure-Python package.
Pure-Python packages aren't restricted to just the Python Standard Library; packages can import modules from outside the Python Standard Library and still be considered pure-Python.
Additionally, a standalone module is a single .py file that only imports modules from the Python Standard Library. A standalone module is necessarily a pure-Python module.
Note that in Python, package technically refers to a folder with an __init__.py file in it. The things you download and install from PyPI with pip are distributions (such as "source distribution" or "sdist"), though the term "package" is also used as a synonym with "distribution", since that term could be confused with the "Linux distro" usage of the word.
Is there an official definition for "pure-Python"? As of this writing, no, though the Python Packaging User Guide makes heavy use of the term in https://packaging.python.org/overview/
Unfortunately, it seems there is no standardized, formalized definition.
As a programmer in Python for almost 2 decades, my definition of pure Python is: a Python package that implements the core logic only in Pythonic statements that only require pure python or native packages. This is a recursive statement, so at the far end of your packages dependency tree, you end up with python packages that only require native python libraries/functions. With this approach, the whole chain of code logic that allows for the main objective of the tool to be accomplished can be read and modified only using Python, and no other programming language nor tool besides the CPython interpreter.
This can't be overstated: "pure Python" is more defined as an objective -- of being entirely readable and modifiable in the Python language --, rather than a state. It's very similar to what the sibling language Julia is trying to do but by generalizing the procedure down to the interpreter, which is written in the Julia language itself. You can say that Julia is "pure" by design, whereas CPython is not (because the CPython interpreter is compiled in C++), but you can still write "pure Python" packages, just like you can write "pure PHP" or "pure Ruby" packages that do not require the use of any package written in another language at any point of the program's logic.
I was looking for something similar to perl's Dumper functionality in python. So after googling I found one which serves me well # https://gist.github.com/1071857#file_dumper.pyamazon
So I downloaded and installed it and it works fine.
But then I came accross PyPI: http://pypi.python.org/pypi which looks like CPAN equivalent for python.
So I searched for the Dumper module there and I could not find it there. I was hoping that this seems like a very basic module and should have been listed in PyPI.
So my question is, if I have to install a python module, then should I search in PyPI first and if i do not find then look other places on google?
OR is there any other Python Module repository apart from PyPI?
I am learning python and hence this question.
thanks.
If you are using pip, pip search package_name would help you do the same as searching on the web interface provided by PyPi.
Once located, installing a python package is of course as easy as
pip install package_name
Some python libraries may be in development stage and may not directly be available on PyPi OR you may want a specific commit has (git) of that library and if you can find that library's source on github.com or on bitbucket.com for example, you can do
pip install -e git+git://github.com/the/repo/url.git#egg=package_name
And regarding your question about perl Dumper, perl's Dumper has two main uses iirc -
data persistence
debugging and inspecting objects.
As far as I know, there's no exact equivalent of perl's Dumper in python.
However, I use pickle for data persistence.
And pprint is useful for visually inspecting objects/debug.
Both of which are standard, built-in modules in Python. There's no necessity for 3rd party libraries for these functionalities.
If you want to use what is here - https://gist.github.com/1071857#file_dumper.pyamazon.
What you need to do is to copy the code and place it in a local file in your project directory. You can name the file something like pydumper.py. Or any name you prefer really, but end it with suffix .py.
In your project, you can import the functions and classes defined in pydumper.py by doing
from pydumper import *
or if you want to be specific (which is preferred. it's better to be explicit about what you are importing.)
from pydumper import Dumper
and you can start using the Dumper class in your own code.
Are you looking for something like easy_install from setuptools? I might have misunderstood your question as I don't use perl.
From the Scripts directory in the python installation directory ("c:/python27/Scripts" on my machine), you can install modules from the command line like so:
easy_install modulename
Makes life alot easier if you set the Scripts directory to your PATH variable.
I wanted to try and look up the source of some of the modules in the Python standard library, but wasn't able to find them. I tried looking in the modules directory after downloading the python tarball, but it has mainly .c files. I also tried looking at the directory where the python that already comes with the OS (mac osx) has it's modules, and there it seems to have mainly .pyc and .pyo files. Would really appreciate it if someone can help me out.
(I tried what was suggested in the question How do I find the location of Python module sources? with no luck)
In cpython, many modules are implemented in C, and not in Python. You can find those in Modules/, whereas the pure Python ones reside in Lib/.
In some cases (for example the json module), the Python source code provides the module on its own and only uses the C module if it's available (to improve performance). For the remaining modules, you can have a look at PyPy's implementations.
The canonical repository for CPython is this Mercurial repository. There is also a git mirror on GitHub.
That would depend on what you define as Standard Library.
The Python Documentations says:
...this library reference manual describes the standard library that is
distributed with Python. It also describes some of the optional
components that are commonly included in Python distributions.
Python’s standard library is very extensive, offering a wide range of
facilities as indicated by the long table of contents listed below.
The library contains built-in modules (written in C) that provide
access to system functionality such as file I/O that would otherwise
be inaccessible to Python programmers, as well as modules written in
Python that provide standardized solutions for many problems that
occur in everyday programming. Some of these modules are explicitly
designed to encourage and enhance the portability of Python programs
by abstracting away platform-specifics into platform-neutral APIs.
If you take an extensive criteria, the Python Documentation explicitly answers what you're asking for, and I quote:
Exploring CPython’s Internals.
CPython Source Code Layout.
This guide gives an overview of CPython’s code structure. It serves as a summary of file locations for modules and builtins.
For Python modules, the typical layout is:
Lib/<module>.py
Modules/_<module>.c
Lib/test/test_<module>.py
Doc/library/<module>.rst
For extension-only modules, the typical layout is:
Modules/<module>module.c
Lib/test/test_<module>.py
Doc/library/<module>.rst
For builtin types, the typical layout is:
Objects/<builtin>object.c
Lib/test/test_<builtin>.py
Doc/library/stdtypes.rst
For builtin functions, the typical layout is:
Python/bltinmodule.c
Lib/test/test_builtin.py
Doc/library/functions.rst
Some exceptions:
builtin type int is at Objects/longobject.c
builtin type str is at Objects/unicodeobject.c
builtin module sys is at Python/sysmodule.c
builtin module marshal is at Python/marshal.c
Windows-only module winreg is at PC/winreg.c
You can get the source code of pure python modules that are part of the
standard library from the location where Python is installed.
For example at : C:\Python27\Lib (on windows) if you have
used Windows Installer for Python Installation.
Look it up under the Lib sub-directory of the Python installation directory.
The source code for many standard library packages is linked at the top of the package's documentation page in the library documentation, for example, the docs for the random module.
The original commit message for adding these links states
Provide links to Python source where the code is short, readable and
informative adjunct to the docs.