Base language of Python - python

What is the base language Python is written in?

You can't say that Python is written in some programming language, since Python as a language is just a set of rules (like syntax rules, or descriptions of standard functionality). So we might say, that it is written in English :). However, mentioned rules can be implemented in some programming language. Hence, if you send a string like 'import this' to that program called interpreter, it'd return you "Zen of Python".
Since most modern OS are written in C, compilers/interpreters for modern high-level languages are also written in C. Python is not an exception - its most popular/"traditional" implementation is called CPython and is written in C.
There are other implementations:
IronPython (Python running on .NET)
Jython (Python running on the Java Virtual Machine)
PyPy (A fast python implementation with a JIT compiler)
Stackless Python (Branch of CPython supporting microthreads)

The sources are public. Python is written in C (actually the default implementation is called CPython).

Python is written in English. But there are several implementations:
PyPy (written in Python)
CPython (written in C)
IronPython (written in C#)
Jython (written in Java)

it is written in C, its also called CPython.

You get a good idea if you compile python from source. Usually it's gcc that compiles the *.c files

To add to and reframe some of the other good answers:
The specification for Python (question) is written in English, but could be written in a formal semantics, as Standard
ML and Scheme are.
See
Programming language specification.
There are implementations of Python in many languages, as noted in Gandaro's answer.
Updated thanks to the inspiration of #TylerH:
As an aside, when the performance of a compute-intensive application is important, the fastest implementation is often not the standard CPython, which is written in C. For example, for many cases, pypy or Cython (both written in Python) may be faster

Related

What will I not learn about Python 2.7 and 3.x, if I start with Jython?

If I learn Jython first, does that mean I will also be learning all of the Python language? When I say all, I mean all the basic constructs of the language, nothing else. What will I not learn about Python or CPython, if I start with Jython? Thanks.
There is no drawback in learning Jython -
it is a conformant implementation of Python 2's syntax - and the differences to Python3 are just the one you will find documented everywhere.
I don't know where jython stands in terms of implementation of Python's stdlib - but I believe it has most of Python's 2.7 stdlib in place and working - some modules won't work, like "ctypes" for example. But as far as the language constructs go, you will be fine.
(IMO it is a good tool, not only for what you want, but a nice tool for exploring Java's libraries themselves in an interactive way, since you can use any Java class from the jython interactive shell)
As for the comments talking about unavailable modules: those are 3rd party modules installable on CPython. You certainly don't need them to get the language constructs, like you want. It is a trade off: you loose a lot of the Python ecosystem, but you can use the Java ecosystem in its place. And certainly, when starting a new project, you can just use normal CPython with whatever modules you need: the language is the same.

What does it mean when people say CPython is written in C?

From what I know, CPython programs are compiled into intermediate bytecode, which is executed by the virtual machine. Then how does one identify without knowing beforehand that CPython is written in C. Isn't there some common DNA for both which can be matched to identify this?
The interpreter is written in C.
It compiles Python code into bytecode, and then an evaluation loop interprets that bytecode to run your code.
You identify what Python is written in by looking at it's source code. See the source for the evaluation loop for example.
Note that the Python.org implementation is but one Python implementation. We call it CPython, because it is implemented in C. There are other implementations too, written in other languages. Jython is written in Java, IronPython in C#, and then there is PyPy, which is written in a (subset of) Python, and runs many tasks faster than CPython.
Python isn't written in C. Arguably, Python is written in an esoteric English dialect using BNF.
However, all the following statements are true:
Python is a language, consisting of a language specification and a bunch of standard modules
Python source code is compiled to a bytecode representation
this bytecode could in principle be executed directly by a suitably-designed processor but I'm not aware of one actually existing
in the absence of a processor that natively understands the bytecode, some other program must be used to translate the bytecode to something a hardware processor can understand
one real implementation of this runtime facility is CPython
CPython is itself written in C, but ...
C is a language, consisting of a language specification and a bunch of standard libraries
C source code is compiled to some bytecode format (typically something platform-specific)
this platform specific format is typically the native instruction set of some processor (in which case it may be called "object code" or "machine code")
this native bytecode doesn't retain any magical C-ness: it is just instructions. It doesn't make any difference to the processor which language the bytecode was compiled from
so the CPython executable which translates your Python bytecode is a sequence of
instructions executing directly on your processor
so you have: Python bytecode being interpreted by machine code being interpreted by the hardware processor
Jython is another implementation of the same Python runtime facility
Jython is written in Java, but ...
Java is a language, consisting of a spec, standard libraries etc. etc.
Java source code is compiled to a different bytecode
Java bytecode is also executable either on suitable hardware, or by some runtime facility
The Java runtime environment which provides this facility may also be written in C
so you have: Python bytecode being interpreted by Java bytecode being interpreted by machine code being interpreted by the hardware processor
You can add more layers indefinitely: consider that your "hardware processor" may really be a software emulation, or that hardware processors may have a front-end that decodes their "native" instruction set into another internal bytecode.
All of these layers are defined by what they do (executing or interpreting instructions according to some specification), not how they implement it.
Oh, and I skipped over the compilation step. The C compiler is typically written in C (and getting any language to the stage where it can compile itself is traditionally significant), but it could just as well be written in Python or Java. Again, the compiler is defined by what it does (transforms some source language to some output such as a bytecode, according to the language spec), rather than how it is implemented.
I found a good understanding of my original doubt here:
http://amitsaha.github.io/site/notes/articles/c_python_compiler_interpreter.html

Question about python construction

A friend of mine that is a programmer told me that "Python is written in Python" or something like that. He meant that Python interpreter is written in Python (I think). I've read in some websites that Python interpret in real time ANY programming language (even C++ and ASM). Is this true?
Could someone explain me HOW COULD IT BE?
The unique explanation that I came up with after thinking a bit is: python is at the same "level" of ASM, it makes sense to python interpret any language (that is in a higher level), am I right? Does this make sense?
I would be grateful is someone explain me a little about it.
Thank you
It's not true. The standard implementation of Python - CPython - is written in C, although much of the standard library is written in Python. There are other implementations in Java (Jython) and .NET (IronPython).
There is a project called PyPy which, among other things, is rewriting the C parts of Python into Python. But the main development of Python is still based on C.
Your friend told you that Python is self-hosting:
The term self-hosting was coined to refer to the use of a computer program as part of the toolchain or operating system that produces new versions of that same program—for example, a compiler that can compile its own source code. Self-hosting software is commonplace on personal computers and larger systems. Other programs that are typically self-hosting include kernels, assemblers, shells and revision control software.
Of course, the very first revision of Python had to be bootstrapped by some other mechanism -- perhaps C or C++ as these are fairly standard targets for lexers and parser generators.
Generally, when someone says language X is written in X, they mean that first a compiler or interpreter for X was written in assembly or other such language, compiled, and then a better compiler or interpreter was written in X.
Additionally, once a very basic compiler/interpreter for X exists, it is sometimes easier to add new language features, classes, etc. to X by writing them in X than to extend the compiler/interpreter itself.
Python is written in C (CPython) as well as Python.
Read about pypy -- that's Python written in Python.
Writing Python in Python is a two-step dance.
Write Python in some other language. C, Java, assembler, COBOL, whatever.
Once you have a working implementation of Python (i.e., passes all the tests) you can then write Python in Python.
When you read about pypy, you'll see that they do something a hair more sophisticated than this. "We are using a subset of the high-level language Python, called RPython, in which we write languages as simple interpreters with few references to and dependencies on lower level details."
So they started with a working Python and then broke the run-time into this RPython kernel which is the smallest nugget of Python goodness. Then they built the rest of Python around the RPython kernel.

Is IronPython a 100% pure Python variant?

I just downloaded the original Python interpreter from Python's site. I just want to learn this language but to start with, I want to write Windows-based standalone applications that are powered by any RDBMS. I want to bundle it like any typical Windows setup.
I searched old posts on SO and found guys suggesting wxPython and py2exe. Apart from that few suggested IronPython since it is powered by .NET.
I want to know whether IronPython is a pure variant of Python or a modified variant. Secondly, what is the actual use of Python? Is it for PHP like thing or like C# (you can either program Windows-based app. or Web.).
IronPython isn't a variant of Python, it is Python. It's an implementation of the Python language based on the .NET framework. So, yes, it is pure Python.
IronPython is caught up to CPython (the implementation you're probably used to) 2.6, so some of the features/changes seen in Python 2.7 or 3.x will not be present in IronPython. Also, the standard library is a bit different (but what you lose is replaced by all that .NET has to offer).
The primary application of IronPython is to script .NET applications written in C# etc., but it can also be used as a standalone. IronPython can also be used to write web applications using the SilverLight framework.
If you need access to .NET features, use IronPython. If you're just trying to make a Windows executable, use py2exe.
Update
For writing basic RDBMS apps, just use CPython (original Python), it's more extensible and faster. Then, you can use a number of tools to make it stand alone on a Windows PC. For now, though, just worry about learning Python (those skills will mostly carry over to IronPython if you choose to switch) and writing your application.
IronPython is an independent Python implementation written in C# as opposed to the original implementation, often referred to as CPython due to it being written in (no surprise) C.
Python is multi-purpose - you can use it to write web apps (often using a framework such as Django or Pylons), GUI apps (as you've mentioned), command-line tools and as a scripting language embedded inside an app written in another language (for instance, the 3D modelling tool Blender can be scripted using Python).
what does "Pure Python" mean? If you're talking about implemented in Python in the same sense that a module may be pure python, then no, and no Python implementation is. If you mean "Compatible with cPython" then yes, code written to cPython will work in IronPython, with a few caveats. The one that's likely to matter most is that the libraries are different, for instance code depending on ctypes or Tkinter won't work. Another difference is that IronPython lags behind cPython by a bit. the very latest version of this writing is 2.6.1, with an Alpha version supporting a few of the 2.7 language features available too.
What do you really need? If you want to learn to program with python, and also want to produce code for windows, you can use IronPython for that, but you can also use cPython and py2exe; both will work equally well for this with only differences in the libraries.
IronPython is an implementation of Python using C#. It's just like the implementation of Python using Java by Jython. You might want to note that IronPython and Jython will always lag behind a little bit in development. However, you do get the benefit of having some libraries that's not available in the standard Python libraries. In IronPython, you will be able to get access to some of the .NET stuff, like System.Drawings and such, though by using these non-standard libraries, it will be harder to port your code to other platforms. For example, you will have to install mono to run apps written in IronPython on Linux (On windows you will need the .NET Framework)

PyPy -- How can it possibly beat CPython?

From the Google Open Source Blog:
PyPy is a reimplementation of Python
in Python, using advanced techniques
to try to attain better performance
than CPython. Many years of hard work
have finally paid off. Our speed
results often beat CPython, ranging
from being slightly slower, to
speedups of up to 2x on real
application code, to speedups of up to
10x on small benchmarks.
How is this possible? Which Python implementation was used to implement PyPy? CPython? And what are the chances of a PyPyPy or PyPyPyPy beating their score?
(On a related note... why would anyone try something like this?)
"PyPy is a reimplementation of Python in Python" is a rather misleading way to describe PyPy, IMHO, although it's technically true.
There are two major parts of PyPy.
The translation framework
The interpreter
The translation framework is a compiler. It compiles RPython code down to C (or other targets), automatically adding in aspects such as garbage collection and a JIT compiler. It cannot handle arbitrary Python code, only RPython.
RPython is a subset of normal Python; all RPython code is Python code, but not the other way around. There is no formal definition of RPython, because RPython is basically just "the subset of Python that can be translated by PyPy's translation framework". But in order to be translated, RPython code has to be statically typed (the types are inferred, you don't declare them, but it's still strictly one type per variable), and you can't do things like declaring/modifying functions/classes at runtime either.
The interpreter then is a normal Python interpreter written in RPython.
Because RPython code is normal Python code, you can run it on any Python interpreter. But none of PyPy's speed claims come from running it that way; this is just for a rapid test cycle, because translating the interpreter takes a long time.
With that understood, it should be immediately obvious that speculations about PyPyPy or PyPyPyPy don't actually make any sense. You have an interpreter written in RPython. You translate it to C code that executes Python quickly. There the process stops; there's no more RPython to speed up by processing it again.
So "How is it possible for PyPy to be faster than CPython" also becomes fairly obvious. PyPy has a better implementation, including a JIT compiler (it's generally not quite as fast without the JIT compiler, I believe, which means PyPy is only faster for programs susceptible to JIT-compilation). CPython was never designed to be a highly optimising implementation of the Python language (though they do try to make it a highly optimised implementation, if you follow the difference).
The really innovative bit of the PyPy project is that they don't write sophisticated GC schemes or JIT compilers by hand. They write the interpreter relatively straightforwardly in RPython, and for all RPython is lower level than Python it's still an object-oriented garbage collected language, much more high level than C. Then the translation framework automatically adds things like GC and JIT. So the translation framework is a huge effort, but it applies equally well to the PyPy python interpreter however they change their implementation, allowing for much more freedom in experimentation to improve performance (without worrying about introducing GC bugs or updating the JIT compiler to cope with the changes). It also means when they get around to implementing a Python3 interpreter, it will automatically get the same benefits. And any other interpreters written with the PyPy framework (of which there are a number at varying stages of polish). And all interpreters using the PyPy framework automatically support all platforms supported by the framework.
So the true benefit of the PyPy project is to separate out (as much as possible) all the parts of implementing an efficient platform-independent interpreter for a dynamic language. And then come up with one good implementation of them in one place, that can be re-used across many interpreters. That's not an immediate win like "my Python program runs faster now", but it's a great prospect for the future.
And it can run your Python program faster (maybe).
Q1. How is this possible?
Manual memory management (which is what CPython does with its counting) can be slower than automatic management in some cases.
Limitations in the implementation of the CPython interpreter preclude certain optimisations that PyPy can do (eg. fine grained locks).
As Marcelo mentioned, the JIT. Being able to on the fly confirm the type of an object can save you the need to do multiple pointer dereferences to finally arrive at the method you want to call.
Q2. Which Python implementation was used to implement PyPy?
The PyPy interpreter is implemented in RPython which is a statically typed subset of Python (the language and not the CPython interpreter). - Refer https://pypy.readthedocs.org/en/latest/architecture.html for details.
Q3. And what are the chances of a PyPyPy or PyPyPyPy beating their score?
That would depend on the implementation of these hypothetical interpreters. If one of them for example took the source, did some kind of analysis on it and converted it directly into tight target specific assembly code after running for a while, I imagine it would be quite faster than CPython.
Update: Recently, on a carefully crafted example, PyPy outperformed a similar C program compiled with gcc -O3. It's a contrived case but does exhibit some ideas.
Q4. Why would anyone try something like this?
From the official site. https://pypy.readthedocs.org/en/latest/architecture.html#mission-statement
We aim to provide:
a common translation and support framework for producing
implementations of dynamic languages, emphasizing a clean
separation between language specification and implementation
aspects. We call this the RPython toolchain_.
a compliant, flexible and fast implementation of the Python_
Language which uses the above toolchain to enable new advanced
high-level features without having to encode the low-level
details.
By separating concerns in this way, our implementation of Python - and
other dynamic languages - is able to automatically generate a
Just-in-Time compiler for any dynamic language. It also allows a
mix-and-match approach to implementation decisions, including many
that have historically been outside of a user's control, such as
target platform, memory and threading models, garbage collection
strategies, and optimizations applied, including whether or not to
have a JIT in the first place.
The C compiler gcc is implemented in C, The Haskell compiler GHC is written in Haskell. Do you have any reason for the Python interpreter/compiler to not be written in Python?
PyPy is implemented in Python, but it implements a JIT compiler to generate native code on the fly.
The reason to implement PyPy on top of Python is probably that it is simply a very productive language, especially since the JIT compiler makes the host language's performance somewhat irrelevant.
PyPy is written in Restricted Python. It does not run on top of the CPython interpreter, as far as I know. Restricted Python is a subset of the Python language. AFAIK, the PyPy interpreter is compiled to machine code, so when installed it does not utilize a python interpreter at runtime.
Your question seems to expect the PyPy interpreter is running on top of CPython while executing code.
Edit: Yes, to use PyPy you first translate the PyPy python code, either to C and build with gcc, to jvm byte code, or to .Net CLI code. See Getting Started

Categories